Skip to content

LLM Pre-training

Transformer Architecture

  • Self-Attention Mechanism
  • Multi-Head Attention
  • Feed Forward Network (FFN)

Positional Encoding

  • Absolute Positional Encoding (Sinusoidal)
  • Rotary Positional Embedding (RoPE)

Training Objectives

  • Masked Language Modeling (MLM)
  • Causal Language Modeling (CLM)

AI-HPC Organization