Generative AI

  1. Transformer Architecture and Language Models
    1. Transformer Fundamentals
      1. Attention Mechanism
        1. Scaled Dot-Product Attention
          1. Multi-Head Attention
            1. Attention Patterns
            2. Positional Encoding
              1. Sinusoidal Encoding
                1. Learned Positional Embeddings
                  1. Relative Position Encoding
                  2. Layer Structure
                    1. Self-Attention Layers
                      1. Feed-Forward Networks
                        1. Residual Connections
                          1. Layer Normalization
                        2. Language Model Pretraining
                          1. Pretraining Objectives
                            1. Causal Language Modeling
                              1. Masked Language Modeling
                                1. Prefix Language Modeling
                                2. Scaling Laws
                                  1. Parameter Scaling
                                    1. Data Scaling
                                      1. Compute Scaling
                                      2. Training Infrastructure
                                        1. Distributed Training
                                          1. Mixed Precision Training
                                            1. Gradient Accumulation
                                          2. Large Language Models
                                            1. GPT Family
                                              1. GPT-1 Architecture
                                                1. GPT-2 Scaling
                                                  1. GPT-3 Emergence
                                                    1. GPT-4 Capabilities
                                                    2. BERT and Bidirectional Models
                                                      1. BERT Pretraining
                                                        1. RoBERTa Improvements
                                                          1. DeBERTa Enhancements
                                                          2. Open Source Models
                                                            1. LLaMA Architecture
                                                              1. Falcon Models
                                                                1. MPT Models
                                                                  1. Mistral Models
                                                                2. Fine-Tuning and Adaptation
                                                                  1. Supervised Fine-Tuning
                                                                    1. Task-Specific Adaptation
                                                                      1. Few-Shot Learning
                                                                        1. In-Context Learning
                                                                        2. Instruction Tuning
                                                                          1. Instruction Following
                                                                            1. Multi-Task Learning
                                                                              1. Chain-of-Thought Training
                                                                              2. Reinforcement Learning from Human Feedback
                                                                                1. Reward Modeling
                                                                                  1. Policy Optimization
                                                                                    1. Constitutional AI