UsefulLinks
Computer Science
Artificial Intelligence
Generative AI
1. Introduction to Generative AI
2. Mathematical and Technical Foundations
3. Machine Learning Fundamentals
4. Core Generative Model Architectures
5. Transformer Architecture and Language Models
6. Text Generation Applications
7. Image and Visual Generation
8. Audio and Speech Generation
9. Development Lifecycle and Best Practices
10. Practical Implementation Tools
11. Ethical Considerations and Responsible AI
12. Legal and Regulatory Landscape
13. Economic and Social Impact
14. Future Directions and Emerging Trends
5.
Transformer Architecture and Language Models
5.1.
Transformer Fundamentals
5.1.1.
Attention Mechanism
5.1.1.1.
Scaled Dot-Product Attention
5.1.1.2.
Multi-Head Attention
5.1.1.3.
Attention Patterns
5.1.2.
Positional Encoding
5.1.2.1.
Sinusoidal Encoding
5.1.2.2.
Learned Positional Embeddings
5.1.2.3.
Relative Position Encoding
5.1.3.
Layer Structure
5.1.3.1.
Self-Attention Layers
5.1.3.2.
Feed-Forward Networks
5.1.3.3.
Residual Connections
5.1.3.4.
Layer Normalization
5.2.
Language Model Pretraining
5.2.1.
Pretraining Objectives
5.2.1.1.
Causal Language Modeling
5.2.1.2.
Masked Language Modeling
5.2.1.3.
Prefix Language Modeling
5.2.2.
Scaling Laws
5.2.2.1.
Parameter Scaling
5.2.2.2.
Data Scaling
5.2.2.3.
Compute Scaling
5.2.3.
Training Infrastructure
5.2.3.1.
Distributed Training
5.2.3.2.
Mixed Precision Training
5.2.3.3.
Gradient Accumulation
5.3.
Large Language Models
5.3.1.
GPT Family
5.3.1.1.
GPT-1 Architecture
5.3.1.2.
GPT-2 Scaling
5.3.1.3.
GPT-3 Emergence
5.3.1.4.
GPT-4 Capabilities
5.3.2.
BERT and Bidirectional Models
5.3.2.1.
BERT Pretraining
5.3.2.2.
RoBERTa Improvements
5.3.2.3.
DeBERTa Enhancements
5.3.3.
Open Source Models
5.3.3.1.
LLaMA Architecture
5.3.3.2.
Falcon Models
5.3.3.3.
MPT Models
5.3.3.4.
Mistral Models
5.4.
Fine-Tuning and Adaptation
5.4.1.
Supervised Fine-Tuning
5.4.1.1.
Task-Specific Adaptation
5.4.1.2.
Few-Shot Learning
5.4.1.3.
In-Context Learning
5.4.2.
Instruction Tuning
5.4.2.1.
Instruction Following
5.4.2.2.
Multi-Task Learning
5.4.2.3.
Chain-of-Thought Training
5.4.3.
Reinforcement Learning from Human Feedback
5.4.3.1.
Reward Modeling
5.4.3.2.
Policy Optimization
5.4.3.3.
Constitutional AI
Previous
4. Core Generative Model Architectures
Go to top
Next
6. Text Generation Applications