Useful Links
Computer Science
Artificial Intelligence
Deep Learning
Reinforcement Learning
1. Foundations of Reinforcement Learning
2. Mathematical Foundations
3. Markov Decision Processes
4. Dynamic Programming
5. Monte Carlo Methods
6. Temporal-Difference Learning
7. Function Approximation
8. Deep Reinforcement Learning
9. Policy Gradient Methods
10. Advanced Topics
11. Implementation and Practical Considerations
12. Applications and Case Studies
Deep Reinforcement Learning
Introduction to Deep RL
Neural Networks in RL
Representation Learning
End-to-End Learning
Deep Q-Networks (DQN)
Neural Network Q-Function Approximation
Network Architecture Design
Loss Function Definition
Training Procedures
Stabilizing Deep RL
Experience Replay
Replay Buffer Implementation
Breaking Sample Correlations
Batch Sampling Strategies
Fixed Q-Targets
Target Network Updates
Stabilizing Training
Update Frequencies
Gradient Clipping
Reward Clipping
DQN Improvements
Double DQN
Overestimation Bias Problem
Double Estimation Solution
Performance Improvements
Dueling DQN
Value and Advantage Decomposition
Network Architecture
Aggregation Methods
Prioritized Experience Replay
TD-Error Based Prioritization
Importance Sampling Corrections
Implementation Details
Rainbow DQN
Combining Multiple Improvements
Distributional RL
Noisy Networks
Deep RL for Continuous Actions
Challenges with Continuous Actions
Deep Deterministic Policy Gradient (DDPG)
Actor-Critic Architecture
Deterministic Policy Gradients
Exploration Strategies
Twin Delayed DDPG (TD3)
Addressing Overestimation
Delayed Policy Updates
Target Policy Smoothing
Previous
7. Function Approximation
Go to top
Next
9. Policy Gradient Methods