UsefulLinks
Computer Science
Artificial Intelligence
Deep Learning
Reinforcement Learning
1. Foundations of Reinforcement Learning
2. Mathematical Foundations
3. Markov Decision Processes
4. Dynamic Programming
5. Monte Carlo Methods
6. Temporal-Difference Learning
7. Function Approximation
8. Deep Reinforcement Learning
9. Policy Gradient Methods
10. Advanced Topics
11. Implementation and Practical Considerations
12. Applications and Case Studies
5.
Monte Carlo Methods
5.1.
Introduction to Monte Carlo RL
5.1.1.
Model-Free Learning
5.1.2.
Sample-Based Estimation
5.1.3.
Episode-Based Learning
5.2.
Monte Carlo Prediction
5.2.1.
First-Visit Monte Carlo
5.2.1.1.
Algorithm Description
5.2.1.2.
Convergence Properties
5.2.1.3.
Unbiased Estimation
5.2.2.
Every-Visit Monte Carlo
5.2.2.1.
Algorithm Description
5.2.2.2.
Differences from First-Visit
5.2.2.3.
Convergence Analysis
5.2.3.
Incremental Implementation
5.2.3.1.
Online Updates
5.2.3.2.
Running Averages
5.2.3.3.
Memory Efficiency
5.3.
Monte Carlo Control
5.3.1.
Monte Carlo ES (Exploring Starts)
5.3.1.1.
Algorithm Description
5.3.1.2.
Exploration Requirements
5.3.1.3.
Convergence Properties
5.3.2.
On-Policy Monte Carlo Control
5.3.2.1.
Epsilon-Greedy Policies
5.3.2.2.
Soft Policies
5.3.2.3.
GLIE Conditions
5.3.3.
Off-Policy Monte Carlo Control
5.3.3.1.
Importance Sampling
5.3.3.2.
Weighted Importance Sampling
5.3.3.3.
Ordinary Importance Sampling
5.4.
Advantages and Limitations
5.4.1.
Model-Free Nature
5.4.2.
Unbiased Estimates
5.4.3.
High Variance
5.4.4.
Episode Completion Requirements
Previous
4. Dynamic Programming
Go to top
Next
6. Temporal-Difference Learning