UsefulLinks
Computer Science
Artificial Intelligence
Deep Learning
Reinforcement Learning
1. Foundations of Reinforcement Learning
2. Mathematical Foundations
3. Markov Decision Processes
4. Dynamic Programming
5. Monte Carlo Methods
6. Temporal-Difference Learning
7. Function Approximation
8. Deep Reinforcement Learning
9. Policy Gradient Methods
10. Advanced Topics
11. Implementation and Practical Considerations
12. Applications and Case Studies
4.
Dynamic Programming
4.1.
Assumptions and Prerequisites
4.1.1.
Complete Knowledge of MDP
4.1.2.
Finite State and Action Spaces
4.1.3.
Computational Requirements
4.2.
Policy Evaluation
4.2.1.
Iterative Policy Evaluation Algorithm
4.2.2.
Convergence Properties
4.2.3.
Computational Complexity
4.2.4.
Stopping Criteria
4.2.5.
In-Place vs Synchronous Updates
4.3.
Policy Improvement
4.3.1.
Policy Improvement Theorem
4.3.2.
Greedy Policy Construction
4.3.3.
Policy Improvement Guarantees
4.3.4.
Monotonic Improvement Property
4.4.
Policy Iteration
4.4.1.
Policy Iteration Algorithm
4.4.1.1.
Evaluation Step
4.4.1.2.
Improvement Step
4.4.1.3.
Termination Conditions
4.4.2.
Convergence Analysis
4.4.3.
Computational Complexity
4.4.4.
Finite Convergence Property
4.5.
Value Iteration
4.5.1.
Value Iteration Algorithm
4.5.2.
Update Rules
4.5.3.
Stopping Criteria
4.5.4.
Convergence Properties
4.5.5.
Relationship to Policy Iteration
4.6.
Generalized Policy Iteration
4.6.1.
Interleaving Evaluation and Improvement
4.6.2.
Flexible Implementation
4.6.3.
Convergence Guarantees
4.7.
Extensions and Variations
4.7.1.
Modified Policy Iteration
4.7.2.
Asynchronous Dynamic Programming
4.7.3.
Prioritized Sweeping
4.8.
Limitations of Dynamic Programming
4.8.1.
Curse of Dimensionality
4.8.2.
Model Requirements
4.8.3.
Computational Scalability
Previous
3. Markov Decision Processes
Go to top
Next
5. Monte Carlo Methods