Reinforcement Learning
Complete Knowledge of MDP
Finite State and Action Spaces
Computational Requirements
Iterative Policy Evaluation Algorithm
Convergence Properties
Computational Complexity
Stopping Criteria
In-Place vs Synchronous Updates
Policy Improvement Theorem
Greedy Policy Construction
Policy Improvement Guarantees
Monotonic Improvement Property
Evaluation Step
Improvement Step
Termination Conditions
Value Iteration Algorithm
Update Rules
Relationship to Policy Iteration
Interleaving Evaluation and Improvement
Flexible Implementation
Convergence Guarantees
Modified Policy Iteration
Asynchronous Dynamic Programming
Prioritized Sweeping
Curse of Dimensionality
Model Requirements
Computational Scalability
Previous
3. Markov Decision Processes
Go to top
Next
5. Monte Carlo Methods