Reinforcement Learning

  1. Dynamic Programming
    1. Assumptions and Prerequisites
      1. Complete Knowledge of MDP
        1. Finite State and Action Spaces
          1. Computational Requirements
          2. Policy Evaluation
            1. Iterative Policy Evaluation Algorithm
              1. Convergence Properties
                1. Computational Complexity
                  1. Stopping Criteria
                    1. In-Place vs Synchronous Updates
                    2. Policy Improvement
                      1. Policy Improvement Theorem
                        1. Greedy Policy Construction
                          1. Policy Improvement Guarantees
                            1. Monotonic Improvement Property
                            2. Policy Iteration
                              1. Policy Iteration Algorithm
                                1. Evaluation Step
                                  1. Improvement Step
                                    1. Termination Conditions
                                    2. Convergence Analysis
                                      1. Computational Complexity
                                        1. Finite Convergence Property
                                        2. Value Iteration
                                          1. Value Iteration Algorithm
                                            1. Update Rules
                                              1. Stopping Criteria
                                                1. Convergence Properties
                                                  1. Relationship to Policy Iteration
                                                  2. Generalized Policy Iteration
                                                    1. Interleaving Evaluation and Improvement
                                                      1. Flexible Implementation
                                                        1. Convergence Guarantees
                                                        2. Extensions and Variations
                                                          1. Modified Policy Iteration
                                                            1. Asynchronous Dynamic Programming
                                                              1. Prioritized Sweeping
                                                              2. Limitations of Dynamic Programming
                                                                1. Curse of Dimensionality
                                                                  1. Model Requirements
                                                                    1. Computational Scalability