Useful Links
Computer Science
Artificial Intelligence
Deep Learning
Reinforcement Learning
1. Foundations of Reinforcement Learning
2. Mathematical Foundations
3. Markov Decision Processes
4. Dynamic Programming
5. Monte Carlo Methods
6. Temporal-Difference Learning
7. Function Approximation
8. Deep Reinforcement Learning
9. Policy Gradient Methods
10. Advanced Topics
11. Implementation and Practical Considerations
12. Applications and Case Studies
Markov Decision Processes
The Markov Property
Definition and Significance
Memoryless Property
Examples of Markov Processes
Non-Markov Processes and Solutions
Formal Definition of MDPs
State Space
Finite vs Infinite States
State Space Structure
State Features and Representation
Action Space
Available Actions per State
Action Space Constraints
Action Dependencies
Transition Probability Function
Transition Matrix Representation
Stochastic Transitions
Deterministic Special Cases
Reward Function
State-based Rewards
Action-based Rewards
State-Action-State Rewards
Expected Reward Calculation
Discount Factor
Present Value Calculation
Effects of Different Discount Rates
Undiscounted vs Discounted Returns
Returns and Value Functions
Return Definition
Cumulative Reward
Discounted Return
Average Return
State-Value Function
Expected Return from States
Value Function Properties
Action-Value Function
Expected Return from State-Action Pairs
Q-Function Properties
Relationship Between Value Functions
Policies in MDPs
Policy Definition
Deterministic Policies
State-to-Action Mapping
Stochastic Policies
Probability Distributions over Actions
Policy Parameterization
Policy Evaluation
Computing Value Functions for Given Policies
Bellman Equations
Bellman Expectation Equations
For State-Value Functions
For Action-Value Functions
Recursive Structure
Bellman Optimality Equations
Optimal State-Value Function
Optimal Action-Value Function
Uniqueness Properties
System of Linear Equations
Matrix Form Representation
Solving Bellman Equations
Optimality in MDPs
Optimal Policies
Definition of Optimality
Existence and Uniqueness
Partial Ordering of Policies
Optimal Value Functions
Optimal State Values
Optimal Action Values
Relationship to Optimal Policies
Finding Optimal Policies
Policy Extraction from Value Functions
Greedy Policy Construction
Previous
2. Mathematical Foundations
Go to top
Next
4. Dynamic Programming