Bellman Equations
Summary
So what did we learn up until now in our Introduction and Markov Property, Chain, Reward Process and Decision Process posts?
Initially we defined our basic concepts:
* State: What does our current environment look like?
* Action: What action are we taking?
* Policy: When taking action $a$ in state $s$