easy-21 coursework for Reinforcement Learning by David Silver study notes Topics Monte-Carlo Contarol Temporal-Difference Learning Linear Function Approximations On-Policy Learning vs Off-Policy Learning check the notebooks for studies