Implementing and showcasing reinforcement learning algorithms with deep networks. Following Open AI spinning up suggestions.
- Deep Q learning
- Deep Recurrent Q learning
- Double Deep Q learning
- Dueling Deep Q learning
- Dueling Double Deep Q learning
- Prioritized Experience Replay
- Vanilla Policy Gradient [Reinforce]
- Value Actor Critic
- Advantage Actor Crictic
- Proximal Policy Optimization
- GAE
- Soft Actor Critic
All implementations upto Advantage actor critic is done. Working on updating the comments.
- Add texts explaining the inner workings of the codes.
- Solve other environments like cartpole and atari 2600.
- Use CNNs.
- Actor critics are barebone. Will need to implement improvement methods for performance.