Jax Implementation of Data-regularized Q (DrQ)
(It's my lecture project of Reinforcement Learning :)
python drq.py cfg=walker_walk train_seed=0
Compare with official implementation
Running the code requires ≈38 GB GPU memory.
As I can access large memory GPUs, so I did not implement a memory-efficient replay buffer for image observations.
Leave it for future work (下次一定!)