CarRacing-v0 | VizdoomDefendCenter-v0 |
---|---|
Real Rollout | Hallucinated Rollout |
---|---|
- Source set_pythonpath.bash
- Go into the dataset/car_racing directory and run rollout_wrapper.py
- Run make_csv.py in datasets/car_racing
- Call train_vae.py
- Sampled mappings of noise reconstructions are seen in results
Increase the number of rollouts in rollout.bash and rollout.py to generate more data. This currently trains on the random action policy, so there isn't much variation in the road.
- Train VAE as above (duh!)
- Call train_mdrnn.py
- Train the VAE and MDRNN
- Call python train_controller.py --n-samples 4 --pop-size 6 --target-return 950 --max-workers=12
- max-workers sets how many population parameters to run in parallel, where each population requires a core for evaluation
Note that training the MDRNN requires that the VAE is well trained, and training the controller requires that both the VAE and MDRNN are well trained! It's important to retrain the VAE and MDRNN as the agent explores more of the environment. These three steps are looped in train.bash.
xvfb-run -a -s "-screen 0 1400x900x24 +extension RANDR" python3 train_dqn.py --task "CarRacing-v0 --train 2000 --eval 100"
Note that xvfb-run
is necessary iff you are training on computer without a display connected (e.g. over SSH).