AI-Walkers-ppo-pytorch

Pytorch implimentation of proximal policy optimization with clipped objective function and generalized advantage estimation. The model is trained on Humanoid, hopper, ant and halfcheetah pybullet environment. To run multiple environments in multiple threads, SubprocVecEnv class from stable baseline is used (file included).

HumanoidBulletEnv-v0	CheetahBulletEnv-v0

HopperBulletEnv-v0	AntBulletEnv-v0

Usage

Important command line arguments :
--env environment name (note : works only for continuous pybullet environments)
--learn agent starts training
--play agent plays using pretrained model
-n_workers number of environments
-load continues training from given checkpoint
-model load the model or checkpoint
-ppo_steps number of steps before update
-epochs number of updates
-mini_batch batch size during ppo update
-lr policy and critic learning rate
-c1 critic discount
-c2 entropy beta

To train the agent:

# train new agent
python agent.py --learn --env <ENV_ID> 

# load checkpoints
python agent.py --learn --env <ENV_ID> -load -model <CHECKPOINT PATH>

To Play:

python agent.py --play --env <ENV_ID> -model <MODEL PATH>

Requirements

Python >= 3.7
Pytorch >= 1.3.1
gym
pybullet

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
__pycache__		__pycache__
checkpoints		checkpoints
README.md		README.md
agent.py		agent.py
ant.gif		ant.gif
cheetah.gif		cheetah.gif
hopper.gif		hopper.gif
humanoid.gif		humanoid.gif
humanoid_agent.py		humanoid_agent.py
humanoid_model.py		humanoid_model.py
model.py		model.py
multiprocessing_env.py		multiprocessing_env.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI-Walkers-ppo-pytorch

Usage

Requirements

About

Languages

iamvigneshwars/ai-walkers-ppo-pytorch

Folders and files

Latest commit

History

Repository files navigation

AI-Walkers-ppo-pytorch

Usage

Requirements

About

Topics

Resources

Stars

Watchers

Forks

Languages