generalized-advantage-estimation

An implementation from the state-of-the-art family of reinforcement learning algorithms Proximal Policy Optimization using normalized Generalized Advantage Estimation and optional batch mode training. The loss function incorporates an entropy bonus.

python machine-learning reinforcement-learning entropy deep-learning neural-network optimization gae pytorch rl actor-critic proximal-policy-optimization ppo open-ai open-ai-gym generalized-advantage-estimation ppo-pytorch

Updated Dec 26, 2022
Python

nslyubaykin / relax_trpo_example

Star

Example TRPO implementation with ReLAx

reinforcement-learning gae policy-gradient reinforcement-learning-algorithms continuous-control trpo generalized-advantage-estimation discrete-control

Updated Aug 29, 2022
Jupyter Notebook

nslyubaykin / relax_ppo_example

Star

Example PPO implementation with ReLAx

reinforcement-learning gae policy-gradient reinforcement-learning-algorithms continuous-control proximal-policy-optimization ppo generalized-advantage-estimation discrete-control

Updated Aug 29, 2022
Jupyter Notebook

Improve this page

Add a description, image, and links to the generalized-advantage-estimation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the generalized-advantage-estimation topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

generalized-advantage-estimation

Here are 8 public repositories matching this topic...

bentrevett / pytorch-rl

adik993 / ppo-pytorch

hcnoh / rl-collection-pytorch

leaderj1001 / Phasic-Policy-Gradient

nslyubaykin / rnns_for_pomdp

tomasspangelo / proximal-policy-optimization

nslyubaykin / relax_trpo_example

nslyubaykin / relax_ppo_example

Improve this page

Add this topic to your repo