Skip to content

Latest commit

 

History

History
19 lines (11 loc) · 1.04 KB

README.md

File metadata and controls

19 lines (11 loc) · 1.04 KB

Deterministic-GAIL-PyTorch

This is an attempt to implement Generative Adversarial Imitation Learning (GAIL) for deterministic policies with off Policy learning on static data. The policy never interacts with the environment (except for evaluation), instead it is trained on policy state-action pair, where policy only selects actions for states sampled from expert data.

Results

Although it works sometimes (depending on the type of environment), the algorithm has high variance, and the results are inconsistent.

BipedalWalker-v2

Expert Policy Recovered Policy (10 expert episodes)
Epochs vs rewards