Deterministic-GAIL-PyTorch

This is an attempt to implement Generative Adversarial Imitation Learning (GAIL) for deterministic policies with off Policy learning on static data. The policy never interacts with the environment (except for evaluation), instead it is trained on policy state-action pair, where policy only selects actions for states sampled from expert data.

Results

Although it works sometimes (depending on the type of environment), the algorithm has high variance, and the results are inconsistent.

BipedalWalker-v2

Expert Policy	Recovered Policy (10 expert episodes)

Epochs vs rewards

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Deterministic-GAIL-PyTorch

Results

BipedalWalker-v2

Files

README.md

Latest commit

History

README.md

File metadata and controls

Deterministic-GAIL-PyTorch

Results

BipedalWalker-v2