Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added E3B and validated - SuperMarioBros environment - Fixed Pretraining Mode #41

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

roger-creus
Copy link
Collaborator

Description

I have implemented the E3B intrinsic reward proposed here. I have added the SuperMarioBros environment, which I have used to validate the E3B implementation. I have also fixed the pretraining mode for on-policy agents:

Before: the intrinsic rewards are only added to the extrinsic returns and advantages.
Now: if on pretraining mode, compute the intrinsic returns and intrinsic advantages. If using intrinsic + extrinsic rewards, do as before.

This has significantly increased the performance of intrinsic reward algorithms in pre-training mode.

This is the performance of PPO+E3B during pretraining mode in the SuperMarioBros-1-1-v3 environment (i.e. without access to task rewards!)

image

Motivation and Context

  1. E3B is a recent algorithm that achieves SOTA results in complex environments, so it's a valuable contribution.
  2. During the pretraining phase, the intrinsic rewards were not being optimized properly
  3. Added the SuperMarioBros environment because it is cool and helps evaluating the performance of exploration algorithms since in Mario, good exploratory agents achieve high task rewards.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)

Checklist

  • I've read the CONTRIBUTION guide (required)
  • I have updated the changelog accordingly (required).
  • My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • I have updated the documentation accordingly.
  • I have opened an associated PR on the rllte-hub repository (if necessary)
  • I have reformatted the code using make format (required)
  • I have checked the codestyle using make check-codestyle and make lint (required)
  • I have ensured make pytest and make type both pass. (required)
  • I have checked that the documentation builds using make doc (required)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant