Skip to content

Bootstrap your own latent (BYOL) methods in offline reinforcement learning

Notifications You must be signed in to change notification settings

dhruvsreenivas/byol-offline

Repository files navigation

BYOL-Offline

Bootstrap your own latent (BYOL) [1] [2] and other methods commonly used in exploration applied in offline reinforcement learning.

Updates

  • Fixed JAX Dreamer setup -- VAE now trains fine.
  • BYOL loss looks fine to me as well, but need to test this more to make sure that the uncertainty quant actually makes sense (i.e. implement online with PPO and see if it works on M-Revenge or something).

General conclusions

  • Have to test more (as RL training is quite slow, and also need to test with BC warmup), but I'm pretty sure the BYOL loss doesn't necessarily give a great pessimism signal for offline RL.
  • This makes some sense, as I saw that the BYOL loss was much smaller and conflicted with the Dreamer los for high-dimensional observations.

About

Bootstrap your own latent (BYOL) methods in offline reinforcement learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages