Skip to content
/ qwr Public

Q-Value Weighted Regression for Reinforcement Learning with discrete or continuous actions

License

Notifications You must be signed in to change notification settings

vub-ai-lab/qwr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Q-Value Weighted Regression

Paper: Q-Value Weighted Regression: Reinforcement Learning with Limited Data

Q-Value Weighted Regression is a relatively simple RL algorithm that trains a stochastic policy so that the probability of every action increases, with a force (weight) proportional to the advantage value of that action. The advantage value of an action is computed as $A(s, a) = Q(s, a) - E[Q(s, a')]$.

This repo implements QwR as I understand it from the paper (that releases no code). With limited hyper-parameter tuning, the code in this repository learns LunarLander and LunarLanderContinuous. It also runs on Pong but does not seem to learn.

Features

  • Interacts with an OpenAI Gym environment
  • Support for discrete and continuous action spaces (Discrete and Box spaces)
  • Support for Discrete, Box and Dict observation spaces. Images are fed through a NatureCNN.
  • Simple logging: stuff gets printed on stdout with a prefix, for ease of use with gnuplot.
  • Simple code without advanced features, designed to quickly experiment with the algorithm.

About

Q-Value Weighted Regression for Reinforcement Learning with discrete or continuous actions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published