rlberry-v0.3.0
Release of version 0.3.0 of rlberry.
New in 0.3.0
PR #206
- Creation of a Deep RL tutorial, in the user guide.
PR #132
- New tracker class
rlberry.agents.bandit.tools.BanditTracker
to track statistics to be used in Bandit algorithms.
PR #191
- Possibility to generate a profile with
rlberry.agents.manager.AgentManager
.
- Misc improvements on A2C.
- New StableBaselines3 wrapper
rlberry.agents.stable_baselines.StableBaselinesAgent
to import StableBaselines3 Agents.
PR #119
- Improving documentation for agents.torch.utils
- New replay buffer
rlberry.agents.utils.replay.ReplayBuffer
, aiming to replace code in utils/memories.py - New DQN implementation, aiming to fix reproducibility and compatibility issues.
- Implements Q(lambda) in DQN Agent.
Feb 22, 2022 (PR #126)
- Setup
rlberry.__version__
(currently 0.3.0dev0) - Record rlberry version in a AgentManager attribute equality of AgentManagers
- Override
__eq__
method of the AgentManager class.
Feb 14-15, 2022 (PR #97, #118)
- (feat) Add Bandits basic environments and agents. See
~rlberry.agents.bandits.IndexAgent
and~rlberry.envs.bandits.Bandit
. - Thompson Sampling bandit algorithm with gaussian or beta prior.
- Base class for bandits algorithms with custom save & load functions (called
~rlberry.agents.bandits.BanditWithSimplePolicy
)
- (fix) Fixed bug in
FiniteMDP.sample()
: terminal state was being checked withself.state
instead of givenstate
- (feat) Option to use 'fork' or 'spawn' in
~rlberry.manager.AgentManager
- (feat) AgentManager output_dir now has a timestamp and a short ID by default.
- (feat) Gridworld can be constructed from string layout
- (feat)
max_workers
argument for~rlberry.manager.AgentManager
to control the maximum number of processes/threads created by thefit
method.
Feb 04, 2022
- Add
~rlberry.manager.read_writer_data
to load agent's writer data from pickle files and make it simpler to customize in~rlberry.manager.plot_writer_data
- Fix bug, dqn should take a tuple as environment
- Add a quickstart tutorial in the docs
quick_start
- Add the RLSVI algorithm (tabular)
~rlberry.agents.RLSVIAgent
- Add the Posterior Sampling for Reinforcement Learning PSRL agent for tabular MDP
~rlberry.agents.PSRLAgent
- Add a page to help contributors in the doc
contributing