- v1.4 [20Q1]
- 1.4.1:
- tensorforce reintegrated (due to an incompatibility between tfagents and tensorforce, tensorforce must be explicitely activated (see intro Switching backends)
- upgrade to tfagents 0.3, tensorflow 2.0.1, matplotlib 3.1.2
- kwargs for env.register_with_gym(...)
- 1.4.0:
- agent saving & loading (see intro Saving & loading a trained policy);
- lineworld as test environment included
- 1.4.1:
- v1.3 [19Q4]
- 1.3.1: agent.score substituted by agent.evalute;
- 1.3.0:
- migration to tensorflow 2.0
- support for tensorforce and keras-rl suspended until support for tf 2.0 is available
- v1.2 [19Q3]
- 1.2.2: fix for CemAgent and SacAgent default backend registration
- 1.2.1: SacAgent for tfagents preview; notebook on 'Agent logging, seeding and jupyter output cells'
- 1.2.0: Agent.score
- v1.1 [19Q3]
- 1.1.23: CemAgent for keras-rl backend; DqnAgent, RandomAgent for tensorforce
- 1.1.22: DuelingDqnAgent, DoubleDqnAgent with keras-rl backend
- 1.1.21: keras-rl backend (dqn)
- 1.1.20: #54 logging in jupyter notebook solved, doc updates
- 1.1.19:
- jupyter plotting performance improved
- plot.ToMovie with support for animated gifs
- 1.1.18: tensorforce backend (ppo, reinforce)
- 1.1.11:
- plot.StepRewards, plot.Actions
- default_plots parameter (instead of default_callbacks)
- v1.0.1 [19Q3]
- api based on pluggable backends and callbacks (for plotting, logging, training durations)
- backend: tf-agents, default
- algorithms: dqn, ppo, random
- plots: State, Loss (including actor-/critic loss), Steps, Rewards
- support for creating a mp4 movie (plot.ToMovie)
- v0.1 [19Q2]
- prototype implementation / proof of concept
- hard-wired support for Ppo, Reinforce, Dqn on tf-agents
- hard-wired plots for loss, sum-of-rewards, steps and state rendering
- hard-wired mp4 rendering
- separate "public api" from concrete implementation using a frontend / backend architecture (inspired by scikit learn, matplotlib, keras)
- pluggable backends
- extensible through callbacks (inspired by keras). separate callback types for training, evaluation and monitoring
- pre-configurable, algorithm specific train & play loops