Release Release v2.5.0 · takuseno/d3rlpy

New Algorithm

Cal-QL has been added to d3rlpy in v2.5.0! Please check a reproduction script here. To support faithful reproduction, SparseRewardTransitionPicker has been also added, which is used in the reproduction script.

Custom Algorithm Example

One of the frequent questions is "How can I implement a custom algorithm on top of d3rlpy?". Now, the new example script has been added to answer this question. Based on this example, you can build your own algorithm while you can utilize a whole training pipeline provided by d3rlpy. Please check the script here.

Enhancement

Exporting Decision Transformer models as TorchScript and ONNX has been implemented. You can use this feature via save_policy method in the same way as you use with Q-learning algorithms.
Tuple observation support has been added to PyTorch/ONNX export.
Modified return-to-go calculation for Q-learning algorithms and skip this calculation if return-to-go is not necessary.
n_updates option has been added to fit_online method to control update-to-data (UTD) ratio.
write_at_termination option has been added to ReplayBuffer.

Bugfix

Action scaling has been fixed for D4RL datasets.
Default replay buffer creation at fix_online method has been fixed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v2.5.0

New Algorithm

Custom Algorithm Example

Enhancement

Bugfix