Skip to content

Release v2.4.0

Compare
Choose a tag to compare
@takuseno takuseno released this 18 Feb 03:57
· 88 commits to master since this release

Tuple observations

In v2.4.0, d3rlpy supports tuple observations.

import numpy as np
import d3rlpy

observations = [np.random.random((1000, 100)), np.random.random((1000, 32))]
actions = np.random.random((1000, 4))
rewards = np.random.random((1000, 1))
terminals = np.random.randint(2, size=(1000, 1))
dataset = d3rlpy.dataset.MDPDataset(
    observations=observations,
    actions=actions,
    rewards=rewards,
    terminals=terminals,
)

You can find an example script here

Enhancements

  • logging_steps and logging_strategy options have been added to fit and fit_online methods (thanks, @claudius-kienle )
  • Logging with WanDB has been supported. (thanks, @claudius-kienle )
  • Goal-conditioned envs in Minari have been supported.

Bugfix

  • Fix errors for distributed training.
  • OPE documentation has been fixed.