Release Release v2.3.0 · takuseno/d3rlpy

Distributed data parallel training

Distributed data parallel training with multiple nodes and GPUs has been one of the most demanded feature. Now, it's finally available! It's extremely easy to use this feature.

Example:

# train.py
from typing import Dict

import d3rlpy


def main() -> None:
    # GPU version:
    # rank = d3rlpy.distributed.init_process_group("nccl")
    rank = d3rlpy.distributed.init_process_group("gloo")
    print(f"Start running on rank={rank}.")

    # GPU version:
    # device = f"cuda:{rank}"
    device = "cpu:0"

    # setup algorithm
    cql = d3rlpy.algos.CQLConfig(
        actor_learning_rate=1e-3,
        critic_learning_rate=1e-3,
        alpha_learning_rate=1e-3,
    ).create(device=device)

    # prepare dataset
    dataset, env = d3rlpy.datasets.get_pendulum()

    # disable logging on rank != 0 workers
    logger_adapter: d3rlpy.logging.LoggerAdapterFactory
    evaluators: Dict[str, d3rlpy.metrics.EvaluatorProtocol]
    if rank == 0:
        evaluators = {"environment": d3rlpy.metrics.EnvironmentEvaluator(env)}
        logger_adapter = d3rlpy.logging.FileAdapterFactory()
    else:
        evaluators = {}
        logger_adapter = d3rlpy.logging.NoopAdapterFactory()

    # start training
    cql.fit(
        dataset,
        n_steps=10000,
        n_steps_per_epoch=1000,
        evaluators=evaluators,
        logger_adapter=logger_adapter,
        show_progress=rank == 0,
        enable_ddp=True,
    )

    d3rlpy.distributed.destroy_process_group()


if __name__ == "__main__":
    main()

You need to use torchrun command to start training, which should be already installed once you install PyTorch.

$ torchrun \
   --nnodes=1 \
   --nproc_per_node=3 \
   --rdzv_id=100 \
   --rdzv_backend=c10d \
   --rdzv_endpoint=localhost:29400 \
   train.py

In this case, 3 processes will be launched and start training loop. DecisionTransformer-based algorithms also support this distributed training feature.

The example is also available here

Minari support (thanks, @grahamannett !)

Minari is an OSS library to provide a standard format of offline reinforcement learning datasets. Now, d3rlpy provides an easy access to this library.

You can install Minari via d3rlpy CLI.

$ d3rlpy install minari

Example:

import d3rlpy

dataset, env = d3rlpy.datasets.get_minari("antmaze-umaze-v0")

iql = d3rlpy.algos.IQLConfig(
    actor_learning_rate=3e-4,
    critic_learning_rate=3e-4,
    batch_size=256,
    weight_temp=10.0,
    max_weight=100.0,
    expectile=0.9,
    reward_scaler=d3rlpy.preprocessing.ConstantShiftRewardScaler(shift=-1),
).create(device="cpu:0")

iql.fit(
    dataset,
    n_steps=1000000,
    n_steps_per_epoch=100000,
    evaluators={"environment": d3rlpy.metrics.EnvironmentEvaluator(env)},
)

Minimize redundant computes

From this version, calculation of some algorithms are optimized to remove redundant inference. Therefore, especially algorithms with dual optimization such as SAC and CQL became extremely faster than the previous version.

Enhancements

GoalConcatWrapper has been added to support goal-conditioned environments.
return_to_go has been added to Transition and TransitionMiniBatch
MixedReplayBuffer has been added to sample two experiences from multiple buffers with arbitrary ratio.
initial_temperature supports 0 at DiscreteSAC.

Bugfix

Getting started page has been fixed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v2.3.0

Distributed data parallel training

Minari support (thanks, @grahamannett !)

Minimize redundant computes

Enhancements

Bugfix

Contributors