Skip to content

v0.0.3

Compare
Choose a tag to compare
@sublee sublee released this 30 Sep 07:41
· 170 commits to master since this release

Released on September 30, 2019.

Featured

torchgpipe now overlaps copy and computation using the separate CUDA streams. Previously, GPU could not compute a partition while copying micro-batches across different GPUs because they all happened on the same default CUDA stream.

Other Improvements

  • Added support for PyTorch 1.2.
  • Redesigned the internal pipeline parallelism to represent dependencies transparently.
  • Fixed the hanging issue when an exception is raised in a partition.
  • Fixed the unintended size accumulation (#3 by @842974287) of balance_by_size().

Breaking Changes:

  • No more support for PyTorch 1.0.
  • Changed type of GPipe.devices from tuple to list.
  • Removed current_microbatch(). This approach turned out to be incompatible with checkpointing.