Skip to content

Latest commit

 

History

History
70 lines (44 loc) · 1.97 KB

CHANGELOG.md

File metadata and controls

70 lines (44 loc) · 1.97 KB

Changelog

All changes we make to the assignment code or PDF will be documented in this file.

[unreleased] - yyyy-mm-dd

Added

Changed

Fixed

  • handout: clarify that ddp_bucketed_benchmarking doesn't require the full grid of runs.

[unreleased] - yyyy-mm-dd

Added

Changed

  • code: remove try-finally blocks in DDP tests.

Fixed

  • handout: remove outdated mention of a problem that doesn't exist on the assignment
  • handout: fix Slurm environment variables in examples.
  • handout: clarify assumptions in ddp_bucketed_benchmarking (b).

[0.0.3] - 2024-04-21

Added

Changed

  • code: remove humanfriendly from requirements.txt, add matplotlib
  • handout: modify problem distributed_communication_multi_node to specify that multinode measurements should be 2x1, 2x2, and 2x3.
  • handout: clarify that torch.cuda.synchronize() is necessary for timing collective communication ops, even when they are called with async_op=False.

Fixed

  • handout: fixed cut off text in problem memory_profiling (a)
  • handout: fixed mismatch between slurm config and description text in section 3.2
  • code: fix ToyModelWithTiedWeights to actually tie weights.
  • handout: fix typo in bucketed DDP test command, should be pytest tests/test_ddp.py
  • handout: fix deliverable of ddp_overlap_individual_parameters_benchmarking (a) to not ask for communication time, only end-to-end step time.
  • handout: clarify analysis in optimizer_state_sharding_accounting (a).

[0.0.1] - 2024-04-17

Added

  • handout: added a short question about variability on problem benchmarking_script

Changed

Fixed

  • handout: fixed typo in problem triton_rmsnorm_forward. The adapters should return the classes, not the .apply attribute.
  • code: added -e flag to ./cs336-systems/'[test]'
  • handout: clarified recommendation about the timeit module
  • handout: clarified question about kernel with highest CUDA total

[0.0.0] - 2024-04-16

Initial release.