Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 400 Bytes

200213 Training Large Neural Networks with Constant Memory using a New Execution Algorithm.md

File metadata and controls

7 lines (4 loc) · 400 Bytes

https://arxiv.org/abs/2002.05645

Training Large Neural Networks with Constant Memory using a New Execution Algorithm (Bharadwaj Pudipeddi, Maral Mesmakhosroshahi, Jinwen Xi, Sujeeth Bharadwaj)

CPU 메모리를 활용해 큰 모델의 메모리 소모를 효율적으로 감소시키는 방법. ZeRO도 그렇고 마소가 갑자기 큰 모델 트레이닝시키는 것에 꽂힌 듯?

#computation