Releases: pjlab-sys4nlp/llama-moe
Releases · pjlab-sys4nlp/llama-moe
v1.0.0-publish
Everything seems to be ready
v0.3.2-cpt-configs_and_scripts
- add final data portion of sheared llama
- add gate_network_type, moe_calculator_score_scale_factor, and update prob_map arguments in config
- add exec scripts
v0.3.1-cpt-dynamic_batch_loading: Llama2 CPT with Dynamic Batch Loading
- Llama2 CPT with 4096 context length training.
- Dynamic batch loading from ShearedLlama Implementation.
v0.2.1-cpt-13b: Fix 13B CPT bugs
Merge pull request #31 from pjlab-sys4nlp/scaling_13b CPT: fix tb logging, fix grad ckpting, faster data loading