InternLM / InternEvo Star 285 Code Issues Pull requests InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies. pytorch multi-modal gemma pipeline-parallelism transformers-models tensor-parallelism llava llm-training internlm flash-attention zero3 llm-framework sequence-parallelism internlm2 ring-attention deepspeed-ulysses llama3 910b Updated Sep 29, 2024 Python
xrsrke / pipegoose Star 77 Code Issues Pull requests Discussions Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)* transformers moe data-parallelism distributed-optimizers model-parallelism megatron mixture-of-experts pipeline-parallelism huggingface-transformers megatron-lm tensor-parallelism large-scale-language-modeling 3d-parallelism zero-1 sequence-parallelism Updated Dec 14, 2023 Python
AlibabaPAI / FlashModels Star 8 Code Issues Pull requests Fast and easy distributed model training examples. deep-learning pytorch zero data-parallelism model-parallelism distributed-training xla tensor-parallelism llm fsdp sequence-parallelism Updated Sep 23, 2024 Python
InternLM / InternEvo-HFModels Star 3 Code Issues Pull requests Democratizing huggingface model training with InternEvo model-parallelism huggingface tensor-parallelism llm zero3 sequence-parallelism Updated Sep 30, 2024 Python