#

sequence-parallelism

Here are 4 public repositories matching this topic...

InternLM / InternEvo

InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.

pytorch multi-modal gemma pipeline-parallelism transformers-models tensor-parallelism llava llm-training internlm flash-attention zero3 llm-framework sequence-parallelism internlm2 ring-attention deepspeed-ulysses llama3 910b

Updated Nov 18, 2024
Python

xrsrke / pipegoose

Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*

transformers moe data-parallelism distributed-optimizers model-parallelism megatron mixture-of-experts pipeline-parallelism huggingface-transformers megatron-lm tensor-parallelism large-scale-language-modeling 3d-parallelism zero-1 sequence-parallelism

Updated Dec 14, 2023
Python

AlibabaPAI / FlashModels

Fast and easy distributed model training examples.

deep-learning pytorch zero data-parallelism model-parallelism distributed-training xla tensor-parallelism llm fsdp sequence-parallelism

Updated Nov 8, 2024
Python

InternLM / InternEvo-HFModels

Democratizing huggingface model training with InternEvo

model-parallelism huggingface tensor-parallelism llm zero3 sequence-parallelism

Updated Oct 8, 2024
Python

Improve this page

Add a description, image, and links to the sequence-parallelism topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the sequence-parallelism topic, visit your repo's landing page and select "manage topics."