Releases: EleutherAI/gpt-neox
GPT-NeoX 2.0
With GPT-NeoX 2.0, we now support upstream DeepSpeed. This enables the use of new DeepSpeed features such as Curriculum Learning, Communication Logging, and Autotuning.
For any changes in upstream DeepSpeed that are fundamentally incompatible with GPT-NeoX 2.0, we do the following:
- Attempt to create a PR to upstream DeepSpeed
- Stage the PR on DeeperSpeed 2.x, so that there's always a DeepSpeed version that's guaranteed to work with GPT-Neox 2.x.
Therefore, we recommend using DeeperSpeed 2.x unless your use-case relies on a specific upstream DeepSpeed feature that we haven't merged into DeeperSpeed 2.x yet.
What's Changed
- Mup Support in #704
- Bring deepspeed_main up-to-date in #746
- Latest DeepSpeed Support in #663
- Curriculum Learning Support in #695
- Autotuning Support in #739
Full Changelog: v1.0...v2.0
GPT-NeoX 1.0
This is the legacy GPT-NeoX relying on old DeeperSpeed (0.3.15). We only recommend using this release under circumstance that you're loading a model based on old DeeperSpeed (e.g. GPT-J, GPT-NeoX20B, the Pythia suite, etc).
The primary difference between this release and v2.x is the DeepSpeed version supported. If you're using 2.x, we're assuming that you're using either the latest release of DeepSpeed or DeeperSpeed 2.x.
legacy_gptj_residual.1.0.0
Merge pull request #492 from EleutherAI/dependabot/pip/requirements/n…