Skip to content

0.0.50 Mixture of EasyDeL experts

Compare
Choose a tag to compare
@erfanzar erfanzar released this 08 Feb 11:40
· 759 commits to main since this release

What's Changed

  • Optimize mean loss and accuracy calculation by @yhavinga in #100
  • Mixtral Models are fully supported and they are PJIT-compatible
  • A Wider range of models now support FlashAttention on TPU
  • Qwen 1, Qwen 2, PHI 2, Robert is new Added Models which support FlashAttention on TPU and EasyBIT
  • LoRA support for the trainer is now Added (EasyDeLXRapTureConfig)
  • Adding EasyDel Serve Engine APIs
  • Adding Prompter (Beta and might be removed in future updates)
  • The Training Process is now 21 % Faster in 0.0.50 than 0.0.42.
  • Transform Functions are now Automated for all the models (Except MosaicMPT for this one you still have to use static methods)
  • The Trainer APIs have changed and now it's faster, more dynamic, and more hackable.
  • Default Version of the JAX now changed to 0.4.22 for FJFormer custom Pallas kernels usage.

New Contributors

Full Changelog: 0.0.42...0.0.50