Skip to content

DJLServing v0.25.0 Release

Compare
Choose a tag to compare
@siddvenk siddvenk released this 20 Dec 19:41
· 2 commits to 0.25.0-dlc since this release
043e7af

Key Changes

  • TensorRT LLM Integration. DJLServing now supports using the TensorRT LLM backend to deploy Large Language Models.
  • SmoothQuant support in DeepSpeed
  • Rolling batch support in DeepSpeed to boost throughput
  • Updated Documentation on using DJLServing to deploy LLMs
    • We have added documentation for supported configurations per container, as well as many new examples

Enhancements

Bug Fixes

Docs

CI/CD

New Contributors

Full Changelog: v0.24.0...v0.25.0