Skip to content

DJLServing v0.21.0 release

Compare
Choose a tag to compare
@frankfliu frankfliu released this 25 Feb 18:14
· 1341 commits to master since this release
583dbc0

Key Features

  • Adds faster transformer support (#424)
  • Adds Deepspeed ahead of time partition script in DLC (#466)
  • Adds SageMaker MME support (#479)
  • Adds support for stable-diffusion-2-1-base model (#484)
  • Adds support for stable diffusion depth model (#488)
  • Adds out of memory protection for modle loading (#496)
  • Makes load_on_devices per model setting (#493)
  • Improves several per model settings
  • Improves management console model loading and inference UI (#431, #432)
  • Updates deepspeed to 0.8.0 (#465)
  • Upgrades PyTorch to 1.13.1 (#414)

Enhancement

  • Adds model_id support for huggingface models (#406)
  • Adds AI template package (#485)
  • Improves snakeyaml error message (#400)
  • Improves s5cmd error handling (#442)
  • Emits medel inference metrics to log file (#452)
  • Supports model.pt and model.onnx file name (#459)
  • Makes batch per model setting (#456)
  • Keeps failure worker status for 1 minutes (#463)
  • Detects engine to avoid uncessarily download MXNet engine (#481)
  • Uses temp directory instead of /tmp (#404)
  • Adds better logging and error handling for s5cmd process execution (#409)
  • Uses jacoco aggregation report plugin (#421)
  • Rollback model if failed start work in synchronous mode (#427)
  • Adds fastertransformer t5 integration test (#469)
  • Print better stacktrace if channel is closed (#473)
  • Supports FasterFansformer to run in mpi mode (#474)

Bug fixes

  • Adds fix to workaround SageMaker changes (#401)
  • Treats empty HTTP parameter as absent (#429)
  • Fixes inference console UI bug (#439)
  • Fixed gpt-neox model name typo (#441)
  • Fixes wrong onnx configuration (#449)
  • Fixes issue with passing dtype in huggingface handler. Refactor dtype_f…
  • Fixes issues with model_dir and model_id usage that occur when s3url is…
  • Fixes broken vue tags (#453)

Breaking change

  • Remove unecessary java engine adapter (#448)
  • Removes djl-central module in favor of management console (#447)
  • Sets model status to failure after exceed retry threshold (#455)
  • Removes DLR support (#468)

Documentation

  • Updates management api document (#436)
  • Adds dynamic batching settings to document (#462)
  • Improves plugin README (#477)
  • Fixes managment_api.md broken list (#478)
  • Updates serving configuration document (#437)