You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Support for networks with attention body and smolgen added to blas, cuda, metal and onnx backends.
Persistent L2 cache optimization for the cuda backend. Use the cache_opt=true backend option to turn it on.
Some performance improvements for the cuda, onnx and blas backends.
Added the threads backend option to onnx, defaults to 0 (let the onnxruntime decide) except for onnx-cpu that defaults to 1.
The onnx-dml package now includes a directml.dll installation script.
Some users experienced memory issues with onnx-dml, so the defaults were changed. This may affect performance, in which case you can use the steps=8 backend option to get the old behavior.
The Python bindings are available as a package, see the README for instructions.
Some assorted fixes and code cleanups.
This discussion was created from the release v0.30.0-rc1.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
In this release:
cache_opt=true
backend option to turn it on.threads
backend option to onnx, defaults to 0 (let the onnxruntime decide) except for onnx-cpu that defaults to 1.directml.dll
installation script.steps=8
backend option to get the old behavior.This discussion was created from the release v0.30.0-rc1.
Beta Was this translation helpful? Give feedback.
All reactions