Releases: microsoft/tensorflow-directml-plugin
tensorflow-directml-plugin 0.4.0
The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml-plugin
.
Changes in 0.4.0
- Add DirectML kernels for
CudnnRNNCanonicalToParams
andCudnnRNNParamsToCanonical
- Add support for grouped convolution in
Conv2DBackpropFilter
andConv3DBackpropFilter
- Add
float16
support for_FusedConv2D
tensorflow-directml-plugin 0.3.0
The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml-plugin
.
Changes in 0.3.0
- Set
tensorflow-cpu==2.10.0
as a hard dependency due to incompatibility with Keras 2.11's default optimizers. - Fix overflow in
BatchNorm
ops when float16 or mixed precision is used. - Remove unnecessary
Cast
operation inReduceMin
andReduceMax
ops.
tensorflow-directml-plugin 0.2.0
The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml-plugin
.
Changes in 0.2.0
- Improve TensorBoard profiling and capturing chrome traces
- Add support for
exponential_avg_factor != 1.0
inFusedBatchNorm
- Add an
int32
kernel registration forFill
tensorflow-directml-plugin 0.1.1
The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml-plugin
.
Changes in 0.1.1
- Fix a crash in
InTopKV2
whenk
is bigger than the size of the axis dimension.
tensorflow-directml-plugin 0.1.0
The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml-plugin
.
Changes in 0.1.0
- Upgrade the DirectML version to 1.9.1, which includes minor bug fixes and performance improvements.
- Add DirectML kernels for the
RngSkip
andRngReadAndSkip
operators. - Add DirectML kernels for the
StatelessRandomGetKeyCounterAlg
,StatelessRandomGetKeyCounter
andStatelessRandomGetAlg
operators. - Add a DirectML kernel for
SparseApplyAdagrad
. - Add a DirectML kernel for
StatelessRandomUniformV2
. - Add a DirectML kernel for
InTopKV2
. - Add DirectML kernels for
MatrixDiagV3
andMatrixDiagPartV3
. - Add emulated support for
int64
. - Add a dependency on
tensorflow-cpu>=2.10.0
. Users should install thetensorflow-cpu
package instead oftensorflow
ortensorflow-gpu
when usingtensorflow-directml-plugin
. - Add
int32
support forStridedSlice
. - Add CPU emulated versions of
UnsortedSegmentSum
,UnsortedSegmentMax
,UnsortedSegmentMin
andUnsortedSegmentProd
to get rid of device placement errors in transformer models. - Add a C API for Linux. The C API can be downloaded from the releases page in the
tensorflow-directml-plugin
GitHub repository. - Add support for multiple devices.
- Add integer support for
Relu
. - Add
int32
support forPack
. - Fix the incomplete adapter description on Linux.
- Fix a crash in
ArgMin
andArgMax
when the output type wasint16
oruint16
. - Fix an undefined behavior when retrieving a list of strings from an attribute.
- Fix a memory leak in the BFC allocator.
- Fix a memory leak in the graph optimizer.
- Fix a memory leak in
SegmentReduction
. - Fix a memory leak in
StridedSlice
. - Fix a memory leak in the emulated random kernels.
- Fix the validation of
Range
to allow values nearINT_MAX
. - Get rid of warnings related to unsupported
DataFormatDimMap
andDataFormatVecPermute
operators. - Prevent unbounded growth of command allocator memory.
- Optimize output allocation for inputs that can be executed in-place and directly forwarded to the output.
- Increase the available memory by allowing devices to allocate shared (nonlocal) memory.
- Improve the performance of the unsorted segment operators by batching GPU->CPU copies together.
- Increase the performance of emulated operators by reducing the number of eager context and eager ops creation.