03 Feb 00:57

v0.4.0

5973109

tensorflow-directml-plugin 0.4.0 Pre-release

Pre-release

The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml-plugin.

Changes in 0.4.0

Add DirectML kernels for CudnnRNNCanonicalToParams and CudnnRNNParamsToCanonical
Add support for grouped convolution in Conv2DBackpropFilter and Conv3DBackpropFilter
Add float16 support for _FusedConv2D

Assets 3

13 Dec 05:17

maggie1059

v0.3.0

e58bf87

tensorflow-directml-plugin 0.3.0 Pre-release

Pre-release

The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml-plugin.

Changes in 0.3.0

Set tensorflow-cpu==2.10.0 as a hard dependency due to incompatibility with Keras 2.11's default optimizers.
Fix overflow in BatchNorm ops when float16 or mixed precision is used.
Remove unnecessary Cast operation in ReduceMin and ReduceMax ops.

Assets 3

21 Oct 01:43

PatriceVignola

v0.2.0

368d8d3

tensorflow-directml-plugin 0.2.0 Pre-release

Pre-release

The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml-plugin.

Changes in 0.2.0

Improve TensorBoard profiling and capturing chrome traces
Add support for exponential_avg_factor != 1.0 in FusedBatchNorm
Add an int32 kernel registration for Fill

Assets 3

05 Oct 18:10

PatriceVignola

v0.1.1.dev221004

fde3375

tensorflow-directml-plugin 0.1.1 Pre-release

Pre-release

The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml-plugin.

Changes in 0.1.1

Fix a crash in InTopKV2 when k is bigger than the size of the axis dimension.

Assets 3

29 Sep 17:08

PatriceVignola

v0.1.0.dev220928

536ad9a

tensorflow-directml-plugin 0.1.0 Pre-release

Pre-release

The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml-plugin.

Changes in 0.1.0

Upgrade the DirectML version to 1.9.1, which includes minor bug fixes and performance improvements.
Add DirectML kernels for the RngSkip and RngReadAndSkip operators.
Add DirectML kernels for the StatelessRandomGetKeyCounterAlg, StatelessRandomGetKeyCounter and StatelessRandomGetAlg operators.
Add a DirectML kernel for SparseApplyAdagrad.
Add a DirectML kernel for StatelessRandomUniformV2.
Add a DirectML kernel for InTopKV2.
Add DirectML kernels for MatrixDiagV3 and MatrixDiagPartV3.
Add emulated support for int64.
Add a dependency on tensorflow-cpu>=2.10.0. Users should install the tensorflow-cpu package instead of tensorflow or tensorflow-gpu when using tensorflow-directml-plugin.
Add int32 support for StridedSlice.
Add CPU emulated versions of UnsortedSegmentSum, UnsortedSegmentMax, UnsortedSegmentMin and UnsortedSegmentProd to get rid of device placement errors in transformer models.
Add a C API for Linux. The C API can be downloaded from the releases page in the tensorflow-directml-plugin GitHub repository.
Add support for multiple devices.
Add integer support for Relu.
Add int32 support for Pack.
Fix the incomplete adapter description on Linux.
Fix a crash in ArgMin and ArgMax when the output type was int16 or uint16.
Fix an undefined behavior when retrieving a list of strings from an attribute.
Fix a memory leak in the BFC allocator.
Fix a memory leak in the graph optimizer.
Fix a memory leak in SegmentReduction.
Fix a memory leak in StridedSlice.
Fix a memory leak in the emulated random kernels.
Fix the validation of Range to allow values near INT_MAX.
Get rid of warnings related to unsupported DataFormatDimMap and DataFormatVecPermute operators.
Prevent unbounded growth of command allocator memory.
Optimize output allocation for inputs that can be executed in-place and directly forwarded to the output.
Increase the available memory by allowing devices to allocate shared (nonlocal) memory.
Improve the performance of the unsorted segment operators by batching GPU->CPU copies together.
Increase the performance of emulated operators by reducing the number of eager context and eager ops creation.

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes in 0.4.0

Changes in 0.3.0

Changes in 0.2.0

Changes in 0.1.1

Changes in 0.1.0

Releases: microsoft/tensorflow-directml-plugin

tensorflow-directml-plugin 0.4.0

Changes in 0.4.0

tensorflow-directml-plugin 0.3.0

Changes in 0.3.0

tensorflow-directml-plugin 0.2.0

Changes in 0.2.0

tensorflow-directml-plugin 0.1.1

Changes in 0.1.1

tensorflow-directml-plugin 0.1.0

Changes in 0.1.0