Releases: ROCm/MIOpen
Releases · ROCm/MIOpen
MIOpen v2.9.0
Notes:
- This release contains implicit GEMM algorithm performance updates and bug fixes. Additional performance improvements have been implement for batch normalization.
Changes:
- Added new assembly implicit GEMM kernels
- Added batch normalization optimizations
- Fixed issue where miopen-hip backend install would not search for rocBLAS dependency
- Removed missing tunings from previous release cycle
- Removed deprecated implicit GEMM xDLOPs solvers
- Removed incorrect error messages from implicit GEMM solvers
- Disabled ConvAsmBwdWrW3x3 solver for stride > 1 cases
- Disabled bidirectional multi-pass kernels due to stability issues
MIOpen v2.8.0
Notes:
- This release provides additional bug fixes and support for embedded builds using MIOpen as a static library.
Changes:
- Fixed workspace size calculation for GEMM group convolutions
- Fixed performance regression for M/N
- Fixed issue with faulty compiler option
- Fixed typo in components dependency variable in CMakeLists.txt
- Fixed issues with COMgr backed online compilation for HIP kernels
- Added cmake flag for embedding system databases when building a static library
- Added a way to disable building MIOpenDriver when building a static library
- Added CC compiler detection in ROCm environment
- Known issue: This release may show warnings for "obsolete configs" in the performance database. This can be fixed by rerunning tuning on a specific network; see tuning documentation
MIOpen v2.7.0
Notes:
- This release contains a new reduction API; see API documentation for more information. Additional features for embedded builds have been added, and further support for 3D convolutional networks.
Changes:
- Added additional tunings into performance database
- Added general reduction API
- Added cmake flag for embedding binary database into a static MIOpen build
- Added cmake flag for embedding system find-db text files into static MIOpen build
- Fixed issue with GEMM workspace size calculation for backwards data convolutions #381
- Fixed issue with 3D pooling indexing #365
MIOpen v2.6.0
Notes:
- This release contains convolution performance improvements, improved multi-threading behavior, and improved stability for half precision convolutions. Initial iteration time has been reduced with the introduction of hybrid find mode. Builds for a static library have been refined for this release.
Changes:
- Added MIOPEN_FIND_MODE=3 as the new default convolution Find mode; see documentation here for details
- Added a more runtime-parameterized version of pooling to reduce the number of online compilations
- Improved the performance of backwards spatial batch normalization for small images
- Fixed issue with std::logic_error in SQLite deleter #306
- Fixed issues with half precision stability for convolutions
- Fixed issues with multi-threaded SQLite database accesses
- Fixed issues with 3-D convolutions and incorrect parameters
- Fixed various issues with implicit GEMM static assert failures
- Removed inactive implicit GEMM convolution solvers
- Removed SCGEMM convolutional algorithm from MIOpen
MIOpen v2.5.0
Notes:
- This release contains convolution performance improvements, various minor fixes and documentation updates.
Changes:
- Added a script to detect and install appropriate precompiled kernels
- Added 3D convolution backwards weights implicit GEMM implementation
- Improve performance of convolution implicit GEMM algorithm
- Improved database coverage for batch size 1
- Improved logging and error reporting
- Improved documentation for debugging with numeric checks
- Fixed issue with potential infinities and NaNs appearing during low precision training on CNNs
MIOpen v2.4.0
Notes:
- This release contains new implementations of 3D convolutions using implicitGEMM, general performance improvements for convolutions, bug fixes, better versioning in directories, integration with the new rocclr, and dropout support in RNNs.
Changes:
- Added 3D convolutions for the implicitGEMM algorithm in the forward and backward-data passes
- Added dropout support for RNN layer; e.g., RNN-vanilla, GRU, and LSTM
- Added support for AMD's rocclr runtime and compiler
- Improved performance for implicitGEMM and Winograd algorithms
- Improved database locking
- Fixed issue with GPU memory segmentation fault on asymmetric padding #142
MIOpen v2.3.0
Notes:
- This release contains new implementations of the implicitGEMM and Winograd algorithms, performance improvements for convolutions, further support for 3D convolutional networks, and various bug fixes.
Changes:
- Added 3D Pooling layers
- Added backwards data algorithm for implicitGEMM
- Added GEMM performance improvements via relaxed constraints in rocBLAS-Tensile
- Added full CO v3 support for all kernels in MIOpen
- Added new Winograd group convolution kernels
- Added an API to query MIOpen's version
- Added parallel compilation in initial convolutional algorithm search; partial solution to #130
- Added SQLite binary program cache
- Improved logging across all layers
- Improved MIOpen's internal design for calling convolutional solvers
- Fixed various bugs for the implicitGEMM algorithm
MIOpen v2.2.1
Notes:
- This release contains bug fixes, documentation updates, and further code object version 3 support
Changes:
- Added support for multiple ROCm installations
- Added additional support for code object v3
- Fixed issue with incorrect LRN calculation #127
- Fixed incorrect performance database documentation
- Fixed issue with incorrect workspace calculation in group convolutions
- Fixed issue with unsupported hardware instructions used with inline assembly
MIOpen v2.2.0
Notes:
- This release contains bug fixes, performance improvements, and expanded applicability for specific convolutional algorithms.
- MIOpen has posted a citable paper on ArXiv here.
- An SQLite database has been added to replace the text-based performance database. While the text file still exists, by default SQLite is used over the text-based performance database; see documentation from more details.
Changes:
- Added per solution algorithm filtering environmental variable for debugging
- Added SQLite3 database and build dependency. The text-based performance database support is deprecated and will be removed in the next release.
- Added citation page to documentation pointing to MIOpen's paper
- Added to the overall documentation
- Fixed fusion compilation check issue
- Fixed fusion group convolution warning
- Improved performance of forward pooling
- Improved performance of convolutions
- Improved performance of spatial training batch normalization for some large batch size input configurations
- Improved applicability of implicit GEMM convolution algorithm
- Improved performance of calls to miopenConvolutionXXXGetWorkSpaceSize() functions
- Improved conformance to code object version 3
- Disabled SCGEMM convolution algorithm by default; this algorithm is deprecated and will be removed in future releases
- Changed "hip_hcc" to "hip-hcc" for the MIOpen package requirements in CMakeLists.txt
MIOpen v2.1.0
Notes:
- This release contains new layers, bug fixes, and a new convolution algorithm.
Changes:
- Added a dropout layer API for training
- Added a new SCGEMM algorithm for convolutions
- Added further support for bfp16 in convolutions
- Added a docker hub link for MIOpen docker images.
- Fixed issue with NaN appearing on batch normalization backwards pass in fp16
- Fixed softmax kernel bug in log mode #112
- Fixed gfx803 support issue #869
- Fixed gfx803 kernel issue #117
- Fixed issue with disabled GEMM #119
- Improved performance of batch normalization fp16 forward training layers
- Improved performance of convolutions layers
- Removed MIOpenGEMM as a requirement for the HIP backend. It is now optional.