Skip to content

v0.12.0

Compare
Choose a tag to compare
@lu229 lu229 released this 17 Nov 07:45
· 476 commits to master since this release

Release Note

The following are the highlights in this release:

Performance Optimization

We found that the lack of OP implementations on devices(GPU, Hexagon DSP, etc.) would lead to inefficient model execution, for the memory synchronization between the device and the CPU consumed much time, so we added and enhanced some operators on the GPU( reshape, lpnorm, mvnorm, etc.) and Hexagon DSP (s2d, d2s, sub, etc.) to improve the efficiency of model execution.

Further Support For Speech Recognition

In the last version, we supported the Kaldi framework. In Xiaomi we did a lot of work to support the speech recognition model, including the support of flatten, unsample and other operators in onnx, as well as some bug fixes.

CMake Support

Mace is continuously optimizing our compilation tools. This time, we support cmake compilation. Because of the use of ccache for acceleration, the compilation speed of cmake is much faster than the original bazel.
Related Docs: https://mace.readthedocs.io/en/latest/user_guide/basic_usage_cmake.html

Others

In this version, We supported detection of perfomance regression by dana , and “ gpu_queue_window” parameter is added to yml file, to solve the UI jam problem caused by GPU task execution.
Related Docs: https://mace.readthedocs.io/en/latest/faq.html

Acknowledgement

Thanks for the following guys who contribute code which make MACE better.

yungchienhsu, gasgallo, albu, yunikkk