Releases: ucb-bar/gemmini
Releases · ucb-bar/gemmini
v0.7.1
v0.7.0 I-BERT, Chipyard 1.8, and new profiling scripts
- Add support for I-BERT, including its I-GELU, layernorm, and softmax variants
- Add new parallelized profiling and data collection scripts
- Support for Chipyard 1.8
- Miscellaneous area, power, perf, and functionality improvements, as well as bug fixes
v0.6.4 Dummy configs, area improvements, and bug fixes
New features:
- Add support for new "dummy configs" which give cycle-accurate performance results, but which don't guarantee functional correctness of matmul/conv operations
- Optimize area for reservation stations by splitting unified reservation station into separate stations for each instruction type
Bug fixes:
- Fix incorrect assertions from firing when building some non-systolic configs
- Fix unpredictable infinite stall caused by OS interrupts
- Fix build scripts when building simulators that can generate waveforms
v0.6.3 Chisel 3.5 Bump and Depthwise Convolution Fixes
- Bump to Chisel 3.5
- Fix bug with depthwise convolutions
v0.6.2 Proxy Kernel and Floating Point Fixes
- Disable on-demand paging for proxy-kernel workloads, because on-demand paging doesn't work well with large binaries
- Fix compilation errors for large floating-point configs
v0.6.1 New Usability Improvements
- New scripts to run proxy-kernel binaries on Gemmini
- Better help messages in Gemmini convenience scripts
- New setup and quick start instructions in Gemmini's README
v0.6 Training Features and Performance/Area/Energy Optimizations
- Support for various convolutions that occur during back-propagation
- Single-porting config options for scratchpad/accumulator
- Tree-reduction pipelining options for spatial array
- New SoC and simulation counters that give insight into the performance of Gemmini accelerators and other SoC components
- New scripts that make it easier to build and test Gemmini configs.
v0.2 Padding, Banked Accumulators, Unrolled mvin/mvout
- Hardware-supported padding for matrices which are not multiples of the array size
- Banked accumulators
- Loop unrolled mvin/mvout for matrix sizes divisible by the size of the array
- Bug fixes