Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

larq / compute-engine Public

Notifications You must be signed in to change notification settings
Fork 35
Star 245

Code
Issues 17
Pull requests 1
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Releases: larq/compute-engine

Releases · larq/compute-engine

v0.4.3

21 Sep 13:31

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.4.3

🎉 Features

Add support for multi threaded interpreter (#512) @lgeiger
Add support for iterators in Interpreter.predict (#511) @lgeiger
Add Python wrapper for LCE interpreter (#507) @lgeiger

👷‍♂️ Internal Improvements

Remove redundant code paths: reference BGemm, BGemm functor. (#510) @AdamHillier
Make Interpreter::get_shapes and get_types more generic (#509) @lgeiger
Add padding to temporary arrays to ensure we don't read beyond bounds. (#505) @AdamHillier
Simplify MAKE_ZERO macro (#503) @lgeiger

⬆️ Dependencies

Bump DoozyX/clang-format-lint-action from v0.9 to v0.10 (#504) @dependabot

Assets 5

Loading

All reactions

v0.4.2

10 Sep 16:25

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.4.2

🐛 Bug Fixes

Fix bug introduced by #497. Make sure OPT kernels run. (#501) @AdamHillier

Assets 5

Loading

All reactions

v0.4.1

10 Sep 13:43

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.4.1

🎉 Features

Add bitpacked output support to the Aarch64 optimised int16 kernel. (#496) @AdamHillier
Add Int8 output support to the optimised binary convolution kernels. (#492) @AdamHillier

🚀 Performance

Add bitpacked output support to the Aarch64 optimised int16 kernel. (#496) @AdamHillier
Update int8 reference bitpack code (#487) @Tombana
Add an optimised Aarch64 assembly implementation of Int8 bitpacking. (#494) @AdamHillier
Add Int8 output support to the optimised binary convolution kernels. (#492) @AdamHillier
Fuse part of the back-transformation into the multiplier and bias. (#490) @AdamHillier

🐛 Bug Fixes

Fix bug in 32-bit accumulator kernel introduced by 346f72e. (#491) @AdamHillier

📖 Documentation

Fix typo in README.md (#485) @leonoverweel

👷‍♂️ Internal Improvements

Simplify path selection and remove unnecessary kernel registrations. (#497) @AdamHillier
Add inline to round and saturate (#499) @Tombana
Remove Preprocessed from OutputTransformDetails (#495) @Tombana
Use Bazel test_filter argument instead of action_env for tests (#488) @AdamHillier
Clean-up bconv2d by simplifying Init/Prepare/OneTimeSetup. (#486) @AdamHillier
Simplify BConv2DParams by removing flags. (#484) @AdamHillier

⬆️ Dependencies

Bump DoozyX/clang-format-lint-action from v0.8 to v0.9 (#489) @dependabot

Assets 5

Loading

All reactions

v0.4.0

28 Aug 15:51

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.4.0

⚠️ Breaking Changes ⚠️

Split BSign into Quantize/Dequantize and separate from BConv (#457) @lgeiger, @AdamHillier @Tombana
Change custom option enums to match flatbuffer schema (#464) @lgeiger
Set padding and fused activations as enums (#386) @lgeiger
Rename BConv attributes to follow TFLite more closely (#373) @lgeiger

🎉 Features

Support full int8 QAT models in converter (#449) @lgeiger
Add ARM32 kernel implementation (#432) @honglh
Allow to set fake default ranges to enable latency test of int8 models (#357) @lgeiger
Publish prebuilt benchmark binaries with each new release (#398) @lgeiger
Publish dev docker images for releases (#397) @lgeiger
Use thresholds for bitpacked output (#387) @Tombana
Support dilated convolutions in converter (#385) @lgeiger

🚀 Performance

Improve performance with optimised Ruy matrix packing. (#462) @AdamHillier
Improve Aarch64 performance by removing NEON pipeline stalls. (#394) @AdamHillier

🐛 Bug Fixes

Simplify make build (#480) @Tombana
Fix int8 outputtransform (#422) @Tombana
Fix trailing quantization op with experimental_bitpacking fusion (#402) @lgeiger
Fix BMaxPool IR definition (#404) @lgeiger
Fix sequential model in end2end test: add missing quantisers. (#390) @AdamHillier

📖 Documentation

Update URLs in readme to adapt to the build docs changes (#481) @lgeiger
Update benchmark table in Readme (#470) @lgeiger
Fix broken links (#413) @koenhelwegen

👷‍♂️ Internal Improvements

Remove zero_point argument from im2col (#478) @lgeiger
Remove lce_benchmark_all binary (#477) @lgeiger
Add profiler scopes for bitpacking functions and quantization/maxpool ops. (#471) @AdamHillier
Remove Windows builds from release workflow (#476) @lgeiger
Remove input/output dimensions from BConvParams (#475) @lgeiger
Directly pass packed filer shape to bconv kernel (#474) @lgeiger
Remove output shape argument from bitpack_tensor (#473) @lgeiger
Simplify shape handling in kernel prepare method (#472) @lgeiger
Fix unittest linkopts for macos (#465) @lgeiger
Rename packbits to bitpack for consistency. (#463) @AdamHillier
Automatically upload benchmark binaries to new releases (#456) @lgeiger
Standardise on 32-bit bitpacking. Closes #446. (#461) @AdamHillier
add extra braces around padding_buffer (#455) @andrewstanfordjason
Prebuild AArch32 benchmark binary for new releases (#453) @lgeiger
Add optimised canonical bitpacking for Aarch64. Closes #435. (#443) @AdamHillier
Add Android AAR build to release workflow (#451) @lgeiger
Add make_unsigned to unpack_matrix type (#442) @Tombana
Use a 32-column C++ kernel layout when writing bitpacked output. (#441) @AdamHillier
Test against Python 3.8 by default on CI (#438) @lgeiger
Remove unused packbits_arm32 (#437) @lgeiger
Test converter against TF 2.3 (#436) @lgeiger
Update the unittest error threshold (#430) @Tombana
Refactor OutputTransform with template types (#426) @Tombana
Replace context->ReportError calls with TF_LITE_KERNEL_LOG (#403) @lgeiger
Improve error message when passing wrong converter arg (#414) @lgeiger
Use custom larq MLIR dialect for our ops (#384) @lgeiger
Add misc updates for Micro (#401) @Tombana
Upgrade TensorFlow to eaacee173897b77cdb6afd22d5e78154177a10f3 (#363) @lgeiger
Slightly optimize docker image size (#399) @lgeiger
Add new performance label to release notes (#395) @lgeiger
Move weight bitpacking into its own converter pass. (#393) @AdamHillier
Add custom dev docker file (#392) @lgeiger
Remove RUY_ASM_FLAG_HAS_BIAS (#389) @lgeiger
Remove filter_format attribute from IR and bconv kernel (#382) @lgeiger
Add pattern to ensure post_activation_multipliers are always positiv (#375) @lgeiger
Upgrade github/actions (#381) @lgeiger
Add testing script for faster model conversions debugging (#380) @lgeiger

⬆️ Dependencies

Update lint dependencies (#479) @lgeiger
Bump actions/setup-python from v2.1.1 to v2.1.2 (#460) @dependabot
Update actions/upload-artifact requirement to v2.1.4 (#419, #425, #440, #448, #454, #459) @dependabot
Bump GoogleCloudPlatform/github-actions from 0.1.2 to 0.1.3 (#444) @dependabot
Upgrade tensorflow to 2.3 stable (#439) @lgeiger
Update lce_register.cc (#434) @lgeiger
Bump DoozyX/clang-format-lint-action from v0.6 to v0.8 (#427, #433) @dependabot
Bump actions/setup-python from v2 to v2.1.1 (#429) @dependabot
Upgrade TensorFlow to 2.3.0rc0 (#415) @lgeiger
Bump actions/download-artifact from v1 to v2 (#418) @dependabot
Bump toolmantim/release-drafter from v5.8.0 to v5.11.0 (#417) @dependabot
Create Dependabot config file (#416) @dependabot-preview
Upgrade TensorFlow to tensorflow@9a70ab8 (#405) @lgeiger

Assets 5

Loading

All reactions

v0.3.1

26 May 13:53

lgeiger

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.3.1

🐛 Bug Fixes

Fix weight bitpacking which could lead to non-deterministic behaviour (#377) @lgeiger

Assets 2

Loading

All reactions

v0.3.0

12 May 19:48

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.3.0

🎉 Features

Add experimental support for bitpacked output tensors to the converter (#352) @AdamHillier
Build wheels for Windows (#355) @lgeiger
Add support for prebuilt Python 3.8 wheels (#347) @lgeiger
Upgrade to TF 2.2 (#258) @lgeiger

📖 Documentation

Update benchmark results (#359) @lgeiger
Reformat summaries in python.converter.py (#343) @leonoverweel
Update README with QuickNet Correction (#341) @jamescook106

👷‍♂️ Internal Improvements

Add 8x4 Arm64 kernel to improve performance (#350) @AdamHillier
Build wheels using AVX support (#353) @lgeiger
Use pre-installed bazelisk on CI (#354) @lgeiger
Upgrade ARM gcc compiler to 9.2.1 (#331) @lgeiger
Add TF 2.2 to converter test matrix (#348) @lgeiger
Setup possibility to build nightly wheels on CI (#346) @lgeiger
⬆️ lint dependencies (#342) @lgeiger

Assets 2

Loading

All reactions

v0.2.1

20 Apr 15:41

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.2.1

🎉 Features

Add 8-bit quantization support to kernels (#327) @Tombana
Support reading and writing bitpacked activations in C++ kernels. (#305) @AdamHillier
Support weight scaling in converter (#326) @lgeiger
Add ImageNet evaluation tool from TFLite (#313) @AdamHillier
Add reference implementation of binary convolution. (#314) @arashb

📖 Documentation

Add new benchmark numbers (#339) @Tombana
Fix broken links (#335) @lgeiger
Fix typo (#334) @lgeiger
Correct top-1 accuracies according to #315 (#319) @lgeiger
Correct model accuracies (#315) @lgeiger
Fix docs link in contributing guide (#299) @lgeiger

👷‍♂️ Internal Improvements

Fix GTEST_FILTER on CI (#338) @Tombana
Add bitpack order argument to packbits_tensor (#337) @Tombana
Remove bitpacking dynamic memory allocations (#336) @Tombana
AArch64 BMLA: Use pairwise addition to accumulate popcount results (#332) @lgeiger
Simplify ARM compiler build files (#330) @lgeiger
Consistently use bazel test and --test_output=all on CI (#329) @lgeiger
Run AAR test outside the custom op container (#322) @lgeiger
Switch CI testing from mac to linux (#321) @lgeiger
fix dependency for ref. bconv (#320) @arashb
set bazel as default builder for android AAR (#318) @arashb
fix the header guard for packbit utils (#312) @arashb
Add MLIR status handler to log converter errors (#310) @lgeiger
Consolidate four vector register loads into one instruction. (#308) @AdamHillier
add header guard for bconv impl (#304) @arashb
Fix typos and standardise std::namespace usage. (#303) @AdamHillier
Support a dynamic bitpacking bitwidth. (#302) @AdamHillier
Don't use cache if authentication fails (#301) @lgeiger
Consolidate MLIR and End2End tests (#298) @lgeiger

Assets 2

Loading

All reactions

v0.2.0

23 Mar 18:21

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.2.0

⚠️ Breaking Changes ⚠️

Rename op to LceBconv2d (a3f3838) @Tombana

🎉 Features

Fused activation (#267) @Tombana
Add support for one-padding (#252) @Tombana

🐛 Bug Fixes

Fix typo that caused 64-bit op on ARM32 (#295) @Tombana

📖 Documentation

update the benchmark results (#294) @arashb

👷‍♂️ Internal Improvements

Simplify ci builds (#297) @Tombana
Add preprocessor statement to fix warning (#296) @Tombana
Enable Bazel GCS remote cache (#287) @lgeiger
Use fastbuild instead of dbg for tests. (#288) @AdamHillier
Add +1 padding to the end2end test (#283) @lgeiger
(ci-skip) Simplify pip package publishing (#282) @lgeiger
Switch converter tests to be Python only (#280) @lgeiger
Add end2end test (#269) @Tombana
Merge some bconv test params (as the # of params is limited). (#277) @AdamHillier
Replace mul + add with shl + sub (#268) @lgeiger
remove armv7 android ABI from LCE Lite AAR (#274) @arashb
Cleanup LCE AAR build (#273) @lgeiger
Remove un-used variable, fix typo. (#272) @AdamHillier
Run ARM tests with debug builds. Closes #270. (#271) @AdamHillier
Reduce normal register loads (#263) @AdamHillier
Split fused multiply into backtransform and post multiply (#260) @Tombana
Simplify padding constraint (#264) @lgeiger
Run CI on PRs (including from forks). (#265) @AdamHillier
remove legacy reference impl. of LCE bconv op (#261) @arashb
Add core tests back to CI (#259) @arashb
Simplify is binary check (#253) @lgeiger
Cleanup MLIR tests (#255) @lgeiger
Support running lit tests on macOS without needing to install GNU find (#254) @lgeiger
Update BConv TableGen definition to match TFLite Op (#256) @lgeiger

Assets 2

Loading

All reactions

v0.1.2

26 Feb 16:10

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.1.2

🎉 Features

build LCE compatible TensorFlow Lite android AAR target (#238) @arashb

🐛 Bug Fixes

Use vectors instead of tflite temp buffers (#247) @arashb
Bitpack weights only on the first run (#246) @Tombana

📖 Documentation

update the pixel phone benchmark results (#249) @arashb
Move docs to larq/docs (#239) @lgeiger
Improve converter API docs (#237) @lgeiger
fix links in second table (#236) @koenhelwegen
QuickNet .h5 links (#235) @jamescook106

👷‍♂️ Internal Improvements

Add tests for MLIR passes (#248) @lgeiger
remove legacy TF ops (#243) @arashb
remove LCE tflite legacy python API (#241) @arashb

Assets 2

Loading

All reactions

v0.1.1

20 Feb 17:31

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.1.1

🐛 Bug Fixes

Build wheels for macOS 10.13 (#233) @lgeiger
Fix is-binary check (#230) @lgeiger

📖 Documentation

simplify the android quickstart guide (#231) @arashb

👷‍♂️ Internal Improvements

Update actions/checkout@v2 (#232) @lgeiger

Assets 2

Loading

All reactions

Previous 1 2 3 Next

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.