Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat (nn/sdpa): quantization of scaled dot-product attention #1090

Open
wants to merge 18 commits into
base: dev
Choose a base branch
from

Conversation

nickfraser
Copy link
Collaborator

@nickfraser nickfraser commented Nov 8, 2024

Reason for this PR

Make it easier for users to quantize attention layers.

Changes Made in this PR

Achieved by providing:

  • A modular equivalent to the torch.nn.functional.scaled_dot_product_attention functional
  • A quantized version of this module
  • Adding code to convert the three options

Testing Summary

Tests:

  • Layer replacement test in LLM entry-point
  • Basic accuracy test for OPT
  • Basic graph replacement test (covered by LLM entry-point test)
  • SDPA & Quant SDPA forward tests

Risk Highlight

Adapted from pseudocode in PyTorch's documentation. Otherwise, this code barely touches any existing code, that shouldn't break any existing Brevitas features.

  • This PR includes code from another work (please detail).
  • This PR contains API-breaking changes.
  • This PR depends on work in another PR (please provide links/details).
  • This PR introduces new dependencies (please detail).
  • There are coverage gaps not covered by tests.
  • Documentation updates required in subsequent PR.

Checklist

  • Code comments added to any hard-to-understand areas, if applicable.
  • Changes generate no new warnings.
  • Updated any relevant tests, if applicable.
  • No conflicts with destination dev branch.
  • I reviewed my own code changes.
  • Initial CI/CD passing.
  • 1+ reviews given, and any review issues addressed and approved.
  • Post-review full CI/CD passing.

@nickfraser nickfraser self-assigned this Nov 8, 2024
@nickfraser nickfraser added the next release PRs which should be merged for the next release label Nov 8, 2024
@nickfraser nickfraser marked this pull request as ready for review November 20, 2024 17:56
@nickfraser
Copy link
Collaborator Author

We should merge #1088 before this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
next release PRs which should be merged for the next release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant