Fix (scaling)!: clamp to avoid inf/nan in forward/backward #1097

Giuseppe5 · 2024-11-18T16:03:24Z

Reason for this PR

Restrict scaling with log op could cause inf/nan in the forward pass and/or in the gradients.

Changes Made in this PR

Applying a restriction before (new behaviour) and after (existing behaviour) helps avoiding that.

Testing Summary

Add tests for standalone and runtime scaling to check inf/nan in forward/backwards.

Risk Highlight

This is a breaking change. Old checkpoints will work, new QAT/PTQ runs could produce different results from before.

This PR includes code from another work (please detail).
This PR contains API-breaking changes.
This PR depends on work in another PR (please provide links/details).
This PR introduces new dependencies (please detail).
There are coverage gaps not covered by tests.
Documentation updates required in subsequent PR.

Checklist

Code comments added to any hard-to-understand areas, if applicable.
Changes generate no new warnings.
Updated any relevant tests, if applicable.
No conflicts with destination dev branch.
I reviewed my own code changes.
Initial CI/CD passing.
1+ reviews given, and any review issues addressed and approved.
Post-review full CI/CD passing.

Giuseppe5 · 2024-11-19T10:07:15Z

src/brevitas/core/scaling/standalone.py

                out = self.restrict_scaling_pre(out)
            else:
                out = self.value
            threshold = self.restrict_threshold(self.restrict_threshold_pre(threshold))
-            out = self.clamp_scaling(self.restrict_scaling(out))


I believe self.clamp_scaling is not necessary here because we're already in log domain, and self.restrict_scaling is responsible only for rounding and potential conversion from log to real domain (i.e., exponential).

Applying clamping before restrict_val means we can't get to negative values, thus scaling values between 0 and 1.
@nickfraser

Sounds right to me!

Giuseppe5 added 3 commits November 18, 2024 14:18

Fix (core/scaling): clamping before restrict_val

fb514ba

Test (core/scaling): add test for inf/nan restricted scale

d9fff9a

Fix

c399e9b

Giuseppe5 requested a review from nickfraser November 19, 2024 09:38

Giuseppe5 commented Nov 19, 2024

View reviewed changes

fix test values

170ef90

Giuseppe5 requested review from nickfraser and removed request for nickfraser November 19, 2024 13:03

More fixes to tests

863a00c

Giuseppe5 requested review from nickfraser and removed request for nickfraser November 19, 2024 14:21

Giuseppe5 merged commit abf4a40 into Xilinx:dev Nov 19, 2024
368 of 374 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix (scaling)!: clamp to avoid inf/nan in forward/backward #1097

Fix (scaling)!: clamp to avoid inf/nan in forward/backward #1097

Giuseppe5 commented Nov 18, 2024 •

edited

Loading

Giuseppe5 Nov 19, 2024

nickfraser Nov 19, 2024

Fix (scaling)!: clamp to avoid inf/nan in forward/backward #1097

Fix (scaling)!: clamp to avoid inf/nan in forward/backward #1097

Conversation

Giuseppe5 commented Nov 18, 2024 • edited Loading

Reason for this PR

Changes Made in this PR

Testing Summary

Risk Highlight

Checklist

Giuseppe5 Nov 19, 2024

Choose a reason for hiding this comment

nickfraser Nov 19, 2024

Choose a reason for hiding this comment

Giuseppe5 commented Nov 18, 2024 •

edited

Loading