-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow for matrix-matrix and matrix-vector products with KFACLinearOperator
and KFACInverseLinearOperator
without converting to numpy
#91
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @runame - I've reviewed your changes and they look great!
General suggestions:
- Consider refactoring input validation and device compatibility checks into separate methods to improve code readability and maintainability.
- Explore more efficient approaches for tensor concatenation to optimize performance.
- Enhance error messages with more specific details to aid in debugging and improve the developer experience.
Here's what I looked at during the review
- 🟡 General issues: 4 issues found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Docstrings: all looks good
Thanks for using Sourcery. We offer it for free for open source projects and would be very grateful if you could help us grow. If you like it, would you consider sharing Sourcery on your favourite social media? ✨
curvlinops/kfac.py
Outdated
if return_tensor: | ||
M_torch = cat([rearrange(M, "k ... -> (...) k") for M in M_torch], dim=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (performance): Consider the efficiency of tensor concatenation.
Concatenating tensors in a loop can be inefficient, especially for large numbers of tensors. It might be beneficial to explore alternative approaches that could reduce the computational overhead, such as preallocating a tensor of the correct size and filling it.
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Pull Request Test Coverage Report for Build 8360156180Details
💛 - Coveralls |
KFACLinearOperator
without converting to numpyKFACLinearOperator
and KFACInverseLinearOperator
without converting to numpy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some refactoring proposals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor things. After this round I believe we are ready to merge, maybe up to some more minor nits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found some more minor things, but looks great! One more thing to discuss is whether we want to switch to using cases
in the tests of KFAC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ready to me!
Addresses #71, but only for the
KFACLinearOperator
andKFACInverseLinearOperator
.KFACLinearOperator
/KFACInverseLinearOperator
is arguably the only linear operator here that is likely to be used for preconditioned-gradient methods for large scale neural networks. Therefore, addingtorch_matmat
andtorch_matvec
methods seem like a simple solution to avoid unnecessary device transfers, which are a bottleneck for this use case. However, this doesn't address the issue in general.