-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PyTorch] Reduce the CPU overheads of GroupedLinear
#1072
Merged
timmoon10
merged 10 commits into
NVIDIA:main
from
yaox12:xiny/fused_multi_cast_transpose
Aug 9, 2024
Merged
[PyTorch] Reduce the CPU overheads of GroupedLinear
#1072
timmoon10
merged 10 commits into
NVIDIA:main
from
yaox12:xiny/fused_multi_cast_transpose
Aug 9, 2024
Commits on Aug 8, 2024
-
use fused_multi_cast_transpose
Signed-off-by: Xin Yao <xiny@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for 3457e3c - Browse repository at this point
Copy the full SHA 3457e3cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 074563b - Browse repository at this point
Copy the full SHA 074563bView commit details -
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Configuration menu - View commit details
-
Copy full SHA for 324815d - Browse repository at this point
Copy the full SHA 324815dView commit details -
allocate output tensors in C++
Signed-off-by: Xin Yao <xiny@nvidia.com>
Configuration menu - View commit details
-
Copy full SHA for 4e57d88 - Browse repository at this point
Copy the full SHA 4e57d88View commit details -
Configuration menu - View commit details
-
Copy full SHA for ef31897 - Browse repository at this point
Copy the full SHA ef31897View commit details -
Configuration menu - View commit details
-
Copy full SHA for 26dc2a3 - Browse repository at this point
Copy the full SHA 26dc2a3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3a9d2f3 - Browse repository at this point
Copy the full SHA 3a9d2f3View commit details -
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Configuration menu - View commit details
-
Copy full SHA for 63b55dd - Browse repository at this point
Copy the full SHA 63b55ddView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2d4c80b - Browse repository at this point
Copy the full SHA 2d4c80bView commit details
Commits on Aug 9, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 1bb41a1 - Browse repository at this point
Copy the full SHA 1bb41a1View commit details
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.