Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat (gpfq): separate float and quant forward pass for speedup #955

Closed

Conversation

fabianandresgrob
Copy link
Contributor

Speeding up GPFQ with separate forward passes for quantized and float input. I avoided offloading the float input to disc and instead saved them under an attribute for GPFQ and simply moved them off the GPU when collected.

@fabianandresgrob fabianandresgrob marked this pull request as ready for review May 31, 2024 11:27
@fabianandresgrob fabianandresgrob force-pushed the gpfq_offload_float_acts branch 2 times, most recently from bece0c6 to 9d886c0 Compare June 3, 2024 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants