Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No option to change FP8 status in graphed module after using "make_graphed_callables" #1207

Open
MaciejBalaNV opened this issue Sep 26, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@MaciejBalaNV
Copy link

MaciejBalaNV commented Sep 26, 2024

The function make_graphed_callables always overrides the forward function of the module with wrapped version in fp8_autocast. One issue with this approach is that once we wrap a given module, we cannot use it without FP8, even if we are not using the graphed version. A quick pseudocode example:

module = te.Linear(1028, 1028)
b = te.make_graphed_callables(module, sample_args, fp8_enabled=True, _order=[1,-1]) # _order argument makes it so that we can still use module for non-graphed callable
# At this point b(arg) will execute our module as a graph in FP8, which is fine and expected. We can also call module(arg) to use non-graphed module in FP8.  However, this still executes module in FP8:
with te.fp8_autocast(enabled=False):
    module(arg) # Still executed in FP8!

It's because of this line, which always executes the module in FP8 status we've given at graph creation.

I'd like to see an option to still use non-FP8 non-graphed module even after creating a CUDA Graph with FP8.

@ptrendx ptrendx added the bug Something isn't working label Sep 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants