Interested in Developing MPS Support #972
Replies: 9 comments 1 reply
-
Hi, |
Beta Was this translation helpful? Give feedback.
-
Thanks for looping in others to the discussion. @danthe3rd As it stands, I do not mind leading the charge on my end. New features have a nice effect of bringing on specific people to test it and contribute, so the goal wouldn't be to offload the maintenance and development on you guys. We can start with simple kernels for inference only and go from there |
Beta Was this translation helpful? Give feedback.
-
What kernels do you have in mind for now? |
Beta Was this translation helpful? Give feedback.
-
Building out the matmul, spmm, sddmm, and sparse softmax would likely be the best bet on first steps. Once developed and tested, I'll probably have a better idea on next steps. The overarching end goal of this phase is to have the entire attention mechanism implemented in MPS kernels (forward only.) |
Beta Was this translation helpful? Give feedback.
-
Would it make sense to develop that as part of xFormers tho? Or maybe in a dedicated repo that we could later include as a third-party in xFormers when ready for release |
Beta Was this translation helpful? Give feedback.
-
Originally, I wanted to add MPS support to audiocraft which uses xFormers. The hacky solution to getting MPS to run on GPU is to ignore certain environment variables and ignore xFormers - a less than ideal solution. I created this discussion such that MPS support can be native and thereby trickle out to other projects which use it. Audiocraft being one such example. The core idea being that researchers don't need to worry about the support and users can use it on device when the paper/repo is released. So I do think native support would be ideal. Let me know what you think |
Beta Was this translation helpful? Give feedback.
-
cc @JadeCopet for audiocraft |
Beta Was this translation helpful? Give feedback.
-
Thanks - I'd love to hear what Jade's thoughts are on this too. I hope my reasoning on this is clear. I aim to not throw a wrench in everyone's development, but rather work on what I believe is a natural extension of the repo. I do think that any project leveraging xFormers should have MPS support for inference at the bare minimum. Thus the discussion starting here at the root. Thanks for your insight |
Beta Was this translation helpful? Give feedback.
-
@theadamsabra can you share your "hacky solution"? I'd also like to see xformers and audiocraft running on MPS. I'm trying to build and run the new metavoice model locally (https://github.com/metavoiceio/metavoice-src) to give an old friend their voice back (lost it due to a stroke). @danthe3rd I'd like to second the suggestion that "any project leveraging xFormers should have MPS support for inference at the bare minimum". +1000 "thank yous" for maintaining this project. |
Beta Was this translation helpful? Give feedback.
-
I've been recently interested in enabling MPS support within this package and wanted to know what some first steps for contribution would be. Because there are custom CUDA kernels within xformers, I assume we would need to do similar implementation for MPS kernels as well.
My initial thoughts are to implement MPS kernels for xformers modules to support inference/forward passes only. This way users can test out performance and we can iterate quickly on performance and gauge the need for MPS support on training (which could grow over time as LLM enthusiasts prefer local inference on their own data.)
Looking forward to hear from you all on how we can begin something like this.
Beta Was this translation helpful? Give feedback.
All reactions