Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does it currently support distributed multi card training ? #158

Open
chenrui17 opened this issue Oct 27, 2022 · 3 comments
Open

Does it currently support distributed multi card training ? #158

chenrui17 opened this issue Oct 27, 2022 · 3 comments

Comments

@chenrui17
Copy link

Will it be supported in the future ? current single card training cost too much time

@L-Reichardt
Copy link

L-Reichardt commented Feb 27, 2023

I got the model to run on multiple GPUs, however the training script in this repo is for single GPU.

With current versions of torch / spconv / CUDA the model is a lot faster to train. I rewrote it here for that purpose (for single GPU).

@nakatomo8899
Copy link

How do I run models on multiple GPUs?

@L-Reichardt
Copy link

L-Reichardt commented Jun 28, 2023

@nakatomo8899 I wrote my own Distributed Data Parallel (DDP) pipeline for this (not open source). I used a combination of Lei Maos cookbook, PyTorch's tutorial, and well documented repos such as Swin in order to do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants