-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Py script doesn't utilize my gpu, how do I fix that? (linux) #670
Comments
It seems that you have not chosen a model to start with,you need to specify a model using the
This should work. |
@Jerchongkong
I see that I am still plagued with the
|
same problem with the "slow_conv2d_cpu", that's basically mean's that is not using the gpu from what I saw. I am specifying the gpu and still, not being used |
Having the same issue...Nothing is saved in
|
In my case, I was using only one GPU, so the ID was 0, and the check statement is not considering that. I have fixed my issue and made the pull request here |
@JuarezCulau I just made your suggested change (adding |
I made this adjustment, and it SEEMS to be allowing my AMD GPU to do the work, but I'm running rife at the moment so can't really test the results. All I know is I started it, my GPU fans got REAL damn loud, and my driver crashed because rife was running at the same time, so I probably over loaded the GPU with too many simultaneous tasks, not trying again until rife completes To @arianna30, |
Sorry for hearing that it didn't solve your problem, @arianaa30. I think I forgot to reply. I did notice that the case I opened didn't receive any response from someone on the repository. Considering the number of issues and pull requests, it seems this repository may be discontinued. Regarding @Melechtna, it's been quite some time since I last used code from this repository, and my memory is a bit hazy, so please bear with me. I believe the main issue with using only one GPU with the code from the main branch is that if you have only one GPU connected to your computer, the GPU's ID will be 0, which is expected. The purpose of that statement is to check if the value is null, but it also returns false if the value is 0, based on the way it was implemented previously. It shouldn't pose a problem if you have multiple GPUs connected to your computer and you're utilizing only one with a different ID than 0. However, it will always return false if you're using only one GPU with ID 0, even if that GPU has torch.cuda available. So, if the value of gpu_id is None for you, it might be another issue, and you might be overriding a statement that should block the script's continuation. You mentioned you're using an AMD GPU, and CUDA is built for NVIDIA devices. While it might be used through ROCm for your case, I'm not entirely sure how that works. My suggestion would be for you to look into this further. |
The default in util.py is to set gpu_id to a value of none As for the rocm implementation, it's kind of just slotted in along side the nvidia one, but as far as I understand how torch handles this, as long as you've got opencv compiled with rocm, that SHOULD be the correct statement, but I've yet to have a chance to fully test this properly, as I'm still dealing with other tasks that would cause a GPU driver crash should I start a job right away. However, with the way your code is formatted, it would likely skip the GPU attempt entirely if it's set to 0, as you've passed the argument != 0, meaning, if it doesn't equal 0, check for gpus to render on. So either your interpretation is wrong, or that implementation is wrong. Edit: Actually looking over it a bit more closely, I think you've basically skipped the check entirely in all cases, as, it's none initially, because we haven't checked for what's available, so that value SHOULD be none, because it's just a value that's filled when the check is run. So I'm not even sure how that got it working for you in the first place. |
@Melechtna Are you sure the GPU's ID is not modified in another part of the code? If I remember correctly, while troubleshooting, the ID was always 0 for me. I am not entirely sure I understand your line of thought on this, but I checked the code I sent in the pull request and noticed the error. Regarding 'So either your interpretation is wrong, or that implementation is wrong,' you are right about the implementation being wrong, it was a rookie mistake on my part, and I apologize for that. I believe I created a branch for that single modification and tested it outside of a controlled environment just to see if it works. It should be fixed now, and you can take a look in the same case I opened before. |
Unfortunately my rocm method wasn't successful, and either didn't work at all, or created an issue where device never gets enumerated. As I said, it was a guess that I hadn't tested properly based on dealing with opencv, and some rudimentary research into pytorch, it just took me a while to test it properly. As for the gpu_id, if it gets assigned a number later on in the code, fair enough, I didn't look through the whole thing, as my primary concern was trying to gpu accelerate it with the AMD card I have, not so simple as it was with rife sadly. That implementation you adjusted does look far more sound however, but unless I can figure out how to make it accept rocm, not much use for me. |
@Melechtna I'm sorry to hear that, but thanks for highlighting the error in the commit, it should be all good regarding that now. |
I have done the installation guide to the T. After looking at multiple issues, I have the conclusion that my GPU isn't being utilized. For every time I try to run the .py script, I get the result of
Input:
Output:
I am very puzzled to why the .py script is no longer working, for it had worked for me in the past. Although, I have made a few changes to my PC since then; such as my Kernel, graphics card, etc.
NOTE: I have tried the executable script, and it does work; however, I desire the .py script.
Another thing to consider, when I was installing, and ran:
python setup.py develop
It gave me the result of
My gpu is a
Nvida GeForce RTX 3090
, whereas my prior gpu that I ran Real-ESRGAN on was aNvidia Geforce RTX 2060
The text was updated successfully, but these errors were encountered: