Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Py script doesn't utilize my gpu, how do I fix that? (linux) #670

Open
Bleeplo opened this issue Aug 6, 2023 · 12 comments
Open

Py script doesn't utilize my gpu, how do I fix that? (linux) #670

Bleeplo opened this issue Aug 6, 2023 · 12 comments

Comments

@Bleeplo
Copy link

Bleeplo commented Aug 6, 2023

I have done the installation guide to the T. After looking at multiple issues, I have the conclusion that my GPU isn't being utilized. For every time I try to run the .py script, I get the result of
Input:

python3 inference_realesrgan.py -i "/home/phil/Downloads/image.jpg" -o "output/" --tile 200

Output:

/home/phil/.local/lib/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: 'libc10_cuda.so: cannot open shared object file: No such file or directory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
/home/phil/.local/lib/python3.11/site-packages/torchvision/transforms/functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be **removed in 0.17**. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional.
  warnings.warn(
Testing 0 image
Error "slow_conv2d_cpu" not implemented for 'Half'
        Tile 1/195
Traceback (most recent call last):
  File "/home/phil/Real-ESRGAN2/inference_realesrgan.py", line 166, in <module>
    main()
  File "/home/phil/Real-ESRGAN2/inference_realesrgan.py", line 147, in main
    output, _ = upsampler.enhance(img, outscale=args.outscale)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/phil/Real-ESRGAN2/realesrgan/utils.py", line 221, in enhance
    self.tile_process()
  File "/home/phil/Real-ESRGAN2/realesrgan/utils.py", line 179, in tile_process
    output_start_x:output_end_x] = output_tile[:, :, output_start_y_tile:output_end_y_tile,
                                   ^^^^^^^^^^^
UnboundLocalError: cannot access local variable 'output_tile' where it is not associated with a value

I am very puzzled to why the .py script is no longer working, for it had worked for me in the past. Although, I have made a few changes to my PC since then; such as my Kernel, graphics card, etc.
NOTE: I have tried the executable script, and it does work; however, I desire the .py script.

Another thing to consider, when I was installing, and ran: python setup.py develop
It gave me the result of

fatal: not a git repository (or any of the parent directories): .git
Traceback (most recent call last):
  File "/home/phil/Real-ESRGAN2/setup.py", line 84, in <module>
    setup(
  File "/usr/lib/python3.11/site-packages/setuptools/__init__.py", line 106, in setup
    _install_setup_requires(attrs)
  File "/usr/lib/python3.11/site-packages/setuptools/__init__.py", line 74, in _install_setup_requires
    dist = MinimalDistribution(attrs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/setuptools/__init__.py", line 56, in __init__
    super().__init__(filtered)
  File "/usr/lib/python3.11/site-packages/setuptools/dist.py", line 484, in __init__
    for ep in metadata.entry_points(group='distutils.setup_keywords'):
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/metadata/__init__.py", line 1040, in entry_points
    return SelectableGroups.load(eps).select(**params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/metadata/__init__.py", line 476, in load
    ordered = sorted(eps, key=by_group)
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/metadata/__init__.py", line 1037, in <genexpr>
    eps = itertools.chain.from_iterable(
                                       ^
  File "/usr/lib/python3.11/importlib/metadata/_itertools.py", line 16, in unique_everseen
    k = key(element)
        ^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/metadata/__init__.py", line 954, in _normalized_name
    or super()._normalized_name
       ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/metadata/__init__.py", line 627, in _normalized_name
    return Prepared.normalize(self.name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/metadata/__init__.py", line 882, in normalize
    return re.sub(r"[-_.]+", "-", name).lower().replace('-', '_')
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/re/__init__.py", line 185, in sub
    return _compile(pattern, flags).sub(repl, string, count)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: expected string or bytes-like object, got 'NoneType'

My gpu is a Nvida GeForce RTX 3090, whereas my prior gpu that I ran Real-ESRGAN on was a Nvidia Geforce RTX 2060

@Jerchongkong
Copy link

It seems that you have not chosen a model to start with,you need to specify a model using the -n command

python -i "/home/phil/Downloads/image.jpg" -n realesr-animevideov3 -s 4 -o "output/"

This should work.

@Bleeplo
Copy link
Author

Bleeplo commented Aug 15, 2023

@Jerchongkong
When I tried your suggestion with the inclusion of the python script name (inference_realesrgan.py), I get the following result
input: python inference_realesrgan.py -i "/home/phil/Downloads/image.jpg" -n realesr-animevideov3 -s 4 -o "output/"
output:

/home/phil/.local/lib/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: 'libc10_cuda.so: cannot open shared object file: No such file or directory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
/home/phil/.local/lib/python3.11/site-packages/torchvision/transforms/functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be **removed in 0.17**. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional.
  warnings.warn(
Testing 0 image
Error "slow_conv2d_cpu" not implemented for 'Half'
If you encounter CUDA out of memory, try to set --tile with a smaller number.

I see that I am still plagued with the slow_conv2d_cpu. Nonetheless, when I try adding a --tile parameter, I get a similar error to the ones before.
Input: python inference_realesrgan.py -i "/home/phil/Downloads/image.jpg" -n realesr-animevideov3 -s 4 -o "output/" --tile 200
Output:

/home/phil/.local/lib/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: 'libc10_cuda.so: cannot open shared object file: No such file or directory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
/home/phil/.local/lib/python3.11/site-packages/torchvision/transforms/functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be **removed in 0.17**. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional.
  warnings.warn(
Testing 0 image
Error "slow_conv2d_cpu" not implemented for 'Half'
        Tile 1/16
Traceback (most recent call last):
  File "/home/phil/Real-ESRGAN/inference_realesrgan.py", line 166, in <module>
    main()
  File "/home/phil/Real-ESRGAN/inference_realesrgan.py", line 147, in main
    output, _ = upsampler.enhance(img, outscale=args.outscale)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/phil/Real-ESRGAN/realesrgan/utils.py", line 221, in enhance
    self.tile_process()
  File "/home/phil/Real-ESRGAN/realesrgan/utils.py", line 179, in tile_process
    output_start_x:output_end_x] = output_tile[:, :, output_start_y_tile:output_end_y_tile,
                                   ^^^^^^^^^^^
UnboundLocalError: cannot access local variable 'output_tile' where it is not associated with a value

@JuarezCulau
Copy link

same problem with the "slow_conv2d_cpu", that's basically mean's that is not using the gpu from what I saw. I am specifying the gpu and still, not being used

@arianaa30
Copy link

Having the same issue...Nothing is saved in results directory:
python inference_realesrgan.py -n RealESRGAN_x4plus -i inputs -s 4 -o results --face_enhance

Testing 6 children-alpha
Error "slow_conv2d_cpu" not implemented for 'Half'
If you encounter CUDA out of memory, try to set --tile with a smaller number.

@JuarezCulau
Copy link

JuarezCulau commented Oct 24, 2023

In my case, I was using only one GPU, so the ID was 0, and the check statement is not considering that. I have fixed my issue and made the pull request here
#679

@arianaa30
Copy link

@JuarezCulau I just made your suggested change (adding and gpu_id != 0:) but didn't fix the issue for me.

@Melechtna
Copy link

Melechtna commented Jan 10, 2024

disregard, this part is wrong

I made this adjustment, and it SEEMS to be allowing my AMD GPU to do the work, but I'm running rife at the moment so can't really test the results. All I know is I started it, my GPU fans got REAL damn loud, and my driver crashed because rife was running at the same time, so I probably over loaded the GPU with too many simultaneous tasks, not trying again until rife completes

To @arianna30, this was also wrong due to a misunderstanding that's since been resolved

@JuarezCulau
Copy link

Sorry for hearing that it didn't solve your problem, @arianaa30. I think I forgot to reply. I did notice that the case I opened didn't receive any response from someone on the repository. Considering the number of issues and pull requests, it seems this repository may be discontinued.

Regarding @Melechtna, it's been quite some time since I last used code from this repository, and my memory is a bit hazy, so please bear with me. I believe the main issue with using only one GPU with the code from the main branch is that if you have only one GPU connected to your computer, the GPU's ID will be 0, which is expected. The purpose of that statement is to check if the value is null, but it also returns false if the value is 0, based on the way it was implemented previously. It shouldn't pose a problem if you have multiple GPUs connected to your computer and you're utilizing only one with a different ID than 0. However, it will always return false if you're using only one GPU with ID 0, even if that GPU has torch.cuda available. So, if the value of gpu_id is None for you, it might be another issue, and you might be overriding a statement that should block the script's continuation.

You mentioned you're using an AMD GPU, and CUDA is built for NVIDIA devices. While it might be used through ROCm for your case, I'm not entirely sure how that works. My suggestion would be for you to look into this further.

@Melechtna
Copy link

Melechtna commented Jan 10, 2024

Sorry for hearing that it didn't solve your problem, @arianaa30. I think I forgot to reply. I did notice that the case I opened didn't receive any response from someone on the repository. Considering the number of issues and pull requests, it seems this repository may be discontinued.

Regarding @Melechtna, it's been quite some time since I last used code from this repository, and my memory is a bit hazy, so please bear with me. I believe the main issue with using only one GPU with the code from the main branch is that if you have only one GPU connected to your computer, the GPU's ID will be 0, which is expected. The purpose of that statement is to check if the value is null, but it also returns false if the value is 0, based on the way it was implemented previously. It shouldn't pose a problem if you have multiple GPUs connected to your computer and you're utilizing only one with a different ID than 0. However, it will always return false if you're using only one GPU with ID 0, even if that GPU has torch.cuda available. So, if the value of gpu_id is None for you, it might be another issue, and you might be overriding a statement that should block the script's continuation.

You mentioned you're using an AMD GPU, and CUDA is built for NVIDIA devices. While it might be used through ROCm for your case, I'm not entirely sure how that works. My suggestion would be for you to look into this further.

The default in util.py is to set gpu_id to a value of none

https://github.com/xinntao/Real-ESRGAN/blob/5ca1078535923d485892caee7d7804380bfc87fd/realesrgan/utils.py#L39C31-L39C31

As for the rocm implementation, it's kind of just slotted in along side the nvidia one, but as far as I understand how torch handles this, as long as you've got opencv compiled with rocm, that SHOULD be the correct statement, but I've yet to have a chance to fully test this properly, as I'm still dealing with other tasks that would cause a GPU driver crash should I start a job right away. However, with the way your code is formatted, it would likely skip the GPU attempt entirely if it's set to 0, as you've passed the argument != 0, meaning, if it doesn't equal 0, check for gpus to render on. So either your interpretation is wrong, or that implementation is wrong.

Edit: Actually looking over it a bit more closely, I think you've basically skipped the check entirely in all cases, as, it's none initially, because we haven't checked for what's available, so that value SHOULD be none, because it's just a value that's filled when the check is run. So I'm not even sure how that got it working for you in the first place.

@JuarezCulau
Copy link

@Melechtna Are you sure the GPU's ID is not modified in another part of the code? If I remember correctly, while troubleshooting, the ID was always 0 for me. I am not entirely sure I understand your line of thought on this, but I checked the code I sent in the pull request and noticed the error. Regarding 'So either your interpretation is wrong, or that implementation is wrong,' you are right about the implementation being wrong, it was a rookie mistake on my part, and I apologize for that. I believe I created a branch for that single modification and tested it outside of a controlled environment just to see if it works. It should be fixed now, and you can take a look in the same case I opened before.

@Melechtna
Copy link

Melechtna commented Jan 10, 2024

@Melechtna Are you sure the GPU's ID is not modified in another part of the code? If I remember correctly, while troubleshooting, the ID was always 0 for me. I am not entirely sure I understand your line of thought on this, but I checked the code I sent in the pull request and noticed the error. Regarding 'So either your interpretation is wrong, or that implementation is wrong,' you are right about the implementation being wrong, it was a rookie mistake on my part, and I apologize for that. I believe I created a branch for that single modification and tested it outside of a controlled environment just to see if it works. It should be fixed now, and you can take a look in the same case I opened before.

Unfortunately my rocm method wasn't successful, and either didn't work at all, or created an issue where device never gets enumerated. As I said, it was a guess that I hadn't tested properly based on dealing with opencv, and some rudimentary research into pytorch, it just took me a while to test it properly. As for the gpu_id, if it gets assigned a number later on in the code, fair enough, I didn't look through the whole thing, as my primary concern was trying to gpu accelerate it with the AMD card I have, not so simple as it was with rife sadly. That implementation you adjusted does look far more sound however, but unless I can figure out how to make it accept rocm, not much use for me.

@JuarezCulau
Copy link

@Melechtna I'm sorry to hear that, but thanks for highlighting the error in the commit, it should be all good regarding that now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants