Py script doesn't utilize my gpu, how do I fix that? (linux) #670

Bleeplo · 2023-08-06T21:52:27Z

I have done the installation guide to the T. After looking at multiple issues, I have the conclusion that my GPU isn't being utilized. For every time I try to run the .py script, I get the result of
Input:

python3 inference_realesrgan.py -i "/home/phil/Downloads/image.jpg" -o "output/" --tile 200

Output:

/home/phil/.local/lib/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: 'libc10_cuda.so: cannot open shared object file: No such file or directory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
/home/phil/.local/lib/python3.11/site-packages/torchvision/transforms/functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be **removed in 0.17**. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional.
  warnings.warn(
Testing 0 image
Error "slow_conv2d_cpu" not implemented for 'Half'
        Tile 1/195
Traceback (most recent call last):
  File "/home/phil/Real-ESRGAN2/inference_realesrgan.py", line 166, in <module>
    main()
  File "/home/phil/Real-ESRGAN2/inference_realesrgan.py", line 147, in main
    output, _ = upsampler.enhance(img, outscale=args.outscale)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/phil/Real-ESRGAN2/realesrgan/utils.py", line 221, in enhance
    self.tile_process()
  File "/home/phil/Real-ESRGAN2/realesrgan/utils.py", line 179, in tile_process
    output_start_x:output_end_x] = output_tile[:, :, output_start_y_tile:output_end_y_tile,
                                   ^^^^^^^^^^^
UnboundLocalError: cannot access local variable 'output_tile' where it is not associated with a value

I am very puzzled to why the .py script is no longer working, for it had worked for me in the past. Although, I have made a few changes to my PC since then; such as my Kernel, graphics card, etc.
NOTE: I have tried the executable script, and it does work; however, I desire the .py script.

Another thing to consider, when I was installing, and ran: python setup.py develop
It gave me the result of

fatal: not a git repository (or any of the parent directories): .git
Traceback (most recent call last):
  File "/home/phil/Real-ESRGAN2/setup.py", line 84, in <module>
    setup(
  File "/usr/lib/python3.11/site-packages/setuptools/__init__.py", line 106, in setup
    _install_setup_requires(attrs)
  File "/usr/lib/python3.11/site-packages/setuptools/__init__.py", line 74, in _install_setup_requires
    dist = MinimalDistribution(attrs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/setuptools/__init__.py", line 56, in __init__
    super().__init__(filtered)
  File "/usr/lib/python3.11/site-packages/setuptools/dist.py", line 484, in __init__
    for ep in metadata.entry_points(group='distutils.setup_keywords'):
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/metadata/__init__.py", line 1040, in entry_points
    return SelectableGroups.load(eps).select(**params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/metadata/__init__.py", line 476, in load
    ordered = sorted(eps, key=by_group)
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/metadata/__init__.py", line 1037, in <genexpr>
    eps = itertools.chain.from_iterable(
                                       ^
  File "/usr/lib/python3.11/importlib/metadata/_itertools.py", line 16, in unique_everseen
    k = key(element)
        ^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/metadata/__init__.py", line 954, in _normalized_name
    or super()._normalized_name
       ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/metadata/__init__.py", line 627, in _normalized_name
    return Prepared.normalize(self.name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/metadata/__init__.py", line 882, in normalize
    return re.sub(r"[-_.]+", "-", name).lower().replace('-', '_')
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/re/__init__.py", line 185, in sub
    return _compile(pattern, flags).sub(repl, string, count)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: expected string or bytes-like object, got 'NoneType'

My gpu is a Nvida GeForce RTX 3090, whereas my prior gpu that I ran Real-ESRGAN on was a Nvidia Geforce RTX 2060

The text was updated successfully, but these errors were encountered:

Jerchongkong · 2023-08-09T22:09:23Z

It seems that you have not chosen a model to start with,you need to specify a model using the -n command

python -i "/home/phil/Downloads/image.jpg" -n realesr-animevideov3 -s 4 -o "output/"

This should work.

Bleeplo · 2023-08-15T18:26:25Z

@Jerchongkong
When I tried your suggestion with the inclusion of the python script name (inference_realesrgan.py), I get the following result
input: python inference_realesrgan.py -i "/home/phil/Downloads/image.jpg" -n realesr-animevideov3 -s 4 -o "output/"
output:

/home/phil/.local/lib/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: 'libc10_cuda.so: cannot open shared object file: No such file or directory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
/home/phil/.local/lib/python3.11/site-packages/torchvision/transforms/functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be **removed in 0.17**. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional.
  warnings.warn(
Testing 0 image
Error "slow_conv2d_cpu" not implemented for 'Half'
If you encounter CUDA out of memory, try to set --tile with a smaller number.

I see that I am still plagued with the slow_conv2d_cpu. Nonetheless, when I try adding a --tile parameter, I get a similar error to the ones before.
Input: python inference_realesrgan.py -i "/home/phil/Downloads/image.jpg" -n realesr-animevideov3 -s 4 -o "output/" --tile 200
Output:

/home/phil/.local/lib/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: 'libc10_cuda.so: cannot open shared object file: No such file or directory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
/home/phil/.local/lib/python3.11/site-packages/torchvision/transforms/functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be **removed in 0.17**. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional.
  warnings.warn(
Testing 0 image
Error "slow_conv2d_cpu" not implemented for 'Half'
        Tile 1/16
Traceback (most recent call last):
  File "/home/phil/Real-ESRGAN/inference_realesrgan.py", line 166, in <module>
    main()
  File "/home/phil/Real-ESRGAN/inference_realesrgan.py", line 147, in main
    output, _ = upsampler.enhance(img, outscale=args.outscale)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/phil/Real-ESRGAN/realesrgan/utils.py", line 221, in enhance
    self.tile_process()
  File "/home/phil/Real-ESRGAN/realesrgan/utils.py", line 179, in tile_process
    output_start_x:output_end_x] = output_tile[:, :, output_start_y_tile:output_end_y_tile,
                                   ^^^^^^^^^^^
UnboundLocalError: cannot access local variable 'output_tile' where it is not associated with a value

JuarezCulau · 2023-08-18T16:49:40Z

same problem with the "slow_conv2d_cpu", that's basically mean's that is not using the gpu from what I saw. I am specifying the gpu and still, not being used

arianaa30 · 2023-10-24T19:23:31Z

Having the same issue...Nothing is saved in results directory:
python inference_realesrgan.py -n RealESRGAN_x4plus -i inputs -s 4 -o results --face_enhance

Testing 6 children-alpha
Error "slow_conv2d_cpu" not implemented for 'Half'
If you encounter CUDA out of memory, try to set --tile with a smaller number.

JuarezCulau · 2023-10-24T20:01:22Z

In my case, I was using only one GPU, so the ID was 0, and the check statement is not considering that. I have fixed my issue and made the pull request here
#679

arianaa30 · 2023-10-24T20:21:47Z

@JuarezCulau I just made your suggested change (adding and gpu_id != 0:) but didn't fix the issue for me.

Melechtna · 2024-01-10T09:09:36Z

disregard, this part is wrong

I made this adjustment, and it SEEMS to be allowing my AMD GPU to do the work, but I'm running rife at the moment so can't really test the results. All I know is I started it, my GPU fans got REAL damn loud, and my driver crashed because rife was running at the same time, so I probably over loaded the GPU with too many simultaneous tasks, not trying again until rife completes

To @arianna30, this was also wrong due to a misunderstanding that's since been resolved

JuarezCulau · 2024-01-10T15:31:28Z

Sorry for hearing that it didn't solve your problem, @arianaa30. I think I forgot to reply. I did notice that the case I opened didn't receive any response from someone on the repository. Considering the number of issues and pull requests, it seems this repository may be discontinued.

Regarding @Melechtna, it's been quite some time since I last used code from this repository, and my memory is a bit hazy, so please bear with me. I believe the main issue with using only one GPU with the code from the main branch is that if you have only one GPU connected to your computer, the GPU's ID will be 0, which is expected. The purpose of that statement is to check if the value is null, but it also returns false if the value is 0, based on the way it was implemented previously. It shouldn't pose a problem if you have multiple GPUs connected to your computer and you're utilizing only one with a different ID than 0. However, it will always return false if you're using only one GPU with ID 0, even if that GPU has torch.cuda available. So, if the value of gpu_id is None for you, it might be another issue, and you might be overriding a statement that should block the script's continuation.

You mentioned you're using an AMD GPU, and CUDA is built for NVIDIA devices. While it might be used through ROCm for your case, I'm not entirely sure how that works. My suggestion would be for you to look into this further.

Melechtna · 2024-01-10T15:37:45Z

Sorry for hearing that it didn't solve your problem, @arianaa30. I think I forgot to reply. I did notice that the case I opened didn't receive any response from someone on the repository. Considering the number of issues and pull requests, it seems this repository may be discontinued.

Regarding @Melechtna, it's been quite some time since I last used code from this repository, and my memory is a bit hazy, so please bear with me. I believe the main issue with using only one GPU with the code from the main branch is that if you have only one GPU connected to your computer, the GPU's ID will be 0, which is expected. The purpose of that statement is to check if the value is null, but it also returns false if the value is 0, based on the way it was implemented previously. It shouldn't pose a problem if you have multiple GPUs connected to your computer and you're utilizing only one with a different ID than 0. However, it will always return false if you're using only one GPU with ID 0, even if that GPU has torch.cuda available. So, if the value of gpu_id is None for you, it might be another issue, and you might be overriding a statement that should block the script's continuation.

You mentioned you're using an AMD GPU, and CUDA is built for NVIDIA devices. While it might be used through ROCm for your case, I'm not entirely sure how that works. My suggestion would be for you to look into this further.

The default in util.py is to set gpu_id to a value of none

https://github.com/xinntao/Real-ESRGAN/blob/5ca1078535923d485892caee7d7804380bfc87fd/realesrgan/utils.py#L39C31-L39C31

As for the rocm implementation, it's kind of just slotted in along side the nvidia one, but as far as I understand how torch handles this, as long as you've got opencv compiled with rocm, that SHOULD be the correct statement, but I've yet to have a chance to fully test this properly, as I'm still dealing with other tasks that would cause a GPU driver crash should I start a job right away. However, with the way your code is formatted, it would likely skip the GPU attempt entirely if it's set to 0, as you've passed the argument != 0, meaning, if it doesn't equal 0, check for gpus to render on. So either your interpretation is wrong, or that implementation is wrong.

Edit: Actually looking over it a bit more closely, I think you've basically skipped the check entirely in all cases, as, it's none initially, because we haven't checked for what's available, so that value SHOULD be none, because it's just a value that's filled when the check is run. So I'm not even sure how that got it working for you in the first place.

JuarezCulau · 2024-01-10T16:31:28Z

@Melechtna Are you sure the GPU's ID is not modified in another part of the code? If I remember correctly, while troubleshooting, the ID was always 0 for me. I am not entirely sure I understand your line of thought on this, but I checked the code I sent in the pull request and noticed the error. Regarding 'So either your interpretation is wrong, or that implementation is wrong,' you are right about the implementation being wrong, it was a rookie mistake on my part, and I apologize for that. I believe I created a branch for that single modification and tested it outside of a controlled environment just to see if it works. It should be fixed now, and you can take a look in the same case I opened before.

Melechtna · 2024-01-10T16:37:27Z

@Melechtna Are you sure the GPU's ID is not modified in another part of the code? If I remember correctly, while troubleshooting, the ID was always 0 for me. I am not entirely sure I understand your line of thought on this, but I checked the code I sent in the pull request and noticed the error. Regarding 'So either your interpretation is wrong, or that implementation is wrong,' you are right about the implementation being wrong, it was a rookie mistake on my part, and I apologize for that. I believe I created a branch for that single modification and tested it outside of a controlled environment just to see if it works. It should be fixed now, and you can take a look in the same case I opened before.

Unfortunately my rocm method wasn't successful, and either didn't work at all, or created an issue where device never gets enumerated. As I said, it was a guess that I hadn't tested properly based on dealing with opencv, and some rudimentary research into pytorch, it just took me a while to test it properly. As for the gpu_id, if it gets assigned a number later on in the code, fair enough, I didn't look through the whole thing, as my primary concern was trying to gpu accelerate it with the AMD card I have, not so simple as it was with rife sadly. That implementation you adjusted does look far more sound however, but unless I can figure out how to make it accept rocm, not much use for me.

JuarezCulau · 2024-01-10T16:43:55Z

@Melechtna I'm sorry to hear that, but thanks for highlighting the error in the commit, it should be all good regarding that now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Py script doesn't utilize my gpu, how do I fix that? (linux) #670

Py script doesn't utilize my gpu, how do I fix that? (linux) #670

Bleeplo commented Aug 6, 2023

Jerchongkong commented Aug 9, 2023

Bleeplo commented Aug 15, 2023

JuarezCulau commented Aug 18, 2023

arianaa30 commented Oct 24, 2023

JuarezCulau commented Oct 24, 2023 •

edited

Loading

arianaa30 commented Oct 24, 2023

Melechtna commented Jan 10, 2024 •

edited

Loading

JuarezCulau commented Jan 10, 2024

Melechtna commented Jan 10, 2024 •

edited

Loading

JuarezCulau commented Jan 10, 2024

Melechtna commented Jan 10, 2024 •

edited

Loading

JuarezCulau commented Jan 10, 2024

Py script doesn't utilize my gpu, how do I fix that? (linux) #670

Py script doesn't utilize my gpu, how do I fix that? (linux) #670

Comments

Bleeplo commented Aug 6, 2023

Jerchongkong commented Aug 9, 2023

Bleeplo commented Aug 15, 2023

JuarezCulau commented Aug 18, 2023

arianaa30 commented Oct 24, 2023

JuarezCulau commented Oct 24, 2023 • edited Loading

arianaa30 commented Oct 24, 2023

Melechtna commented Jan 10, 2024 • edited Loading

JuarezCulau commented Jan 10, 2024

Melechtna commented Jan 10, 2024 • edited Loading

JuarezCulau commented Jan 10, 2024

Melechtna commented Jan 10, 2024 • edited Loading

JuarezCulau commented Jan 10, 2024

JuarezCulau commented Oct 24, 2023 •

edited

Loading

Melechtna commented Jan 10, 2024 •

edited

Loading

Melechtna commented Jan 10, 2024 •

edited

Loading

Melechtna commented Jan 10, 2024 •

edited

Loading