Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document NV error code #927

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

SunSerega
Copy link
Contributor

It feels a bit backward to add like this, but this error code is used in production for a while but isn't defined in XML.

The name CL_KERNEL_ILLEGAL_BUFFER_READ_WRITE_NV... Seem to not fully describe the situation, I think I've seen it thrown in other cases.
The only thing I'm sure about is that this error is only thrown while the kernel is executing (never while compiling/linking).

I think I found this name on some random forum a while ago.
Back then I force-added it to my bindings without thinking, to get human-readable errors while testing my code.
But I can imagine, this would also be useful the same way to others...

Do I need to ping someone from NV?
Also, any naming suggestions?

@nikhiljnv
Copy link
Contributor

The name CL_KERNEL_ILLEGAL_BUFFER_READ_WRITE_NV... Seem to not fully describe the situation, I think I've seen it thrown in other cases.

Yes, this is a somewhat generic error and can mean one of many things.
If at all we want to name it, I would be more comfortable naming it more generically something like "CL_KERNEL_ERROR_UNKNOWN_NV" or "CL_KERNEL_EXECUTION_ERROR_NV".

A couple of options -

  1. We can reuse existing error-codes like CL_OUT_OF_RESOURCES or CL_INVALID_OPERATION. This would require a driver change and possibly change in error code returned for existing apps/use-cases.
  2. Add a new error code as mentioned above with this enum value so that applications can continue to see the same error value that can be mapped to NV specific CL error code.

@SunSerega
Copy link
Contributor Author

Does CUDA also only have 1 error code for all possible kernel execution problems? If not, is it possible to improve the NV implementation of OpenCL to reflect different CUDA error codes?

@SunSerega
Copy link
Contributor Author

SunSerega commented Oct 26, 2023

Well, confirmation from someone from NV that the error code is more general - is already great.
I've renamed the enum to match the current behavior.

But I still think it's best to review where this error is thrown and add a few new, more specific enums.
Meanwhile the -9999 code can be left for general errors, that cannot be distinguished.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants