Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functionality to automatically detect NaNs in buffers/images before/after kernel enqueue #306

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Novermars
Copy link
Contributor

Implements #305

Description of Changes

Added a control DetectNaNs (bool), when set to true, it checks all images/buffers which are of floating point type after and before an enqueue to check for NaNs.

Testing Done

Tested with a buffer example on Windows on Intel iGPU.
Further testing WIP.

import numpy as np
import pyopencl as cl

a_np = np.zeros(8).astype(np.float32)
a_np[:] = 0.1

b_np = np.zeros(8).astype(np.int32)
b_np[:] = 10

ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)

mf = cl.mem_flags
a_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a_np)
b_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b_np)

flags = []
prg1 = cl.Program(ctx, """
__kernel void sum(__global float *res_g)
{
  int gid = get_global_id(0);
  res_g[gid] = 0.0f; 
}
""").build(flags)

prg2 = cl.Program(ctx, """
__kernel void insertNaN(__global float *res_g)
{
  int gid = get_global_id(0);
  res_g[gid] = NAN;
}
""").build(flags)

prg3 = cl.Program(ctx, """
__kernel void makeInt(__global int *res_g)
{
  int gid = get_global_id(0);
  res_g[gid] = 10;
}
""").build(flags)

res_g = cl.Buffer(ctx, mf.READ_WRITE, a_np.nbytes)
knl1 = prg1.sum  # Use this Kernel object for repeated calls
knl1(queue, a_np.shape, None, res_g)
knl1(queue, a_np.shape, None, res_g)
knl1(queue, a_np.shape, None, res_g)

knl2 = prg2.insertNaN
knl2(queue, a_np.shape, None, res_g)

knl1(queue, a_np.shape, None, res_g)
knl1(queue, a_np.shape, None, res_g)
knl3 = prg3.makeInt
knl3(queue, b_np.shape, None, res_g)
knl1(queue, a_np.shape, None, res_g)

knl2(queue, a_np.shape, None, res_g)

knl1(queue, a_np.shape, None, res_g)

Gives:

After kernel: insertNaN, EnqueueCtr: 3, arg_index: 0, data type: float*, has a NaN.
Before kernel: sum, EnqueueCtr: 4, arg_index: 0, data type: float*, has a NaN.
After kernel: insertNaN, EnqueueCtr: 8, arg_index: 0, data type: float*, has a NaN.
Before kernel: sum, EnqueueCtr: 9, arg_index: 0, data type: float*, has a NaN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant