Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

printf with string format specifier is not synchronized #883

Open
jjennychen opened this issue Jul 11, 2024 · 1 comment
Open

printf with string format specifier is not synchronized #883

jjennychen opened this issue Jul 11, 2024 · 1 comment

Comments

@jjennychen
Copy link
Collaborator

When using printf with string format specifiers (%s), the output of the specified strings appears to be unsynchronized.
Below is the outputs from chipStar assertion and CUDA assertion when running assert-cuda benchmark in HeCBench:

CUDA:

main.cu:21: void testKernel(int): block: [1,0,0], thread: [28,0,0] Assertion `gid < N` failed.
main.cu:21: void testKernel(int): block: [1,0,0], thread: [29,0,0] Assertion `gid < N` failed.
main.cu:21: void testKernel(int): block: [1,0,0], thread: [30,0,0] Assertion `gid < N` failed.
main.cu:21: void testKernel(int): block: [1,0,0], thread: [31,0,0] Assertion `gid < N` failed.

chipStar:

mmmmaaaaiiiinnnn....ccccuuuu:21: :21: :21: :21: vvvvooooiiiidddd    tttteeeessssttttKKKKeeeerrrrnnnneeeellll((((iiiinnnntttt)))): Device-side assertion `: Device-side assertion `: Device-side assertion `: Device-side assertion `ggggiiiidddd    <<<<    NNNN' failed.
' failed.
' failed.
' failed.

[Reproducer]
Compile and run the following code in a .cu file:
printf.cu

#include <iostream>

__global__ void print() {
  const char* file = "testing.c";
  const char* function = "function";
  const char* message = "assertion message";
  unsigned int line = 3;
  printf("%s:%u: %s: Device-side assertion `%s' failed.\n", file, line, function, message);
}

int main(int argc, char** argv) {
  print<<<1, 3>>>();
}

The output of the program will be something similar to this:

testing.ctesting.ctesting.c:3: :3: :3: functionfunctionfunction: Device-side assertion `: Device-side assertion `: Device-side assertion `assertion messageassertion messageassertion message' failed.
' failed.
' failed.
@pjaaskel
Copy link
Collaborator

pjaaskel commented Jul 12, 2024

Yep, this is an unfortunate difference in printf() between CUDA and OpenCL which is not trivial to fix. I don't think it requires format strings to get any possible output ordering between the threads/WIs. It's down to the OpenCL driver's printf implementation what happens. Some could flush at newline boundaries, some just push chars to a shared "stdout ring buffer" (like PoCL does).

CUDA does the actual printing on host by transferring the fmtstr and the args whereas OpenCL printf can perform it (more) on the device, which means it's not guaranteed to be flushed at string boundaries. We thought about doing a similar implementation (borrow one from AMD ROCm for instance) but it needs non-trivial amount of both compiler and runtime work to get it right and portable.

A clean way to fix this would be to propose an OpenCL extension that can be used to enforce printf() strings to get flushed at \n to make unsynchronized multi-work-item output more readable. Meanwhile we are relying on the OpenCL driver-specific behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants