Skip to content

Commit

Permalink
enqueueWriteBuffer: Initialize host buffer to obtain accurate measure…
Browse files Browse the repository at this point in the history
…ment

When a host buffer is passed as a source into enqueueWriteBuffer(), a
memcpy() is used by OpenCL.
memcpy() is optimized to copy zero pages. Newly allocated memory points
to zero pages, and when the memory is written to, physical memory is
allocated.

Therefore, initialize host buffer to obtain accurate measurements with
enqueueWriteBuffer().

Results on Intel hardware:

    Before:
        Platform: Intel(R) OpenCL HD Graphics
          Device: Intel(R) Gen9 HD Graphics NEO
            Driver version  : 19.03.0 (Linux x64)
            Compute units   : 48
            Clock frequency : 1200 MHz

            Transfer bandwidth (GBPS)
              enqueueWriteBuffer         : 34.18
              enqueueReadBuffer          : 13.02
              enqueueMapBuffer(for read) : 14316530.00
                memcpy from mapped ptr   : 13.01
              enqueueUnmap(after write)  : inf
                memcpy to mapped ptr     : 13.37

    After:
        Platform: Intel(R) OpenCL HD Graphics
          Device: Intel(R) Gen9 HD Graphics NEO
            Driver version  : 19.03.0 (Linux x64)
            Compute units   : 48
            Clock frequency : 1200 MHz

            Transfer bandwidth (GBPS)
              enqueueWriteBuffer         : 13.44
              enqueueReadBuffer          : 12.91
              enqueueMapBuffer(for read) : 21474796.00
                memcpy from mapped ptr   : 12.91
              enqueueUnmap(after write)  : inf
                memcpy to mapped ptr     : 13.44
  • Loading branch information
ssanchez11 authored and krrishnarraj committed Dec 6, 2019
1 parent e1fc832 commit 784e673
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions src/transfer_bandwidth.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ int clPeak::runTransferBandwidthTest(cl::CommandQueue &queue, cl::Program &prog,
try
{
arr = new float[numItems];
memset(arr, 0, numItems * sizeof(float));
cl::Buffer clBuffer = cl::Buffer(ctx, (CL_MEM_READ_WRITE | CL_MEM_ALLOC_HOST_PTR), (numItems * sizeof(float)));

log->print(NEWLINE TAB TAB "Transfer bandwidth (GBPS)" NEWLINE);
Expand Down

0 comments on commit 784e673

Please sign in to comment.