Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[CUDA] Fix synchronization issue in urEnqueueMemImageCopy
For 1D images, urEnqueueMemImageCopy was using cuMemcpyAtoA which does not have an asynchronous version. This means that, when the MemCpy happens between two arrays in device memory, the call will be asynchronous and might complete after the event returned by urEnqueueMemImageCopy finishes. This commits fixes the issue by using cuMemcpy2DAsync to copy 1D images by setting the height to 1.
- Loading branch information