Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests for external sharing not dependant on semaphores. #1648

Merged
merged 16 commits into from
Jul 11, 2023

Conversation

pj87
Copy link
Contributor

@pj87 pj87 commented Feb 21, 2023

Additional external sharing tests that use fences instead of semaphores.

Additional external sharing tests that use fences instead of semaphores.

Signed-off-by: Paweł Jastrzębski <p.k.jastrzebski@gmail.com>
@pj87 pj87 marked this pull request as draft February 21, 2023 11:05
Signed-off-by: Paweł Jastrzębski <p.k.jastrzebski@gmail.com>
@nikhiljnv
Copy link
Contributor

nikhiljnv commented Feb 28, 2023

Will need to merge #1630 before merging any changes to test_vulkan to ensure build sanity.

@nikhiljnv
Copy link
Contributor

Can we make VkFence + clFinish a synchronization option to existing tests rather than creating a separate test that uses fence?
That way we can extend existing coverage across both modes (synchronization via external semaphores vs synchronization via VkFence + clFinish)

@nikhiljnv
Copy link
Contributor

@pj87
Looks like there will be some conflicts between this PR and part of #1645 due to refactoring of vulkan_wrapper moved into common place / lib.
We can go either ways, but do we have a preference which one we want to merge first? We will need to update other one based on the sequence.
Given that this is still in drafts, should we focus on #1645 first?

Apply changes for review:
- Make VkFence + clFinish a synchronization option to existing tests
instead of creating a separate test that uses fence.

Signed-off-by: Paweł Jastrzębski <p.k.jastrzebski@gmail.com>
Signed-off-by: Paweł Jastrzębski <p.k.jastrzebski@gmail.com>
Signed-off-by: Paweł Jastrzębski <p.k.jastrzebski@gmail.com>
@pj87 pj87 marked this pull request as ready for review March 17, 2023 17:29
Copy link
Contributor

@nikhiljnv nikhiljnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some early comments.

vkQueue.submit(vkCommandBuffer, vkVk2CLSemaphore);
if (use_fence)
{
vkQueue.submit(vkCommandBuffer, fence);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will need similar changes to other files especially to test_vulkan_interop_image.cpp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So as I aready mentioned at the beginnig of the task, I don't know how it should be done for the image part. I tried changing it in this commit accordingly for bufffer and image but I ended up with deadlocks:
9e83f00#diff-2b3fe9aac0e9f06312676bdb7212bb775c8b1386f633d73db32aebb87b283f3f
Do you have any idea on how it should be implemented?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take a look. We can cover image changes later if there are more issues to address.

vkQueue.submit(vkCommandBuffer, vkVk2CLSemaphore);
if (use_fence)
{
vkQueue.submit(vkCommandBuffer, fence);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take a look. We can cover image changes later if there are more issues to address.

else
{
if (use_fence)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fence needs to be reset before reusing after previous Wait.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added it but now there are deadlocks there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sequence should be "Reset fence, use it in some submit operation where it gets signaled, wait on it" or "use fence in some submit operation where it gets signaled, wait on it, Reset fence". Resetting just before wait but after signal would override effect of signalling at the end of completion of work submitted earlier and wait would not wait for the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this change there is an issue with:

err = clEnqueueReadBuffer(cmd_queue1, error_1, CL_TRUE, 0,
                                              sizeof(uint8_t), error_3, 0, NULL,
                                              NULL);

it produces the following output:

log_error("&&&& vulkan_opencl_buffer test FAILED\n");

Yet I am not sure what the reason is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed May 9th. @nikhiljnv will try to reproduce these failures.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We tried repro of the issue and found that synchronizations are misplaced in some scenarios.
Have requested changes which should fix data mismatches seen here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for the else branch when iter > 0 was to wait on the semaphore dispatched at the end of the last frame. We either need to wait on the last fence here, or wait on a fence at the end of the frame, and delete the if(iter==0) behavior.

Fixed following fence issues:
- Add missing link to command buffer
- Add fence reset before wait

Signed-off-by: Paweł Jastrzębski <p.k.jastrzebski@gmail.com>
Signed-off-by: Paweł Jastrzębski <p.k.jastrzebski@gmail.com>
Signed-off-by: Paweł Jastrzębski <p.k.jastrzebski@gmail.com>
Changes made:
- wait for fence with clFinish
- queue submit with wait for fence

Signed-off-by: Paweł Jastrzębski <p.k.jastrzebski@gmail.com>
@pj87
Copy link
Contributor Author

pj87 commented May 22, 2023

After the changes requested by @nikhiljnv I am still seeing undeterministic messages:

log_error("&&&& vulkan_opencl_buffer test FAILED\n");

It mostly doesn't reproduce, but it seems that there still may occur from time to time.

@@ -275,7 +265,7 @@ int run_test_with_two_queue(cl_context &context, cl_command_queue &cmd_queue1,

if (use_fence)
{
vkWaitForFences(vkDevice, 1, &fence, VK_TRUE, UINT64_MAX);
clFinish(cmd_queue1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fence wait is correct operation here.
Just to clarify - The places where execution moves from Vulkan to OpenCL (if blocks corresponding to else containing clVk2CLExternalSemaphore->wait) need to have fence waits.
The places where execution moves from OpenCL to Vulkan ((if blocks corresponding to else containing clCl2VkExternalSemaphore->signal)

test_conformance/vulkan/test_vulkan_interop_buffer.cpp Outdated Show resolved Hide resolved
Replaced clFinish with vkWaitForFences in Vulkan exectution context.

Signed-off-by: Paweł Jastrzębski <p.k.jastrzebski@gmail.com>
Replaced remaining clFinish with vkWaitForFences in Vulkan exectution context.

Signed-off-by: Paweł Jastrzębski <p.k.jastrzebski@gmail.com>
nikhiljnv
nikhiljnv previously approved these changes May 29, 2023
Copy link
Contributor

@nikhiljnv nikhiljnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

else
{
if (use_fence)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for the else branch when iter > 0 was to wait on the semaphore dispatched at the end of the last frame. We either need to wait on the last fence here, or wait on a fence at the end of the frame, and delete the if(iter==0) behavior.

test_conformance/vulkan/test_vulkan_interop_buffer.cpp Outdated Show resolved Hide resolved
test_conformance/vulkan/test_vulkan_interop_buffer.cpp Outdated Show resolved Hide resolved
test_conformance/vulkan/test_vulkan_interop_buffer.cpp Outdated Show resolved Hide resolved
Signed-off-by: Paweł Jastrzębski <p.k.jastrzebski@gmail.com>
@pj87
Copy link
Contributor Author

pj87 commented Jun 22, 2023

@joshqti could you take a look if everything is OK now?

pj87 added 2 commits June 23, 2023 13:29
… tests.

Signed-off-by: Paweł Jastrzębski <p.k.jastrzebski@gmail.com>
Signed-off-by: Paweł Jastrzębski <p.k.jastrzebski@gmail.com>
@bcalidas
Copy link

The latest commit looks good. This PR can be merged.

Thanks,
Balaji Calidas,
Qualcomm

Copy link
Contributor

@nikhiljnv nikhiljnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@neiltrevett neiltrevett merged commit 1ab4b26 into KhronosGroup:main Jul 11, 2023
6 checks passed
@pj87 pj87 deleted the external_sharing_fences branch November 10, 2023 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants