Skip to content

Commit

Permalink
[L0] fix a deadlock on a recursive event rwlock
Browse files Browse the repository at this point in the history
L0, when creating a list of events to wait on, in some cases
was first grabbing a lock on a potentially completed event,
and then tried to get a command list, which sometimes needs to
cleanup all completed events. This caused a deadlock.

This patch moves getting a command list to before the event lock.
But because the lock is required to decide whether this command
list actually needed, we might be wasting time here.
  • Loading branch information
pbalcer committed Mar 22, 2024
1 parent 4c22f5c commit 3a64d79
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions source/adapters/level_zero/event.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1307,6 +1307,10 @@ ur_result_t _ur_ze_event_list_t::createAndRetainUrZeEventList(
}
}

ur_command_list_ptr_t CommandList{};
UR_CALL(Queue->Context->getAvailableCommandList(Queue, CommandList,
false, true));

std::shared_lock<ur_shared_mutex> Lock(EventList[I]->Mutex);

if (Queue && Queue->Device != CurQueue->Device &&
Expand All @@ -1316,10 +1320,6 @@ ur_result_t _ur_ze_event_list_t::createAndRetainUrZeEventList(
bool IsInternal = true;
bool IsMultiDevice = true;

ur_command_list_ptr_t CommandList{};
UR_CALL(Queue->Context->getAvailableCommandList(Queue, CommandList,
false, true));

UR_CALL(createEventAndAssociateQueue(
Queue, &MultiDeviceEvent, EventList[I]->CommandType, CommandList,
IsInternal, IsMultiDevice));
Expand Down

0 comments on commit 3a64d79

Please sign in to comment.