Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assign event loop to tasks and run a task's methods in the same loop #24288

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

shangm2
Copy link

@shangm2 shangm2 commented Dec 19, 2024

Description

  1. this pr proposed to use a event loop to run task's methods so that multiple tasks can be assigned to the same event loop and a task's methods will only be executed on the same thread

Motivation and Context

  1. we saw a big number of tasks were created for certain queries which causes contention on the underlying synchronous queue
  2. this pr proposed to use a event loop to run task's method so that multiple tasks can be assigned to the same event loop and a task's method will be executed on the same thread
  3. some methods within http remote task does not need the "synchronized" keywords if we are certain they will be executed in the same thread.

Impact

Test Plan

  1. strobelight seems fine https://fburl.com/scuba/strobelight_java_asyncprofiler/on_demand/3qo6j1h9
Screenshot 2024-12-22 at 16 34 47 2. ran verifier successfully (screenshot is coming)

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* ... :pr:`12345`
* ... :pr:`12345`

Hive Connector Changes
* ... :pr:`12345`
* ... :pr:`12345`

If release note is NOT required, use:

== NO RELEASE NOTE ==

@shangm2 shangm2 requested a review from a team as a code owner December 19, 2024 23:43
@shangm2 shangm2 requested a review from presto-oss December 19, 2024 23:43
Copy link

linux-foundation-easycla bot commented Dec 19, 2024

CLA Not Signed

@shangm2 shangm2 changed the title assign event loop to tasks and run methods from a task in the same ev… assign event loop to tasks and run a task's methods in the same loop Dec 19, 2024
@shangm2 shangm2 marked this pull request as draft December 19, 2024 23:55
Copy link
Member

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@@ -101,6 +104,9 @@ public class HttpRemoteTaskFactory
private final MetadataManager metadataManager;
private final QueryManager queryManager;
private final DecayCounter taskUpdateRequestSize;
// TODO: use config file to set this value
private final EventLoopGroup eventLoopGroup = new DefaultEventLoopGroup(2000,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not worth setting it to a value higher then the number of CPU's available on the system

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

@@ -257,6 +264,7 @@ public RemoteTask createRemoteTask(
taskUpdateRequestSize,
handleResolver,
connectorTypeSerdeManager,
schedulerStatsTracker);
schedulerStatsTracker,
eventLoopGroup.next());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the task still need an executor? Also worth checking usages of updateScheduledExecutor/errorScheduledExecutor and see if all are proper

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The executor is used to run StateChangeListeners attached. This is another thing that seems to be broken. The StateChangeListener has to have an executor attached. It shouldn't be a responsibility of the event producer to provide a thread where to run a foreign callback. Instead a callback should be provided with an executor to run by the creator of a callback.

Copy link
Author

@shangm2 shangm2 Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally make sense. Let me create a separate pr (given that this may require big changes) to let the creator of the listener to handle its own execution. Learned a lot from your precious comment. Really appreciate it.

taskStatusFetcher.start();
taskInfoFetcher.start();
}
taskEventLoop.execute(() -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change remove all synchronized sections?

Copy link
Author

@shangm2 shangm2 Dec 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arhimondr yes, all synchronized sections are now async. The only thing I am not certain is this change https://github.com/prestodb/presto/pull/24288/files#r1894673304. The original synchronized implementation returns a future but now we will need to notify the listener in a sortof async way. I wonder what your thought is on this. So far, I am testing the code on verifier and things look good. And we might be able to get minimal performance gain if we use regular variables for those atomic ones since they dont need to be atomic any more.

@shangm2 shangm2 force-pushed the shangma/use_netty branch 5 times, most recently from 7bb112b to 9e5e4a7 Compare December 20, 2024 07:37
@tdcmeehan
Copy link
Contributor

Very nice change. So I understand, is the effect of this to reduce the number of IO threads related to task management from # of tasks to # of queries?

@arhimondr
Copy link
Member

@tdcmeehan Yeah, the idea is to reduce the number of event processing threads and remove synchronization with the eventual goal of coordinator supporting more tasks running concurrently. Currently we are seeing scalability issues when the number of tasks grows beyond 30-50K. Dispatching events on standard Java thread pools becomes a problem.

@tdcmeehan tdcmeehan self-assigned this Dec 20, 2024
if (splitQueueHasSpace) {
future.set(null);
}
whenSplitQueueHasSpace.createNewListener().addListener(() -> future.set(null), directExecutor());
Copy link
Author

@shangm2 shangm2 Dec 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Placeholder]

@shangm2 shangm2 marked this pull request as ready for review December 22, 2024 03:31
@amitkdutta amitkdutta changed the title assign event loop to tasks and run a task's methods in the same loop Assign event loop to tasks and run a task's methods in the same loop Dec 22, 2024
@amitkdutta
Copy link
Contributor

@shangm2 This is a great change. Please add details in testing section if you have performed any shadow tests. Will be great if we can run it through problematic traffic and show before/after with this change.

@shangm2
Copy link
Author

shangm2 commented Dec 23, 2024

@shangm2 This is a great change. Please add details in testing section if you have performed any shadow tests. Will be great if we can run it through problematic traffic and show before/after with this change.

Yep. I am making screenshots about the result of running verifier agains this change. One issue is this pr changes the way we use thread pool and I am not sure which metrics we can use to compare before and after the change. What do you think @arhimondr ? Can you suggest some metrics I can use to confirm the change? Thanks. I did run some strobelight and did not see anything suspicious: https://fburl.com/scuba/strobelight_java_asyncprofiler/on_demand/3qo6j1h9
Screenshot 2024-12-22 at 16 34 47

{
taskUpdateTimeline.add(System.nanoTime());
executor.execute(this::sendUpdate);
taskEventLoop.execute(this::sendUpdate);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can call sendUpdate directly here, since sendUpdate itself sends a callback to an event loop

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch. thanks!

whenSplitQueueHasSpaceThreshold = OptionalLong.of(weightThreshold);
updateSplitQueueSpace();
}
if (splitQueueHasSpace) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make splitQueueHasSpace volatile and keep the shortcut. There is a chance it may be important.

@@ -234,6 +230,8 @@ public final class HttpRemoteTask
private final DecayCounter taskUpdateRequestSize;
private final SchedulerStatsTracker schedulerStatsTracker;

private final EventLoop taskEventLoop;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Atomic fields no longer need to be atomic (nextSplitId, needsUpdate, sendPlan, started, aborted, outputBuffers). It is better to remove the atomic as it may suggest that the variable is accesses from multiple threads concurrently.
  • Remove GuardedBy annotations, as the accesses are no longer protected by a mutex. Instead you may want to add a comment at the beginning of the class and explain what is going on in this class (e.g.: that all interactions with the TaskState have to be done via an event loop)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Updating the code.

@@ -367,10 +361,10 @@ public HttpRemoteTask(
initialTask.getTaskStatus(),
taskStatusRefreshMaxWait,
taskStatusCodec,
executor,
taskEventLoop,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just pass it once and call it taskEventLoop

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I was trying not to modify the code in other files and keep this pr straightforward but looks like TaskInfoFetcher and ContinuousTaskStatusFetcher are only being used here. Let me update them as well.

executor,
updateScheduledExecutor,
errorScheduledExecutor,
taskEventLoop,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -604,7 +609,7 @@ public void onFailure(Throwable failedReason)
doRemoveRemoteSource(errorTracker, request, future);
}
else {
errorRateLimit.addListener(() -> doRemoveRemoteSource(errorTracker, request, future), errorScheduledExecutor);
errorRateLimit.addListener(() -> doRemoveRemoteSource(errorTracker, request, future), taskEventLoop);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run the callback itself in the event loop as well

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Thanks for catching this!

taskInfoFetcher.start();
}
taskEventLoop.execute(() -> {
try (SetThreadName ignored = new SetThreadName("HttpRemoteTask-%s", taskId)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend not to override the thread name (here and in other places). It is easier to diagnose threading related issues when the threads are named consistently (e.g.: allowing you to find all threads belonging to a certain thread pool and all related events for it)

httpClient,
maxErrorDuration,
errorScheduledExecutor,
taskEventLoop,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do the same exercise (removing synchronization) for the ContinuousTaskStatusFetcher and the TaskInfoFetcher

@arhimondr
Copy link
Member

It would be great if we could stress test this change (run many queries with many stages concurrently) and see if there are any major bottlenecks by profiling the event loop executor for both on-cpu and off-cpu activity.

Copy link
Member

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the change. Thank you for working on it

@shangm2
Copy link
Author

shangm2 commented Dec 23, 2024

It would be great if we could stress test this change (run many queries with many stages concurrently) and see if there are any major bottlenecks by profiling the event loop executor for both on-cpu and off-cpu activity.

Noted. Let me resolve the comments and run tableau queries against the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants