Regularly poll executors to track their utilization #613

squito · 2013-05-15T21:40:52Z

This isn't ready to be merged yet (needs tests & docs), but wanted to get some feedback.

The point of this PR is to provide a really high-level metric on how well the cluster is being utilized, by simply polling what percentages of cores are active. I'm hopefully this will diagnose a lot of cases where jobs are slow b/c spark is being used incorrectly -- imbalanced jobs, driver spending all its time merging accumulator values, long breaks between stages, etc.

The ugly part of this is plumbing the events from StandaloneSchedulerBackend --> ClusterScheduler --> SparkContext --> DAGScheduler --> SparkListener. I'm not sure what the right architecture is in general to let any arbitrary component get an event to the SparkListeners.

…r for now)

…tage

AmplabJenkins · 2013-05-15T21:41:30Z

Thank you for your pull request. An admin will review this request soon.

squito · 2013-05-15T21:44:43Z

core/src/main/scala/spark/scheduler/SparkListener.scala

+    //update ALL active stages
+    activeStageToExecutorStatus.foreach{case(k,v) =>
+      activeStageToExecutorStatus += k -> (v + executorStatus)
+    }


Note that ExecutorStatus messages even happen when there aren't any active stages. This means that you can also measure the cluster utilization across the lifetime of a spark context, not just within a stage. Eg., this would help diagnose if every stage is very well distributed, but between stages there is a lot of work happening on the master. I just decided to not include that in StatsReportListener.

(You could achieve the same effect w/out actually sending all the messages when there are no active stages, but the cluster is idle anyway, so why not.)

rxin · 2013-05-15T21:49:38Z

Why not ganglia or other external monitoring tools?

ryanlecompte · 2013-05-15T21:55:22Z

core/src/main/scala/spark/scheduler/ExecutorStatusPoller.scala

+/**
+ *
+ */
+abstract class ExecutorStatusPoller extends Logging {


I'd suggest using a java.util.concurrent.ScheduledThreadPoolExecutor instead of an infinite while loop. This would also let you schedule the poller at a fixed interval without having to manage the sleep "catchup" time yourself, e.g.:

val pool = Executors.newSingleThreadScheduledExecutor() val poller = new Runnable() { override def run() { // poll each executorId here } } // schedule repeated task pool.scheduleAtFixedRate(poller, 0, waitBetweenPolls, TimeUnit.MILLISECONDS)

This also lets you gracefully stop the poller via:

// gracefully shutdown the poller pool.shutdown() pool.awaitTermination(30, TimeUnit.SECONDS)

good point, I will make that change. This also got me thinking -- do I even want to create a new thread at all? Is there an appropriate thread pool for these repeated tasks already?

squito · 2013-05-16T00:04:20Z

@rxin yeah ganglia could provide something pretty close to this. But I thought this was useful b/c

(1) this is so simple, its useful to have even if you don't setup ganglia. And as this is integrated right into spark, its easier to connect these measures w/ whats going on in your code (you don't just have a ganglia graph w/ times, which then you've got connect back to what was going on in your code). Its not measuring the exact same thing as ganglia would w/ core utilization, but I can't make a really strong case why this is particularly better. If there is eventually really tight integration w/ ganglia, then maybe this could get ripped out.

(2) I was hoping that this might get used for more thorough polling of the executors, eg. stack trace sampling, task progress, etc. So might be a useful stepping stone even if the "core utilization" part is dropped eventually.

mateiz · 2013-06-13T20:46:51Z

Imran, this approach looks good to me, but I'm going to send it to Patrick, who's been looking at monitoring stuff too. I think these are reasonable API calls to add to the listener though.

pwendell · 2013-06-14T18:13:49Z

Hey Imran,

Just a high level question (haven't done a close look yet). If this is all of the information we are collecting - why do you need to poll the executors in the first place?

The information of which tasks are running on which executor when is available directly at the driver. You could actually get much finer grained utilization statistics using that information without the need to add RPC's.

Patrick

squito · 2013-06-15T15:47:54Z

Ah well, part of the motivation for this is that we noticed huge delays between when the executor thinks its finished a task, and when the driver fully registered it -- over 50% of the time actually for some of our workloads. We were able to fix this when we discovered it (in our case it seemed to be mostly the cost of merging the accumulator results on the driver), but I'd really like to make this inefficiency more obvious.

If we really wanted to, we could probably do this entirely within the driver, but I guess I did it this way b/c I was hoping to piggyback more info on top of those RPCs in the future -- eg., maybe jvm metrics, shuffle status, counters could also come via the same mechanism.

stephenh · 2013-06-17T18:52:55Z

FWIW I've wanted a quick & dirty "what's the cluster utilization like?" UI for a long time--agreed that other tools should be used for more extensive monitoring, but it would be nice to have some really basic info available for free/out of the box.

(I haven't looked at the code, so can't comment on the approach/implementation, just chiming in to say I'd enjoy seeing this happen.)

… rate scheduler

… it aggregates many samples

AmplabJenkins · 2013-08-05T21:33:55Z

Thank you for your pull request. An admin will review this request soon.

Author: wangfei <wangfei_hello@126.com> Closes mesos#613 from scwf/masterIndex and squashes the following commits: 1463056 [wangfei] delete no use var: masterIndex

Imran Rashid added 8 commits May 15, 2013 14:22

add simple ExecutorStatus

8908db3

plumb ExecutorStatus through system. (still missing polling)

ffeee81

regularly poll the executors for their status (just standalone cluste…

6bdd9be

…r for now)

add StageStarted event

89f4cb5

StatsReportListener summarizes executor utilization at end of every s…

656422c

…tage

setup executor status polling in local scheduler

9008a55

stage started events when waiting stages start

f71adb1

b/c of initialization order, dag scheduler may be null

fb0db76

squito reviewed May 15, 2013
View reviewed changes

ryanlecompte reviewed May 15, 2013
View reviewed changes

Imran Rashid added 4 commits June 19, 2013 22:38

change ExecutorStatusPoller to use a concurrent executor with a fixed…

b7db7b6

… rate scheduler

combine allExecutors and totalCoreCount

8441f11

shutdown the statusPoller

718af51

rename fields in ExecutorActivitySummary to make it more obvious that…

f8bba7a

… it aggregates many samples

pwendell pushed a commit to andyk/mesos-spark that referenced this pull request May 5, 2014

delete no use var

4bf24f7

Author: wangfei <wangfei_hello@126.com> Closes mesos#613 from scwf/masterIndex and squashes the following commits: 1463056 [wangfei] delete no use var: masterIndex

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regularly poll executors to track their utilization #613

Regularly poll executors to track their utilization #613

squito commented May 15, 2013

AmplabJenkins commented May 15, 2013

squito May 15, 2013

rxin commented May 15, 2013

ryanlecompte May 15, 2013

squito May 16, 2013

squito commented May 16, 2013

mateiz commented Jun 13, 2013

pwendell commented Jun 14, 2013

squito commented Jun 15, 2013

stephenh commented Jun 17, 2013

AmplabJenkins commented Aug 5, 2013

Regularly poll executors to track their utilization #613

Are you sure you want to change the base?

Regularly poll executors to track their utilization #613

Conversation

squito commented May 15, 2013

AmplabJenkins commented May 15, 2013

squito May 15, 2013

Choose a reason for hiding this comment

rxin commented May 15, 2013

ryanlecompte May 15, 2013

Choose a reason for hiding this comment

squito May 16, 2013

Choose a reason for hiding this comment

squito commented May 16, 2013

mateiz commented Jun 13, 2013

pwendell commented Jun 14, 2013

squito commented Jun 15, 2013

stephenh commented Jun 17, 2013

AmplabJenkins commented Aug 5, 2013