Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For SPARK-527, Support spark-shell when running on YARN #868

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

For SPARK-527, Support spark-shell when running on YARN #868

wants to merge 1 commit into from

Conversation

colorant
Copy link

In current YARN mode approaching, the application is run in the Application Master as a user program thus the whole spark context is on remote.

This approaching won't support application that involve local interaction and need to be run on where it is launched.

So In this pull request I have a YarnClientClusterScheduler and backend added.

With this scheduler, the user application is launched locally,While the executor will be launched by YARN on remote nodes with a thin AM which only launch the executor and monitor the Driver Actor status, so that when client app is done, it can finish the YARN Application as well.

This enables spark-shell to run upon YARN.

This also enable other Spark applications to have the spark context to run locally with a master-url "yarn-client". Thus e.g. SparkPi could have the result output locally on console instead of output in the log of the remote machine where AM is running on.

Docs also updated to show how to use this yarn-client mode.

@AmplabJenkins
Copy link

Thank you for your pull request. An admin will review this request soon.

@tgravescs
Copy link

Hey Raymond, this is great. I tried this out quick and I have a couple of comments/questions.

I had to use the repl shaded jar not the yarn jar because otherwise the workers would throw an exception about the ExecutorClassLoader not found.

How did you modify the classpath to pick up the YarnClientImpl on the client side? If I run spark-shell it it gets the error:
java.lang.NoClassDefFoundError: org/apache/hadoop/yarn/client/YarnClientImpl
I had to add the hadoop classpath into the run script.

The paths to the example jar and yarn shaded jar in your instructions are incorrect (although the normal yarn example one is too), its in the examples/target/spark-examples-0.8.0-SNAPSHOT.jar.

I'm not that familiar with the spark-shell yet, do you know how much load this puts on the client? thinking about it more its probably no more load then say the Hadoop apache Hive shell puts on the client.

I tried this on both a non-secure and secure yarn cluster and both worked!

thanks.

@colorant
Copy link
Author

@tgravescs glad to know that it works for you ;)

you are right, when running spark-shell, the repl fat jar should be used. I will modify the doc. about the YarnClientImpl path, I don't do anything to make it work. I think this should already been include in the fat jar. As for me, as long as I export YARN_CONF_DIR, everything is fine.

About the example and yarn shaded jar, it do have the scala-version prefix on my side, I am not sure whether it is because we don't have the same build env or? I use sbt/sbt assembly to prepare the package. I don't try mvn build.

the shell by itself shouldn't put much load on the client I think. While other client program might be, say they do have heavy application logic or involves huge small tasks that need to talk to the executors a lot.

Thanks for the comments ;)

@rxin
Copy link
Member

rxin commented Aug 29, 2013

Cool - when you are done with this, can you help us make Shark work on YARN too? :)

@tgravescs
Copy link

Ah, I'm using mvn that must produce different output directories then sbt.

I'm not sure how you get YarnClientImpl unless sbt packages differently because it is not in the spark-repl-bin-0.8.0-SNAPSHOT-shaded.jar. It is in the spark-yarn-0.8.0-SNAPSHOT-shaded.jar. Are you setting the classpath any other way? Would you mind setting SPARK_PRINT_LAUNCH_COMMAND=1 and see what your classpath looks like? Seems like the computer_classpath script should include yarn or we should have a shaded jar that includes both now.

@tgravescs
Copy link

sorry ignore my comment it does look like the sbt is adding the YarnClientImpl into the repl assembly jar. The mvn package command does not seem to do that. I'll file separate issue for that.

@mateiz
Copy link
Member

mateiz commented Aug 30, 2013

Guys, regarding the output JARs, take a look at it after this patch: #857. It will make the JAR file the same for sbt and Maven.

Matei

On Aug 29, 2013, at 3:17 PM, tgravescs notifications@github.com wrote:

sorry ignore my comment it does look like the sbt is adding the YarnClientImpl into the repl assembly jar. The mvn package command does not seem to do that. I'll file separate issue for that.


Reply to this email directly or view it on GitHub.

@mateiz
Copy link
Member

mateiz commented Aug 31, 2013

Also, I saw there's a TODO about lost executor IDs. How important is that?

@colorant
Copy link
Author

colorant commented Sep 2, 2013

@mateiz oh, I think the detection of lost executor should already been done by Driver actor through akka remote events. I just wondering whether there are other approaching by Yarn framework that can be utilized to enhance or complement the error/fail detection.

@mateiz mateiz mentioned this pull request Sep 5, 2013
@falaki
Copy link

falaki commented Sep 13, 2013

Will this be included in 0.8?

With this scheduler, the user application is launched locally,
While the executor will be launched by YARN on remote nodes.

This enables spark-shell to run upon YARN.
@colorant
Copy link
Author

Hi, I have the patch updated to the latest code , say sync with the new assembly, scripts and package path etc. doc on yarn also updated, Please take a review.

@mateiz
Copy link
Member

mateiz commented Oct 22, 2013

Hey @colorant, I'm curious, can you submit this against the new Apache Spark repo (https://github.com/apache/incubator-spark)? It would be a great feature to include in the next 0.8.x release.

@colorant
Copy link
Author

@mateiz ok, sure, I will try to sync the code to current trunk and submit a pull request there.

xiajunluan pushed a commit to xiajunluan/spark that referenced this pull request May 30, 2014
Minor cleanup following mesos#841.

Author: Reynold Xin <rxin@apache.org>

Closes mesos#868 from rxin/schema-count and squashes the following commits:

5442651 [Reynold Xin] SPARK-1822: Some minor cleanup work on SchemaRDD.count()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants