Skip to content

Building Shark Master Branch

michaeljones46 edited this page May 2, 2014 · 5 revisions

Shark's latest master branch depends on Spark's master branch, which is usually not published to Maven yet. We can however publish Spark to local ivy repository.

git clone
cd spark
sbt/sbt package publish-local

Then check out the AMPLab distribution of Apache Hive and build it.

git clone -b shark-0.11
cd hive
ant package

ant package builds all Hive jars and put them into build/dist directory. On the EC2 AMI, you may have to first install ant-antlr.noarch and ant-contrib.noarch:

yum install ant-antlr.noarch
yum install ant-contrib.noarch

Now check out Shark

git clone
cd shark

Edit the configuration file conf/

#!/usr/bin/env bash


export HIVE_DEV_HOME="/scratch/rxin/hive"
export HIVE_HOME="$HIVE_DEV_HOME/build/dist"

SPARK_JAVA_OPTS="-Dspark.local.dir=/tmp "
SPARK_JAVA_OPTS+="-Dspark.kryoserializer.buffer.mb=10 "
SPARK_JAVA_OPTS+="-verbose:gc -XX:-PrintGCDetails -XX:+PrintGCTimeStamps "

export SCALA_VERSION=2.9.2=3
export SCALA_HOME="/scratch/rxin/scala-2.9.3"
export SPARK_HOME="/scratch/rxin/spark"
export HADOOP_HOME="/scratch/rxin/hadoop-"
export JAVA_HOME="/usr/lib/jvm/java-6-openjdk/jre"

Finally, build Shark

sbt/sbt package