This is a Spark template for generating a new Spark application project. It comes bundled with:
- main and test source directories
- ScalaTest
- Scalacheck
- SBT configuration for 0.13.0 , Scala 2.10.4, and ScalaTest 2.0 dependencies
- project name , package and version customizable as variables
First, you need to install conscript .:
$ curl https://raw.github.com/n8han/conscript/master/setup.sh | sh
Conscript command is installed into ~/bin/cs. If ~/bin is included in PATH, you can install giter8
$ cs n8han/giter8
Next, the following command generates the first set of Spark application.:
$ g8 nttdata-oss/basic-spark-project
After the short question, you have the project directory (default: basic-spark) In the project directory, README.rst is found, which tells us how to execute sample application.
You can read README.rst of the sample project from this link
- Spark 1.2.0 -> Spark 1.3.1
- CDH5.2.1 -> CDH5.3.3
- Spark 1.1.0 -> Spark 1.2.1
- CDH5.2.1 -> CDH5.3.1
Add the following sample applications little changed from Spark official samples. The difference is not the algorithm but the mechanism to handle classes and parameters.
- WordCount, RandomTextWriter (the test data generator for WordCount) and Words (dictionary file)
- GroupByTest
- SparkLR
- SparkHdfsLR and SparkLRTestDataGenerator, the test data generator for SparkHdfsLR.
- Scalatest 2.0 for testing
- Sbt 0.12.4
- Scala 2.10.3
- Spark 0.9.0
- Hadoop 2.3 (CDH5)
- SparkPi
- Scalatest 2.0 for testing
- Sbt 0.12.4
- Scala 2.10.3
- Spark 0.9.0
- Hadoop 2.2 (CDH5b2)
- SparkPi