diff --git a/README.md b/README.md index f522a158..6f01b8bc 100755 --- a/README.md +++ b/README.md @@ -6,76 +6,4 @@ modification to the existing data nor queries. Shark supports Hive's query langu metastore, serialization formats, and user-defined functions. -## Build - -Shark requires Hive 0.7.0 and Spark (0.4-SNAPSHOT). - -Get Hive from Apache: - - $ export HIVE_HOME=/path/to/hive - $ wget http://archive.apache.org/dist/hive/hive-0.7.0/hive-0.7.0-bin.tar.gz - $ tar xvzf hive-0.7.0-bin.tar.gz - $ mv hive-0.7.0-bin $HIVE_HOME - -Get Spark from Github, compile, and publish to local ivy: - - $ git clone https://github.com/mesos/spark.git spark - $ cd spark - $ sbt/sbt publish-local - -Get Shark from Github: - - $ git clone git://github.com/amplab/shark.git shark - $ cd shark - -Before building Shark, first modify the config file: - - $ conf/shark-env.sh - -Compile Shark (make sure `$HIVE_HOME` is set in `conf/shark-env.sh`): - - $ sbt/sbt products - - -## Execution - -There are several executables in /bin: - -* `shark`: Runs Shark CLI. -* `shark-withinfo`: Runs Shark with INFO level logs printed to the console. -* `shark-withdebug`: Runs Shark with DEBUG level logs printed to the console. -* `shark-shell`: Runs Shark scala console. This provides an experimental feature -to convert Hive QL queries into `TableRDD`. -* `clear-buffer-cache.py`: Automatically clears OS buffer caches on Mesos EC2 -clusters. This is handy for performance studies. - - -## Runtime Configuration - -Shark reuses Hive's configuration files, which are loaded from `$HIVE_HOME/conf`. - -We also include a few Shark-specific configuration parameters that can be set -in the same way as you would set configuration parameters in Hive (e.g. from the -Shark CLI): - - shark> set shark.exec.mode = [hive | shark (default)] - shark> set shark.explain.mode = [hive | shark (default)] - - -## Caching - -Shark caches tables in memory as long as their name ends in "`_cached`". For example, -if you have a table named "test", you can create a cached version of it as follows: - - shark> CREATE TABLE test_cached AS SELECT * FROM test; - - -References ----------- -For information on setting up Hive or HiveQl, please read: -https://cwiki.apache.org/confluence/display/Hive/GettingStarted - -For information on Spark, please read: -https://github.com/mesos/spark - - +# For current documentation, see the [Shark Project Wiki](https://github.com/amplab/shark/wiki)