Spark enabled GOR
GOR scalable through the Spark engine (https://spark.apache.org)
git clone git@github.com:gorpipe/gor-spark.git
cd gor-spark
./gradlew clean installDist
Now you can use SparkSQL from within GOR
spark/build/install/gor-scripts/bin/gorpipe "select * from genes.gor limit 10"
spark/build/install/gor-scripts/bin/gorpipe "create xxx = select * from <(select * from genes.gor) where Gene_Symbol like 'B%'; gor [xxx] | top 10"
Scala demo: gorspark.scala
spark-shell --packages org.gorpipe:gor-spark:3.10.2 --exclude-packages "org.apache.logging.log4j:log4j-core,org.apache.logging.log4j:log4j-api" -I gorspark.scala
Python demo: gorspark.py
pyspark --packages org.gorpipe:gor-spark:3.10.2 --exclude-packages "org.apache.logging.log4j:log4j-core,org.apache.logging.log4j:log4j-api" -I gorspark.py