Star Schema Benchmark using the Hive / Druid Integration

Pre-requisites to running this:

Functional Druid cluster.
A version of Hive that supports the Druid Storage Handler. This includes Apache Hive version 2.2 or later, or Hortonworks Data Platform (HDP) 2.6 or later.
Apache Maven and gcc, for the data generator.

Before continuing, identify these things:

Your desired data scale, in gigabytes. For example, a scale of 1000 equals about a TB of data
Your HiveServer2 host:port
The Druid overlord host
The username and password for your Druid metadata database

Process:

Build the data generator (native code)
Package the data generator in a JAR file capable of being run as a MapReduce job to generate data within a Hadoop cluster
Run a MapReduce job to generate "CSV" data within HDFS
Run a Hive job to convert this "CSV" data into Hive tables
Run a Hive job to push pre-aggregated data into Druid. This step may require you to create additional HDFS directories and set permissions if you're not using HDP.

If all goes well, you only run 3 commands:

sh 00datagen.sh [scale] [hiveserver2:port]
sh 00load.sh [scale] [hiveserver2:port] [overlord] [username] [password]
sh 00run.sh [hiveserver2:port]

Example to run at scale 100:

sh 00datagen.sh 100 hive.example.com:10500
sh 00load.sh 100 hive.example.com:10500 druid.example.com druid password
sh 00run.sh hive.example.com:10500

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
queries.druid		queries.druid
queries		queries
ssb-gen		ssb-gen
00clean.sh		00clean.sh
00datagen.sh		00datagen.sh
00load.sh		00load.sh
00run.sh		00run.sh
01structor_end_to_end.sh		01structor_end_to_end.sh
README.md		README.md
TUNING_NOTES.txt		TUNING_NOTES.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Star Schema Benchmark using the Hive / Druid Integration

About

Releases

Packages

Languages

b-slim/hive-druid-ssb

Folders and files

Latest commit

History

Repository files navigation

Star Schema Benchmark using the Hive / Druid Integration

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages