Skip to content

RTBHOUSE/bq-tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Big Query ETL Tool

Big Query ETL tool for creating Big Query table schema from avro schema and serializing avro objects to json.

Creating Big Query schema:

hadoop jar bq-tool-1.0-jar-with-dependencies.jar com.rtbhouse.bq.avro.SchemaConverter hdfs:///user/myuser/avro_schema_file.avsc > bq_schema.bqsc

Creating json which can be loaded to Big Query with "bq load" job.

hadoop jar bq-tool-1.0-jar-with-dependencies.jar com.rtbhouse.bq.avro.AvroToJson com.rtbhouse.bq.avro.AvroToJson -s avro_schema_file.avsc -i /user/myuser/in_avro_files.avro -o /user/myuser/out_json_files

usage:  hadoop jar bq-tool-1.0-jar-with-dependencies.jar com.rtbhouse.bq.avro.AvroToJson com.rtbhouse.bq.avro.AvroToJson []
options:
-f,--file         Avro file or directory to be processed.
-m,--mapsize      Max split mapsize in MB.
-o,--output       HDFS output directory.
-r,--rowsize      Max json row size bytes.
-s,--avroschema   Avro schema file to be processed.
-u,--usage             Print usage.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages