Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We're heavily using this tool to convert a couple of GBs of JSON files into AVRO every day.
It was useful for me to have this tool to accept more JSON files as input, hence my commit here.
If you need to convert a batch of json files, originally, json2avro could only be used like this:
cat file1.json file2.json file3.json | json2avro -S schema_file output.avro
With this patch, json2avro can also be used like this:
json2avro -S schema_files file1.json file2.json file3.json output.avro
eliminating thus the cat utility or any other utility used to concatenate the input files.
The performance improvement is between 1 and 1.5 seconds for a batch of 160MB of JSON files, when running json2avro with multiple input files.