- Install sbt in your system
brew install sbt
- Change the bootstrap servers in TwitterToKafka and Query5_TwitterToKafka files from localhost:9092 to kafka:9092
props.put("bootstrap.servers", "kafka:9092")
- Change the spark configuration session builder in KafkaToMongo and Query5_KafkaToMongo according to the mongoDB container name
.config("spark.mongodb.input.uri", "mongodb://root:root@mongo:27017")
.config("spark.mongodb.output.uri", "mongodb://root:root@mongo:27017")
- Navigate to the directory of the project through terminal and write the command
sbt clean assembly
This command will create a JAR file which includes all the files, folders, library dependencies, etc.
This JAR file can take some time to create (around 5 minutes) and is quite large (200+ Mbs)
So, it is better not to be pushed in the remote repository.
- To execute the object present in the JAR file, run the below command in the terminal
java -cp <JAR_file_name> <object_name>
- The above command would be used to run the JAR file in airflow using BashOperator