ZTM -> NiFi -> Kafka -> Spark -> Kafka -> MongoDB
Project can be run locally and in cloud environment.
To run project locally you have to use docker-compose with configuration written in file:
https://github.com/m-qlas/public-transport-traffic-analisys/blob/main/docker-compose/nifi-kafka-single-mongo.yml.
Docker containers with Zookeper, Kafka, Kafka-Connect, Nifi, MongoDB will be started.
After docker container with Nifi have been started you have to import and run templates from: https://github.com/m-qlas/public-transport-traffic-analisys/tree/main/Nifi_templates
Additional configuration don't have to be done after container start.
Additional configuration don't have to be done after container start.
Configuration files from:
https://github.com/m-qlas/public-transport-traffic-analisys/tree/main/Kafka_connect_confs
have to be added at the path: /etc/kafka
inside container
Next connection with MongoDB can be started with command:
connect-standalone /etc/kafka/connect-standalone.properties /etc/kafka/connect-mongo.properties
Additional configuration don't have to be done after container start.
You have to activate python environment with pyspark library on your machine. In first step dataframe with schedule have to be created, it can be done by running script written in file: https://github.com/m-qlas/public-transport-traffic-analisys/blob/main/ImportSchedule.py After download has been finished you can start stream processinng for trams and buses with files: https://github.com/m-qlas/public-transport-traffic-analisys/blob/main/BusStream.py https://github.com/m-qlas/public-transport-traffic-analisys/blob/main/TramStream.py
After everything is properly configured and stream is started, data will be written to schemas tram and bus in MongoDB. Example querries to recieve can be found in: https://github.com/m-qlas/public-transport-traffic-analisys/blob/main/mongo_commands.js