Introduction

we want to estimate time during a trip between 2 points in a city and our data are car's locations that are produced to kafka every 5 second's .

its huge data so to solve this problem we divide city to 10000 points and assign closet point to every record with its location and calculate weight between points with spark aggregation and produce result to kafka

example tehran which is divided to points

Install

sudo apt update -y
sudo apt install -y git python3 python3-pip python3-venv

git clone https://github.com/hoseinlook/road-traffic-graph.git

cp -n .env.example .env
nano .env

python3.8 -m venv venv
source venv/bin/activate
pip install -U pip
pip install -r requirements.txt

Run

to run this project at first provide infrastructure like kafka and zookeeper with docker

start kafka:

docker-compose up

kafka bootstrap host: localhost:9093
zookeeper server: localhost:2181

start pipeline:

now start pyspark pipeline

source venv/bin/activate
python -m pipeline

spark webUI: localhost:4040

produce example data to kafka:

source venv/bin/activate
python -m generate_data

Note:

you can watch checkpoint's and data of kafka and data of zookeeper in storage directory

Optional

run spark in jupyter (its not streaming) to watch

jupyter-notebook .

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
generate_data		generate_data
pipeline		pipeline
readme_files		readme_files
storage		storage
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.yaml		docker-compose.yaml
requirements.txt		requirements.txt
test-spark-pipeline.ipynb		test-spark-pipeline.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Install

Run

start kafka:

start pipeline:

produce example data to kafka:

Note:

Optional

About

Releases

Packages

Languages

hoseinlook/spark-road-traffic-graph

Folders and files

Latest commit

History

Repository files navigation

Introduction

Install

Run

start kafka:

start pipeline:

produce example data to kafka:

Note:

Optional

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages