Real-Time-Stock-Market-Data-Pipeline-with-Kafka

This project involves building a comprehensive, real-time data engineering pipeline focused on processing stock market data using Apache Kafka. The pipeline integrates various tools and technologies to efficiently handle streaming data and perform operations relevant to data engineering.

Technologies Employed

Programming Language: Python
Amazon Web Services (AWS):
- S3 (Simple Storage Service)
- Athena
- Glue Crawler
- Glue Catalog
- EC2
Apache Kafka for real-time data streaming
SQL for querying data and analysis

Project Architecture

This project architecture leverages Kafka for real-time data ingestion and various AWS services for data storage, cataloging, and querying. It is designed to illustrate a typical data engineering workflow for managing large-scale, streaming data.

Dataset

The project is adaptable to different datasets, emphasizing the operational aspects of building and managing the data pipeline. Dataset is available in the files section

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Real-Time-Stock-Market-Data-Pipeline-with-Kafka

Technologies Employed

Project Architecture

Dataset

Files

README.md

Latest commit

History

README.md

File metadata and controls

Real-Time-Stock-Market-Data-Pipeline-with-Kafka

Technologies Employed

Project Architecture

Dataset