Skip to content

This project involves building a comprehensive, real-time data engineering pipeline focused on processing stock market data using Apache Kafka. The pipeline integrates various tools and technologies to efficiently handle streaming data and perform operations relevant to data engineering.

Notifications You must be signed in to change notification settings

ak-abhilash/Real-Time-Stock-Market-Data-Pipeline-with-Kafka

Repository files navigation

Real-Time-Stock-Market-Data-Pipeline-with-Kafka

This project involves building a comprehensive, real-time data engineering pipeline focused on processing stock market data using Apache Kafka. The pipeline integrates various tools and technologies to efficiently handle streaming data and perform operations relevant to data engineering.

Technologies Employed

  • Programming Language: Python
  • Amazon Web Services (AWS):
    • S3 (Simple Storage Service)
    • Athena
    • Glue Crawler
    • Glue Catalog
    • EC2
  • Apache Kafka for real-time data streaming
  • SQL for querying data and analysis

Architecture

Project Architecture

This project architecture leverages Kafka for real-time data ingestion and various AWS services for data storage, cataloging, and querying. It is designed to illustrate a typical data engineering workflow for managing large-scale, streaming data.

Dataset

The project is adaptable to different datasets, emphasizing the operational aspects of building and managing the data pipeline. Dataset is available in the files section

About

This project involves building a comprehensive, real-time data engineering pipeline focused on processing stock market data using Apache Kafka. The pipeline integrates various tools and technologies to efficiently handle streaming data and perform operations relevant to data engineering.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published