Skip to content

This repository serves as a hands-on implementation of a Big Data platform focused on processing parliamentary data from the website of the Moroccan Parliament. The project aims to calculate Key Performance Indicators (KPIs) to evaluate the engagement level of each government.

Notifications You must be signed in to change notification settings

chaimaebouyarmane/Big_Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Parliament Data Processing - Big Data Implementation

This project focuses on implementing a Big Data platform to process parliamentary data from the website of the Moroccan Parliament. The primary goal is to calculate Key Performance Indicators (KPIs) to evaluate the engagement level of each government.

Project Features

  • Data Extraction: Utilizes Python and BeautifulSoup for web scraping to extract parliamentary data.
  • Big Data Tools: Leverages Cloudera distribution, Hadoop, HDFS, MapReduce, Hive, and HBase for scalable and distributed data processing.
  • Visualization: Utilizes PowerBI for creating visualizations and dashboards to interpret and communicate the results effectively.

Tools Used

  • Cloudera: Big Data platform for data management and processing.
  • Python: Programming language for scripting and data manipulation.
  • BeautifulSoup: Python library for web scraping and data extraction.
  • Hadoop: Distributed storage and processing framework.
  • HDFS: Hadoop Distributed File System for reliable and scalable storage.
  • MapReduce: Programming model for processing large datasets.
  • Hive: Data warehouse infrastructure for querying and managing large datasets.
  • HBase: NoSQL database for real-time, scalable data storage.
  • PowerBI: Business intelligence tool for data visualization and reporting.

Getting Started

  1. Setup Cloudera: Install and configure Cloudera on your system.
  2. Clone the Repository: Clone this repository to your local machine using git clone https://github.com/chaimaebouyarmane/Big_Data.git.
  3. Data Extraction: Run Python scripts to extract parliamentary data.
  4. Big Data Processing: Utilize Hadoop, MapReduce, Hive, and HBase for data processing.
  5. Visualization: Use PowerBI to visualize the calculated KPIs.

Contact 👥

Feel free to reach out to us if you have any questions or suggestions:

Chaimae BOUYARMANE

chaimae bouyarmane Votre nom


About

This repository serves as a hands-on implementation of a Big Data platform focused on processing parliamentary data from the website of the Moroccan Parliament. The project aims to calculate Key Performance Indicators (KPIs) to evaluate the engagement level of each government.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published