Skip to content

This project analyzes greenhouse gas emissions data by country over the years. The goal is to identify trends and patterns in emissions across different nations, providing insights into the global distribution of greenhouse gas emissions.

License

Notifications You must be signed in to change notification settings

Exploratory-Data-Analyses/greenhouse-gas-emission-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Greenhouse Gas Emissions Analysis

Overview

This project involves analyzing greenhouse gas emissions data to understand emissions trends across different countries and years. The analysis includes data cleaning, visualization of emissions by country and year, and forecasting future emissions using a Random Forest model.

Tools

  • Pandas: For data manipulation and cleaning.
  • Matplotlib: For creating visualizations.
  • Scikit-learn: For building and evaluating the Random Forest model.
  • NumPy: For numerical operations.
  • Janitor: For data cleaning utilities.

Folder Structure

.
├── data/
│   └── data.csv                # Input data file with greenhouse gas emissions data
├── figures/
│   ├── country.jpg             # Bar plot of emissions by country
│   ├── yearly.jpg              # Line plot of yearly emissions
│   ├── country-yearly.jpg      # Bar plot of emissions by countries over years
│   └── forecast.jpg            # Line plot of forecasted emissions
├── scripts/
│   └── analysis.py             # Main script for data analysis
├── requirements.txt            # Required Python packages
├── README.md                   # This readme file
├── data-description.md         # Description of the dataset
└── methodology.md              # Methodology used in the analysis
└── results.md                  # Results and findings from the analysis

Installation

To set up the environment for this project, install the required Python packages using:

pip install requirements.txt

Usage

  1. Prepare the Data: Ensure that the dataset is placed in the data folder and named data.csv.

  2. Run the Analysis: Execute the main script to perform the data analysis:

    python scripts/analysis.py

    The script performs the following tasks:

    • Data Loading and Cleaning: Loads the dataset and performs initial cleaning operations, including renaming columns and converting units.

    • Data Visualization:

      • Emissions by Country: Generates a bar plot to show total emissions for each country.
      • Yearly Emissions: Creates a line plot to visualize the trend of emissions over the years.
      • Yearly Emissions by Country: Produces a bar plot showing emissions from the top countries over time.
    • Predictive Modeling: Uses a Random Forest model to predict future greenhouse gas emissions and visualize the forecast.

Analysis Details

Data Loading and Cleaning

The dataset is loaded and cleaned by renaming columns and adjusting units. The year is converted to a period index to handle time spans effectively.

Data Visualization

  1. Emissions by Country: A bar plot is generated to illustrate the total greenhouse gas emissions by country, providing insight into which countries have the highest contributions.

  2. Yearly Emissions: A line plot shows the trend of total emissions over the years, helping identify patterns or changes in emission levels.

  3. Yearly Emissions by Country: A bar plot visualizes emissions by the top countries across different years, highlighting significant contributors and trends.

Predictive Modeling

A Random Forest model is used to predict future emissions. The model is trained on historical data, and the Mean Absolute Percentage Error (MAPE) is calculated to evaluate its performance. Forecasts for future years are visualized to provide insight into potential future emission levels.

Results

The analysis yields several key insights, including:

  • The top countries contributing to greenhouse gas emissions.
  • Trends in emissions over the years.
  • Forecasts of future emissions based on historical data.

Contributing

Contributions are welcome! Please fork the repository and submit pull requests with improvements or additional features.

License

This project is licensed under the MIT License - see the LICENSE.md file for details.


The conversation continues on Kaggle

About

This project analyzes greenhouse gas emissions data by country over the years. The goal is to identify trends and patterns in emissions across different nations, providing insights into the global distribution of greenhouse gas emissions.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages