Skip to content

SoundaryaBaskaran/ML-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

End to End Machine Learning Project

Student Performance Indicator

Introduction About the Data :

Problem Statement

The objective of this project is to investigate how various factors influence student performance (test scores). We will explore the relationship between test scores and the following variables:

  1. Gender: Analyze whether there are gender-based differences in performance.
  2. Ethnicity: Investigate how ethnicity impacts test scores.
  3. Parental Level of Education: Explore correlations between parents’ education levels and student performance.
  4. Lunch: Examine whether the type of lunch (standard or free/reduced) affects test scores.
  5. Test Preparation Course: Understand the impact of completing a test preparation course on student outcomes.
Dataset Information

The dataset includes the following columns:

  • Gender: Sex of students (Male/Female)
  • Race/Ethnicity: Ethnicity of students (Group A, B, C, D, E)
  • Parental Level of Education: Parents’ final education (Bachelor’s degree, Some college, Master’s degree, Associate’s degree, High school)
  • Lunch: Type of lunch (Standard or Free/Reduced)
  • Test Preparation Course: Whether the test preparation course was completed or not
  • Math Score
  • Reading Score
  • Writing Score

Dataset Source Link : https://www.kaggle.com/datasets/spscientist/students-performance-in-exams?datasetId=74977

The data consists of 8 column and 1000 rows.

Approach for the project

  1. Data Ingestion :

    • In Data Ingestion phase the data is first read as csv.
    • Then the data is split into training and testing and saved as csv file.
  2. Data Transformation :

    • In this phase a ColumnTransformer Pipeline is created.
    • For Numeric Variables , then Standard Scaling is performed on numeric data.
    • for Categorical Variables one hot encoding performed , after this data is scaled with Standard Scaler.
    • This preprocessor is saved as pickle file.
  3. Model Training :

    • In this phase base model is tested . The best model found was linear regression.
    • This model is saved as pickle file.
  4. Prediction Pipeline :

    • This pipeline converts given data into dataframe and has various functions to load pickle files and predict the final results in python.
  5. Flask App creation :

    • Flask app is created with User Interface to predict the math score inside a Web Application.

      Screenshot 2024-08-20 220514

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages