Skip to content

Disaster Response dataset messages and categories ETL/ML pipelines and Flask Web App for message categorization.

Notifications You must be signed in to change notification settings

peronecode/disaster-response

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Disaster Response Pipeline Project

This project aims to build a disaster response pipeline to classify disaster messages efficiently. During disaster events, it is crucial to sort and categorize messages quickly and accurately, ensuring that they reach appropriate disaster relief agencies. Our project will provide a web app that enables emergency workers to input new messages and receive classification results across multiple categories.

Project Overview

We use a dataset containing real disaster messages provided by Appen (formerly Figure 8) to create a machine learning pipeline that categorizes these events. The project consists of three main components:

  • ETL Pipeline: A data cleaning pipeline that loads the messages and categories datasets, merges them, cleans the data, and stores it in a SQLite database.
  • ML Pipeline: A machine learning pipeline that loads data from the SQLite database, splits the dataset into training and test sets, builds a text processing and machine learning pipeline, trains and tunes a model using GridSearchCV, outputs results on the test set, and exports the final model as a pickle file.
  • Flask Web App: A web app where an emergency worker can input a new message and get classification results in several categories. The web app also displays visualizations of the data. Thanks Udacity!

Web App - Home page web-app-1.png

Web App - Classified text page web-app-2.png

Used Libraries

Note that this code is compatible with Python 3.8 or higher, and requires a few libraries:

  • Flask==2.2.3
  • joblib==1.2.0
  • nltk==3.8.1
  • numpy==1.22.4
  • pandas==1.5.0
  • plotly==5.14.0
  • scikit_learn==1.2.2
  • SQLAlchemy==1.4.47

Getting Started

To set up this project on your local machine, follow these steps:

  1. Clone this repository: git clone git@github.com:peronecode/disaster-response.git
  2. Install the required Python packages: pip install -r requirements.txt
  3. Run the ETL pipeline script: python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
  4. Run the ML pipeline script: python models/train_classifier.py data/disaster_response.db models/cls_model.pkl
  5. Run the web app: cd app && python app/run.py
  6. Open a web browser and go to the address provided in the terminal.

Project Structure

  • app/: Contains the Flask web app code and templates.
  • data/: Contains the raw data, ETL pipeline script (process_data.py), and the SQLite database generated by the ETL pipeline.
  • models/: Contains the ML pipeline script (train_classifier.py) and the trained model as a pickle file. requirements.txt: Lists the required Python packages for this project.
  • utils/: Contains the text_util.py that was created to avoid duplicated code.
  • images/: Screenshots images.

Licensing, Author and Acknowledgements

About

Disaster Response dataset messages and categories ETL/ML pipelines and Flask Web App for message categorization.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published