This project requires Python 3.x and the following Python libraries installed:
This Project is part of data science nanodegree program by Udacity in collaboration with Figure Eight. The dataset contains pre-labelled tweet and messages from real-life disaster. In this project, messages are categorized so that you can send the messages to an appropriate disaster relief agency.
This project is divided into three section:
- ETL Pipeline: Extract data from source, transform the data to be used in alnalysis, then load the data to SQLite database.
- Machine learning pipeline: train a model to classifiy disaster messages.
- Web app: that can be used by disaster relief agency to categorize messages.
-
Run the following commands in the project's root directory to set up your database and model.
- To run ETL pipeline that cleans data and stores in database:
python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
- To run ML pipeline that trains classifier and saves:
python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl
- To run ETL pipeline that cleans data and stores in database:
-
Run the following command in the app's directory to run your web app.
python run.py
-
Go to http://0.0.0.0:3001/
Screenshots of the app interface:
- Bar chart that shows distribution of message genres:
- Bar chart that shows distribution of message categories:
This project is licensed under the MIT License - see the LICENSE file for details