Disaster Response Pipeline Project

Description

This Project is part of Data Science Nanodegree Program by Udacity in collaboration with Figure Eight. The dataset contains pre-labelled tweet and messages from real-life disaster events. The project aim is to build a Natural Language Processing (NLP) model to categorize messages on a real time basis.

The Project is divided in the following Sections:

Data Processing, ETL Pipeline to extract data from source, clean data and save them in a proper databse structure
Machine Learning Pipeline to train a model able to classify text message in categories
Web App to show model results in real time.

Instructions:

Dependencies

Python 3.5+
Machine Learning Libraries: NumPy, SciPy, Pandas, Sciki-Learn
Natural Language Process Libraries: NLTK
SQLlite Database Libraqries: SQLalchemy
Model Loading and Saving Library: Pickle, joblib
Web App and Data Visualization: Flask, Plotly

Installation

Clone the git repository:

git clone https://github.com/alirezakfz/Disaster_Response_Pipelines.git

Executing Program:

Run the following commands in the project's root directory to set up your database and model.
- To run ETL pipeline that cleans data and stores in database python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
- To run ML pipeline that trains classifier and saves python models/train_classifier.py data/DisasterResponse.db models/classifier.gzip
Run the following command in the app's directory to run your web app. python run.py
Go to http://0.0.0.0:3001/

Additional Material

In the Notebook_Workspace folder you can find two jupyter notebook that will help you understand how the model works step by step:

ETL Preparation Notebook: learn everything about the implemented ETL pipeline
ML Pipeline Preparation Notebook: look at the Machine Learning Pipeline developed with NLTK and Scikit-Learn

You can use ML Pipeline Preparation Notebook to re-train the model or tune it through a dedicated Grid Search section.

Important Files

app/templates/*: templates/html files for web app

data/process_data.py: Extract Train Load (ETL) pipeline used for data cleaning, feature extraction, and storing data in a SQLite database

models/train_classifier.py: A machine learning pipeline that loads data, trains a model, and saves the trained model as a .pkl file for later use

run.py: This file can be used to launch the Flask web app used to classify disaster messages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
app		app
data		data
models		models
notebooks_workspace		notebooks_workspace
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Disaster Response Pipeline Project

Table of Contents

Description

Instructions:

Dependencies

Installation

Executing Program:

Additional Material

Important Files

Authors

License

Acknowledgements

About

Releases

Packages

Languages

License

alirezakfz/Disaster_Response_Pipelines

Folders and files

Latest commit

History

Repository files navigation

Disaster Response Pipeline Project

Table of Contents

Description

Instructions:

Dependencies

Installation

Executing Program:

Additional Material

Important Files

Authors

License

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages