Key Steps:
β Data Wrangling & Cleaning
β Exploratory Data Analysis (EDA) with Python & SQL
β Interactive Maps (Folium) & Dashboards (Plotly Dash)
β Machine Learning (Classification/Regression)
β Presentation & Insights
# Applied Data Science Capstone Project
**Course:** IBM Data Science Professional Certificate (Coursera)
**Author:** Jefferson Firmino Mendes
**GitHub:** www.github.com/jeffthedeveloper
## π― Project Overview
This project demonstrates an end-to-end data science workflow, from **data collection** to **predictive modeling**, as part of the IBM/Coursera Applied Data Science Capstone. It includes:
- **Data wrangling** (cleaning, APIs, web scraping)
- **Exploratory Data Analysis (EDA)** with Python & SQL
- **Interactive visualizations** (Folium maps, Plotly Dash)
- **Machine learning** (classification/regression) for predictions
βββ data/ # Raw & processed datasets
βββ notebooks/ # Jupyter notebooks (EDA, ML, etc.)
β βββ 1_Data_Collection.ipynb
β βββ 2_Data_Wrangling.ipynb
β βββ 3_EDA_SQL_Analysis.ipynb
β βββ 4_Predictive_Modeling.ipynb
βββ scripts/ # Python helper scripts
βββ docs/ # Reports, presentations (PDF)
βββ app/ # Plotly Dash/Flask app (if applicable)
βββ README.md
- Python (Pandas, NumPy, Matplotlib, Seaborn)
- SQL (SQLite, PostgreSQL, or IBM Db2)
- Interactive Maps: Folium
- Dashboarding: Plotly Dash
- Machine Learning: Scikit-learn, XGBoost
- Version Control: Git/GitHub
- Exploratory Analysis: [Brief insight, e.g., "70% of SpaceX launches reuse the booster"]
- Predictive Model: [e.g., "Random Forest achieved 85% accuracy in classifying accident severity"]
- Interactive Tools: [e.g., "Folium maps revealed regional trends in accidents"]
- Clone the repo:
git clone [https://github.com/jeffthedeveloper/Applied-Data-Science-Capstone-End-to-End-Analysis-with-Python-SQL-and-Machine-Learning/blob/main/README.md]
- Install dependencies:
pip install -r requirements.txt
- Run Jupyter notebooks:
jupyter lab
This project is part of an educational coursework (MIT License).
click here β‘ [https://drive.google.com/file/d/1lgArDDKVNuzi1ucUSftZAgvOwFCHdOkG/view?usp=sharing]
### **Customization Tips:**
- For **SpaceX projects**, highlight:
- Falcon 9 landing predictions
- Folium launch site analysis
- For **Accident Severity projects**, focus on:
- SQL queries for crash hotspots
- Classification model (e.g., Logistic Regression)
Let me know if you'd like a **domain-specific version** (e.g., for SpaceX, accidents, COVID-19, etc.)! π―