Welcome to the repository containing the Jupyter notebooks from the Creativa Data Science Bootcamp! This bootcamp is designed to guide you through essential data science concepts and hands-on practices in Python.
This repository includes notebooks from sessions 2 to 5 of the bootcamp, which cover the following topics:
-
Session 1: Getting Started with Python for Data Science Introducing Python and exploring the different types of data, and different fields that deal with data.
-
Session 2: Pandas and Data Cleaning
Learn how to handle and clean data effectively using Pandas, a powerful data manipulation library. -
Session 3: Data Preprocessing, EDA, and Feature Engineering
Explore the key steps in preparing and understanding your data, including Exploratory Data Analysis (EDA) and feature engineering. -
Session 4: Time Series Analysis and Forecasting
Dive into time series data and forecasting techniques using XGBoost, along with building a simple machine learning model from scratch. -
Session 5: Building a Classification Model
Implement and evaluate a classification model using XGBoost, with steps for data normalization, feature importance analysis, and model tuning.
Each notebook is well-documented with explanations, visualizations, and code comments to make it easy to follow along. You can clone or download the repository and run the notebooks locally using Jupyter or any compatible platform.
To set up your environment, you can use the following command to install the required dependencies:
pip install -r requirements.txt
This project is licensed under the MIT License - see the LICENSE file for details.