This repository contains the exercises and projects from the 42 Python for Data Science Piscine. The curriculum is designed to provide a deep dive into Python programming with a specific focus on data manipulation, analysis, and object-oriented programming.
The journey starts from the basics and progresses to advanced libraries like NumPy, Pandas, and Matplotlib.
Basics & Environment Setup
- Introduction to Python syntax and scripts.
- Standard input/output operations.
- Basic data types and control structures.
Numerical Computing with NumPy
- Understanding arrays and matrix operations.
- Image manipulation (loading, slicing, modifying pixels).
- Broadcasting and vectorization.
Data Analysis with Pandas
- Loading and processing CSV/JSON datasets.
- Data cleaning and filtering.
- Statistical analysis and aggregation.
Object-Oriented Programming
- Classes, inheritance, and polymorphism.
- Error handling and exceptions.
- Special methods (dunder methods).
Data Oriented Design & Visualization
- Advanced plotting with Matplotlib/Seaborn.
- Data visualization best practices.
- Performance optimization.
- Python 3.10+
- Libraries:
numpy,pandas,matplotlib,seaborn
pip install numpy pandas matplotlib seabornEach folder corresponds to a specific module and contains numbered exercises (e.g., ex00, ex01). To run a specific exercise:
cd "2 - DataTable/ex00"
python3 load_csv.py