GitHub - Irene-Busah/Big-Data-Science: Course repository for 95-885 Data Science & Big Data, Fall 2025. Contains Python implementations, covering multiple classes in Data Science, Big Data, Machine Learning, and related topics. Includes notebooks, code, and practice exercises across probability, optimization, algorithms, and applied computing.

📊 Data Science & Big Data

95-885, Fall 2025 – Carnegie Mellon University

This repository contains comprehensive coursework and hands-on implementations for 95-885 Data Science & Big Data. It includes Python notebooks, code files, assignments, and practice exercises covering a wide range of topics across:

Probability & statistical modeling
Algorithms & optimization
Applied Machine Learning
Big Data processing & distributed systems
Applied computing & data engineering

🎯 Key Learning Outcomes

Designing end-to-end data science and machine learning solutions, from data ingestion and preprocessing to modeling, evaluation, and deployment. Projects reflect real-world use cases—suitable for solving both industry problems and academic research challenges.
Hands-on practice with Big Data tools, including:
- Apache Spark for distributed data processing
- Hadoop ecosystem tools
- Cloud data handling
Building production-ready pipelines using tools like Pandas, Scikit-learn, PySpark, and Hadoop streaming, and integrating them with machine learning models.

🧠 Contents

Class-Labs/: Jupyter Notebooks used in class labs and practical projects

Assignments/: Clean, tested Python scripts and reports

Documents/: Sample Research Papers

projects/: Capstone or course mini-projects on real datasets

🚀 Skills Gained

By the end of this course, learners will be able to:

Handle massive datasets and perform distributed computation
Apply statistical methods and ML models to large-scale problems
Understand performance bottlenecks in data pipelines
Translate academic theory into practical, scalable solutions

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Assignments		Assignments
Class Labs		Class Labs
Documents		Documents
Projects		Projects
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📊 Data Science & Big Data

🎯 Key Learning Outcomes

🧠 Contents

🚀 Skills Gained

About

Uh oh!

Releases

Packages

Languages

Irene-Busah/Big-Data-Science

Folders and files

Latest commit

History

Repository files navigation

📊 Data Science & Big Data

🎯 Key Learning Outcomes

🧠 Contents

🚀 Skills Gained

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages