This course aims to familiarize you with various data science pipelines using examples with different data types. This course is suitable for students who already have some experience in processing data and will work (or are currently working) with a large amount of data, especially focusing on obtaining insights from data through prediction or explanation techniques. This course is not intended to cover all topics in data science exhaustively. Instead, it introduces ways of working with structured (e.g., sensor measurements) and unstructured data (e.g., text and image).

It is important to keep in mind that this course does not aim to teach you details in programming, machine learning, statistics, or visualization. Instead, this course will teach you how to integrate various techniques (e.g., data wrangling, statistical analysis, data modeling, data visualization) together to perform a data science task. Also, notice that this course assumes someone already collected datasets for you and does not teach you how to collect data in the real world. Data collection is a topic that could take a very long time to explain and is mostly out of the scope of this course.

By the end of the course, we expect you to be able to:

Explain and execute the entire data science pipeline (including data pre-processing, wrangling, analysis, modeling, evaluation, and visualization).
Perform data science tasks with images (e.g., object recognition), text (e.g., topic modeling), and structured data (e.g., those from sensor networks) using the Python programming language.
Critically reflect on the model performance using various metrics and obtain meaningful insights from data analysis.

Recommended Prior Knowledge

This course expects you to have the following prior knowledge:

Intermediate level of Python programming (e.g., knowing different data types and data structures, knowing how to set up the Jupyter Notebook programming environment)
Basic level of machine learning (e.g., knowing what supervised and unsupervised learning means, understanding the differences between classification and regression)
Basic level of information visualization (e.g., knowing how to draw plots using python packages, understanding the differences between a bar chart and histogram)
Basic level of research methods (e.g., knowing what “research questions” mean, understanding basic hypothesis testing methods like t-test)

Teaching Method and Contact Hours

Lecture
Seminar
Self-study

Study Materials

Software:

Jupyter Notebook with Python

Assessment

There are two partial exams and weekly assignments.

Remarks

All information of the course will be on Canvas.

Lectures will be given in English, as well as all the teaching materials and assessment materials. Work sessions will be given in either Dutch or English, depending on the TA’s choice.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
python-warm-up		python-warm-up
structured-data-module		structured-data-module
README.md		README.md
python-warm-up-notebook.ipynb		python-warm-up-notebook.ipynb
requirements.txt		requirements.txt
structured-data-module.zip		structured-data-module.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Contents

Recommended Prior Knowledge

Teaching Method and Contact Hours

Study Materials

Software:

Assessment

Remarks

About

Uh oh!

Releases

Packages

Uh oh!

Languages

kingilsildor/IK-Data_Science

Folders and files

Latest commit

History

Repository files navigation

Contents

Recommended Prior Knowledge

Teaching Method and Contact Hours

Study Materials

Software:

Assessment

Remarks

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages