Course description
Data is increasing rapidly with the use of social media, the internet of things, and new data acquisition methods. Large-scale data analysis entails a high computational demand, increasing the time needed in our analysis. As scientists, we need tools to transform, process, and analyse our data to understand our physical and natural world. For scientific computing, quick interaction for preprocessing, data exploration, and analyses is crucial. However, the volume and complexity of the data make this task highly time demanding and sometimes intractable.
In this course, we start with an introduction to Python language and we will discuss its features. We will practice different python functions and compare them to other libraries to speed-up our analysis, and a brief introduction to matplotlib, a wide-used visualization module. Each seminar section will count with a hands-on experience through practical exercise to apply each method and concept. We will get into parallel computing in python, with a focus in multi-threads tasks via Numba compiler. At the end of these seminars, you will be able to execute multi-core tasks in Python and perform multi-core analyses (CPU/GPU).
Learning Objectives
At the end of this course, you will learn about the followings:
- Designing scripts in Python for data analysis.
- Traditional scientific programming pipelines to transform and visualize data.
- Widely-used Python packages for scientific computing (i.e, NumPy, Scikit-learn, and Matplotlib).
- Main challenges and technical aspects about sequential vs parallel computing.
- Coding high-performance functions written directly in Python using Numba compiler (CPU and GPU).
Table of Contents
Who should attend
This course is aimed at graduate students and postdocs interested in Python programming language, introductory concepts, and their applications of HPC. It requires a basic understanding of programming (e.g., data structures, control statements) or experience in other programming languages (e.g., Matlab, R).
Prerequisites
It requires a basic understanding or experience in programming languages (e.g., Matlab, R). An introduction to Python will be provided to study before the seminar starts.
This course was created for the Goethe Research Academy for Early Career Researchers (GRADE), Goethe University Frankfurt, Germany. June 2022.