This course is an intensive introduction to the most widely-used machine learning methods.
- The first goal is to provide a basic intuitive understanding of these techniques: what they are good for, how they work, how they relate to one another, and their strengths and weaknesses.
- The second goal is to provide a hands-on feel for these methods through experiments with suitable data sets, using Jupyter notebooks.
- The third goal is to understand machine learning methods at a deeper level by delving into their mathematical underpinnings. This is crucial to being able to adapt and modify existing methods and to creatively combining them.
- Taxonomy of prediction problems
- Basics of Linear Algebra and Probability
- Nearest neighbor methods and families of distance functions
- Generalization: what it means; overfitting; selecting parameters using cross-validation
- Generative modeling for classification, especially using the multivariate Gaussian
- Linear regression and its variants
- Logistic regression
- Optimization: deriving stochastic gradient descent algorithms and testing convexity
- Linear classification using the support vector machine
- Nonlinear modeling using basis expansion and kernel methods
- Decision trees, boosting, and random forests
- Methods for flat and hierarchical clustering
- Principal component analysis
- Autoencoders, distributed representations, and deep learning