Machine-Learning-and-Real-world-Data

This is a course taught at University of Cambridge. I want to present my implementations of the machine learning and data analysis algorithms.

Aims

This course introduces machine learning algorithms as used in real-world applications, and the experimental methodology necessary to perform statistical analysis of large-scale data from unpredictable processes. It consists of 3 extended practicals, as follows:

Statistical classification: Determining movie review sentiment using Naive Bayes (7 sessions);
Sequence Analysis: Hidden Markov Modelling and its application to a task from biology (predicting protein interactions with a cell membrane) (4 sessions);
Analysis of social networks, including detection of cliques and central nodes (5 sessions).

Syllabus

Topic One: Statistical Classification [7 sessions]. Introduction to sentiment classification. Naive Bayes parameter estimation. Statistical laws of language. Statistical tests for classification tasks. Cross-validation and test sets. Uncertainty and human agreement.
Topic Two: Sequence Analysis [4 sessions]. Hidden Markov Models (HMM) and HMM training. The Viterbi algorithm. Using an HMM in a biological application.
Topic Three: Social Networks [5 sessions]. Properties of networks: Degree, Diameter. Betweenness Centrality. Clustering using betweenness centrality.

Objectives

By the end of the course you should be able to:

understand and program two simple supervised machine learning algorithms;
use these algorithms in statistically valid experiments, including the design of baselines, evaluation metrics, statistical testing of results, and provision against overtraining;
visualise the connectivity and centrality in large networks;
use clustering (i.e., a type of unsupervised machine learning) for detection of cliques in unstructured networks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Machine-Learning-and-Real-world-Data

Aims

Syllabus

Objectives

Files

README.md

Latest commit

History

README.md

File metadata and controls

Machine-Learning-and-Real-world-Data

Aims

Syllabus

Objectives