Spring 2018 / DM843/DM856
Unsupervised Learning

General Information

One trend can be observed over almost all fields of informatics: we have to cope with an ever-increasing amount of available data of all kinds. This amount of data renders it impossible to inspect the dataset "by hand", or even deduce knowledge from the given data, without sophisticated computer aided help. In this course we will discuss one of the most common mechanism of unsupervised machine learning for investigating datasets: Clustering. Clustering separates a given dataset into groups of similar objects, the clusters, and thus allows for a better understanding of the data and their structure. We discuss a number of clustering methods and their application to various different fields such as biology, economics or sociology.

Lectures

# Date Content Slides Comments
1 Mon, 05.02.2018 Introduction here
2 Fri, 09.02.2018 Mathematical Foundations here
3 Mon, 12.02.2018 Detecting Clusters Graphically here
4 Fri, 16.02.2018 PCA here
5 Mon, 19.02.2018 Proximity Measures here
6 Fri, 23.02.2018
7 Mon, 26.02.2018
8 Fri, 02.03.2018
9 Mon, 05.03.2018
10 Fri, 09.03.2018
11 Mon, 12.03.2018
12 Fri, 16.03.2018
13 Mon, 19.03.2018
14 Fri, 23.03.2018

Exercises

# Date Questions Download Solutions
1 Thu, 15.02.2018 Introduction to R here
2 Thu, 22.02.2018 PCA & PCoA here
UK Dataset
Denmark
Germany
3 Thu, 01.03.2018
4 Thu, 08.03.2018
5 Thu, 15.03.2018
6 Thu, 22.03.2018

Materials

  • All lecture slides are relevant for the exams.
  • All readings noted in the lecture list are relevant for the exam.
  • Brian S. Everitt, Sabine Landau, Morven Leese, Daniel Stahl, Cluster Analysis, 5th Edition, ISBN: 9780470749913
  • A good introduction to R: here
  • R Cheat-Sheets: here