An introduction to clustering and the expectation maximisation algorithm Part 2 – Richard Turner Microsoft Research Ltd

April 11, 2019 @ 9:30 am – 11:00 am
Microsoft Research Ltd, 21 Station Road, Cambridge
Microsoft Research Cambridge Talks Admins

Clustering methods assign ‘similar’ data points to the same cluster, and ‘dissimilar’ data points to different clusters. They find application in a diverse range of application areas including data-driven understanding of disease sub-types, identification of communities in social networks, and email spam filtering. Clustering is therefore one of the central tasks in unsupervised machine learning.

In the second lecture I will describe how learning in this model will be handled through the Expectation Maximisation (EM) algorithm which can be deployed to many latent variable models. We will cover this from the general variational view point connecting to the wider class of variational inference methods. Finally we will look at the behaviour of the mixture of Gaussians EM algorithm and identify strengths and weaknesses of the approach.