In this EN.553.738 Course Project, we study graph-based methods for semi-supervised learning (SSL), presenting the theoretical motivations of these methods, and experimental results on benchmark datasets.
Suppose we have a large number of samples
The main idea here is that we can exploit the total amount of samples we have to understand the underlying geometry of the data, and leverage this information to learn an appropriate classifier
While there are a number of different ways of tackling the semi-supervised learning problem such as generative, consistency-regularization, and pseudo-labeling methods, in this work we restrict our focus to graph-based semi-supervised learning.
We can represent each sample by a vertex in a graph that measures some degree of similarity between the samples. Such a graph may be able to represent the underlying structure of the data and this information can be exploited in the context of the semi-supervised learning problem. After constructing a similarity graph using both labeled and unlabeled samples, the labeled information can be propagated to the unlabeled samples through the learned graph. Note that the large amount of unlabeled samples allows us to construct a rich graph that encodes the geometric structure of the data. This forms the basis for graph-based semi-supervised learning techniques. In this work, we consider 4 such methods and discuss their theoretical motivations, algorithmic implementations, experimental results on several benchmarks, and limitations in detail.
- Laplacian eigenmaps-based SSL [1]
- Eigenfunction-based Regularization on Graph with Function-Adapted Kernels [2]
- Diffusion-based Regularization on Graph with Function Adapted Kernels [2]
- Poisson Learning [3]
Group Members: Ashwin De Silva, Jeremy Welland, Sai Koukuntla, Zhenghan Fang
Course Instructor: Prof. Mauro Maggioni
[project report] [slides]
[1] Mikhail Belkin and Partha Niyogi. Using manifold stucture for partially labeled classification. Advances in neural information processing systems, 15, 2002.
[2] Arthur D Szlam, Mauro Maggioni, and Ronald R Coifman. Regularization on graphs with function-adapted diffusion processes. Journal of Machine Learning Research, 9(8), 2008.
[3] Jeff Calder, Brendan Cook, Matthew Thorpe, and Dejan Slepcev. Poisson learning: Graph based semi-supervised learning at very low label rates, 2020.