Abstract: Semi-supervised learning is the problem of finding missing labels; more precisely one has a data set of feature vectors of which a (often small) subset are labelled. The semi-supervised learning assumption is that similar feature vectors should have similar labels, implying that one needs a geometry on the set of feature vectors. A typical way to represent this geometry is via a graph where the nodes are the feature vectors and the edges are weighted by some measure of similarity. Laplace learning is a popular graph-based method for solving the semi-supervised learning problem which essentially requires one to minimise a Dirichlet energy defined on the graph (hence the Euler-Lagrange equation is Laplace's equation). However, at low labelling rates Laplace learning typically performs poorly. This is due to the lack of regularity, or the ill-posedness, of solutions to Laplace's equation in any dimension higher (or equal to) two. The random walk interpretation allows one to characterise how close one is to entering the ill-posed regime. In particular, it allows one to give a lower bound on the number of labels required and even provides a route for correcting the bias. Correcting the bias leads to a new method, called Poisson learning. Finally, the ideas behind correcting the bias in Laplace learning have motivated a new graph neural network architecture which do not suffer from the over-smoothing phenomena. In particular, this type of neural network, which we call GRAND++ (GRAph Neural Diffusion with a source term) enables one to use deep architectures. This is joint work with Jeff Calder, Dejan Slepčev, Brendan Cook, Tan Nguyen, Hedi Xia, Thomas Strohmer, Andrea Bertozzi, Stanley Osher and Bao Wang.
- Tags
-