The kernel trick is a well known method in machine learning for producing a real-valued measure of similarity between data points in any number of settings. Kernel methods for network analysis provide a way of assigning real values to vertices of the graph. These values may correspond to similarity across any number of graphical properties such as the neighbours they share, or more dynamic context, the influence that change in the state of one vertex might have on another.
By using the kernel trick it is possible to approximate the distribution of features on the vertices of a graph in a way that respects the graphical relationships between vertices. Kernel based methods have long been used, for instance in inferring protein function from other proteins within Protein Interaction Networks (PINs).
The general form of such methods is
where are the vertices of the graph, is the graph kernel, are the non-negative reals, is the vertex feature of interest (e.g. protein annotations) and is the function that approximates the relationship between the vertices and their features.
The Graph Laplacian
The most well known basis for the graph kernel and the one that has received considerable development in biological networks analysis is the so-called graph Laplacian matrix . The Laplacian characterises each vertex of the graph in terms of its adjacency matrix , and degree diagonal matrix , . Informally characterises the flow into and out of each vertex where the row contains firstly the flow into on the diagonal and secondly the flow to other vertices on the off diagonals .
The Diffusion Kernel
Because of its relationship the Laplacian, a Laplacian-based graph kernel characterises the diffusion or flow of information between vertices and is related to other measures of vertex influence such as random walk measures (a property inherited from the Laplacian matrix). Diffusion or heat kernels as these matrices are called, have been used in a variety of contexts, from analysing protein function to comparing fMRI results. The diffusion kernel K can be written as
where is the eigenvector matrix of and is the diagonal matrix of eigenvalues, with scaling parameter (where increasing means increasing the scale or extent of the diffusion process). Approximate feature maps that use diffusion kernels optimise both for good point approximations at a vertex and also for local agreement of with its most influential (read nearby and strongly connected) neighbours.
Relationship with the Heat Equation and Information Diffusion
The diffusion kernel on graphs is actually part of a broader class of kernels which all relate to the heat equation, a PDE describing the physical propagation of heat along a wire. This diffusion is usually thought to occur over a period of time as well as space (hence the in the above equation).
This provides an appealing analogy in which the graphical diffusion kernel really encodes the propensity for propagation of information (heat) between vertices in the graph. This is a desirable property for instance in protein annotation inference where proteins between which information might propagate most easily are also thought to influence each-other biologically. The implication being that increased kernel similarity may indicate increased probability of shared protein function and thus, for instance, shared functional annotations and biochemical properties.
Deeper Analogies with the Heat Equation: Heat Spectral Wavelets
The solution of the physical heat equation can be decomposed into a number of frequency components or spectra in the Fourier domain. Each of these frequencies correspond to a spatial resolution of heat propagation. These too have a graphical analogue in the form of the heat spectral wavelet. These wavelets function just like the Fourier series they are based on, providing a decomposition of the diffusion kernel in terms of wavelets . These wavelets are the projection of onto the vertex . Thus
where selects out the column of . By scaling it is then possible to explore the influence on a vertex from each other vertex and across multiple scales.
Applications: Topological and Time Series Analysis
In the same way that maps into , heat spectral wavelets are vectors that map each vertex into a dimensional space that provides multiscale information on the position of vertex in the graph. This has been used to construct higher order spatial embeddings of graphs, which has interesting applications e.g. in biological network comparison using Topological Data Analysis (TDA) and other geometric analysis methods. More recently, diffusion kernels have been generalised to the time domain where each vertex may have a time series associated with it (e.g. a series of expression level measurements for each protein in the PIN).
Further Reading
Kernel Methods and ‘The Kernel Trick’
Kernel methods in machine learning – kernel machines (2008)
Diffusion kernels on graphs and other discrete Structures
Kernels for Inference of Protein Function & fMRI Analysis
Classifying HCP task-fMRI networks using heat kernels (2016)
Gene function prediction from functional association networks using kernel partial least squares regression (2015)
Diffusion kernel-based logistic regression models for protein function prediction (2006)
Heat Spectral Wavelets and Time Series Analysis
Spectral graph wavelets for structural role similarity in networks (2018)
Tracking network dynamics: A survey using graph distances (2018)
Kernel based reconstruction of space-time functions on dynamic graphs (2017)