SMGI: Sparse Multiple Graph Integration


Description

Graph-based approaches have been most successful in semisupervised learning. In this paper, we focus on label propagation in graph-based semisupervised learning. One essential point of label propagation is that the performance is heavily affected by incorporating underlying manifold of given data into the input graph. The other more important point is that in many recent real-world applications, the same instances are represented by multiple heterogeneous data sources. A key challenge under this setting is to integrate different data representations automatically to achieve better predictive performance. In this paper, we address the issue of obtaining the optimal linear combination of multiple different graphs under the label propagation setting. For this problem, we propose a new formulation with the sparsity (in coefficients of graph combination) property which cannot be rightly achieved by any other existing methods. This unique feature provides two important advantages: 1) the improvement of prediction performance by eliminating irrelevant or noisy graphs and 2) the interpretability of results, i.e., easily identifying informative graphs on classification. We propose efficient optimization algorithms for the proposed approach, by which clear interpretations of the mechanism for sparsity is provided. Through various synthetic and two real-world data sets, we empirically demonstrate the advantages of our proposed approach not only in prediction performance but also in graph selection ability.


SMGI automatically selects effective graphs for label node classification.

Implementation

Download link of SMGI is here

References


Masayuki Karasuyama (karasuyama [at] nitech.ac.jp)

Department of Computer Science, Nagoya Institute of Technology, Japan