Understanding the roles of sub-graph features for graph classification: An empirical study perspective

Publication Type:
Conference Proceeding
International Conference on Information and Knowledge Management, Proceedings, 2013, pp. 817 - 822
Issue Date:
Filename Description Size
Thumbnail2013001513OK.pdf1.12 MB
Adobe PDF
Full metadata record
Graph classification concerns the learning of discriminative models, from structured training data, to classify previously unseen graph samples into specific categories, where the main challenge is to explore structural information in the training data to build classifiers. One of the most common graph classification approaches is to use sub-graph features to convert graphs into instance-feature representations, so generic learning algorithms can be applied to derive learning models. Finding good sub-graph features is regarded as an important task for this type of learning approaches, despite that there is no comprehensive understanding on (1) how effective subgraph features can be used for graph classification? (2) how many sub-graph features are sufficient for good classification results? (3) does the length of the sub-graph features play major roles for classification? and (4) whether some random sub-graphs can be used for graph representation and classification? Motivated by the above concerns, we carry out empirical studies on four real-world graph classification tasks, by using three types of sub-graph features, including frequent sub-graphs, frequent subgraph selected by using information gain, and random sub-graphs, and by using two types of learning algorithms including Support Vector Machines and Nearest Neighbour. Our experiments show that (1) the discriminative power of sub-graphs varies by their sizes; (2) random sub-graphs have a reasonably good performance; (3) number of sub-graphs is important to ensure good performance; and (4) increasing number of sub-graphs reduces the difference between classifiers built from different sub-graphs. Our studies provide a practical guidance for designing effective sub-graph based graph classification methods. Copyright 2013 ACM.
Please use this identifier to cite or link to this item: