Heterogeneous network analysis on academic collaboration networks
- Publication Type:
- Issue Date:
Heterogeneous networks are a type of complex network model which can have multi-type objects and relationships. Nowadays, research on heterogeneous networks has been increasingly attracting interest because these networks are more advantageous in modeling real-world situations than traditional networks, that is homogenous networks, that can only have one type of object and relationship. For example, the network of Facebook has vertices including photographs, companies, movies, news and messages and different relationships among these objects. Besides that, heterogeneous networks are especially useful for representing complex abstract concepts, such as friendship and academic collaboration. Because these concepts are hard to measure directly, heterogeneous networks are able to represent these abstract concepts by concrete and measurable objects and relationships. Because of these features, heterogeneous networks are applied in many areas including social networks, the World Wide Web, research publication networks and so on. This motivates the thesis to work on network analysis in the context of heterogeneous networks. In the past, homogeneous networks were the research focus of network analysis and therefore many methods proposed by previous studies for social network analysis were designed for homogenous networks. Although heterogeneous networks can be considered as an extension of homogenous networks, most of these methods are not applicable on heterogeneous networks because these methods can only address one type of object and relationships instead of dealing with multi-type ones. In network analysis, there are three basic problems including community detection, link prediction and object ranking. These three questions are the basis of many practical questions, such as network structure extraction, recommendation systems and search engines. Community detection, also called clustering, aims to find the community structure of a network including subgroups of vertices that are closely related, which can facilitate people to understand the structure of networks. Link prediction is a task for finding links which are currently non-existent in networks but may appear in the future. Object ranking can be viewed as an object evaluation task which aims to order a set of objects based on their importance, relevance, or other user defined criteria. In addition to these three research issues, approaches for determining the number of clusters a priori is also important because it can improve the quality of community detection significantly. This thesis works on heterogeneous network and proposes a set of methods to address the four main research problems in network analysis including community detection, determining the number of clusters, link prediction and object ranking. There are four contributions in this thesis. Contribution 1 proposes a Multiple Semantic-path Clustering method which can facilitate users to achieve a desired clustering in heterogeneous networks. Contribution 2 develops a Leader Detection and Grouping Clustering method which can determine the number of clusters a priori, thereby improving the quality of clustering. Contribution 3 introduces a Network Evolution-based Link Prediction method which can improve link prediction accuracy by modeling evolution patterns of objects. Contribution 4 proposes a co-ranking method which can work on complex bipartite heterogeneous networks where one type of vertex can connect to themselves directly and indirectly. The performance of all developed methods in the thesis in terms of clustering quality, link prediction accuracy and ranking effectiveness, is evaluated in the context of a research management dataset of University of Technology, Sydney (UTS) and public bibliographic DBLP (DataBase systems and Logic Programming) dataset. Moreover, all the results of the proposed methods in this thesis are compared with state-of-the-art methods and these experimental results suggest that the proposed methods outperform these state-of-the-art methods in quantitative and qualitative analysis.
Please use this identifier to cite or link to this item: