Cross-domain learning for network representations

Publication Type:
Thesis
Issue Date:
2019
Full metadata record
Network representation aims to learn a latent feature space so that artificial intelligent algorithms can be applied based on the latent features. The set of latent features is obtained from the information hidden behind network structures, which is learned to provide knowledge for traditional machine learning tasks, such as node classification, recommendation and data visualization. Networks, which are a kind of structured data, limit the representation performance in the structure searching process. Therefore, a good node sampling strategy plays an important role in network representation. Recent research has driven significant progress in network representation by employing random walk as the network sampling strategy. However, real-world large-scale information networks naturally have structural sparsity. The existing approaches to random walk-based network representations are in the domain-specific view to represent the nodes in a vector format, which cannot guarantee a good representation by one network knowledge learning. To address these gaps, this research proposes a framework and develops two algorithms to adapt useful information across relational large-scale information networks and allows the information of the network structure to be transferred from one network to another network to improve the performance of network representation. First, a novel framework of transferring structures across large-scale information networks (FTLSIN) is proposed. FTLSIN consists of a two-layer random walk to measure the relations between two networks and predict the links across them. Second, a cross-domain network representation algorithm (CDNR) is proposed to demonstrate the knowledge which transfers across domains. CDNR learns the structural information from dense networks to sparse networks and further defines the two-layer random walk in unsupervised feature learning with a cross-domain node mapping procedure and a cross-domain walk mapping procedure. Thirdly, a cross-domain similarity learning algorithm (CDSL) is proposed to acquire the most relevant knowledge from the external network. CDSL is nested in the biased random walk-based node sampling and targets the minimum cost of searching the neighborhood in the biased random walk that considers the first-order and second-order walking; and the neighborhood is described by a dual centrality indicator which consists of closeness centrality and betweenness centrality. The developed framework and the two algorithms are very innovative and significantly contribute to both fields of transfer learning and network representation.
Please use this identifier to cite or link to this item: