Homophily, structure, and content augmented network representation learning

Zhang, D; Yin, J; Zhu, X; Zhang, C

Homophily, structure, and content augmented network representation learning

Zhang, D

Yin, J Zhu, X Zhang, C

Permalink

Publication Type:: Conference Proceeding
Citation:: Proceedings - IEEE International Conference on Data Mining, ICDM, 2017, pp. 609 - 618
Issue Date:: 2017-01-31

Closed Access

	Filename	Description	Size
	HSCA.pdf	Published version	306 kB		View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Zhang, D https://orcid.org/0000-0002-1803-5768	en_US
dc.contributor.author	Yin, J	en_US
dc.contributor.author	Zhu, X	en_US
dc.contributor.author	Zhang, C https://orcid.org/0000-0001-5715-7154	en_US
dc.date.issued	2017-01-31	en_US
dc.identifier.citation	Proceedings - IEEE International Conference on Data Mining, ICDM, 2017, pp. 609 - 618	en_US
dc.identifier.isbn	9781509054725	en_US
dc.identifier.issn	1550-4786	en_US
dc.identifier.uri	http://hdl.handle.net/10453/102723
dc.description.abstract	© 2016 IEEE. Advances in social networking and communication technologies have witnessed an increasing number of applications where data is not only characterized by rich content information, but also connected with complex relationships representing social roles and dependencies between individuals. To enable knowledge discovery from such networked data, network representation learning (NRL) aims to learn vector representations for network nodes, such that off-The-shelf machine learning algorithms can be directly applied. To date, existing NRL methods either primarily focus on network structure or simply combine node content and topology for learning. We argue that in information networks, information is mainly originated from three sources: (1) homophily, (2) topology structure, and (3) node content. Homophily states social phenomenon where individuals sharing similar attributes (content) tend to be directly connected through local relational ties, while topology structure emphasizes more on global connections. To ensure effective network representation learning, we propose to augment three information sources into one learning objective function, so that the interplay roles between three parties are enforced by requiring the learned network representations (1) being consistent with node content and topology structure, and also (2) following the social homophily constraints in the learned space. Experiments on multi-class node classification demonstrate that the representations learned by the proposed method consistently outperform state-of-The-Art NRL methods, especially for very sparsely labeled networks.	en_US
dc.relation	http://purl.org/au-research/grants/arc/DP140102206
dc.relation.ispartof	Proceedings - IEEE International Conference on Data Mining, ICDM	en_US
dc.relation.isbasedon	10.1109/ICDM.2016.139	en_US
dc.title	Homophily, structure, and content augmented network representation learning	en_US
dc.type	Conference Proceeding
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/DVC (International)
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAI - Advanced Analytics Institute Research Centre
pubs.organisational-group	/University of Technology Sydney/Strength - ACRI - Australia China Relations Institute
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
pubs.organisational-group	/University of Technology Sydney/Students
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US

Abstract:

© 2016 IEEE. Advances in social networking and communication technologies have witnessed an increasing number of applications where data is not only characterized by rich content information, but also connected with complex relationships representing social roles and dependencies between individuals. To enable knowledge discovery from such networked data, network representation learning (NRL) aims to learn vector representations for network nodes, such that off-The-shelf machine learning algorithms can be directly applied. To date, existing NRL methods either primarily focus on network structure or simply combine node content and topology for learning. We argue that in information networks, information is mainly originated from three sources: (1) homophily, (2) topology structure, and (3) node content. Homophily states social phenomenon where individuals sharing similar attributes (content) tend to be directly connected through local relational ties, while topology structure emphasizes more on global connections. To ensure effective network representation learning, we propose to augment three information sources into one learning objective function, so that the interplay roles between three parties are enforced by requiring the learned network representations (1) being consistent with node content and topology structure, and also (2) following the social homophily constraints in the learned space. Experiments on multi-class node classification demonstrate that the representations learned by the proposed method consistently outperform state-of-The-Art NRL methods, especially for very sparsely labeled networks.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/102723