Unsupervised multi-author document decomposition based on hidden Markov model

Aldebei, K; He, X; Jia, W; Yang, J

Unsupervised multi-author document decomposition based on hidden Markov model

Aldebei, K He, X

Jia, W

Yang, J

Permalink

Publication Type:: Conference Proceeding
Citation:: 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers, 2016, 2 pp. 706 - 714
Issue Date:: 2016-01-01

Closed Access

	Filename	Description	Size
	20160530.pdf	Accepted Manuscript version	743.65 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Aldebei, K	en_US
dc.contributor.author	He, X https://orcid.org/0000-0001-8962-540X	en_US
dc.contributor.author	Jia, W https://orcid.org/0000-0002-0940-3338	en_US
dc.contributor.author	Yang, J	en_US
dc.date.available	2016-05-25	en_US
dc.date.issued	2016-01-01	en_US
dc.identifier.citation	54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers, 2016, 2 pp. 706 - 714	en_US
dc.identifier.isbn	9781510827585	en_US
dc.identifier.uri	http://hdl.handle.net/10453/120112
dc.description.abstract	© 2016 Association tor Computational Linguistics. This paper proposes an unsupervised approach for segmenting a multiauthor document into authorial components. The key novelty is that we utilize the sequential patterns hidden among document elements when determining their authorships. For this purpose, we adopt Hidden Markov Model (HMM) and construct a sequential probabilistic model to capture the dependencies of sequential sentences and their authorships. An unsupervised learning method is developed to initialize the HMM parameters. Experimental results on benchmark datasets have demonstrated the significant benefit of our idea and our approach has outperformed the state-of-the-arts on all tests. As an example of its applications, the proposed approach is applied for attributing authorship of a document and has also shown promising results.	en_US
dc.relation.ispartof	54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers	en_US
dc.title	Unsupervised multi-author document decomposition based on hidden Markov model	en_US
dc.type	Conference Proceeding
utslib.citation.volume	2	en_US
utslib.for	080104 Computer Vision	en_US
utslib.for	0805 Distributed Computing	en_US
utslib.for	080106 Image Processing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
pubs.organisational-group	/University of Technology Sydney/Strength - CRIN - Realtime Information Networks
pubs.organisational-group	/University of Technology Sydney/Strength - GBDTC - Global Big Data Technologies
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US
pubs.volume	2	en_US

Abstract:

© 2016 Association tor Computational Linguistics. This paper proposes an unsupervised approach for segmenting a multiauthor document into authorial components. The key novelty is that we utilize the sequential patterns hidden among document elements when determining their authorships. For this purpose, we adopt Hidden Markov Model (HMM) and construct a sequential probabilistic model to capture the dependencies of sequential sentences and their authorships. An unsupervised learning method is developed to initialize the HMM parameters. Experimental results on benchmark datasets have demonstrated the significant benefit of our idea and our approach has outperformed the state-of-the-arts on all tests. As an example of its applications, the proposed approach is applied for attributing authorship of a document and has also shown promising results.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/44018