Explicitly and implicitly exploiting the hierarchical structure for mining website interests on news events

Publication Type:
Journal Article
Citation:
Information Sciences, 2017, 420 pp. 263 - 277
Issue Date:
2017-12-01
Metrics:
Full metadata record
Files in This Item:
Filename Description Size
Xuan explicitly.pdfAccepted manuscript version3.99 MB
Adobe PDF
© 2017 Elsevier Inc. After a news event, many different websites publish coverage of that event, each expressing their own unique commentary, perspectives, and viewpoints. Websites form around a specific set of interests to cater to different audiences, and discovering these interests can help audiences C especially people and organizations that are interested in news C select the most appropriate websites to use as their sources of information. This paper presents three methods for formally defining and mining a websites interests, each of which is explicitly or implicitly based on a hierarchial structure: website-webpage-keyword. The first, and most straightforward, method explicitly uses keyword-layer network communities and the mapping relations between websites and keywords. The second method expands upon the first method with an iterative algorithm that combines both the mapping relations and the network relations from the website-webpage-keyword structure to further refine the keyword-layer network communities. In the third method, a website topic model implicitly captures the mapping relations among the websites, webpages, and keywords. The performance of three proposed methods in website interest mining is compared using a bespoke evaluation metric. The experimental results show that the iterative procedure designed in the second method is able to improve website interest mining performance, and the website topic model in the third method achieves the best performance among the three methods.
Please use this identifier to cite or link to this item: