An Approach of Hierarchical Concept Clustering on Medical Short Text Corpus

Publisher:
IEEE
Publication Type:
Conference Proceeding
Citation:
2013 6th International Conference on Biomedical Engineering and Informatics (BMEI 2013), 2013, pp. 509 - 518
Issue Date:
2013-01
Full metadata record
Files in This Item:
Filename Description Size
Thumbnail2013004450OK.pdf824.89 kB
Adobe PDF
Hierarchical clustering and conceptual clustering are two important types of clustering analysis methods. A variety of approaches have been proposed in previous works. However, seldom methods are designed to run on the medical short text database and construct a hierarchical concept taxonomy. This paper proposes a new clustering method of Hierarchical Concept Clustering on Medical Short Text corpus (HCCST), which presents a new solution on actionable disease taxonomy construction from the actual medical data. Our approach has three advantages. Firstly, HCCST takes a new similarity method which covers all the problems in medical short text distance computing. Secondly, an adaptive clustering method is proposed for synonymous disease names without predefining the size of clusters. Thirdly, this paper uses a mutual information based potential hierarchy concept pair recognition method which improves the subsumption method to create hierarchical disease taxonomy. The evaluation is conducted on Chinese medical disease name text data set and the result shows that HCCST achieves satisfactory performance.
Please use this identifier to cite or link to this item: