Exploring Associations within Disease-Gene Pairs: Bibliometrics, Word Embedding, and Network Analytics

Publisher:
IEEE
Publication Type:
Conference Proceeding
Citation:
2022 Portland International Conference on Management of Engineering and Technology (PICMET), 2022, 00, pp. 1-7
Issue Date:
2022-09-14
Filename Description Size
Exploring associations within disease-gene pairs.pdfSubmitted version521.4 kB
Adobe PDF
Full metadata record
Topic extraction and relationship identification are attracting increasing interests from the bibliometric community, as well as from relevant fields in biomedicine. Recently many biomedical studies reveal the pairwise associations between various genes and diseases, which lead to the problem of predicting and investigating new emerging pairs. This paper proposes a method to generate disease-gene pair prediction and ranking, based on both semantic similarities between textual contexts and topological similarities between nodes within a disease-gene network. Specifically, genes and diseases are identified via a term clumping process and the association strengths are calculated based on co-occurrence frequency and a pre-trained Word2Vec model. Meanwhile, an integrated disease-gene network is constructed and we capture potential emerging disease-gene pairs through a modified link prediction approach. We applied the proposed method to a dataset with 27,727 scientific articles in the atrial fibrillation area to demonstrate the reliability of the model. The empirical insights derived from the case highlight implicit associations within those highly ranked disease-gene pairs and provide references for stakeholders in cardiovascular areas.
Please use this identifier to cite or link to this item: