An Intelligent Bibliometric System for Knowledge Association and Hierarchy Discovery

Publication Type:
Thesis
Issue Date:
2023
Full metadata record
Unravelling the intricate knowledge patterns and uncovering the underlying intelligence concealed within scientific literature constitutes a persistent objective for the bibliometric and data mining research communities. The rapid proliferation of scientific publications, coupled with the increasing prevalence of cross-/multi-/inter-disciplinary collaborations and the expanding scope of knowledge, pose continuous challenges for scholars seeking to remain abreast of the latest advancements and attain a comprehensive comprehension of their respective domains. Present knowledge mining methodologies encounter difficulties in flexibly accommodating diverse emerging demands, often necessitating prior expertise from domain specialists to achieve effective analysis, thereby impeding their practicality in real-world knowledge analysis tasks. Aiming to contribute more adaptive and feasible knowledge mining approaches, this thesis incorporates bibliometric and management theories, data mining and natural language processing techniques (i.e., intelligent bibliometrics) to construct an intelligent bibliometric system for 1) knowledge association analysis and inference and 2) knowledge hierarchy extraction and characterisation. The system consists of two methodologies, with the scientific literature corpora as the input and bioentity rankings, bioentity association predictions and topic hierarchy visualisations as the output. The first methodology is a heterogeneous bioentity analysis methodology (HBAM), which focuses on the biomedical domain and provides a literature-based knowledge discovery approach that ranks extracted bioentities and predicts undiscovered bioentity associations. This methodology leverages bioentities' heterogeneity and latent semantic similarities to facilitate more comprehensive bioentity ranking and more accurate entity association prediction. The second methodology focuses on knowledge hierarchies and develops two hierarchical topic tree (HTT) models to extract and visualise topic hierarchies from scientific literature data adaptively. The two models can generate consistent research topics and solid parent-child topic relationships, with the latter refined as parameter-free and has better adaptivity. Lastly, the constructed intelligent bibliometric system integrates the proposed methods and a work pipeline, a Python-developed graphical user interface is then developed to provide an accessible for non-technical background users to conduct customised analysis. Academic researchers, policymakers, and entrepreneurs in certain domains can benefit from the system's ability to uncover knowledge associations and profile knowledge hierarchies for informed decision-making.
Please use this identifier to cite or link to this item: