Modelling population-level treatment pathways using patient-level administrative health records
- Publication Type:
- Thesis
- Issue Date:
- 2024
Open Access
Copyright Clearance Process
- Recently Added
- In Progress
- Open Access
This item is open access.
Treatment pathways are structured guidelines detailing the sequence of medical interventions for managing specific diseases, such as the progression from diagnosis to surgery and post-operative care for cancer patients. Treatment pathways are regularly revised when more effective treatments are discovered. By retrospectively analysing a patient group's health records, researchers can determine the most frequently used treatment regimens for that population. This process, known as pathway inference, holds enormous potential, yet it remains an under-researched area. This thesis delves deeply into the challenges surrounding pathway inference and makes three key contributions to the field.
The first contribution highlights and examines the two methods used to analyse administrative healthcare records (AHRs): direct and indirect analytics. While direct analytics analyses attributes of AHRs, indirect analytics uses AHRs to infer clinical events that are not directly observed. By placing pathway inference within this framework, this contribution elucidates the shortcomings of existing pathway inference studies and highlights the key challenges faced by the field. Specifically, the clinical data in AHRs is incompatible with treatment pathways, creating a 'semantic gap' between the two. This thesis addresses the two main problems stemming from this gap.
Due to the semantic gap, it is not possible to compare or validate pathway inference solutions. The second contribution, CatSyn, pioneers the quantitative evaluation of these techniques using synthetic data-based ground-truth pathways. I validated CatSyn's ability to synthesise realistic tabular AHRs, and, through its use, found that existing pathway inference techniques capture AHRs' semantic content well, yet they crucially fall short in leveraging the temporal information necessary for accurate pathway inference.
Additionally, the semantic gap revealed that the temporal information contained within AHRs is inherently complex and difficult to harness. Our third contribution, Defrag, introduces a state-of-the-art, neural network-based (NN) approach for pathway inference. Defrag utilises a novel, self-supervised learning objective that is specifically engineered to capture both the semantic and temporal context of AHR event sequences. Extensive ablation studies revealed that the synergy between Defrag's NN-based approach and its unique loss function enables it to outperform other methods and alternate NN configurations. I also demonstrated Defrag's effectiveness by identifying best practice pathway fragments for breast cancer, lung cancer, and melanoma in the real-world MIMIC-IV dataset.
Through the innovative methods introduced in this research, this thesis supports future innovations in pathway inference. This research paves the way for population-level treatment analysis, more accurate and efficient treatment planning, and, ultimately, better patient outcomes.
Please use this identifier to cite or link to this item: