Predictive risk modelling under different data access scenarios: Who is identified as high risk and for how long?

Publication Type:
Journal Article
BMJ Open, 2018, 8 (2)
Issue Date:
Full metadata record
© Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. Objective: This observational study critically explored the performance of different predictive risk models simulating three data access scenarios, comparing: (1) sociodemographic and clinical profles; (2) consistency in high-risk designation across models; and (3) persistence of high-risk status over time. Methods: Cross-sectional health survey data (2006-2009) for more than 260 000 Australian adults 45+ years were linked to longitudinal individual hospital, primary care, pharmacy and mortality data. Three risk models predicting acute emergency hospitalisations were explored, simulating conditions where data are accessed through primary care practice management systems, or through hospital-based electronic records, or through a hypothetical 'full' model using a wider array of linked data. High-risk patients were identified using different risk score thresholds. Models were reapplied monthly for 24 months to assess persistence in high-risk categorisation. Results: The three models displayed similar statistical performance. Three-quarters of patients in the high-risk quintile from the 'full' model were also identified using the primary care or hospital-based models, with the remaining patients differing according to age, frailty, multimorbidity, self-rated health, polypharmacy, prior hospitalisations and imminent mortality. The use of higher risk prediction thresholds resulted in lower levels of agreement in highrisk designation across models and greater morbidity and mortality in identified patient populations. Persistence of high-risk status varied across approaches according to updated information on utilisation history, with up to 25% of patients reassessed as lower risk within 1 year. Conclusion/implications: Small differences in risk predictors or risk thresholds resulted in comparatively large differences in who was classified as high risk and for how long. Pragmatic predictive risk modelling design decisions based on data availability or projected high-risk patient numbers may therefore influence individuals identified as high-risk, overall case mix and risk persistence. Routine data linkage would enable greater flexibility in developing and optimising predictive risk models appropriate to both case-finding and performance measurement applications.
Please use this identifier to cite or link to this item: