Benchmarking large language models for supply chain risk identification: an extended evaluation within the LARD-SC framework

Publisher:
SPRINGERNATURE
Publication Type:
Journal Article
Citation:
Service Oriented Computing and Applications, 2025
Issue Date:
2025-01-01
Full metadata record
Operational resilience in modern global supply chains depends on timely and accurate identification of emerging risks. While daily news has become a primary source for such insights, the sheer volume and unstructured nature of these data pose significant analytical challenges, requiring advanced tools to extract relevant and actionable information. This paper introduces an extended evaluation of the LARD-SC framework, a service-oriented architecture for supply chain risk management, by benchmarking five diverse variants of the large language model (LLM) in their capacity to detect, classify, and interpret risks. Drawing on a curated set of 120 real-world news articles on Apple’s Tier 1 suppliers, we adopt a standardized, prompt-based assessment to compare GPT-3.5 turbo, GPT-4o, GPT-4o mini, Claude 3.5 Sonnet, and Claude 3.5 Haiku. Using expert-reviewed metrics, namely the Risk Validation Rate (RVR), Potential Risk Rate (PRR), and False Identification Rate (FIR), we derive a comprehensive Relative Performance Index (RPI) for comparison. Our analysis confirms that advanced GPT-4o variants produce the most consistent accurate risk identifications, achieving higher proportions of validated outcomes while minimizing false positives. Through these results, we highlight the significant promise of LLM-driven analytics for early risk detection in complex supply chains, along with practical considerations such as the influence of prompt engineering, interpretability demands, and the impact of data availability. The findings offer a blueprint for organizations seeking to improve resilience by systematically harnessing the capabilities of LLM within service-oriented risk management ecosystems.
Please use this identifier to cite or link to this item: