Causality for Interpretable Machine Learning

Publication Type:
Thesis
Issue Date:
2023
Full metadata record
The past few years have borne witness to a marked surge in the adoption of machine learning (ML) techniques across a broad spectrum of fields, such as image analysis, text categorization, predictive credit scoring, and recommendation systems, among others. These techniques have made significant strides in various sectors, yet there is a growing concern among researchers about the “black-box” nature intrinsic to these methods. As a consequence, the need for interpreting machine learning models has taken center stage in scholarly debates. However, conventional approaches to machine learning interpretability have primarily focused on associative relationships rather than causal ones. This study seeks to bridge the existing gap in the causal interpretation of machine learning models by developing and enhancing both causal inference and counterfactual methodologies. Initially, it offers a comprehensive review of the causal analysis techniques utilized in machine learning models. Following this, the research proposes an innovative approach to causal inference, one that is anchored in the concept of dynamic propensity scores. In the context of counterfactual explanation, the study brings forward two strategies: one that prioritizes causality to safeguard the causal bonds within counterfactual instances, and another that utilizes a framework based on normalizing flows, designed to yield scalable and robust counterfactual samples. Concerning counterfactual fairness, the study aspires to formulate a min-max strategy designed to achieve counterfactual fairness even within an imperfect structural causal model. Collectively, this research is committed to enhancing the interpretability of machine learning models through the provision of causal explanations and counterfactual analyses.
Please use this identifier to cite or link to this item: