A Pareto-smoothing method for causal inference using generalized Pareto distribution

Publication Type:
Journal Article
Neurocomputing, 2020, 378, pp. 142-152
Issue Date:
Filename Description Size
1-s2.0-S0925231219314717-main.pdfPublished version1.39 MB
Adobe PDF
Full metadata record
© 2019 Elsevier B.V. Causal inference aims to estimate the treatment effect of an intervention on the target outcome variable and has received great attention across fields ranging from economics and statistics to machine learning. Observational causal inference is challenging because the pre-treatment variables may influence both the treatment and the outcome, resulting in confounding bias. The classic inverse propensity weighting (IPW) estimator is theoretically able to eliminate the confounding bias. However, in observational studies, the propensity scores used in the IPW estimator must be estimated from finite observational data and may be subject to extreme values, leading to the problem of highly variable importance weights, which consequently makes the estimated causal effect unstable or even misleading. In this paper, by reframing the IPW estimator in the importance sampling framework, we propose a Pareto-smoothing method to tackle this problem. The generalized Pareto distribution (GPD) from extreme value theory is used to fit the upper tail of the estimated importance weights and to replace them using the order statistics of the fitted GPD. To validate the performance of the new method, we conducted extensive experiments on simulated and semi-simulated datasets. Compared with two existing methods for importance weight stabilization, i.e., weight truncation and self-normalization, the proposed method generally achieves better performance in settings with a small sample size and high-dimensional covariates. Its application on a real-world heath dataset indicates its utility in estimating causal effects for program evaluation.
Please use this identifier to cite or link to this item: