A Causal Dirichlet Mixture Model for Causal Inference from Observational Data

Lin, A; Lu, J; Xuan, J; Zhu, F; Zhang, G

A Causal Dirichlet Mixture Model for Causal Inference from Observational Data

Lin, A Lu, J

Xuan, J

Zhu, F

Zhang, G

Permalink

Publisher:: Association for Computing Machinery (ACM)
Publication Type:: Journal Article
Citation:: ACM Transactions on Intelligent Systems and Technology, 2020, 11, (3), pp. 1-29
Issue Date:: 2020-05-01

Closed Access

	Filename	Description	Size
	3379500.pdf	Published Version	876.28 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Lin, A
dc.contributor.author	Lu, J https://orcid.org/0000-0003-0690-4732
dc.contributor.author	Xuan, J https://orcid.org/0000-0002-8367-6908
dc.contributor.author	Zhu, F https://orcid.org/0000-0001-8089-4769
dc.contributor.author	Zhang, G https://orcid.org/0000-0003-3960-0583
dc.date.accessioned	2020-10-14T05:39:33Z
dc.date.available	2020-10-14T05:39:33Z
dc.date.issued	2020-05-01
dc.identifier.citation	ACM Transactions on Intelligent Systems and Technology, 2020, 11, (3), pp. 1-29
dc.identifier.issn	2157-6904
dc.identifier.issn	2157-6912
dc.identifier.uri	http://hdl.handle.net/10453/143237
dc.description.abstract	© 2020 ACM. Estimating causal effects by making causal inferences from observational data is common practice in scientific studies, business decision-making, and daily life. In today's data-driven world, causal inference has become a key part of the evaluation process for many purposes, such as examining the effects of medicine or the impact of an economic policy on society. However, although the literature contains some excellent models, there is room to improve their representation power and their ability to capture complex relationships. For these reasons, we propose a novel prior called Causal DP and a model called CDP. The prior captures the complex relationships between covariates, treatments, and outcomes in observational data using a rational probabilistic dependency structure. The model is Bayesian, nonparametric, and generative and is not based on the assumption of any parametric distribution. CDP is designed to estimate various kinds of causal effects - average, conditional average, average treated, quantile, and so on. It performs well with missing covariates and does not suffer from overfitting. Comparative experiments on synthetic datasets against several state-of-the-art methods demonstrate that CDP has a superior ability to capture complex relationships. Further, a simple evaluation to infer the effect of a job training program on trainee earnings from real-world data shows that CDP is both effective and useful for causal inference.
dc.language	en
dc.publisher	Association for Computing Machinery (ACM)
dc.relation	http://purl.org/au-research/grants/arc/DP170101632
dc.relation.ispartof	ACM Transactions on Intelligent Systems and Technology
dc.relation.isbasedon	10.1145/3379500
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0801 Artificial Intelligence and Image Processing, 0806 Information Systems
dc.title	A Causal Dirichlet Mixture Model for Causal Inference from Observational Data
dc.type	Journal Article
utslib.citation.volume	11
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	0806 Information Systems
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	closed_access	*
dc.date.updated	2020-10-14T05:39:28Z
pubs.issue	3
pubs.publication-status	Published
pubs.volume	11
utslib.citation.issue	3

Abstract:

© 2020 ACM. Estimating causal effects by making causal inferences from observational data is common practice in scientific studies, business decision-making, and daily life. In today's data-driven world, causal inference has become a key part of the evaluation process for many purposes, such as examining the effects of medicine or the impact of an economic policy on society. However, although the literature contains some excellent models, there is room to improve their representation power and their ability to capture complex relationships. For these reasons, we propose a novel prior called Causal DP and a model called CDP. The prior captures the complex relationships between covariates, treatments, and outcomes in observational data using a rational probabilistic dependency structure. The model is Bayesian, nonparametric, and generative and is not based on the assumption of any parametric distribution. CDP is designed to estimate various kinds of causal effects - average, conditional average, average treated, quantile, and so on. It performs well with missing covariates and does not suffer from overfitting. Comparative experiments on synthetic datasets against several state-of-the-art methods demonstrate that CDP has a superior ability to capture complex relationships. Further, a simple evaluation to infer the effect of a job training program on trainee earnings from real-world data shows that CDP is both effective and useful for causal inference.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/143237