A smoothed Q-learning algorithm for estimating optimal dynamic treatment regimes

Fan, Y; He, M; Su, L; Zhou, XH

A smoothed Q-learning algorithm for estimating optimal dynamic treatment regimes

Fan, Y He, M

Su, L Zhou, XH

Permalink

Publication Type:: Journal Article
Citation:: Scandinavian Journal of Statistics, 2019, 46 (2), pp. 446 - 469
Issue Date:: 2019-06-01

Closed Access

	Filename	Description	Size
	A smoothed Q‐learning algorithm for estimating optimal dynamic treatment regimes.pdf	Published Version	622.37 kB		View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Fan, Y	en_US
dc.contributor.author	He, M https://orcid.org/0000-0003-2663-9526	en_US
dc.contributor.author	Su, L	en_US
dc.contributor.author	Zhou, XH	en_US
dc.date.issued	2019-06-01	en_US
dc.identifier.citation	Scandinavian Journal of Statistics, 2019, 46 (2), pp. 446 - 469	en_US
dc.identifier.issn	0303-6898	en_US
dc.identifier.uri	http://hdl.handle.net/10453/139260
dc.description.abstract	© 2018 Board of the Foundation of the Scandinavian Journal of Statistics In this paper, we propose a smoothed Q-learning algorithm for estimating optimal dynamic treatment regimes. In contrast to the Q-learning algorithm in which nonregular inference is involved, we show that, under assumptions adopted in this paper, the proposed smoothed Q-learning estimator is asymptotically normally distributed even when the Q-learning estimator is not and its asymptotic variance can be consistently estimated. As a result, inference based on the smoothed Q-learning estimator is standard. We derive the optimal smoothing parameter and propose a data-driven method for estimating it. The finite sample properties of the smoothed Q-learning estimator are studied and compared with several existing estimators including the Q-learning estimator via an extensive simulation study. We illustrate the new method by analyzing data from the Clinical Antipsychotic Trials of Intervention Effectiveness–Alzheimer's Disease (CATIE-AD) study.	en_US
dc.relation.ispartof	Scandinavian Journal of Statistics	en_US
dc.relation.isbasedon	10.1111/sjos.12359	en_US
dc.subject.classification	Statistics & Probability	en_US
dc.title	A smoothed Q-learning algorithm for estimating optimal dynamic treatment regimes	en_US
dc.type	Journal Article
utslib.citation.volume	2	en_US
utslib.citation.volume	46	en_US
utslib.for	0104 Statistics	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Business
utslib.copyright.status	closed_access
pubs.issue	2	en_US
pubs.publication-status	Published	en_US
pubs.volume	46	en_US

Abstract:

© 2018 Board of the Foundation of the Scandinavian Journal of Statistics In this paper, we propose a smoothed Q-learning algorithm for estimating optimal dynamic treatment regimes. In contrast to the Q-learning algorithm in which nonregular inference is involved, we show that, under assumptions adopted in this paper, the proposed smoothed Q-learning estimator is asymptotically normally distributed even when the Q-learning estimator is not and its asymptotic variance can be consistently estimated. As a result, inference based on the smoothed Q-learning estimator is standard. We derive the optimal smoothing parameter and propose a data-driven method for estimating it. The finite sample properties of the smoothed Q-learning estimator are studied and compared with several existing estimators including the Q-learning estimator via an extensive simulation study. We illustrate the new method by analyzing data from the Clinical Antipsychotic Trials of Intervention Effectiveness–Alzheimer's Disease (CATIE-AD) study.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/139260