Extensions of the External Validation for Checking Learned Model Interpretability and Generalizability.

Ho, SY; Phua, K; Wong, L; Bin Goh, WW

Extensions of the External Validation for Checking Learned Model Interpretability and Generalizability.

Ho, SY Phua, K Wong, L Bin Goh, WW

Permalink

Publisher:: Elsevier BV
Publication Type:: Journal Article
Citation:: Patterns (New York, N.Y.), 2020, 1, (8), pp. 100129
Issue Date:: 2020-11-13

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Published versionAdobe PDF (980.36 kB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Ho, SY
dc.contributor.author	Phua, K
dc.contributor.author	Wong, L
dc.contributor.author	Bin Goh, WW
dc.date.accessioned	2021-01-31T21:16:29Z
dc.date.available	2021-01-31T21:16:29Z
dc.date.issued	2020-11-13
dc.identifier.citation	Patterns (New York, N.Y.), 2020, 1, (8), pp. 100129
dc.identifier.issn	2666-3899
dc.identifier.issn	2666-3899
dc.identifier.uri	http://hdl.handle.net/10453/145679
dc.description.abstract	We discuss the validation of machine learning models, which is standard practice in determining model efficacy and generalizability. We argue that internal validation approaches, such as cross-validation and bootstrap, cannot guarantee the quality of a machine learning model due to potentially biased training data and the complexity of the validation procedure itself. For better evaluating the generalization ability of a learned model, we suggest leveraging on external data sources from elsewhere as validation datasets, namely external validation. Due to the lack of research attractions on external validation, especially a well-structured and comprehensive study, we discuss the necessity for external validation and propose two extensions of the external validation approach that may help reveal the true domain-relevant model from a candidate set. Moreover, we also suggest a procedure to check whether a set of validation datasets is valid and introduce statistical reference points for detecting external data problems.
dc.format	Electronic-eCollection
dc.language	eng
dc.publisher	Elsevier BV
dc.relation.ispartof	Patterns (New York, N.Y.)
dc.relation.isbasedon	10.1016/j.patter.2020.100129
dc.rights	Elsevier required licence: © 2020 . This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/. The definitive publisher version is available online at https://doi.org/10.1016/j.patter.2020.100129	en_US
dc.rights	info:eu-repo/semantics/openAccess
dc.title	Extensions of the External Validation for Checking Learned Model Interpretability and Generalizability.
dc.type	Journal Article
utslib.citation.volume	1
utslib.location.activity	United States
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney
utslib.copyright.status	open_access	*
dc.date.updated	2021-01-31T21:16:18Z
pubs.issue	8
pubs.publication-status	Published
pubs.volume	1
utslib.citation.issue	8

Abstract:

We discuss the validation of machine learning models, which is standard practice in determining model efficacy and generalizability. We argue that internal validation approaches, such as cross-validation and bootstrap, cannot guarantee the quality of a machine learning model due to potentially biased training data and the complexity of the validation procedure itself. For better evaluating the generalization ability of a learned model, we suggest leveraging on external data sources from elsewhere as validation datasets, namely external validation. Due to the lack of research attractions on external validation, especially a well-structured and comprehensive study, we discuss the necessity for external validation and propose two extensions of the external validation approach that may help reveal the true domain-relevant model from a candidate set. Moreover, we also suggest a procedure to check whether a set of validation datasets is valid and introduce statistical reference points for detecting external data problems.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/145679