Two-Speed Deep-Learning Ensemble for Classification of Incremental Land-Cover Satellite Image Patches

Horry, MJ; Chakraborty, S; Pradhan, B; Shulka, N; Almazroui, M

Two-Speed Deep-Learning Ensemble for Classification of Incremental Land-Cover Satellite Image Patches

Horry, MJ Chakraborty, S

Pradhan, B

Shulka, N Almazroui, M

Permalink

Publisher:: Springer Nature
Publication Type:: Journal Article
Citation:: Earth Systems and Environment, 2023, pp. 1-16
Issue Date:: 2023-01-01

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Published versionAdobe PDF (2.5 MB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Horry, MJ
dc.contributor.author	Chakraborty, S https://orcid.org/0000-0002-0102-5424
dc.contributor.author	Pradhan, B https://orcid.org/0000-0001-9863-2054
dc.contributor.author	Shulka, N
dc.contributor.author	Almazroui, M
dc.date.accessioned	2023-05-08T22:57:39Z
dc.date.available	2023-05-08T22:57:39Z
dc.date.issued	2023-01-01
dc.identifier.citation	Earth Systems and Environment, 2023, pp. 1-16
dc.identifier.issn	2509-9426
dc.identifier.issn	2509-9434
dc.identifier.uri	http://hdl.handle.net/10453/170282
dc.description.abstract	High-velocity data streams present a challenge to deep learning-based computer vision models due to the resources needed to retrain for new incremental data. This study presents a novel staggered training approach using an ensemble model comprising the following: (i) a resource-intensive high-accuracy vision transformer; and (ii) a fast training, but less accurate, low parameter-count convolutional neural network. The vision transformer provides a scalable and accurate base model. A convolutional neural network (CNN) quickly incorporates new data into the ensemble model. Incremental data are simulated by dividing the very large So2Sat LCZ42 satellite image dataset into four intervals. The CNN is trained every interval and the vision transformer trained every half interval. We call this combination of a complementary ensemble with staggered training a “two-speed” network. The novelty of this approach is in the use of a staggered training schedule that allows the ensemble model to efficiently incorporate new data by retraining the high-speed CNN in advance of the resource-intensive vision transformer, thereby allowing for stable continuous improvement of the ensemble. Additionally, the ensemble models for each data increment out-perform each of the component models, with best accuracy of 65% against a holdout test partition of the RGB version of the So2Sat dataset.
dc.language	en
dc.publisher	Springer Nature
dc.relation.ispartof	Earth Systems and Environment
dc.relation.isbasedon	10.1007/s41748-023-00343-3
dc.rights	info:eu-repo/semantics/openAccess
dc.title	Two-Speed Deep-Learning Ensemble for Classification of Incremental Land-Cover Satellite Image Patches
dc.type	Journal Article
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Civil and Environmental Engineering
pubs.organisational-group	/University of Technology Sydney/Strength - CAMGIS - Centre for Advanced Modelling and Geospatial lnformation Systems
utslib.copyright.status	open_access	*
dc.date.updated	2023-05-08T22:57:37Z
pubs.publication-status	Published

Abstract:

High-velocity data streams present a challenge to deep learning-based computer vision models due to the resources needed to retrain for new incremental data. This study presents a novel staggered training approach using an ensemble model comprising the following: (i) a resource-intensive high-accuracy vision transformer; and (ii) a fast training, but less accurate, low parameter-count convolutional neural network. The vision transformer provides a scalable and accurate base model. A convolutional neural network (CNN) quickly incorporates new data into the ensemble model. Incremental data are simulated by dividing the very large So2Sat LCZ42 satellite image dataset into four intervals. The CNN is trained every interval and the vision transformer trained every half interval. We call this combination of a complementary ensemble with staggered training a “two-speed” network. The novelty of this approach is in the use of a staggered training schedule that allows the ensemble model to efficiently incorporate new data by retraining the high-speed CNN in advance of the resource-intensive vision transformer, thereby allowing for stable continuous improvement of the ensemble. Additionally, the ensemble models for each data increment out-perform each of the component models, with best accuracy of 65% against a holdout test partition of the RGB version of the So2Sat dataset.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/170282