Unseen land cover classification fromhigh-resolution orthophotos using integration of zero-shot learning and convolutional neural networks

Pradhan, B; Al-Najjar, HAH; Sameen, MI; Tsang, I; Alamri, AM

Unseen land cover classification fromhigh-resolution orthophotos using integration of zero-shot learning and convolutional neural networks

Pradhan, B

Al-Najjar, HAH Sameen, MI Tsang, I Alamri, AM

Permalink

Publisher:: MDPI
Publication Type:: Journal Article
Citation:: Remote Sensing, 2020, 12, (10)
Issue Date:: 2020-05-01

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Published versionAdobe PDF (14.03 MB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Pradhan, B https://orcid.org/0000-0001-9863-2054
dc.contributor.author	Al-Najjar, HAH
dc.contributor.author	Sameen, MI
dc.contributor.author	Tsang, I
dc.contributor.author	Alamri, AM
dc.date.accessioned	2021-02-04T04:15:37Z
dc.date.available	2021-02-04T04:15:37Z
dc.date.issued	2020-05-01
dc.identifier.citation	Remote Sensing, 2020, 12, (10)
dc.identifier.issn	2072-4292
dc.identifier.issn	2072-4292
dc.identifier.uri	http://hdl.handle.net/10453/145820
dc.description.abstract	© 2020 by the authors. Zero-shot learning (ZSL) is an approach to classify objects unseen during the training phase and shown to be useful for real-world applications, especially when there is a lack of sufficient training data. Only a limited amount of works has been carried out on ZSL, especially in the field of remote sensing. This research investigates the use of a convolutional neural network (CNN) as a feature extraction and classification method for land cover mapping using high-resolution orthophotos. In the feature extraction phase, we used a CNN model with a single convolutional layer to extract discriminative features. In the second phase, we used class attributes learned from the Word2Vec model (pre-trained by Google News) to train a second CNN model that performed class signature prediction by using both the features extracted by the first CNN and class attributes during training and only the features during prediction. We trained and tested our models on datasets collected over two subareas in the Cameron Highlands (training dataset, first test dataset) and Ipoh (second test dataset) in Malaysia. Several experiments have been conducted on the feature extraction and classification models regarding the main parameters, such as the network's layers and depth, number of filters, and the impact of Gaussian noise. As a result, the best models were selected using various accuracy metrics such as top-k categorical accuracy for k = [1,2,3], Recall, Precision, and F1-score. The best model for feature extraction achieved 0.953 F1-score, 0.941 precision, 0.882 recall for the training dataset and 0.904 F1-score, 0.869 precision, 0.949 recall for the first test dataset, and 0.898 F1-score, 0.870 precision, 0.838 recall for the second test dataset. The best model for classification achieved an average of 0.778 top-one, 0.890 top-two and 0.942 top-three accuracy, 0.798 F1-score, 0.766 recall and 0.838 precision for the first test dataset and 0.737 top-one, 0.906 top-two, 0.924 top-three, 0.729 F1-score, 0.676 recall and 0.790 precision for the second test dataset. The results demonstrated that the proposed ZSL is a promising tool for land cover mapping based on high-resolution photos.
dc.language	English
dc.publisher	MDPI
dc.relation.ispartof	Remote Sensing
dc.relation.isbasedon	10.3390/rs12101676
dc.rights	info:eu-repo/semantics/openAccess
dc.subject	0203 Classical Physics, 0406 Physical Geography and Environmental Geoscience, 0909 Geomatic Engineering
dc.title	Unseen land cover classification fromhigh-resolution orthophotos using integration of zero-shot learning and convolutional neural networks
dc.type	Journal Article
utslib.citation.volume	12
utslib.for	0203 Classical Physics
utslib.for	0406 Physical Geography and Environmental Geoscience
utslib.for	0909 Geomatic Engineering
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - CAMGIS - Centre for Advanced Modelling and Geospatial lnformation Systems
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Information, Systems and Modelling
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	open_access	*
pubs.consider-herdc	false
dc.date.updated	2021-02-04T04:15:15Z
pubs.issue	10
pubs.publication-status	Published
pubs.volume	12
utslib.citation.issue	10

Abstract:

© 2020 by the authors. Zero-shot learning (ZSL) is an approach to classify objects unseen during the training phase and shown to be useful for real-world applications, especially when there is a lack of sufficient training data. Only a limited amount of works has been carried out on ZSL, especially in the field of remote sensing. This research investigates the use of a convolutional neural network (CNN) as a feature extraction and classification method for land cover mapping using high-resolution orthophotos. In the feature extraction phase, we used a CNN model with a single convolutional layer to extract discriminative features. In the second phase, we used class attributes learned from the Word2Vec model (pre-trained by Google News) to train a second CNN model that performed class signature prediction by using both the features extracted by the first CNN and class attributes during training and only the features during prediction. We trained and tested our models on datasets collected over two subareas in the Cameron Highlands (training dataset, first test dataset) and Ipoh (second test dataset) in Malaysia. Several experiments have been conducted on the feature extraction and classification models regarding the main parameters, such as the network's layers and depth, number of filters, and the impact of Gaussian noise. As a result, the best models were selected using various accuracy metrics such as top-k categorical accuracy for k = [1,2,3], Recall, Precision, and F1-score. The best model for feature extraction achieved 0.953 F1-score, 0.941 precision, 0.882 recall for the training dataset and 0.904 F1-score, 0.869 precision, 0.949 recall for the first test dataset, and 0.898 F1-score, 0.870 precision, 0.838 recall for the second test dataset. The best model for classification achieved an average of 0.778 top-one, 0.890 top-two and 0.942 top-three accuracy, 0.798 F1-score, 0.766 recall and 0.838 precision for the first test dataset and 0.737 top-one, 0.906 top-two, 0.924 top-three, 0.729 F1-score, 0.676 recall and 0.790 precision for the second test dataset. The results demonstrated that the proposed ZSL is a promising tool for land cover mapping based on high-resolution photos.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/145820