Visual Recognition in RGB Images and Videos by Learning from RGB-D Data

Li, W; Chen, L; Xu, D; Van Gool, L

Visual Recognition in RGB Images and Videos by Learning from RGB-D Data

Li, W Chen, L Xu, D

Van Gool, L

Permalink

Publication Type:: Journal Article
Citation:: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40 (8), pp. 2030 - 2036
Issue Date:: 2018-08-01

Closed Access

	Filename	Description	Size
	08000401.pdf	Published Version	275.73 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Li, W	en_US
dc.contributor.author	Chen, L	en_US
dc.contributor.author	Xu, D https://orcid.org/0000-0003-2775-9730	en_US
dc.contributor.author	Van Gool, L	en_US
dc.date.issued	2018-08-01	en_US
dc.identifier.citation	IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40 (8), pp. 2030 - 2036	en_US
dc.identifier.issn	0162-8828	en_US
dc.identifier.uri	http://hdl.handle.net/10453/131316
dc.description.abstract	© 1979-2012 IEEE. In this work, we propose a framework for recognizing RGB images or videos by learning from RGB-D training data that contains additional depth information. We formulate this task as a new unsupervised domain adaptation (UDA) problem, in which we aim to take advantage of the additional depth features in the source domain and also cope with the data distribution mismatch between the source and target domains. To handle the domain distribution mismatch, we propose to learn an optimal projection matrix to map the samples from both domains into a common subspace such that the domain distribution mismatch can be reduced. Such projection matrix can be effectively optimized by exploiting different strategies. Moreover, we also use different ways to utilize the additional depth features. To simultaneously cope with the above two issues, we formulate a unified learning framework called domain adaptation from multi-view to single-view (DAM2S). By defining various forms of regularizers in our DAM2S framework, different strategies can be readily incorporated to learn robust SVM classifiers for classifying the target samples, and three methods are developed under our DAM2S framework. We conduct comprehensive experiments for object recognition, cross-dataset and cross-view action recognition, which demonstrate the effectiveness of our proposed methods for recognizing RGB images and videos by learning from RGB-D data.	en_US
dc.relation.ispartof	IEEE Transactions on Pattern Analysis and Machine Intelligence	en_US
dc.relation.isbasedon	10.1109/TPAMI.2017.2734890	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.title	Visual Recognition in RGB Images and Videos by Learning from RGB-D Data	en_US
dc.type	Journal Article
utslib.citation.volume	8	en_US
utslib.citation.volume	40	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
utslib.for	0806 Information Systems	en_US
utslib.for	0906 Electrical and Electronic Engineering	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
utslib.copyright.status	closed_access
pubs.issue	8	en_US
pubs.publication-status	Published	en_US
pubs.volume	40	en_US

Abstract:

© 1979-2012 IEEE. In this work, we propose a framework for recognizing RGB images or videos by learning from RGB-D training data that contains additional depth information. We formulate this task as a new unsupervised domain adaptation (UDA) problem, in which we aim to take advantage of the additional depth features in the source domain and also cope with the data distribution mismatch between the source and target domains. To handle the domain distribution mismatch, we propose to learn an optimal projection matrix to map the samples from both domains into a common subspace such that the domain distribution mismatch can be reduced. Such projection matrix can be effectively optimized by exploiting different strategies. Moreover, we also use different ways to utilize the additional depth features. To simultaneously cope with the above two issues, we formulate a unified learning framework called domain adaptation from multi-view to single-view (DAM2S). By defining various forms of regularizers in our DAM2S framework, different strategies can be readily incorporated to learn robust SVM classifiers for classifying the target samples, and three methods are developed under our DAM2S framework. We conduct comprehensive experiments for object recognition, cross-dataset and cross-view action recognition, which demonstrate the effectiveness of our proposed methods for recognizing RGB images and videos by learning from RGB-D data.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/131316