Multi-Scale Frequency Reconstruction for Guided Depth Map Super-Resolution via Deep Residual Network

Zuo, Y; Wu, Q; Fang, Y; An, P; Huang, L; Chen, Z

Multi-Scale Frequency Reconstruction for Guided Depth Map Super-Resolution via Deep Residual Network

Zuo, Y Wu, Q

Fang, Y An, P Huang, L Chen, Z

Permalink

Publisher:: Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:: Journal Article
Citation:: IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30, (2), pp. 297-306
Issue Date:: 2020-02-01

Closed Access

	Filename	Description	Size
	08598786.pdf	Published version	7.58 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Zuo, Y
dc.contributor.author	Wu, Q https://orcid.org/0000-0001-5641-2483
dc.contributor.author	Fang, Y
dc.contributor.author	An, P
dc.contributor.author	Huang, L
dc.contributor.author	Chen, Z
dc.date.accessioned	2020-04-30T06:40:25Z
dc.date.available	2020-04-30T06:40:25Z
dc.date.issued	2020-02-01
dc.identifier.citation	IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30, (2), pp. 297-306
dc.identifier.issn	1051-8215
dc.identifier.issn	1558-2205
dc.identifier.uri	http://hdl.handle.net/10453/140365
dc.description.abstract	© 1991-2012 IEEE. The depth maps obtained by the consumer-level sensors are always noisy in the low-resolution (LR) domain. Existing methods for the guided depth super-resolution, which are based on the pre-defined local and global models, perform well in general cases (e.g., joint bilateral filter and Markov random field). However, such model-based methods may fail to describe the potential relationship between RGB-D image pairs. To solve this problem, this paper proposes a data-driven approach based on the deep convolutional neural network with global and local residual learning. It progressively upsamples the LR depth map guided by the high-resolution intensity image in multiple scales. A global residual learning is adopted to learn the difference between the ground truth and the coarsely upsampled depth map, and the local residual learning is introduced in each scale-dependent reconstruction sub-network. This scheme can restore the depth structure from coarse to fine via multi-scale frequency synthesis. In addition, batch normalization layers are used to improve the performance of depth map denoising. Our method is evaluated in noise-free and noisy cases. A comprehensive comparison against 17 state-of-the-art methods is carried out. The experimental results show that the proposed method has faster convergence speed as well as improved performances based on the qualitative and quantitative evaluations.
dc.language	en
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.relation.ispartof	IEEE Transactions on Circuits and Systems for Video Technology
dc.relation.isbasedon	10.1109/TCSVT.2018.2890271
dc.rights	info:eu-repo/semantics/closedAccess
dc.rights	© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	en_US
dc.subject	0801 Artificial Intelligence and Image Processing, 0906 Electrical and Electronic Engineering
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	Multi-Scale Frequency Reconstruction for Guided Depth Map Super-Resolution via Deep Residual Network
dc.type	Journal Article
utslib.citation.volume	30
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	0906 Electrical and Electronic Engineering
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - INEXT - Innovation in IT Services and Applications
pubs.organisational-group	/University of Technology Sydney/Strength - GBDTC - Global Big Data Technologies
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
pubs.organisational-group	/University of Technology Sydney
utslib.copyright.status	closed_access	*
pubs.consider-herdc	true
dc.date.updated	2020-04-30T06:40:17Z
pubs.issue	2
pubs.publication-status	Published
pubs.volume	30
utslib.start-page	297
utslib.citation.issue	2

Abstract:

© 1991-2012 IEEE. The depth maps obtained by the consumer-level sensors are always noisy in the low-resolution (LR) domain. Existing methods for the guided depth super-resolution, which are based on the pre-defined local and global models, perform well in general cases (e.g., joint bilateral filter and Markov random field). However, such model-based methods may fail to describe the potential relationship between RGB-D image pairs. To solve this problem, this paper proposes a data-driven approach based on the deep convolutional neural network with global and local residual learning. It progressively upsamples the LR depth map guided by the high-resolution intensity image in multiple scales. A global residual learning is adopted to learn the difference between the ground truth and the coarsely upsampled depth map, and the local residual learning is introduced in each scale-dependent reconstruction sub-network. This scheme can restore the depth structure from coarse to fine via multi-scale frequency synthesis. In addition, batch normalization layers are used to improve the performance of depth map denoising. Our method is evaluated in noise-free and noisy cases. A comprehensive comparison against 17 state-of-the-art methods is carried out. The experimental results show that the proposed method has faster convergence speed as well as improved performances based on the qualitative and quantitative evaluations.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/140365