Recovering dense 3D motion and shape information from RGB-D data

Wang, Yucheng

Recovering dense 3D motion and shape information from RGB-D data

Wang, Yucheng

Permalink

Publication Type:: Thesis
Issue Date:: 2017

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download contents and abstractAdobe PDF (91.37 kB)

Adobe PDF

Download thesisAdobe PDF (4.76 MB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Wang, Yucheng
dc.date.accessioned	2017-06-05T01:47:23Z
dc.date.available	2017-06-05T01:47:23Z
dc.date.issued	2017
dc.identifier.uri	http://hdl.handle.net/10453/102739
dc.description	University of Technology Sydney. Faculty of Engineering and Information Technology.	en_AU
dc.description.abstract	3D motion and 3D shape information are essential to many research fields, such as computer vision, computer graphics, and augmented reality. Thus, 3D motion estimation and 3D shape recovery are two important topics in these research communities. RGB-D cameras have become more accessible in recent few years. They are popular for good mobility, low cost, and high frame rate. However, these RGB-D cameras generate low-resolution and low-accuracy depth images due to chip size limitations and ambient illumination perturbation. Thus, obtaining high-resolution and high-accuracy 3D information based on RGB-D data is an important task. This research investigates 3D motion estimation and 3D shape recovery solutions for RGB-D cameras. Thus, within this thesis, various methods are developed and presented to address the following research challenges: fusing passive stereo vision and active depth acquisition; 3D motion estimation based on RGB-D data; depth super-resolution based on RGB-D video with large displacement 3D motion. In Chapter 3, a framework is presented to acquire depth images by fusing active depth acquisition and passive stereo vision. Active depth acquisition and passive stereo vision have their limitations in some aspects, but their range-sensing characteristics are complementary. Thus, combining both approaches can produce more accurate results than using either one only. Unlike previous fusion methods, the noisy depth observation from active depth acquisition is initially taken as a prior knowledge of the scene structure, which improves the accuracy of the fused depth images. Chapter 4 details a method for 3D scene ow estimation based on RGB-D data. The accuracy of scene ow estimation is limited by two issues: occlusions and large displacement motions. To handle occlusions, the occlusion status is modelled, and the scene ow and occluded regions are jointly estimated. To deal with large displacement motions, an over-parameterised scene ow representation is employed to model both the rotation and translation components of the scene ow. In Chapter 5, a depth super-resolution framework is presented for RGB-D video sequences with large 3D motion. To handle large 3D motion, our framework has two stages: motion compensation and fusion. A superpixel-based motion estimation approach is proposed for efficient motion compensation. The fusion task is modelled as a regression problem, and a specific deep convolutional neural network (CNN) is designed that can learns the mapping function between depth image observations and the fused depth image given a large amount of training data.	en_AU
dc.format	Thesis (PhD)
dc.language.iso	en_AU	en_AU
dc.relation	https://opus.lib.uts.edu.au/bitstream/10453/102739/2/02whole.pdf
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	The author owns the copyright in this thesis including all reproduction and reuse rights for the work. The work may not be altered without the permission of the copyright owner. Attribution is essential when quoting or paraphrasing from this thesis.
dc.rights	au.edu.uts.lib/ppc
dc.subject	Computer vision.	en
dc.subject	Computer graphics.	en
dc.subject	Augmented reality.	en
dc.subject	3D motion and 3D shape information.	en
dc.subject	Convolutional neural network (CNN)	en
dc.subject	RGB-D cameras.	en
dc.subject	High-resolution and high-accuracy 3D information.	en
dc.subject	Occlusions.	en
dc.subject	Superpixel-based motion estimation approach.	en
dc.title	Recovering dense 3D motion and shape information from RGB-D data	en_AU
dc.type	Thesis	en_AU
utslib.copyright.status	open_access

Abstract:

3D motion and 3D shape information are essential to many research fields, such as computer vision, computer graphics, and augmented reality. Thus, 3D motion estimation and 3D shape recovery are two important topics in these research communities. RGB-D cameras have become more accessible in recent few years. They are popular for good mobility, low cost, and high frame rate. However, these RGB-D cameras generate low-resolution and low-accuracy depth images due to chip size limitations and ambient illumination perturbation. Thus, obtaining high-resolution and high-accuracy 3D information based on RGB-D data is an important task. This research investigates 3D motion estimation and 3D shape recovery solutions for RGB-D cameras. Thus, within this thesis, various methods are developed and presented to address the following research challenges: fusing passive stereo vision and active depth acquisition; 3D motion estimation based on RGB-D data; depth super-resolution based on RGB-D video with large displacement 3D motion. In Chapter 3, a framework is presented to acquire depth images by fusing active depth acquisition and passive stereo vision. Active depth acquisition and passive stereo vision have their limitations in some aspects, but their range-sensing characteristics are complementary. Thus, combining both approaches can produce more accurate results than using either one only. Unlike previous fusion methods, the noisy depth observation from active depth acquisition is initially taken as a prior knowledge of the scene structure, which improves the accuracy of the fused depth images. Chapter 4 details a method for 3D scene ow estimation based on RGB-D data. The accuracy of scene ow estimation is limited by two issues: occlusions and large displacement motions. To handle occlusions, the occlusion status is modelled, and the scene ow and occluded regions are jointly estimated. To deal with large displacement motions, an over-parameterised scene ow representation is employed to model both the rotation and translation components of the scene ow. In Chapter 5, a depth super-resolution framework is presented for RGB-D video sequences with large 3D motion. To handle large 3D motion, our framework has two stages: motion compensation and fusion. A superpixel-based motion estimation approach is proposed for efficient motion compensation. The fusion task is modelled as a regression problem, and a specific deep convolutional neural network (CNN) is designed that can learns the mapping function between depth image observations and the fused depth image given a large amount of training data.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/102739