Depth estimation for light field images and light field image synthesis

Publication Type:
Thesis
Issue Date:
2024
Full metadata record
Light field imaging technology allows capturing the pixel intensity and the direction of the incident light with a single capture. Capturing this additional dimensionality allows for generating images at different focal lengths, extending the depth of field, and estimating the scene depth from a single capture. Depth estimation for light field images is fundamental for various light field applications such as light field image compression techniques, reconstructing views from a sparse set of perspective views, and 3D reconstruction. This thesis presents an algorithm to improve depth map estimation for light field images using depth from defocus. Previous depth map estimation approaches for light field images do not capture sharp transitions around object boundaries due to occlusions, making many current approaches unreliable at depth discontinuities. This is especially the case for light field images because the pixels do not exhibit photo-consistency in the presence of occlusions. In this work, a small patch size of pixels is used in each focal stack image for comparing defocus cues, allowing the algorithm to generate sharper depth boundaries. The frequency domain analysis image similarity checking then generates the depth map. Processing in the frequency domain reduces the individual pixel errors that occur while directly comparing RGB images, making the algorithm more resilient to noise. The depth estimation techniques explored for light field images form the building blocks for solving this inverse problem of light field image reconstruction and synthesis. The ability to convert 2D RGB images to 4D light field images will change how we perceive traditional photography. This thesis presents a light field synthesis algorithm that uses the focal stack images and the all-in-focus image to synthesize a 9 x 9 sub-aperture view light field image. Using the presented approach of estimating a depth map using depth from defocus, this depth map and the all-in-focus image synthesize the sub-aperture views and their corresponding depth maps by mimicking the apparent shifting of the central image according to the depth values. Occluded regions are handled in the synthesized sub-aperture views by filling them with the information recovered from the focal stack images. Experimental results demonstrate that the presented algorithm outperforms state-of-the-art depth estimation techniques for light field images, particularly in the case of noisy images. Results also show that if the depth levels in the image are known, a high-accuracy light field image can be synthesized with just five focal stack images.
Please use this identifier to cite or link to this item: