A new context-based method for restoring occluded text in natural scene images

Publication Type:
Conference Proceeding
Document Analysis Systems, 2020, 12116 LNCS, pp. 466-480
Issue Date:
Full metadata record
Text recognition from natural scene images is an active research area because of its important real world applications, including multimedia search and retrieval, and scene understanding through computer vision. It is often the case that portions of text in images are missed due to occlusion with objects in the background. Therefore, this paper presents a method for restoring occluded text to improve text recognition performance. The proposed method uses the GOOGLE Vision API for obtaining labels for input images. We propose to use PixelLink-E2E methods for detecting text and obtaining recognition results. Using these results, the proposed method generates candidate words based on distance measures employing lexicons created through natural scene text recognition. We extract the semantic similarity between labels and recognition results, which results in a Global Context Score (GCS). Next, we use the Natural Language Processing (NLP) system known as BERT for extracting semantics between candidate words, which results in a Local Context Score (LCS). Global and local context scores are then fused for estimating the ranking for each candidate word. The word that gets the highest ranking is taken as the correction for text which is occluded in the image. Experimental results on a dataset assembled from standard natural scene datasets and our resources show that our approach helps to improve the text recognition performance significantly.
Please use this identifier to cite or link to this item: