A study on automated handwriting understanding

Publication Type:
Issue Date:
Full metadata record
Handwriting is a concatenation of graphical symbols drawn by a pen or other writing instruments, using a hand in order to represent linguistic constructs for communication and knowledge storage. These graphical marks/writing symbols have deep orthographic relation to the phonology of a spoken language. However, to a machine, handwriting is nothing but a pattern. Therefore, recognition of this pattern is performed in order to read a manuscript by a computer. Such a process of automatic character pattern recognition from an optically scanned document image is called OCR (Optical Character Recognition). Nowadays, the computer vision method is not limited to simply recognizing patterns/objects. It tries to endow the machine a human-like intelligent ability. The main goal of this research is using computer vision to bridge the gap between pattern recognition and human perception of handwriting. In this thesis, we focus on understanding the handwriting, which is beyond simply recognizing the characters by OCR. Towards this aim, we peek into the implicit information of handwriting to understand some inherent characteristics. In this thesis, we concentrate on three aspects. First, understanding the generation of handwritten information by the writing body; second, understanding the writing strokes in regards to the quality of handwriting; third, understanding the content revealing handwritten word entities. Thus, the thesis contains three parts. Regardless of past researches on writer inspection, it is hard to find an empirical study performed on intra-variable handwriting, although such variation should be an important concern. The first part of this thesis addresses this concern. Besides, this part inspects the writer on some unconventional aspects, e.g., writing variability over struck-out texts, multiple scripts, etc. The second part makes a pioneering contribution to understanding writing stroke information in multiple facets, such as legibility, aesthetics, difficulty, and idiosyncrasy of strokes. The third part of the thesis approaches to comprehend the content of the handwritten document using computer vision, without the aid of a transcription engine or the natural language processing which, according to our knowledge, is the earliest attempt of its kind. This research has adapted the traditional machine learning approaches as well as state-of-the-art deep learning approaches and has proposed new techniques to automate the process of handwriting understanding. The performed experiments have produced encouraging results, which ensure the applicability of the proposed research. This study has an impact on general image processing, pattern recognition, machine learning, and deep learning domains, especially on document image processing and handwriting processing. Moreover, this research contributes to forensics for questioned document examination, biometrics for behavioral analysis through handwriting, library science/archival science for e-archiving of the manuscript, and data science. According to us, this study has pushed the frontiers of handwriting-related research.
Please use this identifier to cite or link to this item: