Computer Vision News - January 2023

20 Congrats, Doctor Mohamed! Optical and Handwritten Text Recognition (OCR/HTR) systems are now achieving good results using recent deep learning tools. However, those systems can fail in some scenarios. For instance, when the image quality is degradedor the addressed script iswith low-resource (lack of labeled training data). To address this, novel deep learning models and training strategies are proposed within the thesis. Those contributions can be categorized into two main lines of research: document image enhancement and HTR in low-resource scenarios. For the document image enhancement, the problem was treated as an image-to-image translation task and several deep learning tools were employed as solutions. First, a conditional Generative Adversarial Network (cGAN) , composed of a generator and a discriminator, was used to generate clean images by conditioning on the degraded ones. During training, the generator is producing an enhanced image and passing it to the discriminator to decide whether it is real (originally clean) or fake (cleaned by the generator). This training was done using an adversarial loss with a min-max game. Second, an evolved approach was introduced by adding another component during training that read the text of the enhanced images and forces the generator to produce more readable text while enhancing the image by a CTC loss. This was improving the results to produce images that are as clean and readable as possible. Finally, a model was proposed basing on vision transformers instead of the convolutional layers, which further improves the results. An example of enhancing degraded images by our models is presented in Fig. 1. Mohamed Ali Souibgui has recently completed his PhD in Universitat Autònoma de Barcelona. He has been working as a researcher in the Computer Vision Center of Barcelona. His research mainly focuses on the enhancement and recognition of historical document images, especially the low-resource manuscripts, i.e, the scripts with few available datasets like the ciphered manuscripts. The thesis was done under the European project DECRYPT . Congrats, Doctor Mohamed!

RkJQdWJsaXNoZXIy NTc3NzU=