Computer Vision News - November 2019

2 Summary We Tried for You 4 Deep Learning OCR Using Tesseract with Python by Amnon Geifman Usually when training a network on MNIST data set , we obtain high accuracy when the images are aligned and properly set. However, moving from an artificial dataset to a robust and well standing application might be quite challenging. Luckily, nowadays there are a few open source libraries that perform OCR ( Optical Characters Recognition ) very well. Today we will go through one of such libraries called Tesseract . We will start with setup the environment for such a project and see exactly how to manipulate images to turn them into python string objects. Tesseract was developed as proprietary software by the Hewlett-Packard Labs. Later, Google took over development and many open source contributors joined. On the latest version, the character recognition is based on deep learning techniques, specifically on LSTM- long short-term memory, which is a widely used RNN model. The official version of Tesseract OCR allows developers to build their own applications using C or C++ API. Over time, many version and wrappers have been developed. Today we will use the Tesser-Ocr wrapper in python. Setting the Environment One of themain challenges in using Tesseract with python is setting the environment. I recommend to use a virtual environment for this project and if you do so, I highly recommend to do it with Conda. The installation procedure is done by two steps: first, setup the required packages using pip, then install the python wrapper to use the OCR package. We start with the pip: I assume you have pip on your system path (if you don't have put it there!) so just pip install the following on the command line: