Scene Text Recognition
Introduction
- Definition: Text—as a fundamental tool of communicating information—scatters throughout natural scenes, e.g., street signs, product labels, license plates, etc. Automatically reading text in natural scene images is an important task in machine learning and gains increasing attention due to a variety of applications. For example, accessing text in images can help the visually impaired understand the surrounding environment. To enable autonomous driving, one must accurately detect and recognize every road sign. Indexing text in images would enable image search and retrieval from billions of consumer photos on the internet.
- Applications: Indexing of multimedia archives, recognizing signs in driver assisted systems, providing scene information to visually impaired people, identifying vehicles by reading their license plates.
- Scope: No scope decided yet.
- Tools: OpenCV, Tesseract, PaddleOCR
Models
Semantic Reasoning Networks
Towards Accurate Scene Text Recognition with Semantic Reasoning Networks. arXiv, 2020.
Differentiable Binarization
Real-time Scene Text Detection with Differentiable Binarization. arXiv, 2019.
CRAFT
Character Region Awareness for Text Detection. arXiv, 2019.
EAST
EAST: An Efficient and Accurate Scene Text Detector. arXiv, 2017.
Process flow
Step 1: Collect Images
Fetch from database, scrap from the internet or use public datasets. Setup the database connection and fetch the data into python environment.
Step 2: Data Preparation
Explore the data, validate it and create preprocessing strategy. Clean the data and make it ready for processing.
Step 3: Model Building
Apply different kinds of detection, recognition and single-shot models on the images. Track the progress and experiments. Validate the final set of models and select/assemble the final model.
Step 4: UAT Testing
Wrap the model inference engine in API for client testing
Step 5: Deployment
Deploy the model on cloud or edge as per the requirement
Step 6: Documentation
Prepare the documentation and transfer all assets to the client
Use Cases
Scene Text Detection with EAST Tesseract
Detect the text in images and videos using EAST model. Read the characters using Tesseract. Check out this notion.
Scene Text Recognition with DeepText
Detect and Recognize text in images with an end-to-end model named DeepText. Check out this notion.
Automatic License Plate Recognition
Read the characters on the license plate image using Tesseract OCR. Check out this notion.
Keras OCR Toolkit Experiment
Keras OCR is a deep learning based toolkit for text recognition in images. Check out this notion.
OCR Experiments
Experiments with three OCR tools - Tesseract OCR, Easy OCR, and Arabic OCR. Check out this and this notion.
PaddleOCR Experiments
Experiments with state of the art lightweight and multi-lingual OCR. Check out this notion.