Revise/improve Text & Digit Recognition Samples

Student: Zihao Mu

Mentor: Vladimir Tyan

Link to accomplished work:

Merged PR: opencv/pull/17675
Multiple text recognition models for OpenCV DNN: Shared model link
Train your own text recognition model for OpenCV : deep-text-recognition-benchmark
Detailed Tutorial: OpenCV OCR Tuorial

Introduction

Hi, I'm Zihao Mu! I was the developer of openCV GSoC2020. The goal of this project is to improve text & digit recognition samples in OpenCV. In this deep learning era, we can implement some more efficient text recognition methods. The project mainly consists of two parts:

First Part Digital Recognition through live camera: Digital Detector: Connected Component Analysis Digital Recognizer: LeNet-5 pre-trained on MINST dataset.
Second Part: Text Recognition through live camera: Digital Detector: EAST Digital Recognizer: Multiple text recognition models based on deep learning

My Journey

First Period

Implement opencv/sample/cpp/digits_lenet.cpp base on Connected Component Analysis and LeNet-5. Finding stable preprocessing methods, and implementing ROI of digital rotation prediction.

Second Period

Implement opencv/sample/dnn/text_detection.cpp and opencv/sample/dnn/text_detection.py, let it not only detect text, but recognize text. Based this Github Project, multiple text recognition models have been trained and can be correctly called by the OpenCV DNN module.

Third Period

Provide a Detailed Tutorial, including how to train your own text recognition model, and how to convert the model to be called by OpenCV DNN.

Benchmarks for text recognition models

Their performance at different text recognition datasets is shown in the table below:

Model name	IIIT5k(%)	SVT(%)	ICDAR03(%)	ICDAR13(%)	ICDAR15(%)	SVTP(%)	CUTE80(%)	average acc (%)	parameter( x10^6 )
DenseNet-CTC	72.267	67.39	82.81	80	48.38	49.45	42.50	63.26	0.24
DenseNet-BiLSTM-CTC	73.76	72.33	86.15	83.15	50.67	57.984	49.826	67.69	3.63
VGG-CTC	75.96	75.42	85.92	83.54	54.89	57.52	50.17	69.06	5.57
CRNN_VGG-BiLSTM-CTC	82.63	82.07	92.96	88.867	66.28	71.01	62.37	78.03	8.45
ResNet-CTC	84.00	84.08	92.39	88.96	67.74	74.73	67.60	79.93	44.28