• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Efficient, lexicon-free OCR using deep learning
 
  • Details
  • Full
Options
2019
Conference Paper
Title

Efficient, lexicon-free OCR using deep learning

Abstract
Contrary to popular belief, Optical Character Recognition (OCR) remains a challenging problem when text occurs in unconstrained environments, like natural scenes, due to geometrical distortions, complex backgrounds, and diverse fonts. In this paper, we present a segmentation-free OCR system that combines deep learning methods, synthetic training data generation, and data augmentation techniques. We render synthetic training data using large text corpora and over 2000 fonts. To simulate text occurring in complex natural scenes, we augment extracted samples with geometric distortions and with a proposed data augmentation technique - alpha-compositing with background textures. Our models employ a convolutional neural network encoder to extract features from text images. Inspired by the recent progress in neural machine translation and language modeling, we examine the capabilities of both recurrent and convolutional neural networks in modeling the interactions between input elements. The proposed OCR system surpasses the accuracy of leading commercial and open-source engines on distorted text samples.
Author(s)
Namysl, Marcin  
Konya, Iuliu
Mainwork
15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019. Proceedings  
Conference
International Conference on Document Analysis and Recognition (ICDAR) 2019  
Open Access
DOI
10.1109/ICDAR.2019.00055
Additional full text version
Landing Page
Language
English
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024