• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Character enhancement for historical newspapers printed using hot metal typesetting
 
  • Details
  • Full
Options
2011
Conference Paper
Title

Character enhancement for historical newspapers printed using hot metal typesetting

Abstract
We propose a new method for an effective removal of the printing artifacts occurring in historical newspapers which are caused by problems in the hot metal typesetting, a widely used printing technique in the late 19th and early 20th century. Such artifacts typically appear as thin lines between single characters or glyphs and are in most cases connected to one of the neighboring characters. The quality of the optical character recognition (OCR) is heavily influenced by this type of printing artifacts. The proposed method is based on the detection of (near) vertical segments by means of directional single-connected chains (DSCC). In order to allow the robust processing of complex decorative fonts such as Fraktur, a set of rules is introduced. This allows us to successfully process prints e xhibiting artifacts with a stroke width even higher than that of most thin characters systems. We evaluate our approach on a dataset consisting of old newspaper excerpts printed using Fraktur fonts. The recognition results on the enhanced images using two independent OCR engines (ABBYY FineReader and Tesseract) show significant improvements over the originals.
Author(s)
Konya, Iuliu
Eickeler, Stefan  
Seibert, Christoph  
Mainwork
International Conference on Document Analysis and Recognition, ICDAR 2011. Vol.2  
Conference
International Conference on Document Analysis and Recognition (ICDAR) 2011  
Open Access
DOI
10.24406/publica-r-373127
10.1109/ICDAR.2011.190
File(s)
001.pdf (757.52 KB)
Rights
Under Copyright
Language
English
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS  
Keyword(s)
  • OCR

  • retro-digitization

  • historical documents

  • hot metal typesetting

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024