Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Machine learning for document structure recognition

: Paaß, Gerhard; Konya, Iuliu

Preprint urn:nbn:de:0011-n-1945657 (4.4 MByte PDF)
MD5 Fingerprint: 8031e6a937c6ee2823deea336f372f4b
The original publication is available at
Erstellt am: 16.2.2012

Mehler, A.:
Modeling, learning, and processing of text-technological data structures
Berlin: Springer, 2011 (Studies in computational intelligence 370)
ISBN: 3-642-22612-4
ISBN: 978-3-642-22612-0
ISBN: 978-3-642-22613-7
ISSN: 1860-949X
Aufsatz in Buch, Elektronische Publikation
Fraunhofer IAIS ()
machine learning; digitizing paper documents; document structure recognition; rule-based; document segmentation

The backbone of the information age is digital information which may be searched, accessed, and transferred instantaneously. Therefore the digitization of paper documents is extremely interesting. This chapter describes approaches for document structure recognition detecting the hierarchy of physical components in images of documents, such as pages, paragraphs, and figures, and transforms this into a hierarchy of logical components, such as titles, authors, and sections. This structural information improves readability and is useful for indexing and retrieving information contained in documents. First we present a rule-based system segmenting the document image and estimating the logical role of these zones. It is extensively used for processing newspaper collections showing world-class performance. In the second part we introduce several machine learning approaches exploring large numbers of interrelated features. They can be adapted to geometrical models of the document structure, which may be set up as a linear sequence or a general graph. These advanced models require far more computational resources but show a better performance than simpler alternatives and might be used in future.