• English
  • Deutsch
  • Log In
    Password Login
    or
  • Research Outputs
  • Projects
  • Researchers
  • Institutes
  • Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. A robust front page detection algorithm for large periodical collections
 
  • Details
  • Full
Options
2008
Conference Paper
Titel

A robust front page detection algorithm for large periodical collections

Abstract
Large-scale digitization projects aimed at periodicals often have as input streams of completely unlabeled document images. In such situations, the results produced by the automatic segmentation of the document stream into issues heavily influence the overall output quality of a document image analysis system. As a solution to the issue segmentation problem, this paper introduces a robust, two-step front page detection algorithm. First, the salient connected components from the front page of the periodical are described using a multi-dimensional Gaussian distribution based on discrete cosine transform (DCT) features. Second, a graph model is computed by applying Delaunay triangulation on the selected set of components. A specialized, error-tolerant graph matching algorithm is used to compute the distance score between the model and each candidate page. Experiments on a large, real-world newspaper data set demonstrate the generality and effectiveness of the proposed method.
Author(s)
Konya, Iuliu
Seibert, Christoph
Glahn, Sebastian
Eickeler, Stefan
Hauptwerk
19th International Conference on Pattern Recognition, ICPR 2008. Proceedings
Konferenz
International Conference on Pattern Recognition (ICPR) 2008
Thumbnail Image
DOI
10.1109/ICPR.2008.4760966
Language
English
google-scholar
Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Send Feedback
© 2022