Options
2026
Conference Paper
Title
Legend-Informed Symbol Recognition in Engineering Diagrams with Self-supervised Learning
Abstract
Engineering diagrams are vital documents in many industries. Historically stored as image data, conversion of such diagrams into modern formats is required for further use and adaptation. Therefore, research towards automated digitization has gained traction. To recognize symbols in the diagrams, recent studies rely on supervised learning, but large labeled datasets are difficult to acquire in industry settings. In this paper, we present a self-supervised approach towards automated recognition of engineering diagram symbols. We validate the method on diagrams from the building sector, where they are used for technical plant planning, installation, and monitoring. The method makes use of diagram legends, which show prototypical examples of the symbols occurring in the diagram. As the legend entries are unique, they can be used to learn embeddings through contrastive learning for a self-supervised classification of diagram symbols. The method circumvents most of the labeling efforts: all symbols are extracted from the set of diagrams with a symbol region detector trained on a synthetic dataset. Then, we train a symbol encoder by contrasting the symbols found inside the legends with each other. The encoder is subsequently used in a matching procedure that classifies unknown diagram symbols by comparing them to the legend examples. Furthermore, it can recognize when symbols do not appear in the legend at all. Generalizing beyond variations in diagram drawing style, this matching procedure achieves over 80% accuracy. The results demonstrate the potential of legends for engineering diagram digitization without the need to invest in labeled datasets.
Author(s)