Now showing 1 - 10 of 2756
No Thumbnail Available
Publication

Knowledge-Distillation-Based Label Smoothing for Fine-Grained Open-Set Vehicle Recognition

2024 , Wolf, Stefan , Loran, Dennis , Beyerer, Jürgen

Fine-grained vehicle classification describes the task of estimating the make and the model of a vehicle based on an image. It provides a useful tool for security authorities to find suspects in surveillance cameras. However, most research about fine-grained vehicle classification is only focused on a closed-set scenario which considers all possible classes to be included in the training. This is not realistic for real-world surveillance applications where the images fed into the classifier can be of arbitrary vehicle models and the large number of commercially available vehicle models renders learning all models impossible. Thus, we investigate fine-grained vehicle classification in an open-set recognition scenario which includes unknown vehicle models in the test set and expects these samples to be rejected. Our experiments highlight the importance of label smoothing for open-set recognition performance. Nonetheless, it lacks recognizing the different semantic distances between vehicle models which result in largely different confusion probabilities. Thus, we propose a knowledge-distillation-based label smoothing approach which considers these different semantic similarities and thus, improves the closed-set classification as well as the open-set recognition performance.

No Thumbnail Available
Publication

12.2 W ZGP OPO pumped by a Q-Switched Tm3+:Ho3+-codoped fiber laser

2024 , Forster, Patrick , Lautenschläger, Jan , Lorenz, Dominik , Schneider, Julian , Panitzek, Dieter , Romano, Clément , Eichhorn, Marc , Kieleck, Christelle

We present our latest results in power scaling of Midwave-Infrared (MWIR) Optical Parametric Oscillators (OPOs) based on a Zinc Germanium Phosphide (ZGP) crystal, utilizing a single oscillator fiber laser as pump source. To obtain a compact and complexity-reduced pump source emitting at ≥ 2.09 μm, a Q-switched Tm3+:Ho3+- codoped fiber laser was developed. Based on this pump source at an emission wavelength of 2.1 μm, we achieved an MWIR output power of 12.2W with pulse energies of up to 270 μJ and a conversion efficiency exceeding 43 %. This result exceeds the published power records of ZGP-based OPOs pumped by 2 μm Q-switched fiber lasers by 50 % and sets a new benchmark for average power scaling and pulse energy of Q-switched pump sources.

No Thumbnail Available
Publication

DustNet: Attention to Dust

2024 , Michel, Andreas , Weinmann, Martin , Schenkel, Fabian , Gomez, Tomas , Falvey, Mark , Schmitz, Rainer , Middelmann, Wolfgang , Hinz, Stefan

Detecting airborne dust in common RGB images is hard. Nevertheless, monitoring airborne dust can greatly contribute to climate protection, environmentally friendly construction, research, and numerous other domains. In order to develop an efficient and robust airborne dust monitoring algorithm, various challenges have to be overcome. Airborne dust may be opaque as well translucent, can vary heavily in density, and its boundaries are fuzzy. Also, dust may be hard to distinguish from other atmospheric phenomena such as fog or clouds. To cover the demand for a performant and reliable approach for monitoring airborne dust, we propose DustNet, a dust density estimation neural network. DustNet exploits attention and convolutional-based feature pyramid structures to combine features from multiple resolution and semantic levels. Furthermore, DustNet utilizes highly aggregated global information features as an adaptive kernel to enrich high-resolution features. In addition to the fusion of local and global features, we also present multiple approaches for the fusion of temporal features from consecutive images. In order to validate our approach, we compare results achieved by our DustNet with those results achieved by methods originating from the crowd-counting and the monocular depth estimation domains on an airborne dust density dataset. Our DustNet outperforms the other approaches and achieves a 2.5% higher accuracy in localizing dust and a 14.4% lower mean absolute error than the second-best approach.

No Thumbnail Available
Publication

UPAR Challenge 2024: Pedestrian Attribute Recognition and Attribute-Based Person Retrieval - Dataset, Design, and Results

2024 , Cormier, Mickael , Specker, Andreas , Cezar Silveira Jacques Junior, Julio , Moritz, Lennart , Metzler, Jürgen , Moeslund, Thomas B. , Nasrollahi, Kamal , Escalera, Sergio , Beyerer, Jürgen

Attribute-based person retrieval enables individuals to be searched and retrieved using their soft biometric features, for instance, gender, accessories, and clothing colors. The process has numerous practical use cases, such as surveillance, retail, or smart cities. Notably, attribute-based person retrieval empowers law enforcement agencies to efficiently comb through vast volumes of surveillance footage from extensive multi-camera networks, facilitating the swift localization of missing persons or criminals. However, for real-world application, attribute-based person retrieval is required to generalize to multiple settings in indoor and outdoor scenarios with their respective challenges. For its second edition, the WACV 2024 Pedestrian Attribute Recognition and Attribute-based Person Retrieval Challenge (UPAR-Challenge) aimed once again to spotlight the current challenges and limitations of existing methods to bridge the domain gaps in real-world surveillance contexts. Analogous to the first edition, two tracks are offered: pedestrian attribute recognition and attribute-based person retrieval. The UPAR-Challenge 2024 dataset extends the UPAR dataset with the introduction of harmonized annotations for the MEVID dataset, which is used as a novel test domain. To this aim, 1.1M additional annotations were manually labeled and validated. Each track evaluates the robustness of the competing methods to domain shifts by training and evaluating on data from entirely different domains. The challenge attracted 82 registered participants, which was considered a success from the organizers' perspective. While ten competing teams surpassed the baseline for track 1, no team managed to outperform the baseline on track 2, emphasizing the task's difficulty. This work describes the challenge design, the adopted dataset, obtained results, as well as future directions on the topic. The UPAR-Challenge dataset is available on GitHub: https:/github.com/speckean/upar_challenge.

No Thumbnail Available
Publication

Domänenadaptation für feingranulare Fahrzeugklassifikation mittels Domain-Adversarial-Learning

2024 , Wolf, Stefan , Beyerer, Jürgen

No Thumbnail Available
Publication

Security Fence Inspection at Airports Using Object Detection

2024 , Friederich, Nils , Specker, Andreas , Beyerer, Jürgen

To ensure the security of airports, it is essential to protect the airside from unauthorized access. For this pur-pose, security fences are commonly used, but they require regular inspection to detect damages. However, due to the growing shortage of human specialists and the large man-ual effort, there is the need for automated methods. The aim is to automatically inspect the fence for damage with the help of an autonomous robot. In this work, we explore object detection methods to address the fence inspection task and localize various types of damages. In addition to evaluating four State-of-the-Art (SOTA) object detection models, we analyze the impact of several design criteria, aiming at adapting to the task-specific challenges. This in-cludes contrast adjustment, optimization of hyperparameters, and utilization of modern backbones. The experimental results indicate that our optimized You Only Look Once v5 (YOLOv5) model achieves the highest accuracy of the four methods with an increase of 6.9% points in Average Precision (AP) compared to the baseline. Moreover, we show the real-time capability of the model. The trained models are published on GitHub: hups://github.com/IN-Friederichlairport_fence_inspection.

No Thumbnail Available
Publication

Particle-Specific Deflection Windows for Optical Sorting by Uncertainty Quantification

2024 , Reith-Braun, Marcel , Liang, Kevin , Pfaff, Florian , Maier, Georg , Gruna, Robin , Bauer, Albert , Kruggel-Emden, Harald , Längle, Thomas , Beyerer, Jürgen , Hanebeck, Uwe D.

In current state of the art sensor-based sorting systems, the length of the deflection windows, i.e., the period of nozzle activation and the number of nozzles to be activated, is commonly determined solely by the size of the particles. However, this comes at the cost of the sorting process not accounting for model discrepancies between actual and presumed particle motion, as well as for situations where the available information does not allow for precise determination of nozzle activations. To achieve a desired sorting accuracy, in practice, one is therefore usually forced to enlarge the deflection window to a certain degree, which increases the number of falsely co-deflected particles and compressed air consumption. In this paper, we propose incorporating the uncertainty of the prediction of particle motion of each individual particle into the determination of the deflection windows. The method is based on the predictive tracking approach for optical sorting, which tracks the particles while they move toward the nozzle array based on images of an area-scan camera. Given the extracted motion information from the tracking, we propose an approximation for the distribution of arrival time and location of the particle at the nozzle array assuming nearly constant-velocity or nearly constantacceleration particle motion behavior. By evaluating the quantile function of both distributions, we obtain a confidence interval for the arrival time and location based on prediction uncertainty, which we then combine with the particle size to form the final deflection window. We apply our method to a real sorting task using a pilot-scale chute sorter. Our results obtained from extensive sorting trials show that sorting accuracies can be remarkably improved compared with state-of-the-art industrial sorters and enhanced even further compared with predictive tracking while having the potential to reduce compressed air consumption.

No Thumbnail Available
Publication

Enhancing Skeleton-Based Action Recognition in Real-World Scenarios Through Realistic Data Augmentation

2024 , Cormier, Mickael , Schmid, Yannik , Beyerer, Jürgen

Skeleton-based action recognition is a prominent research area that provides a concise representation of human motion. However, real-world scenarios pose challenges to the reliability of human pose estimation, which is fundamental to such recognition. The existing literature mainly focuses on laboratory experiments with near-perfect skeletons, and fails to address the complexities of the real world. To address this, we propose simple yet highly effective data augmentation techniques based on the observation of erroneous human pose estimation, which enhance state-of-the-art methods for real-world skeleton-based action recognition. These techniques yield significant improvements (up to +4.63 accuracy) on the widely used UAV Human Dataset, a benchmark for evaluating real-world action recognition. Experimental results demonstrate the effectiveness of our augmentation techniques in compensating for erroneous and noisy pose estimation, leading to significant improvements in action recognition accuracy. By bridging the gap between laboratory experiments and real-world scenarios, our work paves the way for more reliable and practical skeleton-based action recognition systems. To facilitate reproducibility and further development, the Skelbumentations library is released at https://github.com/MickaelCormier/Skelbumentations. This library provides the code implementation of our augmentation techniques, enabling researchers and practitioners to easily augment skeleton sequences and improve the performance of skeleton-based action recognition models in real-world applications.

No Thumbnail Available
Publication

Cognitive User Modeling for Adaptivity in Serious Games

2024 , Streicher, Alexander , Bauer, Kolja

Accurate user models that capture information such as needs and knowledge levels are a central part of adaptive e-learning systems, which is all the more important in a post-pandemic world with more individualized learning. In this article, we report on the application of a Bayesian cognitive state modeling approach to adaptive educational serious games. Adaptivity needs information on the users as control variables, e.g., high or low cognitive load. Typically, this information is encoded in user models. One approach to building user models is to use tools from cognitive sciences such as Bayesian cognitive state modeling. However, cognitive modeling tools for adaptivity are sparse and can be difficult to implement. The main research question of this work is how to apply cognitive modeling tools to serious games to control adaptivity. The contribution of this article is the concept of how to implement cognitive modeling for adaptive serious games. Our approach makes use of standardized Experience API (xAPI) tracking data to facilitate applicability. We investigate how to compute quantitative measures of user performance to control adaptive responses. The implemented system has been evaluated in a user study with a serious game for image interpretation. The study results show a moderate correlation between self-assessed and computed variables.

No Thumbnail Available
Publication

Fusion between Event-Based and Line-Scan Camera for Sensor Based Sorting

2024 , Bäcker, Paul , Terzer, Nick , Maier, Georg , Hanebeck, Uwe D. , Längle, Thomas , Beyerer, Jürgen

In sensor-based sorting systems, there is usually a time delay between the detection and separation of the material stream. This delay is required for the sensor data to be processed, i.e., to identify the objects that should be ejected. In this blind phase, the material stream continues to move. In most current systems, homogeneous movement for all objects is assumed, and actuation is timed accordingly. However, in many cases, this assumption does not hold true, for example, when unknown, foreign materials are present that have varying density and shapes, leading to inaccurate activation of the separation actuators and in turn lower sorting quality. Minimizing the blind phase by reducing the distance between the sensor and the actor is limited by the processing time of the detection process and may lead to interference between actuation and sensing. In this work, we address these issues by using an event-based camera placed between the sensor and actuator stages to track objects during the blind phase with minimal latency and small temporal increments between tracking steps. In our proposed setup, the event-based camera is used exclusively for tracking, while an RGB line-scan camera is used for classification. We propose and evaluate several approaches to combine the information of the two cameras. We benchmark our approach against the traditional method of using a fixed temporal offset by comparing simulated valve activation. Our method shows a drastic improvement in accuracy for our example application, improving the percentage of correctly deflected objects to 99.2% compared to 78.57% without tracking.