• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. P2PNeXt: Advancing Crowd Counting and Localization Using an Enhanced P2PNet Architecture
 
  • Details
  • Full
Options
2025
Conference Paper
Title

P2PNeXt: Advancing Crowd Counting and Localization Using an Enhanced P2PNet Architecture

Abstract
Accurate crowd counting and localization are essential for ensuring public safety and managing risks in densely populated areas, such as during large events or in urban environments. They enable authorities to monitor and manage large gatherings effectively, thereby preventing overcrowding and potential accidents. In emergency situations, accurate crowd data can facilitate quicker and more efficient responses by enabling the identification of high-density areas that may require immediate attention. From the computer vision perspective, these are crucial capabilities, demanding both precision in object counting and accurate spatial localization of individuals. In this study, we propose an enhancement to the P2PNet, a point-based framework for crowd counting, by integrating a modern neural network architecture, ConvNeXt, as the backbone.We explored two primary directions for the backbone integration: utilizing a feature pyramid to combine various feature maps, and employing a single feature map from ConvNeXt, bypassing the feature pyramid. Initial experiments indicated that the single-feature-map approach, particularly with the very first feature map, yielded superior results. However, through a few critical modifications to the feature pyramid module - including bilinear interpolation for upsampling, batch normalization across convolutions, and the inclusion of ReLU in the decoder - the feature pyramid approach ultimately outperformed the single feature map method. The revised feature pyramid, especially the first feature map output from the decoder module, achieved the best results across multiple datasets. This way our research contributes to the broader understanding of risk assessment and management, offering a robust solution for precise crowd density estimation and localization.
Author(s)
Golda, Thomas  orcid-logo
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Sänger, Jann
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Hildenbrand, John
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Metzler, Jürgen  
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Mainwork
35th European Safety and Reliability Conference (ESREL 2025) and the 33rd Society for Risk Analysis Europe Conference (SRA-E 2025). Proceedings  
Conference
European Safety and Reliability Conference 2025  
Society for Risk Analysis Europe (SRA Conference) 2025  
Open Access
File(s)
Download (1.18 MB)
Rights
Use according to copyright law
DOI
10.3850/978-981-94-3281-3_ESREL-SRA-E2025-P4779-cd
10.24406/publica-5925
Additional link
Full text
Language
English
Fraunhofer-Institut für Optronik, Systemtechnik und Bildauswertung IOSB  
Keyword(s)
  • Crowd Counting

  • Computer Vision

  • Machine Learning

  • ConvNeXt

  • P2PNet

  • Point-Based Framework

  • Public Safet

  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024