• English
  • Deutsch
  • Log In
    Password Login
    Research Outputs
    Fundings & Projects
    Researchers
    Institutes
    Statistics
Repository logo
Fraunhofer-Gesellschaft
  1. Home
  2. Fraunhofer-Gesellschaft
  3. Konferenzschrift
  4. Benchmarking the Benchmarks
 
  • Details
  • Full
Options
July 10, 2023
Conference Paper
Title

Benchmarking the Benchmarks

Abstract
Over the years, security researchers have developed a broad spectrum of automatic code scanners that aim to find security vulnerabilities in applications. Security benchmarks are commonly used to evaluate novel scanners or program analysis techniques. Each benchmark consists of multiple positive test cases that reflect typical implementations of vulnerabilities, as well as negative test cases, that reflect secure implementations without security flaws. Based on this ground truth, researchers can demonstrate the recall and precision of their novel contributions.
However, as we found, existing security benchmarks are often underspecified with respect to their underlying assumptions and threat models. This may lead to misleading evaluation results when testing code scanners, since it requires the scanner to follow unclear and sometimes even contradictory assumptions.
To help improve the quality of benchmarks, we propose SecExploitLang, a specification language that allows the authors of benchmarks to specify security assumptions along with their test cases. We further present Exploiter, a tool than can automatically generate exploit code based on a test case and its SecExploitLang specification to demonstrate the correctness of the test case.
We created SecExploitLang specifications for two common security benchmarks and used Exploiter to evaluate the adequacy of their test case implementations. Our results show clear shortcomings in both benchmarks, i.e., a significant number of positive test cases turn out to be unexploitable, and even some negative test case implementation turn out to be exploitable. As we explain, the reasons for this include implementation defects, as well as design flaws, which impacts the meaningfulness of evaluations that were based on them. Our work shall highlight the importance of thorough benchmark design and evaluation, and the concepts and tools we propose shall facilitate this task.
Author(s)
Miltenberger, Marc  
Fraunhofer-Institut für Sichere Informationstechnologie SIT  
Arzt, Steven  
Fraunhofer-Institut für Sichere Informationstechnologie SIT  
Holzinger, Philipp  
Fraunhofer-Institut für Sichere Informationstechnologie SIT  
Näumann, Julius  
Fraunhofer-Institut für Sichere Informationstechnologie SIT  
Mainwork
ASIA CCS 2023, ACM Asia Conference on Computer and Communications Security. Proceedings  
Conference
Asia Conference on Computer and Communications Security 2023  
Open Access
DOI
10.1145/3579856.3582830
Additional full text version
Landing Page
Language
English
Fraunhofer-Institut für Sichere Informationstechnologie SIT  
  • Cookie settings
  • Imprint
  • Privacy policy
  • Api
  • Contact
© 2024