Generator-based Fuzzing with Input Features

Kraus, Roman; Nguyen, Hoang Lam; Schneider, Martin A.

doi:10.1145/3643659.3643925

September 2024

Conference Paper

Abstract

Generator-based fuzzing is a capable technique for testing semantic processing stages of a system under test (SUT). The idea is to use format-specific input generators, which can guarantee that inputs will be syntactically valid. One open question however is how to create inputs with generator-based fuzzing whose content exhibits particular qualities (or input features). This is a downside, as previous research suggests the importance of input features for triggering otherwise rarely reached functionalities of an SUT. We propose an approach to identify input features for rarely visited code by performing sequential pattern mining on the tree model of generated inputs. These features are regenerated by splicing (i.e., inserting) them into the model of newly generated inputs. We evaluate our approach on Ant, Maven, Closure and Rhino. The results indicate an increased diversity in the exploration of rarely executed code in most benchmarks. Significant improvements in valid rare branch hits were observed in half of the SUTs. JavaScript benchmarks tend to benefit more in terms of overall coverage but no statistically significant difference was found.

Author(s)

Kraus, Roman

Fraunhofer-Institut für Offene Kommunikationssysteme FOKUS

Nguyen, Hoang Lam

Humboldt-Universität zu Berlin

Schneider, Martin A.

Fraunhofer-Institut für Offene Kommunikationssysteme FOKUS

Mainwork

17th ACM/IEEE International Workshop on Search-Based and Fuzz Testing, SBFT 2024. Proceedings

Conference

International Workshop on Search-Based and Fuzz Testing 2024

Options

Generator-based Fuzzing with Input Features