Fuzzing is an automated testing technique that generates a lot of testcases and monitors for exceptions to test a program. Recently, fuzzing research using machine learning has been actively proposed to solve various problems in the fuzzing process, b...
Fuzzing is an automated testing technique that generates a lot of testcases and monitors for exceptions to test a program. Recently, fuzzing research using machine learning has been actively proposed to solve various problems in the fuzzing process, but a comprehensive evaluation of fuzzing research using machine learning is lacking. In this paper, we analyze recent research that applies machine learning to scheduling techniques for fuzzing, categorizing them into reinforcement learning-based and supervised learning-based fuzzers. We evaluated the coverage performance of the analyzed machine learning-based fuzzers against real-world programs with four different file formats and bug detection performance against the LAVA-M dataset. The results showed that AFL-HIER, which applied seed clustering and seed scheduling with reinforcement learning outperformed in coverage and bug detection. In the case of supervised learning, it showed high coverage on tcpdumps with high code complexity, and its superior bug detection performance when applied to hybrid fuzzing. This research shows that performance of machine learning-based fuzzer is better when both machine learning and additional fuzzing techniques are used to optimize the fuzzing process. Future research is needed on practical and robust machine learning-based fuzzing techniques that can be effectively applied to programs that handle various input formats.