DOI: https://doi.org/10.15368/theses.2022.69
Available at: https://digitalcommons.calpoly.edu/theses/2476
Date of Award
6-2022
Degree Name
MS in Computer Science
Department/Program
Computer Science
College
College of Engineering
Advisor
Phoenix Fang
Advisor Department
Computer Science
Advisor College
College of Engineering
Abstract
Fuzzing is the process of finding security vulnerabilities in code by creating inputs that will activate the exploits. Grammar-based fuzzing uses a grammar, which represents the syntax of all inputs a target program will accept, allowing the fuzzer to create well-formed complex inputs. This thesis conducts an in-depth study on two blackbox grammar-based fuzzing methods, GLADE and Learn&Fuzz, on their performance and usability to the average user. The blackbox fuzzer Radamsa was also used to compare fuzzing effectiveness. From our results in fuzzing PDF objects, GLADE beats both Radamsa and Learn&Fuzz in terms of coverage and pass rate. XML inputs were also tested, but the results only show that XML is a relatively simple input as the coverage results were mostly the same. For the XML pass rate, GLADE beats all of them except for the SampleSpace generation method of Learn&Fuzz. In addition, this thesis discusses interesting problems that occur when using machine learning for fuzzing. With experience from the study, this thesis proposes an improvement to GLADE’s user-friendliness through the use of a configuration file. This thesis also proposes a theoretical improvement to machine learning fuzzing through supplementary examples created by GLADE.