Fuzzing at scale
August 12th, 2011 | Published in Google Online Security
One of the exciting things about working on security at Google is that you have a lot of compute horsepower available if you need it. This is very useful if you’re looking to fuzz something, and especially if you’re going to use modern fuzzing techniques.
Using these techniques and large amounts of compute power, we’ve found hundreds of bugs in our own code, including Chrome components such as WebKit and the PDF viewer. We recently decided to apply the same techniques to fuzz Adobe’s Flash Player, which we include with Chrome in partnership with Adobe.
A good overview of some modern techniques can be read in this presentation. For the purposes of fuzzing Flash, we mainly relied on “corpus distillation”. This is a technique whereby you locate a large number of sample files for the format at hand (SWF in this case). You then see which areas of code are reached by each of the sample files. Finally, you run an algorithm to generate a minimal set of sample files that achieves the code coverage of the full set. This calculated set of files is a great basis for fuzzing: a manageable number of files that exercise lots of unusual code paths.
What does corpus distillation look like at Google scale? Turns out we have a large index of the web, so we cranked through 20 terabytes of SWF file downloads followed by 1 week of run time on 2,000 CPU cores to calculate the minimal set of about 20,000 files. Finally, those same 2,000 cores plus 3 more weeks of runtime were put to good work mutating the files in the minimal set (bitflipping, etc.) and generating crash cases. These crash cases included an interesting range of vulnerability categories, including buffer overflows, integer overflows, use-after-frees and object type confusions.
The initial run of the ongoing effort resulted in about 400 unique crash signatures, which were logged as 106 individual security bugs following Adobe's initial triage. As these bugs were resolved, many were identified as duplicates that weren't caught during the initial triage. A unique crash signature does not always indicate a unique bug. Since Adobe has access to symbols and sources, they were able to group similar crashes to perform root cause analysis reducing the actual number of changes to the code. No analysis was performed to determine how many of the identified crashes were actually exploitable. However, each crash was treated as though it were potentially exploitable and addressed by Adobe. In the final analysis, the Flash Player update Adobe shipped earlier this week contained about 80 code changes to fix these bugs.
Commandeering massive resource to improve security is rewarding on its own, but the real highlight of this exercise has been Adobe’s response. The Flash patch earlier this week fixes these bugs and incorporates UIPI protections for the Flash Player sandbox in Chrome which Justin Schuh contributed assistance on developing. Fixing so many issues in such a short time frame shows a real commitment to security from Adobe, for which we are grateful.