Thursday, February 04, 2010

A Maze of Twisty Fuzzers All Alike

Funny how a single innocent tweet can stir the pot. Not that I'm disappointed or that I mind, because the pot definitely needed to be stirred, but that certainly wasn't my intent on Monday. Really. But let's back up.

So I gave a short, not terribly technical presentation on Open Source fuzzing tools right before lunch at a conference on vulnerability discovery hosted by CERT at their office in Arlington. It was go to back there as I'd been to the SEI offices back in 2006 when I was working with them on the disclosure of some SCADA vulns.

Unfortunately, I didn't get to stick around the whole day (I missed Jared DeMott's presentation) and I was in and out during a conference call but there interesting talk by CERT, CERT-FI, Secunia, and Codenomicon.

But most interesting, and what led to my innocent tweet was a talk by Microsoft on how they use fuzzing and what were the results of different tools and approaches.

The conclusion I found to be surprising that they found that the use of "smart fuzzers" to have a lower ROI than the use of "dumb fuzzers" and their whitebox fuzzing platform called SAGE. Their point was the time it takes to define, model, and implement the protocol in a smart fuzzer is in most cases better spent having less skilled engineers run dumb fuzzers or white box tools.

They mentioned a talk at Blue Hat Security Briefings (I don't think this is the actual talk, but I don't have time to look for it) where they presented the bug results on a previously untested application were tested by a internally written (smart fuzzer), Peach (the dumb fuzzer?) and their whitebox fuzzing platform called SAGE. They mentioned an interesting technique of taking "major hashes" and "minor hashes" on the stack traces to isolate unique bugs. This is interesting because the primary focus has been on reducing the number of unique test cases but another approach is to look at the results. It may end up being more efficient. Of course this assumes the ability to have instrumented targets which may not always be the case, for example with embedded systems.

So Dale picked up on this and tried to apply this to the world of SCADA
We have two security vendors that are trying to sell products to the control system market: Wurldtech with their Achilles platform and Mu Dynamics with their Mu Test Suite. [FD: Wurldtech is a past Digital Bond client and advertiser] One of the features of these products is they both send a large number of malformed packets at an interface – - typically crashing protocol stacks that have ignored negative testing.
Mu responded within the comments in the blog and Wurldtech (far more defensively) on their own blog
In fact, our CTO Dr. Kube even gave a presentation at Cansecwest almost 2 years ago called “Fuzzing WTF” which was our first attempt to re-educate the community. To bolster the impact, we invited our friends at Codenomicon to help as they also were frustrated with the community discourse. The presentation can be found here.
Well I guess thie "re-education" (which sounds vaguely Maoist, I guess some of us need to be sent to a Wurldtech Fuzzing Re-education program) hasn't exactly worked although a satisfied Wurldtech customer did chime in on the Digital Bond blog. I actually agree that the need for better descriptions of fuzzing tools capabilities is needed and that was the entire point of my talk. I did a survey of the features available several dozen fuzzing tools and fuzzing frameworks that could be used to test.

I didn't spend as much time on the actual message generation as I should have and I was only focusing on Free and Open Source tools, but I identified a number of attributes for comparison such as target, execution mode, language, transport, template (generation, data model, built-in functions), fault payloads, debugging & instrumentation, and session handling. I'm not sure I completely hit my target but one of my goals was to develop some criteria to help folks make better choices on which Open Source tools could be used to most efficiently conduct robustness testing of your target. One of my conclusions (which I was pleased to hear echoed in the Microsoft talk) is that no single tool is best, no single approach is adequate--and that there are different types of fuzzing users that will require different feature sets. A QA engineer (that may have little to no security expertise) requires different features from those required for a pen-tester (or perhaps security analyst as part of a compliance-based engagement) which are still different from a hard core security researcher.

And the same applies to commercial tools you are paying tens of thousands of dollars for. One size does not fit all, regardless of the marketing (or mathematical) claims of the vendor. It would definitely be good to see a bakeoff of the leading commercial and Open Source fuzzing/protocol robustness tools similar to what Jeff Mercer has been doing for webapp scanners but I'm not optimistic that we will see that on the commercial tools because they are too expensive and the primarily customers for these tools (large vendors) are not going to disclose enough details about the vulnerabilities discovered to provide a rich enough data set for comparison.

It won't be me but perhaps some aspiring young hacker will take the time to do a thorough comparing the coverage of the tools that are out there against a reference implementation -- instead of writing yet another incomplete, poorly documented Open Soure fuzzer or fuzzing framework.

2 comments:

Ari Takanen said...

Charlie Miller already has done a comparison of fuzzers, presented at CanSecWest, and also featured in more detail in our book (authored by Takanen, DeMott, Miller). www.fuzz-test.com

Also the coverage white paper by Codenomicon: Fuzzing Challenges: Metrics and Coverage answers a lot of the issues you bring up.

Matt Franz said...

I looked at the ToC on Amazon perhaps somebody could me a free copy ;)

In general most security books are obsolete by the time they are printed.

- mdf