This is a personal blog. My other stuff: book | home page | Twitter | prepping | CNC robotics | electronics

February 10, 2015

Bi-level TIFFs and the tale of the unexpectedly early patch

Today's release of MS15-016 (CVE-2015-0061) fixes another of the series of browser memory disclosure bugs found with afl-fuzz - this time, related to the handling of bi-level (1-bpp) TIFFs in Internet Explorer (yup, MSIE displays TIFFs!). You can check out a simple proof-of-concept here, or simply enjoy this screenshot of eight subsequent renderings of the same TIFF file:

The vulnerability is conceptually similar to other previously-identified problems with GIF and JPEG handling in popular browsers (example 1, example 2), with the SOS handling bug in libjpeg, or the DHT bug in libjpeg-turbo (details here) - so I will try not to repeat the same points in this post.

Instead, I wanted to take note of what really sets this bug apart: Microsoft has addressed it in precisely 60 days, counting form my initial e-mail to the availability of a patch! This struck me as a big deal: although vulnerability research is not my full-time job, I do have a decent sample size - and I don't think I have seen this happen for any of the few dozen MSIE bugs that I reported to MSRC over the past few years. The average patch time always seemed to be closer to 6+ months - coupled with what the somewhat odd practice of withholding attribution in security bulletins and engaging in seemingly punitive PR outreach if the reporter ever went public before that.

I am very excited and hopeful that rapid patching is the new norm - and huge thanks to MSRC folks if so :-)

February 04, 2015

Symbolic execution in vuln research

There is no serious disagreement that symbolic execution has a remarkable potential for programatically detecting broad classes of security vulnerabilities in modern software. Fuzzing, in comparison, is an extremely crude tool: it's the banging-two-rocks-together way of doing business, as contrasted with brain surgery.

Because of this, it comes as no surprise that for the past decade or so, the topic of symbolic execution and related techniques has been the mainstay of almost every single self-respecting security conference around the globe. The tone of such presentations is often lofty: the slides and research papers are frequently accompanied by claims of extraordinary results and the proclamations of the imminent demise of less sophisticated tools.

Yet, despite the crippling and obvious limitations of fuzzing and the virtues of symbolic execution, there is one jarring discord: I'm fairly certain that probably around 70% of all remote code execution vulnerabilities disclosed in the past few years trace back to fairly "dumb" fuzzing tools, with the pattern showing little change over time. The remaining 30% is attributable almost exclusively to manual work - be it systematic code reviews, or just aimlessly poking the application in hopes of seeing it come apart. When you dig through public bug trackers, vendor advisories, and CVE assignments, the mark left by symbolic execution can be seen only with a magnifying glass.

This is an odd discrepancy, and one that is sometimes blamed on the practitioners being backwardly, stubborn, and ignorant. This may be true, but only to a very limited extent; ultimately, most geeks are quick to embrace the tools that serve them well. I think that the disconnect has its roots elsewhere:
  1. The code behind many of the most-cited, seminal publications on security-themed symbolic execution remains non-public; this is particularly true for Mayhem and SAGE. Implementation secrecy is fairly atypical in the security community, is usually viewed with distrust, and makes it difficult to independently evaluate, replicate, or build on top of the published results.

  2. The research often fails to fully acknowledge the limitations of the underlying methods - while seemingly being designed to work around these flaws. For example, the famed Mayhem experiment helped identify thousands of bugs, but most of them seemed to be remarkably trivial and affected only very obscure, seldom-used software packages with no significance to security. It is likely that the framework struggled with more practical issues in higher-value targets - a prospect that, especially if not addressed head-on, can lead to cynical responses and discourage further research.

  3. Any published comparisons to more established vulnerability-hunting techniques are almost always retrospective; for example, after the discovery of Heartbleed, several teams have claimed that their tools would have found the bug. But analyses that look at ways to reach an already-known fault condition are very susceptible to cognitive bias. Perhaps more importantly, it is always tempting to ask why the tools are not tasked with producing a steady stream of similarly high-impact, headline-grabbing bugs.
The uses of symbolic execution, concolic execution, static analysis, and other emerging technologies to spot substantial vulnerabilities in complex, unstructured, and non-annotated code are still in their infancy. The techniques suffer from many performance trade-offs and failure modes, and while there is no doubt that they will shape the future of infosec, thoughtful introspection will probably get us there sooner than bold claims with little or no follow-through. We need to move toward open-source frameworks, verifiable results, and solutions that work effortlessly and reliably for everyone, against almost any target. That's the domain where the traditional tools truly shine, and that's why they scale so well.

Ultimately, the key to winning the hearts and minds of practitioners is very simple: you need to show them how the proposed approach finds new, interesting bugs in the software they care about.

February 03, 2015

afl-fuzz: black-box binary fuzzing, perf improvements, and more

I had quite a few posts about afl-fuzz recently, mostly focusing on individual, newly-shipping features (say, the fork server, the crash explorer, or the grammar reconstruction logic). But this probably gets boring for people not interested in the tool, and doesn't necessarily add up to a coherent picture for those who do.

To trim down on AFL-themed posts, I decided to write down a technical summary of all the internals and maintain it as a part of the AFL home page. The document talks about quite a few different things, including:
  • The newly-added support for guided fuzzing of black-box, closed-source binaries (yes, it finally happened!),

  • Info about effector maps - a new feature that offers significant performance improvements for many types of fuzzing jobs,

  • Some hard data comparing the efficiency of evolutionary fuzzing and AFL-style instrumentation versus more traditional tools,

  • Discussion of many other details that have not been documented in depth until now - queue culling, file minimization, etc.
I'll try to show a bit more restraint with AFL-related news on this blog from now on, so if you want to stay in the loop on key developments, consider signing up for the afl-users@ mailing list.