[This post was conceived and co-authored by Andrew Ruef, Ph.D. student at the University of Maryland, working with me. –Mike]
As researchers, we are often asked to look into a crystal ball. We try to anticipate future problems so that work we begin now will help address those problems before they become acute. Sometimes, a researcher guesses the problem and its possible solution, but chooses not to pursue it. In a sense, she has found, and discarded, an idea ahead of its time.
Recently, a friend of Andrew’s pointed him to a 20-year-old email exchange on the “firewalls” mailing list that blithely suggests, and discards, problems and solutions that are now quite relevant, and on the cutting edge of software security research. The situation is both entertaining and instructive, especially in that the ideas are quite squarely in the domain of programming languages research, but were not considered by PL researchers at the time (as far as we know).
Problem: Exploits via buffer overflow
This 1995 e-mail exchange on the “firewalls” mailing list identifies the problem presented by buffer overflow-based exploits:
Someone can spend hours to develop an exploit for a particular target
platform, OS version, and CPU type. That exploit might need to use different
offsets on other versions or machine types, and it is clearly just a
mechanical process.It is almost getting boring or something.
Note this statement of boredom is being made just one year before the publication of Smashing the Stack for Fun and Profit, a watershed publication that systematizes the construction of stack smashing attacks. This statement also anticipates research that automates the discovery and construction of exploits. The author continues:
I just decided it would be pretty funny if we had a C compiler which would
randomize the location of variables on the stack for each compile. It
would also be nice if it could randomly stick in some unused short ints
into the stack in various places (especially around big strings used
by sprintf, strcpy, and friends 🙂 ).This would offer some protection against the next 10 years of cloned
stack-o-rama exploits. Raise the level of effort and all that. 🙂
It doesn’t change the price of the first exploit but it might prevent
anybody from making a profit on the volume!
Offhand solutions: 20 years of research
The amazing thing about this email exchange is the number of ideas that are offered almost tongue-in-cheek, but which in the 20 years since have turned into serious topics of study and, in some cases, mainstream deployment.
Stack randomization (2010 – active research)
The initial post quoted above suggests one possible defense — randomizing stack layouts. Some on the email list dismiss the idea (more on this below), but by 2010, Michael Franz makes the case that the idea should be revisited. In the last five years, his group and others have made a compelling case for automated software diversity, which includes the randomization of stack layouts.
Safe Stacks (2006 – now a compiler extension)
Others on the list respond with their own crazy ideas, such as the use of two stacks:
As long as we’re talking ‘blue sky’ solutions, why not do it _right_??
It’s well-known that the way you insure the integrity of code is keep it
separate from data. A single stack, containing both user-data, *and*
program ‘state’ information (function return addresses, etc.), fails to
maintain this separation. Solution: _two_ stacks. One for return-
addresses and processor ‘context’ info *only*, the other for all user-data.
Running with two stacks was seriously explored by the XFI system (which provided other protections, too) in 2006. The idea has matured to the point that it is now in clang, the production-quality front-end of LLVM, as SafeStack
Bounds Checking (1995 – going mainstream)
The author who proposed separate stacks also proposed preventing out-of-bounds accesses in the first place:
Another solution is to add ‘run-time array bounds checking’ to the compiler,
a la Pascal. Of course, then you have to deal with what happens when a
‘array bounds exception’ occurs. Default procedure, in Pascal for example,
is to announce it and _abort_ the program (wouldn’t *this* make for wonderful
‘denial-of-service’ attacks?!!). The way you prevent those aborts is to do
your _own_ checking *first*, and never reference past the end of an array.
Thus, you have _double_ checks on everything.
Jones and Kelly proposed backward compatible bounds checking for C that same year (1995), which also saw the release of the type-safe (and bounds-checked) Java programming language. Jones and Kelly’s solution was far too slow to be practical, but the benefits were deemed promising enough that a long line of research on type-safe (or memory-safe) C followed, including languages such as CCured and Cyclone in the early 2000s, and Softbound/CETS in 2009-2010. Hardware support for runtime bounds checking (based on work by Nagarakatte and others involved with Softbound) is slated to be released as an Intel ISA extension called MPX.
Stack Canaries (1998 – now mainstream)
Another author further riffs on these ideas, proposing what is now known as stack canaries:
I’ll do you one better on that. How about an integer or similiar type
allocated immediately after the buffer, whose value is set by the compiler
(“randomly”) and which is checked after the reads. If the buffer’s been over
written, it’s unlikely that the value will match what is put in. Note
that this doesn’t stop people on the machine from recompiling and noting
differences in the binaries to determine what value is being used, but it
will protect against most external attacks.
The paper that fleshes out the idea of stack canaries was published in 1998 by Cowan et al. Support for stack canaries would be added to mainstream compilers soon afterward (the original paper presented a gcc patch), with some small improvements such as a dynamically allocated integer as opposed to a static one.
Ideas not mentioned
While incredibly prescient, the authors of firewalls thread missed some important defenses against buffer overflow exploits. They did not identify the idea of Address Space Layout Randomization (ASLR), first proposed in 2001, though their idea of stack randomization gets at the same underlying concept.[ref]While ASLR is of questionable value for 32-bit address spaces, it is now in common use and seems effective on modern (64-bit) processors.[/ref] They also did not identify Control Flow Integrity (CFI), first proposed in 2005,[ref]Techniques for restricting a program’s control flow were also proposed prior to 2005, e.g., program shepherding in 2002. CFI is implemented via (re)compilation and defines its policy in terms of a control-flow graph, rather than (say) in terms of legal entry points to libraries.[/ref] which (like the other ideas proposed) compiles the program differently to limit the possible effects of an attack. Finding the right balance of precision and performance is still a research question for CFI, but it is making its way into the mainstream. More ideas explored in the last 20 years are nicely described in Szekeres et al’s 2013 paper, SOK: Eternal War in Memory.
How and why?
Seeing this email exchange, we were led to wonder: What has changed since 1995 that has upgraded what seemed at the time (to the authors) to be impractical ideas?
The people of 1995 were particularly concerned with complicating the compiler, as it’s part of the trusted computing base:
But when someone suggests fixing the problem by opening up and
hacking *another* large code-generating program, my worry alarm goes off.
Perhaps I’m too paranoid or too cranky, but I’m certainly not going to
accept on faith that hacking a compiler in the way described is a safe and
reliable way to proceed, without a lot of respected people walking through
the compiler and pronouncing it sane and safe on all applicable platforms.
This is a reasonable concern. In the last 20 years, the state of both the art and the practice of compiler construction has advanced quite a bit. There is now production-quality compiler, called CompCert,[ref]Xavier Leroy, the leader of the CompCert project, just won the POPL 2006 Most Influential Paper Award for authoring the first paper on the work.[/ref] that has been formally verified to be correct; extensions to this compiler should cause fewer concerns. Also, the emergence of LLVM, which provides a robust, modular infrastructure on which we can create exciting compiler extensions, gives greater confidence that those extensions are correct, and creates a quicker path to adoption. The Vellvm project looks to add verifiability to aspects of LLVM, which should also address “worry alarms.”
It also seems like there is an undercurrent in this discussion of “these runtime checks are expensive and wacky, why don’t we just get the code right the first time?” (which is part of the punchline of some CS stand-up comedy). Maybe it took a long time before the idea of “just write the C code right the first time” started to be replaced with some brutal realism about how hard it is to write C code right the first time. Maybe that realism is what we needed to start thinking about runtime defenses as something more than jokes told on a mailing list.
Why firewalls?
Another question that may come to mind is: Why were these ideas occurring on the “firewalls” list, and not within the programming languages community? The proposals are germane to language and compiler design, after all. The author closing out the email thread wondered the same thing:
All this said, methinks this is getting a bit far afield from _firewalls_.
and further discussion is probably more appropriate in one of the ‘comp.lang.*’
newsgroups.
One possibility is that the firewalls community was closer to the exploits in the wild than the language community: they could identify the properties of the systems that the exploits relied on, and could conceive, at a high level, how those systems could be changed to break those dependencies while preserving the original semantics of the system.
Today, many of the developers of the next generation of runtime security systems either are also, or closely collaborate with, leading security experts that understand how modern exploits work. Indeed, as Mike has written before, there can be great synergy when applying programming languages ideas to problems identified by other fields, such as cryptography and machine learning.
The Future
Reading this, you might be inspired to ponder what technologies today seem like impossible long shots but actually could become relevant in 20 years. We’d love to see them, in the comments!
Here are some our off-the-tops-our-heads predictions:
-
Verification for the rest of us. Verification has existed as, charitably, a pipe dream, for many reasons: verification tools are slow and hard to use, it can be difficult to think about how to phrase properties you want to prove, and there always seems to be a gap between the proof and the implementation. However, systems like frama-c, F*, bedrock, and dafny seem to be closing gaps both in terms of usability and by strongly connecting the proof with the implementation. Projects like IronClad and Sel4, and funding efforts like HACMS, are pushing the state of the art.
-
Code injection will end. We have been plagued with low level security vulnerabilities that allow an adversary to inject a whole new program into a vulnerable program, usually abbreviated in the literature as buffer overflows. Technologies like those mentioned here have been closing the door on code injection attacks, e.g., the prevalence of stack-based attacks has greatly diminished in recent years. This trend will continue. This does not end the security story, but the concerns 20 years in the future should very different from the concerns of today.
-
Information disclosure will remain. Much more difficult than proving basic safety properties about C programs is proving properties about how the programs control and release the information they are charged with, such as (variants of) non-interference. Timing or resource usage based information leaks have received little attention in research (which motivated the recent DARPA STAC program). Perhaps if cloud computing continues to be adopted, disclosures of small secrets or keys are all that attackers will really need.
two small corrections: the mentioned ASLR bruteforce paper is wrong, most of its statements are flat out false and don’t support the paper’s claims. as for CFI, as far as i know, it was first described in 2003 by yours truly: https://pax.grsecurity.net/docs/pax-future.txt (and interestingly enough, the Gleipnir project guys had been aware of PaX at the time yet made no mention of this obvious prior art). i’d also say that there’s no longer a concern with performance vs. precision, we can have both nowadays.
Very interesting! Can you elaborate on the falseness of the Shacham et al paper’s claims, and the reasons you think precision/performance problem for CFI is solved (or point me to longer-form posts that do)? For the latter, my impression has been there is a progression of papers that one-up each other: propose a weaker property that provides good performance, but then (the following year) show attacks that exploit the added weakness.
it’s probably worth an entire blog to describe all its failings (though http://seclists.org/dailydave/2004/q4/104 is a good start), but if you read the paper, think about why they could avoid the stack entropy and whether that’s a generic (‘always present’) condition or not and why there’s no elaboration on it in the paper. then all the speculation on brute force prevention shows a complete misunderstanding of how cpu exceptions and signal delivery work, etc.
as for practical CFI, there’s my work (RAP) that i presented on at H2HC’15 but there’re also academic papers now that achieve not too bad performance on a subset of indirect control flows (e.g., VTrust or ShrinkWrap).
Thanks for the pointers. The academic papers you mention I’m not familiar with, but on quick glance it seems that they don’t cover the whole CFG; i.e, they weaken precision in some sense by focusing only on VTables. This more limited focus probably improves performance. But perhaps they do it in a way that’s (finally) sufficient in a practical sense, I don’t know. I’ll also check out your H2HC’15 work. BTW, it was interesting to read over your 2003 “future.txt” post — I do see the connection to CFI ideas, e.g., the enforcement mechanism in c.2. Didn’t know about it before; thanks for sharing!
Great ideas, including those in PaX, come up again and again. For example, there’s some nice work from 2000 called CFCSS (control flow checking by software signatures). I think that your mention of “as far as you knew” highlights the fundamental issue. A lot of great ideas continually resurface because people didn’t know that the prior work existed.
Did you know that I also invented control-flow integrity? When I told my Ph.D advisor of my brilliant idea they disabused me of my ignorance.
Nice! Looking forward to learning more about things I didn’t realize were already known! 🙂
i had actually known about CFCSS and many other schemes that went back to the 90’s and even earlier (not surprisingly the reliability guys had an earlier exposure to the side effects of random memory corruption than the security guys with targeted attacks) and no, i don’t consider it CFI exactly because it had no attacker model that we have for CFI (e.g., CFCSS does the signature check *after* the control transfer, not *before*, it’s obviously circumventible by a targeted attack). a cousin? perhaps. prior art? not so much.
You’re right about CFCSS not being CFI. I mentioned it because, as you pointed it, it’s a cousin and many of the right ideas (though not the whole package as you say) were floating around at the time. Productizing an exploit mitigation is hard.
I had to think about the obviousness of integrity checking at the destination as being broken. In their cast you are certainly right. This leads my mind onto a hypothetical system where I want to toy with the idea of source-checking (at the target), rather than target-checking (at the source).
Some ideas from Vx32/NaCl about masking the targets of all indirect control flows to be at a certain alignment come to mind. You could futher mask and displace the targets (a kind of hash) so that you could locate indirect targets in one area of a binary. This would probably suck for the icache, but it seems like a good first step in checking at the destination rather than at the source. What do you think of this type of approach? Is it something that, during the design of your PaX approach, you considered and dismissed?
Hey Pax Team. Long time fan.
I definitely would have cited your pax-futures doc in the CFI paper, if I had known about it. At the time, I was familiar with your NX and ASLR techniques; sorry to have missed your futures doc.
It’s a shame that PaX doesn’t get more credit for its seminal security mitigations. Kudos on your work!
P.S. Also thought Hovav’s paper vastly undersold ASLR, when it came out.
Pingback: Software Security Ideas Ahead of Their Time – Trail of Bits Blog
I’d be really interested in hearing opinions on http://threatspec.org (FOSS project I started) from this type of community. Not so much academic as it is JFDI.
Pingback: Four short links: 3 February 2016 - O'Reilly Radar
Thanks for mentioning the Eternal War in Memory and VTrust. (There’s a typo: it’s Eternal, not External.)
For VTrust, we think the main advantage is its very low complexity (i.e., per-unit compilation, support for shared libraries, no whole-program analysis needed), combined with low performance overhead and good precision for virtual calls. VTrust will ideally be combined with another strong solution for the remaining indirect calls. MCFI (and follow ups) are great regarding precision and reasonable regarding performance but require some more heavy-weight analysis and changes to the loader to do the online set merging.
It’s been interesting how far CFI and related indirect CF-transfer checks have been pushed in recent years and I’m looking forward to see the next wave of improvements.
Michael, regarding your one-upping comment: I think CFI development went down a path of “we need less performance and can sacrifice precision” at one point. The attack papers then sowed that we cannot lower precision without significantly reducing the security guarantees.
PaX Team, I’ll follow Ulfar and thank for the work you did. I agree that it’s somewhat under appreciated in academia. The RAP presentation is interesting and I’ve been eagerly waiting for a paper that describes it in more detail (or the sources). Are there any write-ups about the heap protection? Just by looking at the presentation I cannot be sure on all the gory details! 🙂
Nice read indeed! Obligatory self-advertisement, but the historical overview of memory errors in http://vvdveen.com/publications/RAID2012.pdf seems worth mentioning here. Then there is also http://vvdveen.com/graphs/webpage.html with more detailed graphs (unfinished though, and not up to date).
The graphs are awesome! I would have one feature request: can you also attribute the other bugs to different bug classes (instead of just total)? 🙂
Thanks 🙂 RE request: I’d have to look at the remaining CVE’s and figure out which keywords I can use to classify one as type x. That is certainly possible, yes, but the question is whether I can find the time… This project has been on my todo list for some years now, maybe it’s time to pick it up again.
Pingback: Knowledge is Power | Software Security Ideas Ahead of Their Time
Great overview! It’s interesting to see how the limitations of the “firewall approach” were known to so many so early on. In the future, I hope that we can also look beyond research that narrowly focuses on hardening C and C++ programs against memory-based attacks.
The SoK paper, “Eternal War on Memory” is misnamed: The war will end when we finally build secure systems that are memory safe, using conventional, typed languages that already exist today, or the languages of the future that they inspire (such as Rust, F* and others). Further, we don’t even need type systems, assuming that you don’t mind a little more space/time overhead. The problem is C and C++, and we can and should solve it culturally, without more security or language technology.
I’m tired of hearing the knee-jerk responses that seem all too commonplace. Here are a few that I hear all of the time: (1) “Legacy software” and (2) “programmers can’t use real types”, or “they don’t want to do so” and/or (3) “memory safe languages without static types are too slow.”
To (1), I would point out to all of the security research mentioned above and say, aren’t we falling victim to the sunk-cost fallacy? And if we haven’t yet reached that point as a CS research community, then when will we have finally crossed that line?
To (2), I would just say: Yes they can, especially if you devote time to learning about types in undergrad instead of baroque pointer-based reasoning in C and friends” and “Yes they should. I’d prefer to interact with a type checker than a giant testing suite running in valgrind, and I think many others feel the same. I agree with the forward-looking sentiment above under “Verification for the rest of us”. But I would add that we already have quite enough of this today in simple type systems, like that of ML and friends.
To (3), many (or most?) memory safe languages don’t require static typing. If you want safety and you don’t care as much about a lean program, then use one of these languages. If you want both safely and leanness, use a language like Rust, or some variant of ML.
Finally, the best reason to say goodbye to C and C++ is so that those of us who care about software security can actually study interesting, open problems, like the post mentions under the heading “Information disclosure will remain”. Agreed! And I would add, studying the leaking of information is easier when the model is high-level, and not burdened by the out-dated baggage of languages without real abstractions, or the ability to create and enforce them.
Types and verification are the answer, and they will save us for another decade or two of boring research about how to keep hobbling along with our broken language designs from the 1970s.