Security Improvements to the Whidbey Compiler

I’ve been away from an Internet connection over the last
few days. After a conference in the Netherlands, I visited my sister in Germany.
She’s stationed at Spangdahlem Air Base, which happened to be where the Air
Force started using Xbox Live. The German countryside is amazing. I spent most of
my time on the air base, but had time for a trip to Aachen.

Anyways, I was invited to speak about some Visual C++
security features at the Netherlands Unix Users Group’s autumn conference. When
I’m not working on language design, one of the things I like doing is looking at
security problems and doing something about them. When I started in Visual C++,
Louis Lafreniere and Phil Lucido had created a feature known as “security
checks” (commonly referenced as the “/GS switch”). This debuted in the Visual
C++ 2002 release. I wrote

an article that described how the feature worked in that release.

At the end of 2001, I had a number of discussions with
Louis and Phil about how we could improve security checks. As buffer overruns
were found, we saw that it was often possible to circumvent the security checks
architecture. There were a few cases where the VC 2002 implementation would have
prevented arbitrary code from running, but it was clearly possible to do better
in general. The discussion between Louis, Phil, and I led to a number of ideas,
some of which were introduced to Visual C++ 2003. The main thing that VC 2003
did was sort the local variables so that buffers were allocated in memory
addresses higher than other local variables. This prevents local variables from
being overwritten by a buffer overrun, thus avoiding attacks like pointer
subterfuge and v-table hijacking.

Our discussions also showed that the security checks
architecture was unable to prevent attacks that exploited exception handling
(such as Code Red). This is because a security check that determines whether a
cookie changed happens at the end of a function call in the function epilog.
Exceptions allow a program to choose control flow that avoids returning from a
function. What makes exception handling exploitable is that exception
information is placed on the stack (this is done for historical reasons and
performance). I’ll spend more time talking about exception handling later (as it
will be useful for understanding /EHs and /EHa). If a buffer overrun is able to
overwrite exception handling information on the stack, the EH info can be
somewhere earlier in the call stack. As many system libraries make use of
exception handling, nearly every program will have some exception records in the
call stack.

One thing that makes exceptions unique on Windows is that
the operating system provides the infrastructure to make exceptions work. For us
to make code resilient to attacks against exception handling we needed support
from the operating system. Bryan Tuttle, a build engineer in Windows, suggested
creating tables of exception thunks that Windows could use to validate the EH
record. This suggestion was developed into the feature known as “Safe
Exceptions”. Many people were involved to make this work, including Richard
Shupak, Dan Spalding, Louis Lafreniere, Phil Lucido, and Bryan Tuttle. This
feature debuted with the improvements to /GS in VC 2003. The operating system
infrastructure for safe exceptions was introduced in Windows Server 2003.

The VC 2003 release had a short development cycle, so not
all of our ideas to improve /GS were implemented. The Whidbey product cycle gave
us the opportunity to do more. The biggest improvement to /GS is the protection
for vulnerable function parameters. To understand this, it’s helpful to see what
the stack layout looks like before this change. From high memory to low memory,
this is what shows up in a function activation record:

       Function arguments
      Return address
      Frame pointer
      Cookie
      EH record
      Local buffers
      Local variables
      Callee save registers

If a buffer overrun occurs, it is possible to overwrite the
EH record, the cookie, the frame pointer, the return address, the function
arguments, and function activation records earlier in the call stack. The EH
record is protected by safe exceptions, the cookie can only be exploited by
using a value that matches the value in the __security_cookie variable, the
frame pointer is only useful after the function returns, and the return address
is only exploitable at the function return after the security check has already
taken place. Thus, during the execution of the function these parts of the
function activation record are not exploitable; however, the function arguments
are used by the code in the function. VC 2003 did nothing to protect the
function arguments.

The Whidbey compiler will do something to address this by
identifying vulnerable arguments and copying those arguments to memory addresses
lower than the local buffers. This is done in the function epilog. The code of
the function then makes use of the copy of the function argument rather than the
original argument. We often refer to this as parameter shadowing. This yields
the following stack layout:

       Function arguments
      Return address
      Frame pointer
      Cookie
      EH record
      Local buffers
      Local variables and copies of vulnerable parameters
      Callee save registers

Why only copy some parameters? Making copies of parameters
has a performance impact. Just as only vulnerable functions have cookies and
code injected for the security check, only vulnerable parameters in vulnerable
functions will be copied. What makes an argument vulnerable? Basically, it is an
inductive set: pointers and structures that contain vulnerable parameters. The
actual implementation of this feature does a more in depth analysis that
includes other factors to identify vulnerable parameters. Of course, there are
parameters that are vulnerable but cannot be moved, such as non-POD C++ objects.
Ultimately, the choice of copying vulnerable parameters is heuristic with a goal
of balancing performance with real security mitigation benefits.

This improvement makes it more difficult to use out
parameters and pass by reference variables to circumvent the security checks
architecture. For example, in VC 2002 an out parameter that was changed by a
buffer overrun to point to the __security_cookie variable would make it possible
for an attacker to get a predictable cookie value thus preventing the security
check in the function epilog from triggering. This then opens the possibility
for easier arbitrary code exploits such as stack smashing. The Whidbey compiler
will not make use of the original out parameter, so that approach to
circumventing the security checks architecture will not work.

The recently announced service pack to Windows XP and
Windows Server 2003 will be built with a compiler that includes these Whidbey
improvements. This will help improve Windows’s resilience in the event of a
buffer overrun, and will hopefully mitigate the harm a buffer overrun can incur.
It is our hope that a service pack to the VC 2003 compiler will include these
updates, and we are currently investigating how that may be possible.

Another change we are making in Whidbey is that /GS will be
the default behavior in the compiler. This follows from the trustworthy
computing pillars that software should be secure by default. All Microsoft
software is building with /GS. By making this the default, turning off security
checks requires an explicit action that can be found by grep of build logs. That
makes audits of code bases easier. The Visual C++ team has spent an enormous
amount of effort to make /GS useful for retail software, and the fact that
Windows, Office, SQL Server, Visual Studio, among other products build with this
proves that it is an effective feature.

I have left out a number of details. There are number of
other improvements to /GS that we are doing. I’ve been saying this for a year
now, but I do hope to have time to write a revision to the white paper I wrote
in February 2002. In the meantime, know that /GS is improving with each release
of Visual C++.