Annotations - yet more help finding buffer overflows

Last time I talked about how we used template overloads to help automatically transform safe calls to strcpy into strcpy_s. But not all calls to strcpy are safe, of course. Consider this code:

void GetIntegratedCutlery(char *out)
{
      strcpy(out, “spork”);
}

In Visual Studio 2005, even with template overloads enabled, this will give you a deprecation warning telling you it isn’t safe. strcpy isn’t told how big out is going to be.

Let’s imagine an improved version of the function

void GetIntegratedCutleryEx(char *out, size_t size)
{
      strcpy_s(out, size, “spork”);
}

This function is now at least safe – strcpy_s has required us to tell strcpy how large the buffer out is.

A criticism I’ve heard of this kind of fix is that we have just ‘moved’ the problem, since now we have to ensure that the value passed to size is correct. This criticism has some validity – nothing about this code ensures that size has a correct value. One can argue that developers, especially maintenance ones, are more likely to get size correct when the have to write it explicitly. But this isn’t the strongest argument for this change. The best argument is that the buffer size is now explicit in the code and can be reasoned about by the toolset (as well as by developers themselves).

You may have noticed that the standard library headers have gotten much bigger in Visual C++ 2005. Compare the declaration of strstr from VC++ 2003:

_CRTIMP char * __cdecl strstr(const char *, const char *);

with the one from VC++ 2005:

_CRTIMP __checkReturn _CONST_RETURN char * __cdecl strstr(__in_z const char * _Str, __in_z const char * _SubStr);

There are three major changes here:

  • _CONST_RETURN – this was added to bring us closer into conformance with the C++ standard.

  • Parameter names (_Str, _SubStr) – these were added to improve the intellisense user experience in the IDE, as well as for readability

  • Code annotations (__checkReturn, __in_z) – these annotations allow analysis tools to understand the intent of code better, and detect more issues

Annotations are the important ones for our purposes. __in_z actually tells the compiler quite a lot about the parameters of a function [see sal.h for the full definition], but at a high level it says “this is a input string (__in) that is null terminated (_z)”.

Annotations are used to find problems when you throw the /analyze switch on the compiler. For example, consider this function:

#include <stdlib.h>
#include <string.h>
#include <wchar.h>

wchar_t *AllocateAndFillW(size_t n, wchar_t c) throw(...)
{
      wchar_t *retVal=(wchar_t *)malloc(n*sizeof(wchar_t));   

      wmemset(retVal, c, n*sizeof(wchar_t)); /* line 9 */

return retVal;
}

You can probably spot the bug on line 9 easily, but /analyze can do it for you:

f:\an.cpp(9) : warning C6383: buffer overrun due to conversion of an element count into a byte count: an element count is expected for parameter '3' in call to'wmemset'

[Note that /analyze is only supported in the Enterprise (team development) versions of the product]

This happens because we’ve added annotations to malloc and wmemset in the CRT headers. /analyze uses those to see the problem. It sees that malloc returns __bcount_opt(_Size) a writable block of size bytes (bcount) which may be null (_opt). It sees that wmemset takes __out_ecount_full(_N) a writable buffer of _N elements. It then does the math and notices that the *sizeof(wchar_t) on line 9 is wrong and reports the problem. It found a buffer overrun for you.

Returning to our original example, there is no way the compiler can tell us whether this function has an overrun because it doesn’t know about the relationship between out and size

void GetIntegratedCutleryEx(char *out, size_t size)
{
      strcpy_s(out, size, “spork”);
}

However, we can teach it about this relationship:

void GetIntegratedCutleryEx(__out_ecount_z(size) char *out, __in size_t size)
{
      strcpy_s(out, size, “spork”);
}

And then when a user writes a bad piece of code:

void DeluxeExtendedCutleryEnumerationProviderManager(void)
{
      char b[3];

      GetIntegratedCutleryEx(b, _countof(b)+1);
}

You get a nice error

f:\an.cpp(23) : warning C6386: buffer overrun: accessing 'argument 1', the writable size is '3' bytes, but '4' bytes may be written: Lines: 21, 23

You can see how the deprecation warnings we add in the libraries (that encourage you to convert from strcpy to strcpy_s), and the annotations we added in the headers (which ensure you pass our functions the correct size) can be augmented by the annotations you write in your own code to further reduce the chance you’ll write errors (especially buffer overrun errors) in your code.

More next time. Looking forward to hearing from you.

Martyn