Using MAP files - part 1
Back in February, the Doctor talked about manually unwinding stacks. MAP files are a great tool to help when you are doing this, and they are your best tool for resolving the code addresses found on the stack. I was recently working on an issue that required converting a lot of unresolved stacks and trying to pin down what was happening on several devices. Doing this reminded me of all the tricks and the Doctor and I decided we should document it for everyone.
In part one we’re going to talk about what a MAP file is. In part two, we’ll talk about how you can use this information for resolving stacks.
Today, symbol files contain a lot of information, including some extras like source line information, etc. To get pretty much ALL of that data in human readable form, you would need both MAP and COD files. Most people don’t want COD files getting out because it contains their source code. MAP files are just the basics – offsets for functions, globals, and other data. It’s plenty of information for what we need to do – resolve a stack – and you should already have them in your flat release directory.
I'm going to use COREDLL as an example (I’ve randomly trimmed this a lot). Highlighted text will be discussed after the example:
Coredll <<<< the module name
Timestamp is d600142f <<<< timestamp when it was built
Preferred load address is 10000000 <<<< Where the module wants to load. Don’t trust this!
>>>> Begin information about the sections in this module.
Start Length Name Class
0001:00000000 00005c44H .rdata CODE
0001:00005c44 00000024H .rdata$debug CODE
0001:00005c68 000003c4H .rdata$r CODE
0001:0000602c 00065338H .text CODE
0001:0006c1f0 0000c36eH .edata CODE
0002:00000000 00000004H .CRT$XCA DATA
0002:00000004 00000004H .CRT$XCAA DATA
0002:00000028 000009acH .data DATA
0002:000009e0 00000384H .bss DATA
0003:00000000 00005310H .pdata DATA
0004:00000000 000002b0H .rsrc$01 DATA
>>>> Begin actual symbolic information
Address Publics by Value Rva+Base Lib:Object
0000:00000000 ___safe_se_handler_count 00000000 <absolute>
0000:00000000 ___safe_se_handler_table 00000000 <absolute>
0001:00000030 ??_7exception@std@@6B@ 10001030 coredll_ALL:stdexcpt.obj
0001:00000174 ??_C@_17LDADEION@?$AAI?$AAM?$AAE?$AA?$AA@ 10001174 coredll_ALL:Imm.obj
0001:0000017c ??_C@_13COJANIEC@?$AA0?$AA?$AA@ 1000117c coredll_ALL:Imm.obj
0001:00000180 ??_C@_15KNBIKKIN@?$AA?$CF?$AAd?$AA?$AA@ 10001180 coredll_ALL:Imm.obj
0001:00000434 ??_7logic_error@std@@6B@ 10001434 coredll_ALL:string.obj
>>>> Ok, enough of that. It’s ugly and not relevant to us. Further in, we start seeing the following:
0001:00000960 cszTimeZones 10001960 coredll_ALL:time.obj
0001:00000978 NormalYearDaysBeforeMonth 10001978 coredll_ALL:time.obj
0001:00000994 LeapYearDaysBeforeMonth 10001994 coredll_ALL:time.obj
0001:000009b0 NormalYearDayToMonth 100019b0 coredll_ALL:time.obj
0001:00000b20 LeapYearDayToMonth 10001b20 coredll_ALL:time.obj
<<<< Those are global variables. I’ll tell you how I knew that soon.
>>>> Now things get interesting. Notice that in the 4th column there is an “f”? That stands for function. (that’s how I knew the ones above were globals... no “f”)
0001:0000602c mbstowcs 1000702c f coredll_ALL:coredll.obj
0001:00006114 wcstombs 10007114 f coredll_ALL:coredll.obj
0001:00006210 RegisterDlgClass 10007210 f coredll_ALL:coredll.obj
0001:000062a4 CoreDllInit 100072a4 f coredll_ALL:coredll.obj
>>>> More functions, but they look kinda funny. That’s because they are “decorated” or “mangled”. More on this in part two.
0001:00006338 ??0exception@std@@QAA@XZ 10007338 f coredll_ALL:stdexcpt.obj
0001:00006354 ??0exception@std@@QAA@PBD@Z 10007354 f coredll_ALL:stdexcpt.obj
0001:000063a0 ??0exception@std@@QAA@ABV01@@Z 100073a0 f coredll_ALL:stdexcpt.obj
0001:00006404 ??1exception@std@@UAA@XZ 10007404 f coredll_ALL:stdexcpt.obj
I said above that you should not trust the “Preferred load address”. There is a simple reason for this. It is preferred. If I do a findstr on MAP files in one of my flat release directories for the string “Preferred load address is 10000000”, I get back 582 hits. We know that all those files aren’t loading at the same address. Some binaries will be rebased to a specific pre-determined address, others will be dynamically rebased when they are loaded. In part two, I will talk about ways to determine the true load address for a module.
Above you will also notice the addresses in the first column that look like this: 0001:00005c44. This is a segmented address. Without going into too much explanation, a segmented address is an offset relative to a segment written in the form segment:offset. In the this case, the address represents an offset of 0x5c44 bytes into segment number 0x0001. So, if segment 0x0001 began at 0x01000000 this would represent a relative address of 0x01005c44.
Another term above that should be defined is, RVA+Base. RVA is the Relative Virtual Address. Base is the Preferred Load Address I already told you about. So, let’s look at the following line:
Address Publics by Value Rva+Base Lib:Object
0001:00006114 wcstombs 10007114 f coredll_ALL:coredll.obj
Here, the RVA+Base is 0x10007114. We know from the top of the file that the Preferred load address is 0x10000000. Subtracting that, we are left with 0x7114. This is the offset into the file to the beginning of the wcstombs function. We can learn something else here as well. The segmented address is 0001:00006114. If we subtract 0x6114 from 0x7114 we get 0x1000. Why does this difference exist? Well, 0001:00006114 is segmented. It is an offset into segment 0x0001. We now know that segment 0x0001 begins at 0x1000 – 4k into the file.
Ok. So now you’ve seen a MAP file, and have a couple of clues about what’s in it. In part two, we will talk about using this information to resolve call stacks.
Comments
Anonymous
November 04, 2006
Good article, Where is the next part?Anonymous
November 21, 2006
Also see http://blogs.msdn.com/sloh/archive/2005/02/28/381706.aspx Though I never talked about global vars at all. I always meant to post about how to go the other way (figure out where in RAM a particular symbol was, which is more useful for global variables) but never got around to it. SueAnonymous
July 02, 2007
Where can i find part2, thanks!Anonymous
January 18, 2010
Maybe this guy quit from M.S, because the E-mail address not work; but thanks to him too;Anonymous
June 21, 2011
can u tell how to get the actual address of the global variables and is it possible to set the values of variaables based on this address