Examine the layout of managed strings in memory
Suppose you wrote some C# code like this:
var str1 = "ThisIsAString";
var str2 = "ThisIsAnotherString";
As you’d expect, each string is stored in the resulting built binary and also in memory when the binary is loaded, resulting in 2 separate strings.
Now suppose you wrote this code instead:
var str1 = "ThisIsAString";
var str2 = "ThisIsAString";
The strings are identical: would there be a copy of each? How do we find out?
Here’s one way to find out:
Start VS 2010
File->New->Project->C#-> Windows Console Application. Paste in the code sample below.
Hit F11 to step into the program, then Ctrl-F11 (or right-click->Go to disassembly) to see the code being executed.
Open a couple windows
Debug->Windows->Registers
Debug->Windows->Memory 1
Right click the memory window and choose to show the memory contents as 4 byte integers.
Step to the line indicated (after register eax has been set to the string’s address).
Then drag the EAX value (0x023c96e4 in my sample) to the Memory window.
You can actually see (and optionally change!) the string “ThisIsAString” in memory as Unicode, with every other byte being 0.
System.String is a CLR Class. For every CLR class, the first integer is the ClassId: so System.String has a ClassId = 51acfb08 in this process. The ClassId could be different when you restart the process.
The next integer (0x0000000d = 13) is the length of the string (13 Unicode bytes). Following that, you can see the “ThisIsAString” in memory.
So now you know how to examine the code that gets executed when handling strings.
The variable “str2” is assigned the same string value. Is that value stored twice in memory? From the assembly code, we see that register EAX is set to the same value (03392088h) for “str2” as “str1”, so we know the two identical strings have been coalesced.
Another way to see the code: open a Visual Studio Command Prompt and use ILDasm:
Ildasm “C:\Users\calvinh\AppData\Local\Temporary Projects\ConsoleApplication1\bin\Debug\ConsoleApplication1.exe”
Exercises:
1. Try running this with Project->Properties->Build->Platform Target x64
2. Try using a Release build, rather than Debug (the default)
<Code>
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
using System.Reflection;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
var str1 = "ThisIsAString";
var str2 = "ThisIsAString"; // 2nd copy to see compiler coalesce
var str3 = "";
var str4 = new String('a', 0);
var str5 = "" + "";
var str6 = "This" + "IsAString";
}
}
}
</Code>