Devenv stealth feature, Poor Man's Profiler

This feature is unsupported. It is undocumented. If you call help desk about it nice people will say “huh?”  If it breaks, people will nod knowingly and say “I told you so”. In later versions it may simply disappear.  It could decay right now as I speak and explode! (maybe not.)

 

One more disclaimer ... it is Native Only, it does not work for managed debugging.

 

The @clk 'pseudo register' has been around for a while.  If you have VS7.0 or greater you can use this.  It is the poor mans profiler.  Stick @clk in the watch window and you get a tick count.  Each time you step it will be updated.  Put @clk=0 in the next slot of the watch window and the first @clk will show the time (including debugger overhead) since the last stop.

 

But wait there's more!  With the advent of tracepoints you can get @clk on steroids.  Set a bp, then right click on the red bp circle and go to "When hit".  In the dialog check the "Print message" check box and fill in this: {@clk},{@clk=0} $FUNCTION.  Now open the output window and run.  You will get a list of tickcount deltas between each tracepoint hit. 

 

Here is an example I took for devenv rendering a callstack with 8 frames:

 

406298197,0 CDebugStackFrame::GetArgs(unsigned short * *, unsigned long, unsigned int)

111466,0 CDebugStackFrame::GetArgs(unsigned short * *, unsigned long, unsigned int)

6228,0 CDebugStackFrame::GetArgs(unsigned short * *, unsigned long, unsigned int)

4657,0 CDebugStackFrame::GetArgs(unsigned short * *, unsigned long, unsigned int)

5933,0 CDebugStackFrame::GetArgs(unsigned short * *, unsigned long, unsigned int)

8049,0 CDebugStackFrame::GetArgs(unsigned short * *, unsigned long, unsigned int)

5844,0 CDebugStackFrame::GetArgs(unsigned short * *, unsigned long, unsigned int)

92580,0 CDebugStackFrame::GetArgs(unsigned short * *, unsigned long,g unsigned int)

 

You'll notice the first tick count is meaningless, the rest give relative times.  Here we can see the time between the 1st and 2nd and the time between the 7th and 8th calls took substantially more time.  Looking at the callstack frames generated by those two calls each required function evaluation.  I went to the callstack window and turned off 'Show parameter values'.  The refresh of the callstack without function evaluation gave me some new data:

 

732630352,0 CDebugStackFrame::GetArgs(unsigned short * *, unsigned long, unsigned int)

3452,0 CDebugStackFrame::GetArgs(unsigned short * *, unsigned long, unsigned int)

5125,0 CDebugStackFrame::GetArgs(unsigned short * *, unsigned long, unsigned int)

5117,0 CDebugStackFrame::GetArgs(unsigned short * *, unsigned long, unsigned int)

2648,0 CDebugStackFrame::GetArgs(unsigned short * *, unsigned long, unsigned int)

5406,0 CDebugStackFrame::GetArgs(unsigned short * *, unsigned long, unsigned int)

4818,0 CDebugStackFrame::GetArgs(unsigned short * *, unsigned long, unsigned int)

5432,0 CDebugStackFrame::GetArgs(unsigned short * *, unsigned long, unsigned int)

 

Note the $FUNCTION is optional.  Here is what I get when I remove that bit:

With Function evaluation:

1122488927,0

82541,0

4880,0

5737,0

2480,0

6899,0

4696,0

83810,0

 

Without function evaluation:

8114431,0

5796,0

4868,0

4858,0

2352,0

5077,0

4676,0

5413,0

 

So, how long is a 'tickcount'? You need to ask the hardware (QueryPerformanceFrequency) as the QueryPerformanceCounter api this uses is hardware dependent.

 

Also as a side note the numbers above are from a debug build of devenv running a checked version of the clr.  I've had testers mock me for how slow my machine appears with a checked clr, so don't bother reading too much into those values.