Session memory – who’s this guy named Max and what’s he doing with my memory?

SQL Server MVP Jonathan Kehayias (blog) emailed me a question last week when he noticed that the total memory used by the buffers for an event session was larger than the value he specified for the MAX_MEMORY option in the CREATE EVENT SESSION DDL. The answer here seems like an excellent subject for me to kick-off my new “401 – Internals” tag that identifies posts where I pull back the curtains a bit and let you peek into what’s going on inside the extended events engine.

In a previous post (Option Trading: Getting the most out of the event session options) I explained that we use a set of buffers to store the event data before  we write the event data to asynchronous targets. The MAX_MEMORY along with the MEMORY_PARTITION_MODE defines how big each buffer will be. Theoretically, that means that I can predict the size of each buffer using the following formula:

max memory / # of buffers = buffer size

If it was that simple I wouldn’t be writing this post.

I’ll take “boundary” for 64K Alex

For a number of reasons that are beyond the scope of this blog, we create event buffers in 64K chunks. The result of this is that the buffer size indicated by the formula above is rounded up to the next 64K boundary and that is the size used to create the buffers. If you think visually, this means that the graph of your max_memory option compared to the actual buffer size that results will look like a set of stairs rather than a smooth line. You can see this behavior by looking at the output of dm_xe_sessions, specifically the fields related to the buffer sizes, over a range of different memory inputs:

Note: This test was run on a 2 core machine using per_cpu partitioning which results in 5 buffers. (Seem my previous post referenced above for the math behind buffer count.)

input_memory_kb

total_regular_buffers

regular_buffer_size

total_buffer_size

637

5

130867

654335

638

5

130867

654335

639

5

130867

654335

640

5

196403

982015

641

5

196403

982015

642

5

196403

982015

This is just a segment of the results that shows one of the “jumps” between the buffer boundary at 639 KB and 640 KB. You can verify the size boundary by doing the math on the regular_buffer_size field, which is returned in bytes:

196403 – 130867 = 65536 bytes

65536 / 1024 = 64 KB

The relationship between the input for max_memory and when the regular_buffer_size is going to jump from one 64K boundary to the next is going to change based on the number of buffers being created. The number of buffers is dependent on the partition mode you choose. If you choose any partition mode other than NONE, the number of buffers will depend on your hardware configuration. (Again, see the earlier post referenced above.) With the default partition mode of none, you always get three buffers, regardless of machine configuration, so I generated a “range table” for max_memory settings between 1 KB and 4096 KB as an example.

start_memory_range_kb

end_memory_range_kb

total_regular_buffers

regular_buffer_size

total_buffer_size

1

191

NULL

NULL

NULL

192

383

3

130867

392601

384

575

3

196403

589209

576

767

3

261939

785817

768

959

3

327475

982425

960

1151

3

393011

1179033

1152

1343

3

458547

1375641

1344

1535

3

524083

1572249

1536

1727

3

589619

1768857

1728

1919

3

655155

1965465

1920

2111

3

720691

2162073

2112

2303

3

786227

2358681

2304

2495

3

851763

2555289

2496

2687

3

917299

2751897

2688

2879

3

982835

2948505

2880

3071

3

1048371

3145113

3072

3263

3

1113907

3341721

3264

3455

3

1179443

3538329

3456

3647

3

1244979

3734937

3648

3839

3

1310515

3931545

3840

4031

3

1376051

4128153

4032

4096

3

1441587

4324761

As you can see, there are 21 “steps” within this range and max_memory values below 192 KB fall below the 64K per buffer limit so they generate an error when you attempt to specify them.

Clarification: I got another question about this topic last week, this time from a student in one of Jonathan Kehayias' classes. Someone noticed that if you take the size of the buffer (regular_buffer_size) and divide that by 64 KB you don't get the nice even boundary that I am claiming above and that someone is wondering "What gives?". A reasonable question.

What gives is that regular_buffer_size is reporting the usable buffer space, or the space left after we subtract a small amount of memory for the overhead of managing the buffers. You'll notice that if you take any regular_buffer_size and divide it by 64 KB you'll have the same fraction left over every time: 0.99687194824219, that's the overhead.

Max approximates True as memory approaches 64K

The upshot of this is that the max_memory option does not imply a contract for the maximum memory that will be used for the session buffers (Those of you who read Take it to the Max (and beyond) know that max_memory is really only referring to the event session buffer memory.) but is more of an estimate of total buffer size to the nearest higher multiple of 64K times the number of buffers you have. The maximum delta between your initial max_memory setting and the true total buffer size occurs right after you break through a 64K boundary, for example if you set max_memory = 576 KB (see the green line in the table), your actual buffer size will be closer to 767 KB in a non-partitioned event session. You get “stepped up” for every 191 KB block of initial max_memory which isn’t likely to cause a problem for most machines.

Things get more interesting when you consider a partitioned event session on a computer that has a large number of logical CPUs or NUMA nodes. Since each buffer gets “stepped up” when you break a boundary, the delta can get much larger because it’s multiplied by the number of buffers. For example, a machine with 64 logical CPUs will have 160 buffers using per_cpu partitioning or if you have 8 NUMA nodes configured on that machine you would have 24 buffers when using per_node. If you’ve just broken through a 64K boundary and get “stepped up” to the next buffer size you’ll end up with total buffer size approximately 10240 KB and 1536 KB respectively (64K * # of buffers) larger than max_memory value you might think you’re getting. Using per_cpu partitioning on large machine has the most impact because of the large number of buffers created. If the amount of memory being used by your system within these ranges is important to you then this is something worth paying attention to and considering when you configure your event sessions.

The DMV dm_xe_sessions is the tool to use to identify the exact buffer size for your sessions. In addition to the regular buffers (read: event session buffers) you’ll also see the details for large buffers if you have configured MAX_EVENT_SIZE. The “buffer steps” for any given hardware configuration should be static within each partition mode so if you want to have a handy reference available when you configure your event sessions you can use the following code to generate a range table similar to the one above that is applicable for your specific machine and chosen partition mode.

DECLARE @buf_size_output table (input_memory_kb bigint, total_regular_buffers bigint, regular_buffer_size bigint, total_buffer_size bigint)
DECLARE @buf_size int, @part_mode varchar(8)
SET @buf_size = 1 -- Set to the begining of your max_memory range (KB)
SET @part_mode = 'per_cpu' -- Set to the partition mode for the table you want to generate

WHILE @buf_size <= 4096 -- Set to the end of your max_memory range (KB)
BEGIN
BEGIN TRY

        IF EXISTS (SELECT * from sys.server_event_sessions WHERE name = 'buffer_size_test')
DROP EVENT SESSION buffer_size_test ON SERVER
DECLARE @session nvarchar(max)
SET @session = 'create event session buffer_size_test on server
add event sql_statement_completed
add target ring_buffer
with (max_memory = ' + CAST(@buf_size as nvarchar(4)) + ' KB, memory_partition_mode = ' + @part_mode + ')'

        EXEC sp_executesql @session

        SET @session = 'alter event session buffer_size_test on server
state = start'

        EXEC sp_executesql @session

        INSERT @buf_size_output (input_memory_kb, total_regular_buffers, regular_buffer_size, total_buffer_size)
SELECT @buf_size, total_regular_buffers, regular_buffer_size, total_buffer_size FROM sys.dm_xe_sessions WHERE name = 'buffer_size_test'
END TRY
BEGIN CATCH
INSERT @buf_size_output (input_memory_kb)
SELECT @buf_size
END CATCH
SET @buf_size = @buf_size + 1
END

DROP EVENT SESSION buffer_size_test ON SERVER

SELECT MIN(input_memory_kb) start_memory_range_kb, MAX(input_memory_kb) end_memory_range_kb, total_regular_buffers, regular_buffer_size, total_buffer_size from @buf_size_output group by total_regular_buffers, regular_buffer_size, total_buffer_size

Thanks to Jonathan for an interesting question and a chance to explore some of the details of Extended Event internals.

- Mike