SQL Server MVP Jonathan Kehayias (blog) emailed me a question last week when he noticed that the total memory used by the buffers for an event session was larger than the value he specified for the MAX_MEMORY option in the CREATE EVENT SESSION DDL. The answer here seems like an excellent subject for me to kick-off my new “401 – Internals” tag that identifies posts where I pull back the curtains a bit and let you peek into what’s going on inside the extended events engine.
In a previous post (Option Trading: Getting the most out of the event session options) I explained that we use a set of buffers to store the event data before we write the event data to asynchronous targets. The MAX_MEMORY along with the MEMORY_PARTITION_MODE defines how big each buffer will be. Theoretically, that means that I can predict the size of each buffer using the following formula:
max memory / # of buffers = buffer size
If it was that simple I wouldn’t be writing this post.
I’ll take “boundary” for 64K Alex
For a number of reasons that are beyond the scope of this blog, we create event buffers in 64K chunks. The result of this is that the buffer size indicated by the formula above is rounded up to the next 64K boundary and that is the size used to create the buffers. If you think visually, this means that the graph of your max_memory option compared to the actual buffer size that results will look like a set of stairs rather than a smooth line. You can see this behavior by looking at the output of dm_xe_sessions, specifically the fields related to the buffer sizes, over a range of different memory inputs:
...
...
Max approximates True as memory approaches 64K
The upshot of this is that the max_memory option does not imply a contract for the maximum memory that will be used for the session buffers (Those of you who read Take it to the Max (and beyond) know that max_memory is really only referring to the event session buffer memory.) but is more of an estimate of total buffer size to the nearest higher multiple of 64K times the number of buffers you have. The maximum delta between your initial max_memory setting and the true total buffer size occurs right after you break through a 64K boundary, for example if you set max_memory = 576 KB (see the green line in the table), your actual buffer size will be closer to 767 KB in a non-partitioned event session. You get “stepped up” for every 191 KB block of initial max_memory which isn’t likely to cause a problem for most machines.
Things get more interesting when you consider a partitioned event session on a computer that has a large number of logical CPUs or NUMA nodes. Since each buffer gets “stepped up” when you break a boundary, the delta can get much larger because it’s multiplied by the number of buffers. For example, a machine with 64 logical CPUs will have 160 buffers using per_cpu partitioning or if you have 8 NUMA nodes configured on that machine you would have 24 buffers when using per_node. If you’ve just broken through a 64K boundary and get “stepped up” to the next buffer size you’ll end up with total buffer size approximately 10240 KB and 1536 KB respectively (64K * # of buffers) larger than max_memory value you might think you’re getting. Using per_cpu partitioning on large machine has the most impact because of the large number of buffers created. If the amount of memory being used by your system within these ranges is important to you then this is something worth paying attention to and considering when you configure your event sessions.
The DMV dm_xe_sessions is the tool to use to identify the exact buffer size for your sessions. In addition to the regular buffers (read: event session buffers) you’ll also see the details for large buffers if you have configured MAX_EVENT_SIZE. The “buffer steps” for any given hardware configuration should be static within each partition mode so if you want to have a handy reference available when you configure your event sessions you can use the following code to generate a range table similar to the one above that is applicable for your specific machine and chosen partition mode.
DECLARE @buf_size_output table (input_memory_kb bigint, total_regular_buffers bigint, regular_buffer_size bigint, total_buffer_size bigint)
DECLARE @buf_size int, @part_mode varchar(8)
SET @buf_size = 1 -- Set to the begining of your max_memory range (KB)
SET @part_mode = 'per_cpu' -- Set to the partition mode for the table you want to generate
WHILE @buf_size <= 4096 -- Set to the end of your max_memory range (KB)
BEGIN
BEGIN TRY
IF EXISTS (SELECT * from sys.server_event_sessions WHERE name = 'buffer_size_test')
DROP EVENT SESSION buffer_size_test ON SERVER
DECLARE @session nvarchar(max)
SET @session = 'create event session buffer_size_test on server
add event sql_statement_completed
add target ring_buffer
with (max_memory = ' + CAST(@buf_size as nvarchar(4)) + ' KB, memory_partition_mode = ' + @part_mode + ')'
EXEC sp_executesql @session
SET @session = 'alter event session buffer_size_test on server
state = start'
EXEC sp_executesql @session
INSERT @buf_size_output (input_memory_kb, total_regular_buffers, regular_buffer_size, total_buffer_size)
Read more: Using SQL Server Extended Events
In a previous post (Option Trading: Getting the most out of the event session options) I explained that we use a set of buffers to store the event data before we write the event data to asynchronous targets. The MAX_MEMORY along with the MEMORY_PARTITION_MODE defines how big each buffer will be. Theoretically, that means that I can predict the size of each buffer using the following formula:
max memory / # of buffers = buffer size
If it was that simple I wouldn’t be writing this post.
I’ll take “boundary” for 64K Alex
For a number of reasons that are beyond the scope of this blog, we create event buffers in 64K chunks. The result of this is that the buffer size indicated by the formula above is rounded up to the next 64K boundary and that is the size used to create the buffers. If you think visually, this means that the graph of your max_memory option compared to the actual buffer size that results will look like a set of stairs rather than a smooth line. You can see this behavior by looking at the output of dm_xe_sessions, specifically the fields related to the buffer sizes, over a range of different memory inputs:
...
...
Max approximates True as memory approaches 64K
The upshot of this is that the max_memory option does not imply a contract for the maximum memory that will be used for the session buffers (Those of you who read Take it to the Max (and beyond) know that max_memory is really only referring to the event session buffer memory.) but is more of an estimate of total buffer size to the nearest higher multiple of 64K times the number of buffers you have. The maximum delta between your initial max_memory setting and the true total buffer size occurs right after you break through a 64K boundary, for example if you set max_memory = 576 KB (see the green line in the table), your actual buffer size will be closer to 767 KB in a non-partitioned event session. You get “stepped up” for every 191 KB block of initial max_memory which isn’t likely to cause a problem for most machines.
Things get more interesting when you consider a partitioned event session on a computer that has a large number of logical CPUs or NUMA nodes. Since each buffer gets “stepped up” when you break a boundary, the delta can get much larger because it’s multiplied by the number of buffers. For example, a machine with 64 logical CPUs will have 160 buffers using per_cpu partitioning or if you have 8 NUMA nodes configured on that machine you would have 24 buffers when using per_node. If you’ve just broken through a 64K boundary and get “stepped up” to the next buffer size you’ll end up with total buffer size approximately 10240 KB and 1536 KB respectively (64K * # of buffers) larger than max_memory value you might think you’re getting. Using per_cpu partitioning on large machine has the most impact because of the large number of buffers created. If the amount of memory being used by your system within these ranges is important to you then this is something worth paying attention to and considering when you configure your event sessions.
The DMV dm_xe_sessions is the tool to use to identify the exact buffer size for your sessions. In addition to the regular buffers (read: event session buffers) you’ll also see the details for large buffers if you have configured MAX_EVENT_SIZE. The “buffer steps” for any given hardware configuration should be static within each partition mode so if you want to have a handy reference available when you configure your event sessions you can use the following code to generate a range table similar to the one above that is applicable for your specific machine and chosen partition mode.
DECLARE @buf_size_output table (input_memory_kb bigint, total_regular_buffers bigint, regular_buffer_size bigint, total_buffer_size bigint)
DECLARE @buf_size int, @part_mode varchar(8)
SET @buf_size = 1 -- Set to the begining of your max_memory range (KB)
SET @part_mode = 'per_cpu' -- Set to the partition mode for the table you want to generate
WHILE @buf_size <= 4096 -- Set to the end of your max_memory range (KB)
BEGIN
BEGIN TRY
IF EXISTS (SELECT * from sys.server_event_sessions WHERE name = 'buffer_size_test')
DROP EVENT SESSION buffer_size_test ON SERVER
DECLARE @session nvarchar(max)
SET @session = 'create event session buffer_size_test on server
add event sql_statement_completed
add target ring_buffer
with (max_memory = ' + CAST(@buf_size as nvarchar(4)) + ' KB, memory_partition_mode = ' + @part_mode + ')'
EXEC sp_executesql @session
SET @session = 'alter event session buffer_size_test on server
state = start'
EXEC sp_executesql @session
INSERT @buf_size_output (input_memory_kb, total_regular_buffers, regular_buffer_size, total_buffer_size)
Read more: Using SQL Server Extended Events