Bagikan melalui


How It Works (It Just Runs Faster): Auto Soft NUMA…

Back in June, Bob Dorr gave you the thinking behind why SQL Server 2016 It Just Runs Faster due to changes in our SQL Server and NUMA configurations. Bob blogged about a new SQL Server 2016 feature called Auto Soft NUMA. As I’ve hit the road and talked more about SQL Server 2016: It Just Runs Faster, I’ve received some questions about how Auto Soft NUMA works and how to recognize how SQL Server detects a NUMA configuration and makes decisions to partition it even further.

The best way to give you the inside details on how this work is to use an example. I have a machine I use from time to time in our labs at Microsoft that has the following specifications:

4 physical NUMA nodes – 1 socket per NODE

18 cores per socket (or NODE)

Hyper threading Enabled = 36 logical processors per socket (or NODE)

144 total logical processors as exposed by the Windows OS and seen by SQL Server.

The SQL ERRORLOG provides some very interesting information about what it detects from “asking” Windows OS about the machine configuration and understanding the licensing of SQL Server. (Standard Edition and CAL based licensing can restrict how many processors SQL Server can use.  Standard is discussed later in this post. You can find out more on CAL licensing at https://www.microsoft.com/en-us/sql-server/sql-server-2016-pricing). For this machine, the ERRORLOG reports:

SQL Server detected 4 sockets with 18 cores per socket and 36 logical processors per socket, 144 total logical processors; using 144 logical processors based on SQL Server licensing. This is an informational message; no user action is required.

I’m running SQL Server 2016 Enterprise Edition (Core Licensing) and so I get full access to the compute resources from this machine. To understand how Auto Soft NUMA will partition this configuration into soft nodes, let’s look at an ERRORLOG entry showing the node to CPU mapping with Auto Soft NUMA disabled. Auto Soft NUMA is enabled by default in SQL Server 2016 but can be disabled by executing ALTER SERVER CONFIGURATION SET SOFTNUMA OFF and then restarting the SQL Server instance (Note: there is a fix in SQL Server 2016 CU1 you need to apply when using this command: https://support.microsoft.com/kb/3158710).

In the ERRORLOG you may have seen entries that look like this:

Node configuration: node 0: CPU mask: 0x0000000fffffffff:0 Active CPU mask: 0x0000000fffffffff:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 1: CPU mask: 0x0000000fffffffff:1 Active CPU mask: 0x0000000fffffffff:1. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 2: CPU mask: 0x0000000fffffffff:2 Active CPU mask: 0x0000000fffffffff:2. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 3: CPU mask: 0x0000000fffffffff:3 Active CPU mask: 0x0000000fffffffff:3. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.

For this machine, this tells us how SQL Server has mapped CPU to NODEs and which schedulers are active to be used (using AFFINITY can affect what is active and specifying affinity settings would change the Active CPU mask). For my machine with Auto Soft NUMA disabled, I see 4 nodes and all the logical processors active. But how do I know which ones? Let’s look at the first line to break this down

Node configuration: node 0: CPU mask: 0x0000000fffffffff:0 Active CPU mask: 0x0000000fffffffff:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.

CPUs are numbered from 0 to N and you read this ordering from right to left. So the value 0x0000000fffffffff (Note the :0 is the processor group number) is ‭111111111111111111111111111111111111‬in binary.  Let’s actually use the calculator program in Windows 10 to look at this a different way. Use the Programmer mode of the calculator and copy in the value in Hex but use the “Bit toggling keypad” option

image

Notice the numbers underneath the bits. They can help you see the CPU numbers (starting with 0) up to 63. Start at the bottom right corner and read right to left and up. This tells us that for Node 0 on Processor Group 0, CPUs 0 – 35 (36 logical CPUs) belong to Node 0.  Since each node here is on its own processor group, its CPU numbers start with 0. So you can break out the same CPU numbers for each node in the ERRORLOG list. What about DMVs? Do they show this same mapping?

Take a look at the results of this query:

select node_id, memory_node_id, online_scheduler_mask, cpu_affinity_mask
from sys.dm_os_nodes
where node_id != 64

image

As with the ERRORLOG, there are 4 nodes. The online_scheduler_mask and cpu_affinity_mask should line up with what the ERRORLOG showed above (online_scheduler_mask = Active). Except this time, we show you the mask in decimal. So take that decimal number and plug it into the calc app and look at the values in binary. It looks like same as when you did it with 0000000fffffffff

BTW. Your NUMA system may not have different processor groups for each node. Here is another example from another two node system I have with one processor group

Node configuration: node 0: CPU mask: 0x0000000000000fff:0 Active CPU mask: 0x0000000000000fff:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 1: CPU mask: 0x0000000000fff000:0 Active CPU mask: 0x0000000000fff000:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.

This is a two node system with 6 cores per node (with HT 12 logical CPUs per node). So for Node 0, 0x0000000000000fff is CPU 0-11 but what about the 2nd line?. What does 0000000000fff000 mean? Look at this in calc and you can see:

image

The bits start with 1 at position 12 and go to 23. This means Node 1 has CPUs 12 – 23 mapped into that node. So pretty simple. Break out the bits and look for the 1s. Then find their ordinal position to know what CPU numbers they correspond to.

So back to our original server. Let’s turn Auto Soft NUMA back on and look at the ERRORLOG

First, we see a message like this (Note: We made a fix in 2016 SP1 because this used to say 8 logical processors)

Automatic soft-NUMA was enabled because SQL Server has detected hardware NUMA nodes with greater than 8 physical cores.

And the node mapping has apparently gone crazy

Node configuration: node 0: CPU mask: 0x0000000000015555:0 Active CPU mask: 0x0000000000015555:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 1: CPU mask: 0x0000000555540000:0 Active CPU mask: 0x0000000555540000:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 2: CPU mask: 0x000000000002aaaa:0 Active CPU mask: 0x000000000002aaaa:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 3: CPU mask: 0x0000000aaaa80000:0 Active CPU mask: 0x0000000aaaa80000:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 4: CPU mask: 0x0000000000015555:1 Active CPU mask: 0x0000000000015555:1. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 5: CPU mask: 0x0000000555540000:1 Active CPU mask: 0x0000000555540000:1. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 6: CPU mask: 0x000000000002aaaa:1 Active CPU mask: 0x000000000002aaaa:1. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 7: CPU mask: 0x0000000aaaa80000:1 Active CPU mask: 0x0000000aaaa80000:1. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 8: CPU mask: 0x0000000000015555:2 Active CPU mask: 0x0000000000015555:2. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 9: CPU mask: 0x0000000555540000:2 Active CPU mask: 0x0000000555540000:2. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 10: CPU mask: 0x000000000002aaaa:2 Active CPU mask: 0x000000000002aaaa:2. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 11: CPU mask: 0x0000000aaaa80000:2 Active CPU mask: 0x0000000aaaa80000:2. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 12: CPU mask: 0x0000000000015555:3 Active CPU mask: 0x0000000000015555:3. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 13: CPU mask: 0x0000000555540000:3 Active CPU mask: 0x0000000555540000:3. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 14: CPU mask: 0x000000000002aaaa:3 Active CPU mask: 0x000000000002aaaa:3. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 15: CPU mask: 0x0000000aaaa80000:3 Active CPU mask: 0x0000000aaaa80000:3. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.

Wow. Why did SQL Server take this 4 node system and now split into 16 nodes? Well the principle behind Auto Soft NUMA is that we want to try and get as close to 8 CPUs per node because traditionally 8 CPUs per NODE is a scalability threshold. Let’s take the first four Soft Nodes and see how SQL Server split this up:

Node configuration: node 0: CPU mask: 0x0000000000015555:0 Active CPU mask: 0x0000000000015555:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 1: CPU mask: 0x0000000555540000:0 Active CPU mask: 0x0000000555540000:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 2: CPU mask: 0x000000000002aaaa:0 Active CPU mask: 0x000000000002aaaa:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 3: CPU mask: 0x0000000aaaa80000:0 Active CPU mask: 0x0000000aaaa80000:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.

The CPU mask technique I showed you earlier can be mapped out using calculator. Using the same techniques as I did above, lets look at the first node 0 CPU mask of 0000000000015555

image

You can see there are 9 CPUs that have values of 1 but they are not consecutive. In fact, CPU 0, 2, 4, 6, 8, 10, 12, 14, and 16 are the active CPUs for this new soft NODE.  If you look at Node 1 you will see that CPUs 18, 20, 22, 24, 26, 28, 30, 32, and 34 are active. What about CPU 1, 3, …17? They belong to Node 2 and Node 3 has CPU 19, 21, … 35. So in total the first 4 soft nodes have 36 logical CPUs but each node alternates the CPU mapped to the perspective soft node. Why? This is because we want to spread out the logical CPUs for each core (Ex. CPU 0 and 1 belong to a physical core) across the soft nodes. In addition, we want to avoid putting CPUs from the same core close to each other so Node 1 starts with CPU 18 not 1. Now if you have read anything about NUMA, one of the key concepts is to schedule work on CPUs on the local node to allocate memory in a fast manner (aka. avoid remote or foreign memory access). Therefore, all of these soft nodes are still mapped to their corresponding physical node. You can see this mapping from sys.dm_os_nodes. Let’s run the same query as we did when soft NUMA was OFF:

image

You can see that node_id 0-3 all map to memory_node_id (physical node) 0. So all of the CPUs mapped to these soft nodes still are mapped to their corresponding physical node.

What about Standard Edition?

With all the buzz of our announcement on SQL Server 2016 SP1 and new features that are enabled on Standard Edition, I thought it would be interesting for you to see how Standard Edition looks on this machine and what is possible.

Standard Edition is limited to 24 cores (but HT is “free” so 48 logical processors). I installed a Standard Edition instance on this same machine and with Auto Soft NUMA ON the ERRORLOG looks like this:

SQL Server detected 4 sockets with 18 cores per socket and 36 logical processors per socket, 144 total logical processors; using 48 logical processors based on SQL Server licensing. This is an informational message; no user action is required.

You can see even though all computing resources are detected only 48 logical processors can be used.

Auto Soft NUMA is still applicable but since we can only use 48 logical processors, only the first 48 CPUs are active. The mapping looks like this

Node configuration: node 0: CPU mask: 0x0000000000015555:0 Active CPU mask: 0x0000000000015555:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 1: CPU mask: 0x0000000555540000:0 Active CPU mask: 0x0000000555540000:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 2: CPU mask: 0x000000000002aaaa:0 Active CPU mask: 0x000000000002aaaa:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 3: CPU mask: 0x0000000aaaa80000:0 Active CPU mask: 0x0000000aaaa80000:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 4: CPU mask: 0x0000000000015555:1 Active CPU mask: 0x0000000000000555:1. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 5: CPU mask: 0x0000000555540000:1 Active CPU mask: 0x0000000000000000:1. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 6: CPU mask: 0x000000000002aaaa:1 Active CPU mask: 0x0000000000000aaa:1. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 7: CPU mask: 0x0000000aaaa80000:1 Active CPU mask: 0x0000000000000000:1. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 8: CPU mask: 0x0000000000015555:2 Active CPU mask: 0x0000000000000000:2. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 9: CPU mask: 0x0000000555540000:2 Active CPU mask: 0x0000000000000000:2. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 10: CPU mask: 0x000000000002aaaa:2 Active CPU mask: 0x0000000000000000:2. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 11: CPU mask: 0x0000000aaaa80000:2 Active CPU mask: 0x0000000000000000:2. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 12: CPU mask: 0x0000000000015555:3 Active CPU mask: 0x0000000000000000:3. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 13: CPU mask: 0x0000000555540000:3 Active CPU mask: 0x0000000000000000:3. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 14: CPU mask: 0x000000000002aaaa:3 Active CPU mask: 0x0000000000000000:3. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 15: CPU mask: 0x0000000aaaa80000:3 Active CPU mask: 0x0000000000000000:3. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.

Everything looks the same as with Enterprise Edition up to Node 4. But Node 4-6 looks like this:

Node configuration: node 4: CPU mask: 0x0000000000015555:1 Active CPU mask: 0x0000000000000555:1. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 5: CPU mask: 0x0000000555540000:1 Active CPU mask: 0x0000000000000000:1. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 6: CPU mask: 0x000000000002aaaa:1 Active CPU mask: 0x0000000000000aaa:1. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.

 

The first four nodes make up 36 logical CPUs. But since we are capped at 48, we need to map 12 more CPUs but keep our rules as we did before (I.e. spread out CPUs from same core, …). So Node 4 has 6 CPUs making up CPU 0, 2, 4, 6, 8, and 10 from Processor Group 1.  Node 5 is empty and Node 6 has CPU 1, 3, 5, 7, 9, and 11. One of the reasons that all the other nodes are available but not used is to support dynamic affinity. You are free to spread out your affinity for SQL Server among any of the CPUs on any of these nodes provided you don’t exceed the 48 logical processor restriction.

 

What does this look like in a Virtual Machine?

By their nature, virtual machines don’t have a concept of logical or physical processors. All CPUs are considered “physical processors” within the context of the Guest OS. Hyper-V calls these vCPUs. So the configuration of the virtual machine you setup and the capabilities of your hypervisor will dictate what the guest OS and SQL Server will detect. Let’s use an example with Microsoft Hyper-V on Windows Server 2012 R2 to see what this can look like. I created a new virtual Machine with the Guest OS of Windows Server 2012 R2 and SQL Server 2016 SP1 Enterprise Edition (Core Licensing). Using Hyper-V Configuration settings (under NUMA configuration) I assigned 2 NUMA nodes and 64 processors to my VM. (Windows Server 2012 R2 supports a max of 64 vCPUs per VM. Windows Server 2016 supports up to 240 vCPUs per VM).

First you can see that Windows presents all processors as “physical” processors to SQL Server:

SQL Server detected 2 sockets with 32 cores per socket and 32 logical processors per socket, 64 total logical processors; using 64 logical processors based on SQL Server licensing. This is an informational message; no user action is required.

SQL Server detects more than 8 physical processors per NODE so Auto Soft NUMA is enabled

Automatic soft-NUMA was enabled because SQL Server has detected hardware NUMA nodes with greater than 8 physical cores

And here is the CPU to node mapping

Node configuration: node 0: CPU mask: 0x00000000000000ff:0 Active CPU mask: 0x00000000000000ff:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 1: CPU mask: 0x000000000000ff00:0 Active CPU mask: 0x000000000000ff00:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 2: CPU mask: 0x0000000000ff0000:0 Active CPU mask: 0x0000000000ff0000:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 3: CPU mask: 0x00000000ff000000:0 Active CPU mask: 0x00000000ff000000:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 4: CPU mask: 0x000000ff00000000:0 Active CPU mask: 0x000000ff00000000:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 5: CPU mask: 0x0000ff0000000000:0 Active CPU mask: 0x0000ff0000000000:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 6: CPU mask: 0x00ff000000000000:0 Active CPU mask: 0x00ff000000000000:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
Node configuration: node 7: CPU mask: 0xff00000000000000:0 Active CPU mask: 0xff00000000000000:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.

Since we have 64 total processors this breaks down very nicely into 8 soft nodes each with 8 processors. We also ensure we keep each soft node mapped to the corresponding physical node so soft nodes 0-3 map to memory node 0 and soft nodes 4-7 map to memory node 1.

 

Does any of this Matter?

As Bob Dorr talked about NUMA in his post back in June, partitioning is a common scheme to achieve better scalability. The concept of NUMA itself is a partitioning methodology.  SQL Server in many parts of its core engine, partitions key data structures and makes decisions per NODE so by splitting up physical nodes into more virtual nodes, we can achieve better scalability. While I don’t have specific demos to show you this, we have seen customers get a boost in performance with no application changes simply by upgrading to SQL Server 2016 and allowing SQL Server to apply the Auto Soft NUMA configuration. The entire intention of this feature is that “It Just Works” and can help achieve better scalability on high-dense multi-core systems

 

Bob Ward
Principal Architect
Microsoft
Tiger Team, Data Group

Comments

  • Anonymous
    December 01, 2016
    "because traditionally 8 CPUs per NODE is a scalability threshold"Still True with 2016?
    • Anonymous
      December 02, 2016
      Yes
  • Anonymous
    March 06, 2017
    When soft numa is enabled in this case (4 sockets with 18 cores per socket and 36 logical processors per socket, 144 total logical processors), you will get odd number of online_scheduler_count (equal 9)Is it ok from performance perspective ?Which option is best:1. use default (enable)2. disable soft numa3. change anfinity mask toALTER SERVER CONFIGURATION SET PROCESS AFFINITY CPU = 0 TO 15,18 TO 33,64 TO 79,82 TO 97,128 TO 143,146 TO 161,192 TO 207,210 TO 225here you get even number online_scheduler_count equal 8, but you "lose" 8 cores (16 logical cpu)Check:SELECT node_id, node_state_desc, memory_node_id, processor_group, online_scheduler_countFROM sys.dm_os_nodesWHERE node_id != 64