Windows Server and Processor Cores…
Recently someone asked me of my thoughts on how Windows Server handles processor cores. With newer processors available with more than 2 or 4 cores each, it seemed like a good time to revisit this topic. If you have a system with multiple processor sockets and a few new processors with 2, 4, 6, 10, or more cores each…what should you expect Windows to do?
With use of multi-core processors becoming more prevalent in not only servers but desktops and maybe even your next cell phone or TV remote…it makes sense to review how Windows Server makes use of processors…since that is where you’re more likely to see higher densities of cores on a single physical processor.
Windows Server 2008/2008 R2 count processors in relation to licensing by processor socket (physical processor.) For example, say the edition of Windows Server you have indicates that it supports up to 4 physical processors. If you have dual core processors in four available processor sockets, that would provide 8 logical processors for the OS. If the processors also support Simultaneous Multithreading (SMT) (also known to some as Hyper Threading Technology (HTT) based on Intel's implementation) and the system has this option enabled, then the total logical processor count may then be 16. 16 processors as compared to 8…seems like a no-brainer to have twice as many, right?
Don’t confuse processor cores with extra logical processors (LPs) available with SMT enabled. Some configurations with SMT might present one or more extra LP per core. Physical cores and LPs from SMT are two different things and the expected performance may be different than expected. Additional processor cores are practically just like additional physical processors without requiring extra sockets for them on the system board.
The way I like to think of SMT is to think of one of those old pizza shops where the pizza maker is tossing the dough in the air in preparation to make a pizza. You know…the old-fashioned way. Instead of being able to toss just one pizza at a time…imagine the same person tossing and spinning two of them simultaneously. Performance of this person compared to two separate people performing the same task may not be equivalent and may be somewhere in-between. For SMT compatible processors that provide an extra LP per physical core, SMT allows a processor core to run one additional concurrent thread per SMT LP exposed but sharing on-chip resources like cache. If the shared resources on chip become bottlenecks for simultaneously executing threads on the same processor, then SMT may not contribute to but might limit performance.
For purposes of illustration, assume you have a single core processor that supports SMT and provides a single SMT LP. In that configuration, both LPs share resources on the chip. A SMT processor typically will not provide the same performance as two single-threaded processors but may provide better performance than a single processor. With expected performance of a SMT LP being somewhere in-between, the performance gains achieved in a SMT configuration will vary by application. While I truly believe that SMT on today’s hardware is better implemented and performs better than in years past, I don’t factor SMT into sizing a system. SMT can be a good performance benefit to have on hand if you need it, but I’ve not seen the performance to be that much greater. I’ve consistently thought of SMT as yielding more compute power than dealing with I/O. You can search the net and find a variety of opinions on this topic. You may form your own opinion. There are also some applications that suggest or require disabling SMT because of the impact to the application. The advice I’ve consistently given has been to size systems according to physical processors and cores. Then use Performance Monitor to determine if SMT provides additional gain. And, of course, if an application says don’t use SMT…the vendor may have a reason.
The number of possible LPs prior to Windows Server 2008 R2 was based on the number of bits. For instance, a 32-bit OS could use 32 LPs; 64-bit could use 64 LPs. This is confirmed by Mark Russinovich’s presentation on R2’s kernel changes (available on the Microsoft Download Center.) Windows Server 2008 R2 extends this limit by allowing up to 4 groups of up to 64 processors each. Doing the math, that translates to a maximum of 256 LPs for Windows Server 2008 R2. That alone would be enough for me to jump to R2 if I were an administrator using very expensive hardware with lots of processor cores…especially for virtualization.
The Windows Server 2008 R2 kernel establishes processor groups (K-Groups) at boot time; they are not customizable by an administrator after startup. However, according to KB2506384, there exists a way to manually adjust K-Group assignments to your liking for the next boot of the OS. K-Groups may contain one or more NUMA nodes. Windows attempts to place all processors from a given NUMA node in the same group where possible. Systems with less than 64 LPs will have only a single group. From a scheduling standpoint, threads are assigned to only one group at a time. Also, an interrupt may target only the processors of a single group.
What happens when a physical processor has multiple cores or a given core has multiple logical processors?
The answer to this question depends on whether you're using Windows Server 2008 R2 RTM or with applicable updates that alter default behavior. Using the RTM version of Windows Server 2008 R2, the kernel attempts to place all cores of a given physical processor in the same group whenever possible. If using processors where the number of cores per chip isn’t an even multiple of 2, then some cores on a physical processor may be split between groups. For example, if using 12 processors with 6 cores each, the total number of processor cores would be 72. This would result in one group of 64 processors, and a second group of 8. The eleventh physical processor would have 4 cores in the first group, with the remaining two cores the second group along with all six cores of processor 12. For some applications, uneven groups can be problematic. Additionally, minor hardware differences between seemingly identical systems could result in one with a {64,8} grouping and another with {8,64}.
If using Windows Server 2008 R2 with KB 2510206 (or future service pack containing this update), the kernel will attempt to balance processors amongst groups. With the preceding example of 72 LP cores, the resulting groups would each contain 36. The update provides predictability and balance without requiring manual K-Group specification as per KB2506384.
If using Windows Server 2008 with more than 64 cores, you would not be able to utilize extra cores above that limit even though they may exist. Windows Server 2008 R2 can utilize processor groups and allow use of these additional cores up to the maximum of 256. This isn’t the only reason to consider moving to Windows Server 2008 R2…there certainly are many more.
Hyper-V has limits as to the number of LPs it supports as a virtualization host. As a result, if using Hyper-V it is possible that the system may be limited on the number of LP that can be used even though the OS version, Service Pack, or other updates may support more. It is important to be aware of these limits when planning or ordering systems to be used for virtualization. Based on published documents found on TechNet about the upcoming Windows Server 2012 release, indications are that Hyper-V limits may raise significantly. For more information, see the link provided in the additional references section as Windows Server 2012 was not a released product at the time this post was published.
Below is a chart of Windows Server versions and the maximum number of LPs supported.
Windows Server 2008 w/Hyper-V | 16 LP |
Windows Server 2008 Service Pack 2 w/ Hyper-V | 24 LP |
Windows Server 2008 R2 w/ Hyper-V | 64 LP |
Therefore, if you have a computer with Windows Server 2008 R2 installed with updates that normally could support up to 256 LPs, the same system would only utilize the first 64 LPs with Hyper-V enabled. All remaining LPs would be ignored. The primary OS runs in the parent partition and does not recognize any LPs above 64 because the Hypervisor does not present any additional to the system. In a configuration like this, you would want to make certain that SMT is disabled to make use of all possible physical cores possible rather than have many physical cores ignored and unused. Further, on such a configuration, you may receive a warning from a Best Practices Analyzer scan indicating the number of LPs available exceeds the number supported by Hyper-V.
2510206 Performance issues when more than 64 logical processors are used in Windows Server 2008 R2
https://support.microsoft.com/kb/2510206/EN-US
2546706 A Windows Server 2008 R2-based computer that has some NUMA-based processors and more than 256 logical processors runs in SMP mode as a 64-processor system and may experience decreased performance
https://support.microsoft.com/kb/2546706/EN-US
2517752 "0x0000000A" Stop error occurs during the shutdown process on a computer that is running Windows Server 2008 and that has more than 64 processors installed
https://support.microsoft.com/kb/2517752/EN-US
Sysinternals CoreInfo tool can show logical processor to physical processor mappinghttps://technet.microsoft.com/en-us/sysinternals/cc835722
Hyper-V: The number of logical processors in use must not exceed the supported maximumhttps://technet.microsoft.com/en-us/library/ee941148(v=WS.10).aspx
Requirements and Limits for Virtual Machines and Hyper-V in Windows Server 2008 R2https://technet.microsoft.com/en-us/library/ee405267(WS.10).aspx
Competitive advantages of Windows Server 2012 Release Candidate