Part 2: Resource Governor CPU Demo on multiple CPUs
In some cases CPU demo described in the previous post may behave not as expected.
One such case is doing the demo on multiple CPUs. Even though it appears as Resource Governor is "not working", it is "by design". Let's take a look at what happens in this case.
What is our promise about CPU bandwidth distribution?
When you specify percentage on the pool we will ensure average CPU bandwidth distribution on per scheduler basis for CPU bound workloads. Note, emphasis: average, per scheduler, CPU bound. What does this mean?
For example, you have a 2 CPU machine, and SQL Server runs on both of the CPUs. If you haven't set affinity mask as was indicated in the Setup step of the previous post,
select scheduler_id, cpu_id, status, is_online from sys.dm_os_schedulers
will get you something like:
scheduler_id cpu_id status is_online
------------ ----------- ------------------------------------------------------------ ---------
0 0 VISIBLE ONLINE 1
1 1 VISIBLE ONLINE 1
257 0 HIDDEN ONLINE 1
255 0 VISIBLE ONLINE (DAC) 1
Important item to note here is that we have 2 online schedulers to service normal user requests (scheduler_id < 255). You can compare it with the case when affinity mask is set to 1. You will still have 2 schedulers (with scheduler_id < 255), but one of them will be offline.
Now, for simplicity, let's configure 2 groups and 2 pools as follows:
CREATE RESOURCE POOL pSlow
WITH (MAX_CPU_PERCENT = 30)
CREATE RESOURCE POOL pFast
WITH (MAX_CPU_PERCENT = 70)
CREATE WORKLOAD GROUP gSlow
USING pSlow
CREATE WORKLOAD GROUP gFast
USING pFast
GO
And corresponding classifier function as
CREATE FUNCTION f1()
RETURNS SYSNAME WITH SCHEMABINDING
BEGIN
DECLARE @val sysname
if 'UserSlow' = SUSER_SNAME()
SET @val = 'gSlow';
else if 'UserFast' = SUSER_SNAME()
SET @val = 'gFast';
return @val;
END
GO
Create logins, assign function and make changes effective
CREATE LOGIN UserFast WITH PASSWORD = 'UserFastPwd', CHECK_POLICY = OFF
CREATE LOGIN UserSlow WITH PASSWORD = 'UserSlowPwd', CHECK_POLICY = OFF
GO
ALTER RESOURCE GOVERNOR WITH (CLASSIFIER_FUNCTION = dbo.f1)
GO
ALTER RESOURCE GOVERNOR RECONFIGURE
Add group counters for CPU Usage % and run 1 CPU intensive query UserSlow, then 1 with UserFast. You will observe a picture similar to this:
But how can explain that the resource governor does not obey 70:30 distribution we specified for the resource pools? The answer is quite simple: the queries landed on different schedulers (= different CPUs) and thus, there is nothing to share between them and each of them is taking a full CPU. How can you verify that?
First of all, verify the classification
-- verify the classification
select
sess.session_id, sess.group_id, CONVERT(NCHAR(20), grps.name) as group_name
from
sys.dm_exec_sessions as sess join
sys.dm_resource_governor_workload_groups as grps on sess.group_id = grps.group_id
where
session_id > 50
GO
Where you will see something like
session_id group_id group_name
---------- ----------- --------------------
52 2 default
51 265 gSlow
54 266 gFast
Everything is normal here. Now, let's take a look at how currently executing requests landed on schedulers:
select
r.session_id,
CONVERT(NCHAR(20), wg.name) as group_name,
t.scheduler_id,
r.status
from sys.dm_exec_requests r
join sys.dm_os_tasks t on r.task_address = t.task_address
join sys.dm_resource_governor_workload_groups wg on r.group_id = wg.group_id
where
r.session_id > 50
GO
Which will get
session_id group_name scheduler_id status
---------- -------------------- ------------ ------------------------------
54 gFast 0 runnable
52 default 0 running
51 gSlow 1 running
What you can see here is that gFast and gSlow is working on different schedulers, thus confirming our earlier statement. Thus Red and Blue line corresponds to utilization of 1 full CPU or 50%, since it is normalized to number of CPUs available on the box.
Is this a problem with Resource Governor in general? Certainly not if you have more executing requests than you have CPUs or if queries are running in parallel. But you should keep this in mind if you are trying to give 90% of CPU bandwidth to a single query running serially compared to a parallel query or number of serial queries in another group/pool.
A simple way of thinking about this is: is my workload capable of taking as much CPU as I told it to? Remember, when you set MAX_CPU_PERCENT on the pool, you allowing workloads belonging to the pool to take this much bandwidth on each scheduler. In case of the demo, you are working with a single query workload which produces this initially unexpected result.
Comments
Anonymous
January 01, 2003
If you are looking for some demos to play and learn about Resource Governor, you should have a look toAnonymous
May 30, 2012Manage a single database spread over 200+ cpu cores?....