Questions on Page Life Expectancy and other memory related counters.

chrisrdba 381 Reputation points
2022-10-28T16:12:50.083+00:00

Greetings. My company recently acquired a new monitoring product that alerts when PLE is < 300 for 15 minutes.

Immediately, PLE appeared to be a problem on all 3 boxes where this product lived on a regular basis. Obviously this didn't seem right, especially since I had no users complaining at all about performance.

During my investigation, I found a couple articles from gurus suggesting the PLE isn't the end all be all many of us grew up believing it was. These articles are definitely in the minority compared to what you'll read from googling the topic, but they can be found here, here, and here.

All said I decided to do an experiment -- I began collecting 5 recommended memory counters every 5 minutes on 2 of the 3 boxes I have this new monitoring product installed on. It stands to reason that if PLE was low, the rest of the counters would show memory pressure as well. Of course I realize each counter is unique, but with 4 others counters counting, something else should show some ugliness.

As expected, NONE of the other counters revealed an issue, while almost half of the PLE counters showed below the magic 300 mark (in many cases under 50).

All said, should PLE simply be ignored? Do these other counters seem to be more valid that PLE based on the reasons provided in the articles supplied? Do these other counters seem like reasonable replacements for PLE?

Most articles that reference other counters say something along the lines of "if this number is bad, refer to PLE", but my small experiment seems to have blown a hole wide open in ever needing to do that. Granted this is a very small scale experiment, but I have no reason to think things would change given a longer test time.

If anyone wants to give this a shot, below is the code:

--create tables  
CREATE TABLE [dbo].[osSysMemory](  
 [dateAdded] [datetime] NOT NULL,  
 [total_physical_memory_kb] [bigint] NOT NULL,  
 [available_physical_memory_kb] [bigint] NOT NULL,  
 [system_memory_state_desc] [nvarchar](256) NOT NULL  
) ON [PRIMARY]  
GO  
  
CREATE TABLE [dbo].[osProcessMemory](  
 [dateAdded] [datetime] NOT NULL,  
 [physical_memory_in_use_kb] [bigint] NOT NULL,  
 [process_physical_memory_low] [bit] NOT NULL,  
 [process_virtual_memory_low] [bit] NOT NULL  
) ON [PRIMARY]  
GO  
  
CREATE TABLE [dbo].[osPerfCountersMemory](  
 [dateAdded] [datetime] NOT NULL,  
 [instance_name] [nchar](128) NOT NULL,  
 [object_name] [nchar](128) NOT NULL,  
 [counter_name] [nchar](128) NOT NULL,  
 [cntr_value] [bigint] NOT NULL  
) ON [PRIMARY]  
GO  
ALTER TABLE [dbo].[osSysMemory] ADD  CONSTRAINT [DF_osSysMemory_dateAdded]  DEFAULT (getdate()) FOR [dateAdded]  
ALTER TABLE [dbo].[osProcessMemory] ADD  CONSTRAINT [DF_osProcessMemory_dateAdded]  DEFAULT (getdate()) FOR [dateAdded]  
ALTER TABLE [dbo].[osPerfCountersMemory] ADD  CONSTRAINT [DF_osPerfCountersMemory_dateAdded]  DEFAULT (getdate()) FOR [dateAdded]  
  
--code for job to populate tables every 5 minutes  
insert into osSysMemory (dateAdded, total_physical_memory_kb,available_physical_memory_kb, system_memory_state_desc)  
select getdate(), dosm.total_physical_memory_kb,  
       dosm.available_physical_memory_kb,   
       dosm.system_memory_state_desc  
FROM sys.dm_os_sys_memory dosm;  
  
insert into osProcessMemory (dateAdded, physical_memory_in_use_kb, process_physical_memory_low, process_virtual_memory_low)  
SELECT getdate(),  dopm.physical_memory_in_use_kb,   
       dopm.process_physical_memory_low,   
       dopm.process_virtual_memory_low  
FROM sys.dm_os_process_memory dopm;  
  
insert into OSPerfCountersMemory (instance_name, object_name, counter_name, cntr_value)  
--should always be > 90, if not may have RAM issues  
 SELECT instance_name,'SQLServer:Buffer Manager', 'Buffer cache hit ratio',(a.cntr_value * 1.0 / b.cntr_value) * 100.0   
FROM sys.dm_os_performance_counters  a  
JOIN  (SELECT cntr_value, OBJECT_NAME   
    FROM sys.dm_os_performance_counters    
    WHERE counter_name = 'Buffer cache hit ratio base'  
        AND OBJECT_NAME LIKE '%:Buffer Manager%') b ON  a.OBJECT_NAME = b.OBJECT_NAME  
WHERE a.counter_name = 'Buffer cache hit ratio'  
AND a.OBJECT_NAME LIKE '%:Buffer Manager%'  
  
insert into OSPerfCountersMemory (instance_name, object_name, counter_name, cntr_value)  
--should never go below 300 (this is really outdated, but no clear replacement, hence all this buffoonery)  
SELECT instance_name, object_name, counter_name, cntr_value  
FROM sys.dm_os_performance_counters  
WHERE [counter_name] = 'Page life expectancy'  
  
insert into OSPerfCountersMemory (instance_name, object_name, counter_name, cntr_value)  
--if above 0 may have RAM issues  
SELECT instance_name, object_name, counter_name, cntr_value  
FROM sys.dm_os_performance_counters  
WHERE [object_name] = 'SQLServer:Memory Manager'  
AND [counter_name] = 'Memory Grants Pending'  
   
--queries to look at results  
--almost half of the records for PLE returned  
SELECT *  
  FROM [perf].[dbo].[osPerfCountersMemory]  
  where counter_name = 'page life expectancy'  
  and OBJECT_NAME = 'SQLServer:Buffer Node'  
  and cntr_value <= 300  
  
--0  
SELECT *  
  FROM [perf].[dbo].[osPerfCountersMemory]  
  where counter_name = 'Memory Grants Pending'  
    and cntr_value > 0  
  
SELECT *  
  FROM [perf].[dbo].[osPerfCountersMemory]  
  where counter_name = 'Buffer cache hit ratio'  
  and cntr_value <= 90  
  
--0 (this guy is close as we hit "steady" instead of "high")  
  SELECT *  
  FROM [perf].[dbo].[osSysMemory]  
  where system_memory_state_desc not in ('Physical memory usage is steady', 'Available physical memory is high')  
  
 --0  
SELECT *  
  FROM [perf].[dbo].[osProcessMemory]  
  where process_physical_memory_low = 1  
  or process_virtual_memory_low = 1  
  
    
  
  
  
SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
13,325 questions
0 comments No comments
{count} votes

Accepted answer
  1. Erland Sommarskog 106.5K Reputation points
    2022-10-28T21:29:27.74+00:00

    During my investigation, I found a couple articles from gurus suggesting the PLE isn't the end all be all many of us grew up believing it was. These articles are definitely in the minority compared to what you'll read from googling the topic, but they can be found here, here, and here.

    To be honest, you have read articles by Brent Ozar and Paul Randall - and then you ask in this forum if they are correct or not?

    You really think that this forum is full of people that are smarter than these guys? I, for one, am certainly not going to pretend I am. So if they say that PLE should be ignored, there is no reason for disbelief.

    And furthermore, you seem to have to done a good job to back up what they are saying with your own test. I did not try it myself, but I have no reason to doubt your findings.

    Overall, I'm a little skeptic to all these monitoring software that sounds the alarms for various things. You should always ask yourself when an alarm goes off, is there a real issue? (And from that point of view, the alarm source are the users. When they sound the alarm, you know it's for real. :-)


0 additional answers

Sort by: Most helpful