Hi All,
We have a QA server on which a bunch of databases are hosted.
It hosts like 20-25 databases. All these databases are not active all the team.
At set of databases indicates each environment like QA1 , QA2 , QA3 etc... In each set , there is 1 big db which is around 4.15TB and others are 500GB each.
Multiple testing team uses multiple environments. Sometimes only 1-2 environments are performing testing.
During the time, the workload queries gets executed fine. But whenever, multiple teams starts testing 4-5 environments
at same time , the performance is very slow. a query which takes 2-3 mins to finish would take more than 30 mins.
During these times, testing complains a lot of query performance issues. We dont find any blocking and mostly see PAGEIOLatches on all databases.
The server has 16 CPUs and 256GB RAM. Max server memory set to 80% i.e 205GB.
Questions
Q) This is kind of non-technical question. More of dealing with a pressure situation when servers are running slow.
I usually check number of connections to the users which shows the activity in each database and share it with users.
Sometimes we try bumping the max server memory and gets relief but still we see slowness in the execution of app queries.
I typically use, sp_whoisactive to see current acitivity. for blocking and waits...
All db are isolated to its own separate disks, autogrowths are set to 500mb fixed growth, tempdb has it own separate disks, instant file initialization is on, presize data and log files etc...
Really sometimes I am not finding answers on how can explain it to the end users/testers that multiple environments are running and that's reason for slowness.
They just keep sending escalation mails that db performance is slow. How can we tell / convey them in layman terms.
Most of the business users are non-technical folks.
What information do we need to gather so that we can show them or tell them so they understand?
for instance, If we show the I/O warnings from errorlog , then they will says its something wrong with sql server.
some people just look at PLE and say, there is no memory.
Even if we provide the list of long running queries, high CPU queries no action taken. They say, it was running fine on development server but on this server it takes long long time.
Lots of email noise goes on.. can't say in words. This continues especially when multiple releases happen in same month.
I feel somewhere I am not able explain. How to deal in such situations. How to set the clear expectations and at the same time I feel somewhere I am missing a systematic approach.
Please share your thoughts. Looking for some candid and detailed explanation.
Q) For such server, how much memory is needed is actually needed?
Lets take an example )
If I have a 4TB database(db1) and another 1TB whole set of databases(db2,db3,db4) which makes 1 env.
How does the calculation go for how much memory we need for the SQL VM?
What factors to be considered in determining that this server must need this much minimum memory and this much disk throughput.
Thanks,
Sam