Poor database performance

Question

Poor database performance

Sam 1,476

Hi All,

We are seeing serious I/O contentions issues on one of our database called "MarketingDB".
Looking for some suggestions on how we can improve I/O demands for this database. Enabled Instant file initialization for data files to get some performance gain.
All data files are in one drive and log file is in a separate drive. Excluded Antivirus scans on all SQL Folders.
Formatted drives with 64k blocksize.
We tried archiving some data but reverted back. Since database is big, the deletes are taking a lot of time and ending up filling up the txn log file and we have a lot of LOB data types inside the tables.
We tried with batch deletes as well nothing but major improvements.
If the database files are not equi-sized. Anyone share thoughts of equi-sizing all the data files . We have a dedicated file for index data file.
The "Marketing_data" data file is becoming a HOT SPOT very frequently. What is the approach to distribute the data within the file into separate filegroups. What approach should be taken to distribute the tables?
Do we need to collect table sizes and distribute them as all big tables into one data file and all small tables into separate data file in a separate drive? what are the best practices ? partitioning ? Please share your thoughts.

Autogrowth settings

====================

data files autogrowth 100MB , Unrestricted growth
log file autogrow by 1GB, unrestricted.

Best Regards,
Sam

Erland Sommarskog 133.7K Reputation points MVP Volunteer Moderator

2020-12-31T13:11:49.557+00:00

There are so many open ends here that it is impossible to give an answer. There is I/O contention, you say. Is that for reads or for writes? If it is for writes, is that data or log file?

The data files are of different sizes. Are they all one filegroup, or is each file its own filegroup?

What is the hardware? SAN? Local disks?

This is only the start of the questions...
Dan Guzman 9,516 Reputation points

2020-12-31T13:33:44.003+00:00

How exactly did you identify "serious I/O contentions issues" and "HOT SPOT"?
Sam 1,476 Reputation points

2021-01-05T10:04:21.07+00:00

Hi Erland,

Please find my responses.

There are so many open ends here that it is impossible to give an answer. There is I/O contention, you say. Is that for reads or for writes? If it is for writes, is that data or log file?
--- Seeing more reads on data file.

fileid
1 - primary file
2 - log file
3 - secondary data file
4 - index data file

The data files are of different sizes. Are they all one filegroup, or is each file its own filegroup?
--- 3 filegroups
Marketting_prm PRIMARY
Marketting_data MRKT_DATA
Marketting_index MRKT_INDX

What is the hardware? SAN? Local disks?
Local disks

I am open to all your questions Sir. I will learn.
Sam 1,476 Reputation points

2021-01-05T10:05:26.043+00:00

I use this query Dan. Please correct me if I am wrong.

SELECT [database_id],
[file_id],
[num_of_reads],
[num_of_bytes_read],
[io_stall_read_ms],
[num_of_writes],
[num_of_bytes_written],
[io_stall_write_ms],
[io_stall],
[size_on_disk_bytes]
FROM sys.[dm_io_virtual_file_stats](NULL, NULL)
where [database_id] = 12
ORDER BY [io_stall] DESC;
GO

Answer accepted by question author

3 additional answers

Your answer

Erland Sommarskog 133.7K Reputation points MVP Volunteer Moderator

2020-12-31T13:11:49.557+00:00

There are so many open ends here that it is impossible to give an answer. There is I/O contention, you say. Is that for reads or for writes? If it is for writes, is that data or log file?

The data files are of different sizes. Are they all one filegroup, or is each file its own filegroup?

What is the hardware? SAN? Local disks?

This is only the start of the questions...
Dan Guzman 9,516 Reputation points

2020-12-31T13:33:44.003+00:00

How exactly did you identify "serious I/O contentions issues" and "HOT SPOT"?
Sam 1,476 Reputation points

2021-01-05T10:04:21.07+00:00

Hi Erland,

Please find my responses.

There are so many open ends here that it is impossible to give an answer. There is I/O contention, you say. Is that for reads or for writes? If it is for writes, is that data or log file?
--- Seeing more reads on data file.

fileid
1 - primary file
2 - log file
3 - secondary data file
4 - index data file

The data files are of different sizes. Are they all one filegroup, or is each file its own filegroup?
--- 3 filegroups
Marketting_prm PRIMARY
Marketting_data MRKT_DATA
Marketting_index MRKT_INDX

What is the hardware? SAN? Local disks?
Local disks

I am open to all your questions Sir. I will learn.
Sam 1,476 Reputation points

2021-01-05T10:05:26.043+00:00

I use this query Dan. Please correct me if I am wrong.

SELECT [database_id],
[file_id],
[num_of_reads],
[num_of_bytes_read],
[io_stall_read_ms],
[num_of_writes],
[num_of_bytes_written],
[io_stall_write_ms],
[io_stall],
[size_on_disk_bytes]
FROM sys.[dm_io_virtual_file_stats](NULL, NULL)
where [database_id] = 12
ORDER BY [io_stall] DESC;
GO

Answer 1

David Browne 111 Microsoft Employee

You're getting about 7ms/read on a 6TB database with lots of IO, which is reasonable. Do you have an actual problem or just noticing the IO wait time? This might be acceptable, but you're probably a bit under-sized from an IO and VM point-of-view. Nothing wrong with that so long as your performance is acceptable and you can perform maintenance tasks in a reasonable time frame. But everything is harder and takes longer when you're running on under-sized infrastructure.

Using Premium SSDs. It is an Azure Win server 2019 VM.

How many disks and what SKU? Are they in a Storage Space (eg managed by the Azure SQL Server resource blade) or mounted individually?

how we can improve I/O demands for this database.

There's nothing you can do with the file and filegroup design that will make a difference. And moving tables between filegroups won't help either.

From an infrastructure point-of-view add more disks and/or increase the VM size to provide more cache memory and possibly space for buffer pool extensions.

Each P30 disk you add gives you 5000IOPS or 200MB/sec of additional throughput in addition to 1TB of usable space. So storage provisioning in Azure isn't just about the size of your database. You may need 20TB of space to get the performance you want on a 5TB database.

But before you do that, evaluate whether some of your large tables can be compressed using PAGE or Columnstore compression, or your queries can be optimized to not have to read so much data. As I might have said once or twice before, turn on the Query Store. You will want to look at queries driving IO, both Logical Reads and Physical Reads (including Read-Ahead Reads).

Sam 1,476 Reputation points

2021-01-11T14:41:47.153+00:00

Hi David,

Thank you for the inputs. We are using

Initially as you said, we used to reach the VM limits. So, as per MS recommendations, we moved to Standard E32-16s_v3.
Also, we added stripped disks to get more throughput.
Other thing is, we used to 4K blocksize. now we changed to 64K blocksize.

Q) How did you come to know about disk latency is 7ms/read ?
Q) What is the correct way to choose the VM size especially taking I/O into consideration? Is it IOPs and throughput perfmon counters to be monitored or is there any additional counters needs to taken into consideration as well?

I am new to azure world and want to know more about the VM sizing part.
Shashank Singh 6,251 Reputation points

2021-01-11T14:50:32.94+00:00

That is tentative data. If you need to get "real" I/O stalls use Paul Randal query. The output you have posted is hardly of any use. And 7 MBs speed I don't think is correct, I may be wrong.
David Browne 111 Reputation points Microsoft Employee

2021-01-11T14:59:52.523+00:00

How did you come to know about disk latency is 7ms/read ?

64653850111 io_stall_read_ms ÷ 9213975725 num_of_reads = 7.0 ms/read

Although that's highly aggregated, so the peaks may be much higher.
Shashank Singh 6,251 Reputation points

2021-01-11T15:11:46.37+00:00

This is generalizing David :) The data is cumulative and I would not pin my hopes on this

Answer 2

Yes Erland. Its an Azure VM. Here mainly I am looking for possible tuning opportunities to reduce I/O. Things like partitioning, archiving, or distributing data across multiple equi-sized data files etc...

First of all, it is worth repeating what David said: Bigger disks will give you better throughput. That is how it works in Azure.

When it comes to improving performance, David gave some tips, but we cannot really answer out of the blue. You need to analyse what is behind this high I/O. You also need to determine whether it is reads or writes you want to trim down. Adding indexes can reduce reads significantly - but it makes writes more expensive.

David suggested that archiving data could help. Ideally, it should not matter. It is going to help if there are lot of scans that trawls through that old data without finding anything. With the old data out of the way, those scans will be faster. But they will still be scans. That said, archiving can still be a good idea, because it will reduce your backup times, and your backups will be easier to manage.

Filegroups or partitioning - I agree with David. That's a non-starter.

David also gave another great tip: Query Store. My experience is that in the end what pays off is to fix bad queries - by rewriting them or adding indexes. When I work with a new client that has performance issues, I always ask them to enable Query Store and extract data and send me.

Sam 1,476 Reputation points

2021-01-12T08:02:18.927+00:00

Thanks for the suggestions Erland.

Answer 3

Erland Sommarskog 133.7K MVP Volunteer Moderator

That data does not look too good. However, that is since the machine was last rebooted. What I usually do is that I schedule this query to run every 10-20 seconds and then compute the delta. When I can look io_write_stall_ms / io_num_of_writes to get the stall time.

This is likely to reveal even higher peaks, but you will also see when they occur. I would guess the slowest response time occurs during disk rebuilds.

You said these were local disks. What sort of? Spinning? SSD? NVMe? USB? :-)

Sam 1,476 Reputation points

2021-01-08T11:58:22.21+00:00

Using Premium SSDs. It is an Azure Win server 2019 VM.
Erland Sommarskog 133.7K Reputation points MVP Volunteer Moderator

2021-01-08T22:20:19.247+00:00

And previously you said that they were local disks. An Azure VM is a completely different thing. This means that you are guaranteed a certain performance level depending on the disk you have selected. And note here that size is important - a bigger disk typically also comes with a promise of better throughput.
Sam 1,476 Reputation points

2021-01-11T14:24:49.043+00:00

Yes Erland. Its an Azure VM. Here mainly I am looking for possible tuning opportunities to reduce I/O. Things like partitioning, archiving, or distributing data across multiple equi-sized data files etc...
David Browne - msft 3,851 Reputation points

2021-01-11T15:05:44.733+00:00

Archiving unneeded data is helpful. Partitioning typically not, unless you have common expensive queries that could benefit from partition elimination. So that's really a component of query tuning and index design. Compression is good, especially if you can benefit from columnstore indexes, either clustered columnstores, or non-clustered columnstores.

Queries on large tables can often require much less IO on a columnstore table due the cumulative impact of 1) ~10x compression 2) column elimination, and 3) row group elimination.

Adding filegroups or adding files to existing filegroups is unlikely to produce any improvement.
Sam 1,476 Reputation points

2021-01-12T08:03:07.717+00:00

Many thanks David for sharing your valuable inputs.

Answer 4

Hi @Sam ,
Could you please share us more information about the I/O Performance?
You can use sys.dm_io_virtual_file_stats which returns I/O statistics for data and log files to monitor I/O bottlenecks:

SELECT   
cast(DB_Name(a.database_id) as varchar) as Database_name,  
b.physical_name, *   
FROM    
sys.dm_io_virtual_file_stats(null, null) a   
INNER JOIN sys.master_files b ON a.database_id = b.database_id and a.file_id = b.file_id  
ORDER BY Database_Name

You also can use Performance Monitor to capture statistics about I/O usage. The following counters are helpful to see if there is a bottleneck:

Avg. Disk sec/Read. The value of Avg. Disk sec/Read >20ms is bad.
Avg. Disk sec/Write The value of Avg. Disk sec/Read >20ms is bad.
Avg. Disk sec/Transfer If the values of this counter are consistently above 15-20 ms. then you need to take a look at the issue further.

In addition, missing indexes, poorly written queries, fragmentation or out of date statistics can also cause slow I/O performance.
For more information, please refer to Slow I/O - SQL Server and disk I/O performance and I/O troubleshooting which could help.
Best Regards,
Amelia

If the answer is helpful, please click "Accept Answer" and upvote it.
Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

Sam 1,476 Reputation points

2021-01-05T10:08:48.28+00:00

Hi Amelia,

I use this below query from virtualfilestats

SELECT [database_id],
[file_id],
[num_of_reads],
[num_of_bytes_read],
[io_stall_read_ms],
[num_of_writes],
[num_of_bytes_written],
[io_stall_write_ms],
[io_stall],
[size_on_disk_bytes]
FROM sys.[dm_io_virtual_file_stats](NULL, NULL)
where [database_id] = 12
ORDER BY [io_stall] DESC;
GO

Share via

Poor database performance

3 additional answers

Your answer