SQL Tempdb taking up space

Question

SQL Tempdb taking up space

Sandro Alves 51

Hi friends,

we have two synchronous and one asynchronous sql 2016 servers with Always On.

Our tempdb are six files in one volume on one disk and another six files on another disk for a total of 12 files.

During the early morning routines, a curious scenario happened twice that called into question the concept of tempdb management.

During these routines, one of the disks occupied all its space and the DBA team reported that one of the JOBs had failed due to disk failure.

In light of this, questions arose:

We understand any activity in SQL will manage using 12 tempdb files allocating activities randomly. However, we are in doubt if for example:

Activity 1, whatever it is, did it start using tempdb_01 and at the same time it will use another tempdb? Or will it get stuck on that tempdb only?

If an activity keeps working on a tempdb, how does it know that disk space is being taken up and needs to use another one? I believe that SQL does not have this visibility.

Thanks.

4 answers

Your answer

Answer 1

Erland Sommarskog 121.4K MVP Volunteer Moderator

There is only one single tempdb, but a database can be split up on many filegroups, and a filegroup can consist of several files. For tempdb, you typically only have a single filegroup, but it is recommended to have many files.

When there are multiple files in a filegroup, SQL Server will allocate space in these files in a round-robin fashion. That is, if you create a temp table and fill it up with lots of data, the first extent of eight pages will be on, say, file 2. Next extent will be on file 3 etc. If you have 12 files, the 13th extent will be next to appear in file 2. This presumes that all files are of the same size, which is also is highly recommended.

To this comes the log file. You can have multiple log files, but in difference to data files, there is little point in this, so typically you only have one log file. Since tempdb is in simple recovery, the transaction log for tempdb is frequently truncated, but if there is a long-running transaction, it can still grow and eventually eat up all available disk space.

Sandro Alves 51 Reputation points

2023-02-05T15:30:40.83+00:00

Hi,

grateful for your explanation.

So even if I have all this division of files, if a transaction remains active it won't have time for the logs to truncate.

In this case, as I only have one (tempdb.log), probably the file that grew was it?

Is there any strategy to prevent this from happening?

I remember that this happened occasionally, it doesn't happen often.

See how our file division is structured:

tempdb.ldf (disk 1)

tempdb.mdf (disk 1)

temp03.ndf (disk 1)

temp05.ndf (disk 1)

temp07.ndf (disk 1)

temp09.ndf (disk 1)

temp11.ndf (disk 1)

temp02.ndf (disk 2)

temp04.ndf (disk 2)

temp06.ndf (disk 2)

temp08.ndf (disk 2)

temp10.ndf (disk 2)

temp12.ndf (disk 2)

(disk 1)

(disk 2)

Tks.
Erland Sommarskog 121.4K Reputation points MVP Volunteer Moderator

2023-02-05T16:52:56.84+00:00

In this case, as I only have one (tempdb.log), probably the file that grew was it?

Which file that grew and overflowed the disk should be clear from the error message from SQL Server. If you don't have this error message, we can only speculate.

Is there any strategy to prevent this from happening?

The only general strategy I can give is to see your local hardware dealer to get more disk.

If that is not an option to you, you need get an understanding on which operation that lead the file(s) growing. Autogrow is captured by the default trace. More detailed information, including the statement that triggered the autogrow can be captured by an extended events session.

It is not at all unlikely that you find that the operation is part of the normal workload for the application, and it is not desirable to change it. In that case, you may end up at the local hardware dealer anyway.

I note from you screenshots that the various .mdf and .ndf files have quite varying sizes. If this is caught in flight there can be some variation even if the files were created with the same size originally. But that variation should be this big. I think you should review the sizes of the files, and modify them so that they are all of the same size.
Sandro Alves 51 Reputation points

2023-02-05T18:31:59.13+00:00
Hi,

the files are configured like this:

I understand that increasing disk size is an option due to usage required by some activity. But I don't feel safe counting on being lucky to increase to a size without any rationale.

Because I say that, because I can increase it but I won't know if at some other time it could include an activity that will need more space than what was added and we will have a stop again.

We can see that the disk size of 70GB is above margin if we take into account that there are 6 files of 8GB maximum size. But the tempdb.log is unlimited, that is, it can grow as much as you need.

New workloads are being attached to constant SQL and this has caused a lot of concern about disk consumption.

So I think like this:

It would be interesting to monitor the growth and shrinkage of these files individually to confirm they are doing the job we expect. I believe so, otherwise I would have constant disk overflow events and I don't.

Monitoring workloads to understand which one is causing it is a good option. We already know that it was during a specific workload that failed, but I know if we have more information to help us know which of the files consumed the (tempdb).

Other questions:

When this overflow occurs or when space is close to running out, what is the correct procedure to free up disk space without losing the workload that is running?

Last time the DBA team ran a shrink during the routines and another DBA said it couldn't be done as it would compromise all running workloads.

Thanks.

Answer 2

Hi,

grateful for your explanation.

So even if I have all this division of files, if a transaction remains active it won't have time for the logs to truncate.

In this case, as I only have one (tempdb.log), probably the file that grew was it?

Is there any strategy to prevent this from happening?

I remember that this happened occasionally, it doesn't happen often.

See how our file division is structured:

tempdb.ldf (disk 1)

tempdb.mdf (disk 1)

temp03.ndf (disk 1)

temp05.ndf (disk 1)

temp07.ndf (disk 1)

temp09.ndf (disk 1)

temp11.ndf (disk 1)

temp02.ndf (disk 2)

temp04.ndf (disk 2)

temp06.ndf (disk 2)

temp08.ndf (disk 2)

temp10.ndf (disk 2)

temp12.ndf (disk 2)

(disk 1)

Screenshot 2023-02-05 121200

(disk 2)

Screenshot 2023-02-05 121219

Tks.

Answer 3

Erland Sommarskog 121.4K MVP Volunteer Moderator

To repeat what I said in my last post: make sure that all data files for tempdb have the same size. Furthermore, remove the max size setting. It does not serve you. If you add more disk, you will still face error 1105 when you have consumed 12*8 = 96 GB of tempdb space. 96 GB of tempdb space may sound like a lot, and indeed if your actual database is only 5-10 GB it is. But if your production database is 5-10 TB, it is not.

It would be interesting to monitor the growth and shrinkage of these files

The only time these files shrink is when you restart SQL Server, in which case they return to their configured sizes. But once the system is running, they can only grow. SQL Server will not shrink them on its own initiative.

When this overflow occurs or when space is close to running out, what is the correct procedure to free up disk space without losing the workload that is running?

If you want to save the workload from crashing, all you can do is to find more disk space for data or log, depending on what is about to run out.

Last time the DBA team ran a shrink during the routines

That's pointless in most cases. if the file has grown to a certain size, it was because that space was needed. The required space is not going to be less if you shrink a file. All that happen is that file will grow again.

But if the tempdb files competes with other files on the disk, it may be required to shrink files once the operation has crashed. On the other hand, if the file is dedicated to tempdb, why bother?

Sandro Alves 51 Reputation points

2023-02-05T20:00:46.8866667+00:00
Hi,

sorry for my confusion in understanding, because I'm still learning.

I didn't understand what you returned to mention about the sizes.

Did I show that they are the same size or not? They are configured for the 8192MB limit. The only one that has no limit is the templeg.ldf. Is that why you suggested not to limit others?

Actually there is:

(7 files on dedicated disk on disk 1 - 70GB)

tempdb.mdf

tempdb.ldf

temp11.ndf

temp3.ndf

temp5.ndf

temp7.ndf

temp9.ndf

(6 files on dedicated disk on disk 2 - 70GB)

temp2.ndf

temp4.ndf

temp6.ndf

temp8.ndf

temp10.ndf

temp12.ndf

These disks always keep plenty of free disk space, using a maximum of 30% of the disk.

The graph represents free disk space.

Our bases are big yes, approximately 2TB in total.

If these files only reduce when the SQL is restarted I understand that the use of our tempdb is very small, because as I commented this only happened twice.

Looking at the history of disk consumption I see that it really does not vary in terms of usage and release growth is really constant.

We only restart SQL only at times of installing OS updates and we did that yesterday, but I don't see the freeing up of disk space at startup like you mentioned.

I don't know if it has anything to do with the fact that we use it in alwayson mode. I see that there was a release of 20% after the reboot, but following your explanation, I understand that it should free up more space.

Regarding the shrink, I also think the same. It doesn't make sense why it will grow again that's logical, but the last few times this was action to return the space to its limit of 30% of normal use.

In summary is my concern to question the DBAs?

If the configuration is correct

How to measure how much disk I need to increase, which honestly I don't see the need since consumption has always remained at a maximum of 30% use.

And what DBAs can help me with the investigation into the process that was performed to prevent this consumption from happening again. And if it happens what can I do to help prevent it. To this point you've already shared that DBAs need to investigate the work, but what are they going to tell me after that analysis? Administrator increase the disk? I can raise, but why?

Thanks.
Erland Sommarskog 121.4K Reputation points MVP Volunteer Moderator

2023-02-05T20:32:36.42+00:00

I can see that there is a long comment, but it is marked as deleted. Was that intentional? I ask, because the anti-spam software on this platform is sometimes a little overly aggressive. If so, I have the moderator powers to undelete it. But I would obviously not do that without your concession.
Sandro Alves 51 Reputation points

2023-02-06T01:32:48.0433333+00:00

Hi,

I responded to your last comment with a large yes text.

If you can get it back, thank you.

Thanks.
Erland Sommarskog 121.4K Reputation points MVP Volunteer Moderator

2023-02-07T22:35:09.28+00:00
Did I show that they are the same size or not? They are configured for the 8192MB limit. The only one that has no limit is the templeg.ldf. Is that why you suggested not to limit others?

Yes, I suggest that you remove the 8192 limit. And, no, they are not of the same size. I would suggest that you run something like:

ALTER DATABASE tempdb MODIFY FILE (NAME = 'tempdev', SIZE = 2048 MB, MAXSIZE = UNLIMITED), ALTER DATABASE tempdb MODIFY FILE (NAME = 'temp2', SIZE = 2048 MB, MAXSIZE = UNLIMITED), ALTER DATABASE tempdb MODIFY FILE (NAME = 'temp2', SIZE = 2048 MB, MAXSIZE = UNLIMITED), ALTER DATABASE tempdb MODIFY FILE (NAME = 'temp3', SIZE = 2048 MB, MAXSIZE = UNLIMITED), ALTER DATABASE tempdb MODIFY FILE (NAME = 'temp4', SIZE = 2048 MB, MAXSIZE = UNLIMITED), ALTER DATABASE tempdb MODIFY FILE (NAME = 'temp5', SIZE = 2048 MB, MAXSIZE = UNLIMITED), ALTER DATABASE tempdb MODIFY FILE (NAME = 'temp6', SIZE = 2048 MB, MAXSIZE = UNLIMITED), ALTER DATABASE tempdb MODIFY FILE (NAME = 'temp7', SIZE = 2048 MB, MAXSIZE = UNLIMITED), ALTER DATABASE tempdb MODIFY FILE (NAME = 'temp8', SIZE = 2048 MB, MAXSIZE = UNLIMITED), ALTER DATABASE tempdb MODIFY FILE (NAME = 'temp9', SIZE = 2048 MB, MAXSIZE = UNLIMITED), ALTER DATABASE tempdb MODIFY FILE (NAME = 'temp10', SIZE = 2048 MB, MAXSIZE = UNLIMITED), ALTER DATABASE tempdb MODIFY FILE (NAME = 'temp11', SIZE = 2048 MB, MAXSIZE = UNLIMITED), ALTER DATABASE tempdb MODIFY FILE (NAME = 'temp12', SIZE = 2048 MB, MAXSIZE = UNLIMITED)

Although, this gives you only 24 GB of tempdb, which is quite humble for 2TB.

And what DBAs can help me with the investigation into the process that was performed to prevent this consumption from happening again. And if it happens what can I do to help prevent it

Even 140 GB is not startling for 2TB of data. I would say that if these events where you outrun the tempdb disks are re-occurring... yeah, you can track down what happened, and adjust that process so it does happen again. But then it happens with another operation, and then with a third. In the end, what is the cheapest: that the DBA team tries to optimise, or getting more disk space? (But, OK, if you find that you need 50 TB of tempdb space for that 2 TB something might be broken.)

Answer 4

LiHongMSFT-4306 31,566

Hi @Sandro Alves

You can use sys.dm_db_file_space_usage to see who is using these spaces and how.

By monitoring this, you can know which object the tempdb space is being used, whether it is a user object (user_object_reserved_page_count), a system object (internal_object_reserved_page_count), or a version store (version_store_reserved_page_count).

Another question is how much is the best initial tempdb size? The appropriate size of tempdb in a production environment depends on a variety of factors, so there is no fixed answer to this question. These factors include existing workloads and the SQL Server features used. So, every SQL Server will be different. If a new feature of the same SQL Server uses tempdb is added, its space usage will also change. It is recommended that you analyze your existing workload by performing the following tasks in a SQL Server test environment:

Set the autogrowth of tempdb.
Simulate individual queries or work tasks while monitoring tempdb space usage.
Simulate performing some system maintenance operations, such as rebuilding indexes, while monitoring tempdb space.
Use the tempdb space usage values in the previous steps 2 and 3 to predict how much space will be used under the total workload; and adjust this value for the planned concurrency. For example, if a task uses 10 GB of tempdb space, and in a production environment there may be up to 4 such tasks running at the same time, reserve at least 40 GB of space.
According to the value obtained in step 4, set the initial size of tempdb in the production environment. Autogrowth is also turned on.

Also, the number of tempdb files and the size settings should not only meet the needs of user tasks, but also consider performance optimization.

Best regards,

Cosmog Hong

If the answer is the right solution, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".

Note: Please follow the steps in our Documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

Sandro Alves 51 Reputation points

2023-02-07T16:25:11.7233333+00:00

Hi,

I hope my comment is accepted as the last one I wrote didn't show up.

So this one is just a test.

Thanks.
Sandro Alves 51 Reputation points

2023-02-07T16:36:31.02+00:00

Hi,

I understand your opinion and I've really always heard it in conversations with some DBAs, that it all depends on the workload.

As I said before, we don't have problems with our tempdb, everything works fine, however during some routine it consumed all disk space. Which based on all our conversation could happen, due to the "it depends on the workload".

To have this view of consumption, we would have to make these estimates in the homologation environment, before taking it to production, evaluating it before implementing it.

We know that we don't have control over what the developer is building, so really imagining whether a specific routine will cause off-curve consumption or not is complex for us administrators. I understand and that's why we always put a margin above to avoid these surprises, even without believing that this is a better way of working.

My insistence and I even apologize on the subject is that there are good practices that we must follow and at first I always believed that we always use them, such as the configuration I showed earlier.

When you ask me about the configurations, I get worried, because I understand that our environment has always been configured with good practices, but that you are advising me and looking at the day-to-day tasks with the aim of estimating a better space.

I reiterate that the size in our tempdb is set to automatic, the ones that are limited are the (.ndf) only. This is a problem?

I believe that the file that grew was the (tempdb.log) that is configured for automatic growth, so due to some routine that occurred that I couldn't avoid it before it grew beyond what was expected and consumed the entire disk.

In summary to conclude:

When you talk about (tempdb) in my view the tempdb is made up of files (tempdb.mdf, tempd.ldf and .ndf files).

So I understood that the use of tempdb files, I mean the (.ndf) occurs intelligently by SQL, that is, SQL will use them randomly according to the need, because the (tempdb.ldf and mdf) are unique . I don't know if I'm making myself clear, but I'll try.

So as they (tempdb.ldf and mdf) are on a disk along with the files (.ndf) they compete for space. As much as I have estimated a space for the (.ndf) files set at (6 files x 8GB), we have (tempdb.log and tempdb.mdf) which will also consume space from the same disk. The other disk only has files (.ndf). I don't know if this separation strategy is the best one. For example, if I dedicate an exclusive disk to (tempdb.log and tempdb.mdf) with a very large size, it doesn't make sense, since I can simply increase the space where they already are with the files (.ndf).

The number of tempdb files were strategically created with performance in mind, as it is good practice to even separate them on different disks for better I/O management. Does that make sense to you?

From your point of view, the fact that the files (tempdb.mdf and tempdb.ldf) are together with the files (.ndf), do you believe that separating them would make some sense, since tempdb.ldf can grow automatically, so does not compete with the (.ndf) that are fixed?

Or is this all irrelevant, is the focus really on whether bursting the disk increases more space and relying on luck as new JOBs are created increase more and more?

Thanks.

Share via

SQL Tempdb taking up space

4 answers

Your answer