Share via

SQL Server database corruption issue

Lincoln Brill 0 Reputation points
2025-09-16T13:50:05.4166667+00:00

a production database is corrupted which is confirmed by running the "dbcc checkdb" command. This started last Wed night which required me to restore the database on sunday night from the most recent clean .bak file and susequent .trn log files. I ran a manual backup after the restoration successfully as well as index rebuild and backup using normal SQL maintenance plan commands. During monday the restored database was used as normal and then the evening maintenance plan had corruption.

How do I figured out the core reason why the corruption is happening?

SQL Server Database Engine
0 comments No comments

Answer recommended by moderator
  1. Erland Sommarskog 133.7K Reputation points MVP Volunteer Moderator
    2025-09-16T21:12:37.7433333+00:00

    Corruption is always due to issues outside SQL Server. That is, hardware errors or errors somewhere in the I/O subsystem. The root cause could be a bad memory stick, but that often manifests itself in unexpected crashes as well. The most likely culprit is the I/O subsystem. Which in today's virtualised world is very complex with many components both in hardware and in software.

    Tracking down the exact component can be very tedious, since you basically have to replace them one by one. And then run long enough to feel sure that corruption is not occurring anymore.

    The alternative is to move to new hardware directly. It will be a discussion you will have to make with your infrastructure folks.

    Staying in production with what you have now is not really an option.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. K Durga Prasanna 30 Reputation points Microsoft External Staff
    2025-10-30T11:21:00.1933333+00:00

    Hi @Lincoln Brill

    Welcome to Microsoft Q&A Forum.

    The recurring corruption indicates an underlying issue with the I/O subsystem or hardware. To identify the root cause, start by running DBCC CHECKDB to determine which pages are affected. Then, review the SQL Server Error Logs and Windows Event Logs for I/O-related errors specifically error codes 823, 824, or 825, which signal problems with reading or writing to disk.  

    Next, run comprehensive hardware diagnostics on the server, including memory tests and disk health checks (e.g., SMART status). To simulate SQL Server I/O patterns and stress-test the storage system, use the SQLIOSim utility, which can help reproduce potential hardware faults under load.  

    If the corruption does not reoccur when the same database is restored and used on a different server or storage system, this strongly suggests that the original hardware stack is at fault. In such cases, the long-term resolution is to repair or replace the failing component whether it's a disk, controller, memory module, or virtualized storage layer or to migrate the database to a more reliable infrastructure.

    I hope this information is helpful. If you have any further questions, please let us know. we can assist you further.

     

    Warm Regards.


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.