SQL Server Linux: fsync and Buffered I/O
An example of the error: The application does a buffered write and receives success. (This means data can be stored in file system cache, non-stable media. ) An fsync/fdatasync is used to make sure the data is stored in stable media. The stable media write(s) occurs during the sync operation and can fail for any number of reasons (out of disk space, lost connectivity to SAN, etc.) reporting an EIO error.
The issue, as described in the link, is that the sync returns the error but may clear the state of the cached pages. The next sync returns ESUCCESS, meaning the the write(s), which were in cache, do not flush to stable media but applications were told they did.
Assume the database application opens the backup file, allowing file system caching (~_O_DIRECT.) SQL Server does NOT do this, in fact I made sure SQL Server on Linux does not allow this to occur.
· Backup is started using buffered I/O
· Writes occur to file system cache as the backup is progressing
· An external call to sync occurs causing file system cache writes to take place
· The sync encounters a write error on a buffer belonging to the backup. The sync was called by an application outside database application so the database application is unaware of the failure.
· At the end of backup the database application issues fdatasync and instead of returning EIO, ESUCCESS is returned.
· A successful fdatasync indicates a hardened to stable media backup and the database application then truncates the inactive part of the transaction log.
The database no longer has the proper log records to perform transaction operations or recovery and the backup file, once thought to be valid, is invalid.
This issue does NOT impact SQL Server database, log and backup files. SQL Server opens these file types using O_DIRECT to bypass file system cache. Even when SQL Server on Linux is running in Forced Flush mode the files are opened with O_DIRECT, avoiding the issue.
Bob Dorr - Principal Software Engineer SQL Server