Oops Recovery with Temporal Tables
Reviewed by: Kun Cheng, John Hoang, Sanjay Mishra, Borko Novakovic, Denzil Ribeiro, Murshed Zaman
Have you ever got that sinking feeling after hitting the Execute button in SSMS, thinking “I should not have done that”? DML statements with the missing WHERE clause, DROP statements accidentally targeting slightly mistyped (but existing) tables or databases, RESTORE statements overwriting databases with new data that haven’t been backed up, are all examples of actions prompting an “Oops…” (or worse) shortly thereafter. “Oops recovery” is the term that became popular to describe the process of fixing the consequences.
For most of these scenarios, the usual, and often the only, recovery mechanism is to restore the database from backup to a point in time just before the “oops”, known as point-in-time recovery (PITR). Even though PITR remains the most general and the most effective recovery mechanism, it does have some drawbacks and limitations: the recovery process requires a full database restore, taking the time proportional to the size of the database; a sequence of restores may be needed if multiple “oops” transactions have occurred; in the general case, there will be difficulties reconciling recovered data with data modified after the “oops” point in time, etc. Nevertheless, PITR remains the most widely applicable recovery method for SQL Server databases, both on-premises and in the cloud.
In this article, we would like to discuss another option that became recently available, that can greatly simplify some recovery scenarios. We will discuss recovering from an “oops” using temporal tables, available in SQL Server 2016 and in Azure SQL Database. The linked documentation provides a detailed description of this new feature. As a quick introduction, a temporal table keeps a record of all data changes by saving versions of rows in a separate history table, with a validity period attached to each version. T-SQL language has been extended to simplify querying of current and historical data. In terms of performance overhead, there is none for INSERT and SELECT statements on the current data. The overhead of other statements is similar to that incurred by maintaining an additional index, and is in any case less than the overhead of other solutions for keeping history, such as triggers or CDC.
From the outset, we need to say that this method is applicable to a rather narrow subset of scenarios, considering all the possibilities for an “oops”. It also requires advance preparation, i.e. modifying table schema to add temporal period columns, and explicitly enabling system versioning. But for those specific scenarios, it allows a much simpler recovery process than PITR, and is therefore worth considering in detail.
To explain the recovery process with a temporal table, let’s consider an example (based on a true story).
In this example, an application uses a SQL Server table as a queue. Messages are generated by multiple application processes, and for each message, a row is inserted into this table. A separate process retrieves these messages from the queue, i.e. executes a single row SELECT statement against the table, processes message payload, and then deletes the processed message row from the table. (As an aside, this is not the most optimal way to implement a queue using a table. It would be better to dequeue messages using a DELETE statement with an OUTPUT clause.)
A new code release introduces a bug where rows are selected, and then deleted immediately, without calling the code to process the message. This is not noticed until 7000 messages are deleted without having been processed, while the rest of the workload continues to run and modify data in the database.
To recover from this using the traditional point-in-time recovery, it would have been necessary to perform 7000 precise point-in-time restores, which is not feasible for most applications. Another possible option is to reconstruct the data from the transaction log, however there are no documented or supported ways to do that, and it may be extremely complex or not even possible in the general case.
Now let’s consider what would happen if the queue table in this example were a temporal table.
CREATE TABLE dbo.WorkQueue
(
WorkQueueId int NOT NULL,
MessagePayload nvarchar(max) NOT NULL,
SysStartDateTime datetime2 GENERATED ALWAYS AS ROW START HIDDEN NOT NULL,
SysEndDateTime datetime2 GENERATED ALWAYS AS ROW END HIDDEN NOT NULL,
PERIOD FOR SYSTEM_TIME (SysStartDateTime, SysEndDateTime),
CONSTRAINT PK_WorkQueue PRIMARY KEY (WorkQueueId)
)
;
Compared to a regular table, there are two new columns added here, SysStartDateTime and SysEndDateTime. The values for these columns are automatically generated by SQL Server when rows are inserted, updated, and deleted. Note that in this example, the columns are hidden. While making these columns hidden is optional, it may be useful to avoid impacting the application. If the application code does not follow the best practice of always explicitly listing column names, e.g. if it uses SELECT *, or INSERT statements without target column list, then the addition of new columns can break it. Making the period columns hidden avoids this problem. Note that even if the columns are hidden, it is still possible to see column values if they are explicitly included in the SELECT column list.
If you are familiar with temporal tables, you may have noticed that something is missing from the CREATE TABLE statement above. Specifically, this statement only creates the current table, and there is no mention of the history table. The history table is where older row versions are saved when UPDATE, DELETE, and MERGE statements modify rows in the current table.
In this example, we intentionally do not create the history table right away. If we did, it would start immediately accumulating row versions. For a queue table, it means that every message placed on the queue would effectively remain in the database, potentially using a significant amount of storage. Whether that makes sense depends on the specific application context. If the choice is to have the system versioning enabled at all times, then using a Clustered Columnstore Index for the system table would reduce storage overhead. An example is provided in Creating a temporal table with a user-defined history table.
In this example, we assume that the oops protection provided by temporal tables is only needed during some critical phases of application lifecycle, i.e. during a new code release.
Just prior to the release, we go ahead and enable protection by turning on system versioning:
ALTER TABLE dbo.WorkQueue SET
(SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.WorkQueueHistory))
;
Note that the dbo.WorkQueueHistory history table referenced in the above statement does not have to be created explicitly. It will be automatically created by SQL Server as a part of the ALTER TABLE statement above, based on the schema of the dbo.WorkQueue table.
From this point on, whenever a row in the dbo.WorkQueue table is updated or deleted, the version of the row as it existed immediately prior to the update or delete will be inserted in the dbo.WorkQueueHistory table.
Next, the application code (with the disastrous bug) is deployed. Before allowing the application to use the database, we note the exact time of the release (as an aside, this is a best practice for any critical change):
SELECT SYSUTCDATETIME() AS ReleaseDateTime;
-- 2016-11-03 17:07:21.5027748
The application is brought online, messages are added to the queue, and, due to the bug, are deleted right away:
INSERT INTO dbo.WorkQueue (WorkQueueId, MessagePayload)
VALUES (1, 'Message1');
DELETE dbo.WorkQueue
WHERE WorkQueueId = 1;
INSERT INTO dbo.WorkQueue (WorkQueueId, MessagePayload)
VALUES (2, 'Message2');
DELETE dbo.WorkQueue
WHERE WorkQueueId = 2;
INSERT INTO dbo.WorkQueue (WorkQueueId, MessagePayload)
VALUES (3, 'Message3');
DELETE dbo.WorkQueue
WHERE WorkQueueId = 3;
A few minutes later, or a few hours later if we are unlucky, the oops moment arrives, and the problem is noticed. The application is taken offline, and developers start working on a fix. At this point, we note the time when the “bad” workload stops:
SELECT SYSUTCDATETIME() AS WorkloadStopDateTime;
-- 2016-11-03 17:07:40.0709518
In the meantime, the queue table is empty:
SELECT WorkQueueId,
MessagePayload
FROM dbo.WorkQueue
;
(0 row(s) affected)
The immediate question is whether anything can be done to bring back message data that wasn’t processed while the code with the bug was deployed. Luckily, because we enabled system versioning prior to the release, we can indeed do it, using a single TSQL statement:
WITH LatestVersion AS
(
SELECT WorkQueueId,
MessagePayload,
ROW_NUMBER() OVER (PARTITION BY WorkQueueId ORDER BY SysEndDateTime DESC) AS VersionNumber
FROM dbo.WorkQueueHistory
WHERE -- restrict to rows created after the release ...
SysStartDateTime >= '2016-11-03 17:07:21.5027748'
AND
-- ... and rows deleted before the fix went in
SysEndDateTime < '2016-11-03 17:07:40.0709518'
)
INSERT INTO dbo.WorkQueue
(
WorkQueueId,
MessagePayload
)
SELECT WorkQueueId,
MessagePayload
FROM LatestVersion
WHERE VersionNumber = 1
;
In this statement, we start with a query to retrieve all row versions from the dbo.WorkQueueHistory history table that have their validity period starting on or after the release time, and ending before the application went offline. This is the query within the LatestVersion CTE. In that query, we use the ROW_NUMBER() window function to number row versions for each PK value (i.e. for each WorkQueueId) in chronologically descending order, so that the latest version becomes version number one. In our specific example, there happens to be only one version, because only one DML statement (the erroneous DELETE) affected each row. In a more general case though (e.g. if a row was updated before having been deleted), multiple versions could exist, therefore we need to determine which version is the latest, which is achieved by numbering versions using ROW_NUMBER(). Then, we restrict this result set to filter out all versions but the latest, which is the version of the row just before it was erroneously deleted. Then, we insert these deleted rows back into the dbo.WorkQueue table, effectively recovering from the oops.
We can see that the unprocessed messages are back in the queue table:
SELECT WorkQueueId,
MessagePayload
FROM dbo.WorkQueue
;
(3 row(s) affected)
There are two important caveats to note here.
If you are familiar with temporal tables, you may be wondering why we used the history table directly, instead of using one of the new FOR SYSTEM_TIME temporal clauses in the FROM clause of the query. The reason is that FOR SYSTEM_TIME filters out row versions with the same validity start and end times, as noted here. In our scenario, the INSERT and DELETE statements can happen so quickly one after the other, that the system clock resolution is insufficient for the timestamps to be different. In that case, had we used FOR SYSTEM_TIME, some of the row versions from the history table that we need for data recovery would be filtered out.
Careful readers may also notice that while the recovery statement above works as expected for the current example, it may be problematic in a more general case, specifically when multiple versions of a given row exist in the history table. That will happen if a row in the current table is updated at least once after having been inserted. In that case, there is no guarantee that the validity periods of multiple versions of the same row will be different. For example, if all changes happen in a transaction, then the validity period for every version created in the transaction will be the same, as documented here. This can also happen if, as mentioned earlier, the system clock does not change between very short DML statements. But in the above statement, we order row versions using the SysEndDateTime period column to determine the latest version! Therefore, if there is more than one row with the same SysEndDateTime, the result of the sort will be non-deterministic, and the row version inserted into the current table may not be the latest.
In most cases, there is a simple solution for this problem. Instead of using a period column to determine the latest row version, we can add a rowversion column to the current table, and use it for ordering versions in the history table. A rowversion column is automatically generated and incremented on each update, therefore the latest version of a row in the history table will have the highest rowversion value for the PK value of the row. We say “in most cases”, because today rowversion columns are not supported in memory-optimized tables.
Once the fixed code is deployed, and the application is working again, we turn off system versioning to avoid accumulating queue row versions in the database, until the next release:
ALTER TABLE dbo.WorkQueue SET (SYSTEM_VERSIONING = OFF);
In this simple example, we have shown how using a temporal table provides a simple way to recover from accidental data loss. The approach is also documented more generically in Repairing Row-Level Data Corruption. This is not by any means a universally applicable method; after all, for most databases, it would not be practical to make every table in the database a temporal table, not to mention that many data loss scenarios are much more complex than this example. However, if you have some tables containing critically important data, and/or data that is at high risk of accidental and erroneous changes, then using a temporal table provides a simple way to recover from an oops, keeping the database online, and avoiding a much heavier PITR process.