Using Checkpoints in Packages
Integration Services can restart failed packages from the point of failure, instead of rerunning the whole package. If a package is configured to use checkpoints, information about package execution is written to a checkpoint file. When the failed package is rerun, the checkpoint file is used to restart the package from the point of failure. If the package runs successfully, the checkpoint file is deleted, and then re-created the next time that the package is run.
Using checkpoints in a package can provide the following benefits.
Avoid repeating the downloading and uploading of large files. For example, a package that downloads multiple large files by using an FTP task for each download can be restarted after the downloading of a single file fails and then download only that file.
Avoid repeating the loading of large amounts of data. For example, a package that performs bulk inserts into dimension tables in a data warehouse using a different Bulk Insert task for each dimension can be restarted if the insertion fails for one dimension table, and only that dimension will be reloaded.
Avoid repeating the aggregation of values. For example, a package that computes many aggregates, such as averages and sums, using a separate Data Flow task to perform each aggregation, can be restarted after computing an aggregation fails and only that aggregation will be recomputed.
If a package is configured to use checkpoints, Integration Services captures the restart point in the checkpoint file. The type of container that fails and the implementation of features such as transactions affect the restart point that is recorded in the checkpoint file. The current values of variables are also captured in the checkpoint file. However, the values of variables that have the Object data type are not saved in checkpoint files.
If the package is restarted, Integration Services does not reload the package configurations. Instead, the package uses the configuration information that was written to the checkpoint file. This ensures that, when the package runs again, the package uses the same configurations as when the package failed.
Defining Restart Points
The following Integration Services components are the atomic units of work that you can restart by using checkpoints:
Task The task host container, which encapsulates a single task, is the smallest atomic unit of work that can be restarted.
Note
Because the Data Flow task, which includes all its contents, is an atomic unit of work, you cannot restart a package in the middle of the data flow. To avoid rerunning the whole data flow, you might design the package to include multiple Data Flow tasks. This way, when the package restarts, only the Data Flow tasks that failed will be run again.
Transacted container A transacted container is also an atomic unit of work that can be restarted. If a package is stopped while a transacted container is running, the transaction ends, and any work performed by the transaction is rolled back. However, the checkpoint file does not contain information about the work completed by the child containers, and both the transacted container and its child containers run again when the package restarts.
To minimize the possible conflicts between checkpoints and transactions, Integration Services does not save checkpoint information about what happens inside a container when one of the following conditions is true:
The value of the TransactionOption property of the container is Required.
—or—
The value of the TransactionOption property of the container is Supported, but the parent container owns, or is enrolled in, a transaction.
Note
Using checkpoints and transactions in the same package could cause unexpected results. For example, when a package fails and restarts from a checkpoint, the package might repeat a transaction that has already been successfully committed.
Foreach Loop container The Foreach Loop container is another atomic unit of work that can be restarted. However, the checkpoint file does not contain information about the work completed by the child containers, and the Foreach Loop container and its child containers run again when the package restarts.
Configuring a Package to Restart
The checkpoint file includes the execution results of all completed units of work (as described earlier in this topic), the current values of system and user-defined variables, and package configuration information. The file also includes the unique identifier of the package. To successfully restart a package, the package identifier in the checkpoint file and the package must match; otherwise the restart fails. This prevents a package from using a checkpoint file written by a different package version. If the package runs successfully, after it is restarted the checkpoint file is deleted.
The following table lists the package properties that you set to implement checkpoints.
Property |
Description |
---|---|
CheckpointFileName |
Specifies the name of the checkpoint file. |
CheckpointUsage |
Specifies whether checkpoints are used. |
SaveCheckpoints |
Indicates whether the package saves checkpoints. This property must be set to True to restart a package from a point of failure. |
Additionally, you must set the FailPackageOnFailure property to true for all the containers in the package that you want to identify as restart points.
You can use the ForceExecutionResult property to test the use of checkpoints in a package. By setting ForceExecutionResult of a task or container to Failure, you can imitate real-time failure. When you rerun the package, the failed task and containers will be rerun.
Setting the CheckpointUsage Property
The following table lists the values for the CheckpointUsage property.
Value |
Description |
---|---|
Never |
Specifies that the checkpoint file is not used and that the package runs from the start of the package workflow. |
Always |
Specifies that the checkpoint file is always used and that the package restarts from the point of the previous execution failure. If the checkpoint file is not found, the package fails. |
IfExists |
Specifies that the checkpoint file is used if it exists. If the checkpoint file exists, the package restarts from the point of the previous execution failure; otherwise, it runs from the start of the package workflow. |
Note
The /CheckPointing on option of dtexec is equivalent to setting the SaveCheckpoints property of the package to True, and the CheckpointUsage property to Always. For more information, see dtexec Utility.
Choosing a Location for Checkpoint Files
In a failover cluster where you have Integration Services installed on multiple nodes on the cluster, you can save checkpoint files to a shared location. Then, if a failover occurs, you can restart a package that was interrupted from the last checkpoint on a different node in the cluster.
Securing Checkpoint Files
Package level protection does not include protection of checkpoint files and you must secure these files separately. Checkpoint data can be stored only in the file system and you should use an operating system access control list (ACL) to help secure the location or folder where you store the file. It is important to secure checkpoint files because they contain information about the package state, such as the current values of variables. For example, a variable might contain a recordset with many rows of private data such as telephone numbers. For more information, see Controlling Access to Files Used by Packages.
To configure the checkpoint properties
|
See Also
Concepts
Change History
Updated content |
---|
|