Federation Metadata in SQL Azure Part 3 – Monitoring Ongoing Federation Operations

In part 2 we talked about federation metadata history views. Federation history views only report operation that completed. For monitoring ongoing operations federations provide a separate set of dynamic management views under sys.dm_federation_operation*. All federation operations such as CREATE, ALTER or DROP consist of a set of steps that are executed async. With all async commands, a sync part of the command sets up and kicks off the operation first. Once the sync part is done, the control is returned to the executor of the TSQL. Then, SQL Azure in the background executes the async steps in the background. The initial sync part of these commands also set up the data for monitoring these async federation operations in the sys.dm_federation_operation* views. The views report metadata about the async operation such as the start date and time or the operation type that is running (ex: SPLIT or DROP etc) as well as the current progress of the operation.

Sys.dm_federation_operations represent the operations themselves whereas the sys.dm_federation_operation_members represents all members participating in the operation. Members are detailed as source or destination depending on the operation. In a SPLIT operation there are 2 destinations and 1 source for example.

image

For operations that are sync in nature, Sys.dm_exec_requests or sys.dm_exec_sessions provide great set of information. However with async operation such as federation operations these DMVs provide great information to let you explore the state after operations have been kicked off. Here are a few useful queries that can help you monitor your federations;

 -- see how long a repartitioning operation has been active select datediff(ss,start_date,getutcdate()) as total_seconds, percent_complete, * from sys.dm_federation_operations GO  -- display members with active federation repartitioning operations SELECT fmc.member_id,    cast(fmc.range_low as nvarchar) range_low,    cast(fmc.range_high as nvarchar) range_high,    fops.federation_operation_type FROM sys.federations f  JOIN sys.federation_member_distributions fmc  ON f.federation_id=fmc.federation_id  LEFT OUTER JOIN (  SELECT fo.federation_id, fom.member_id,      fo.federation_operation_type, fom.member_type  FROM sys.dm_federation_operation_members fom   JOIN sys.dm_federation_operations fo  ON fo.federation_operation_id = fom.federation_operation_id      AND fo.federation_operation_type='ALTER FEDERATION SPLIT'      AND fom.member_type='SOURCE') fops ON f.federation_id=fops.federation_id AND fmc.member_id=fops.member_id  ORDER BY f.name, fmc.range_low, fmc.range_high GO

Federation operations are built with resiliency in mind. They have built in retry logic and in cases of unexpected events in the system such as node failovers or excessive performance issues, SQL Azure continues to retry, resume or restart the operations. However in the case of an unlikely failure, there is the additional sys.dm_federation_operation_error* views. They exactly mirror the sys.dm_federation_operation* DMVs but provide additional detail on the operation errors. It is important to note that these views should be empty under normal conditions however they are provided for additional system troubleshooting for support and engineers.

image 

Enjoy.

-Cihan