Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Day 16 of my virtual advent calendar, about stuff I like in SQL Server 2008..
One of the main hurdles in refreshing a data warehouse is to work out what’s changed since the last refresh. If you’re lucky the source will have a last modified date and if you’re really lucky this is actually populated and working. There is a capability in SQL Server 2008 to help as well, Changed Data Capture (CDC), but be aware this is an enterprise edition only feature.
What it does is to create a separate set of tables in the same database (but different schema) as the source. It needs to be turned on first like this..
USE AdventureWorks
EXEC sys.sp_cdc_enable_db
BTW there is a new column in the sys.databases dmv called is_cdc_enabled so you can check if CDC is enabled with…
SELECT name, is_cdc_enabled FROM sys.databases
Now you can enable a table for CDC..
EXEC sys.sp_cdc_enable_table
@Source_Schema = N'HumanResources',
@Source_Name = N'Employee',
@Captured_Column_List =
N'EmployeeID,NationalIDNumber, ManagerID, Title, BirthDate, MaritalStatus, Gender, HireDate, SalariedFlag, VacationHours, SickLeaveHours, CurrentFlag',
@Supports_Net_Changes = 1,
@Role_Name = 'CDC_Admin'
Hopefully all but the last 2 arguments are pretty obvious. Supports Net changes is a bit flag which either shows you only the overall change between two periods as opposed all of the changes . The role name is a database role which runs the code and a new one is created if you specify a new name. Also you can specify other parameters here like @filegroup.
Note you don’t have to include the captured column list (it will track all columns by default, but if you do you must include the primary key or another argument @index_name to refer to a unique index on the table if @supports_net_changes is set to 1.
You can now check this has been enabled..
EXECUTE sys.sp_cdc_help_change_data_capture
@source_schema = N'HumanResources',
@source_name = N'Employee';
Behind the scenes a new system table has been created to track changes, [cdc].[HumanResources_Employee_CT] which has got all the columns we specified we wanted to track plus 5 extra ones ..
[__$start_lsn]
,[__$end_lsn]
,[__$seqval]
,[__$operation]
,[__$update_mask]
so lets make a simple change and see what happens in this table
UPDATE HumanResources.Employee
SET Title = 'DBA' where Title = 'Database Administrator'
and then query the change table like this..
SELECT TOP 1000 [__$start_lsn]
,[__$end_lsn]
,[__$seqval]
,[__$operation]
,[__$update_mask]
,[EmployeeID]
,[Title]
FROM [AdventureWorks].[cdc].[HumanResources_Employee_CT]
to see what was captured.
The LSN columns show the log sequence number of the commit for the change and the -$seqval orders the sequence of changes that can occur in the same transaction, so a possible lower level of detail. _$operation shows what happened to the row, 1 = delete, 2 = insert, 3 = update (before change), and 4 = update (after change).
To help you use this table there are functions to get the min and max LSN’s in the table.
You should also notice that you have got 2 new SQL Agent jobs to capture and cleanup the tables. Rather than modify these jobs directly there are also specific functions to manage them.
I have skimmed over this to give you an idea of the various parts of CDC, so you should check books on line for a fuller explanation here, and there is also a CDC TechNet virtual lab here, which you can try, before you have a go in your own environment.
Technorati Tags: sql server 2008,change data capture,cdc
Comments
Anonymous
December 16, 2008
PingBack from http://bestwebhostingservices.com/andrew-fryers-blog-sql-server-advent-calendar-16-%e2%80%93-change-data/Anonymous
December 17, 2008
Pour tous ceux qui ont gardé une âme d'enfant ! SQL Server Advent Calendar SQL Server Advent