The Oracle CDC Databases

Important

Change Data Capture for Oracle by Attunity is deprecated now. For details, refer to the announcement.

An Oracle CDC Instance is associated with a SQL Server database by the same name on the target SQL Server instance. This database is called the Oracle CDC database (or the CDC database).

The CDC database is created and configured using the Oracle CDC Designer Console and it contains the following elements:

  • A cdc schema created by enabling the database for SQL Server CDC.

  • A set of cdc.xdbcdc_xxxx tables used by the Oracle CDC Instance.

  • A set of empty mirror tables with the definitions of the captured tables in tuphe Source Oracle database.

  • A set of change tables and change access functions that are generated by the SQL Server CDC mechanism and are identical to those used in the regular, non-Oracle, SQL Server CDC.

The cdc schema is initially accessible only to the members of the dbowner fixed database role. Access to the change tables and change functions is determined by the same security model as the SQL Server CDC. For more information about the security model, see Security Model.

Creating the CDC Database

In most cases, the CDC database is created using the CDC Designer Console, but it can also be created with a CDC deployment script that is generated using the CDC Designer Console. The SQL Server system administrator can change the database settings if necessary (for items such as for storage, security, or availability).

For more information about using the CDC Designer Console to create the database tables and the necessary scripts, see Use the New Instance Wizard.

CDC Database User Roles

When a CDC Database is created and enabled for CDC, a database user called cdc_service is created in the CDC database and is associated with the SQL Server login that the Oracle CDC Service was configured with. This user is made a member of the db_datareader, db_datawriter, and db_ddladmin database roles. If the SQL Server login is also the associated with the dbo user then the cdc_service is not created.

This role assignment allows the Oracle CDC Service to update the tables under the cdc schema with captured data and with control information.

When a CDC database is created and CDC source Oracle tables are set up, the CDC database owner can grant SELECT permission of mirror tables and define SQL Server CDC gating roles to control who accesses the change data.

Mirror Tables

For each captured table, <schema-name>.<table-name>, in the Oracle source database, a similar empty table is created in the CDC Database, with the same schema and table name. Oracle source tables with the schema name cdc (not case sensitive) cannot be captured because the cdc schema in SQL Server is reserved for the SQL Server CDC.

The mirror tables are empty; no data is stored in them. They are used to enable the standard SQL Server CDC infrastructure that is used by the Oracle CDC Instance. To prevent data from being inserted or updated into the mirror tables, all UPDATE, DELETE, and INSERT operations are denied for PUBLIC. This ensures that they cannot be modified.

Access to Change Data

Because of the SQL Server security model used to gain access to the change data that is associated with a capture instance, the user must be granted select access to all the captured columns of the associated mirror table (access permissions to the original Oracle tables do not provide access to the change tables in SQL Server). For information on the SQL Server security model, see Security Model.

In addition, if a gating role is specified when the capture instance is created, the caller must also be a member of the specified gating role. Other general change data capture functions for accessing metadata are accessible to all database users through the PUBLIC role, although access to the returned metadata is usually gated by using select access to the underlying source tables, and by membership in any defined gating roles.

Change data may be read by calling special table-based functions generated by the SQL Server CDC component when a capture instance is created. For more information about this function, see Change Data Capture Functions (Transact-SQL).

Accessing CDC data through the Integration Services CDC Source component is subject to the same rules.

The CDC Database Tables

This section describes the following tables in the CDC database.

Change Tables (_CT)

The change tables are created from the mirror tables. They contain the change data that is captured from the Oracle database. The tables are named according to the following convention:

[cdc].[<capture-instance>_CT]

When capture is initially enabled for table <schema-name>.<table-name>, the default capture instance name is <schema-name>_<table-name>. For example, the default capture instance name for the Oracle HR.EMPLOYEES table is HR_EMPLOYEES and the associated change table is [cdc]. [HR_EMPLOYEES_CT].

The capture tables are written to by the Oracle CDC Instance. They are read using special table-valued functions generated by SQL Server when the capture instance is created. For example, fn_cdc_get_all_changes_HR_EMPLOYEES. For more information about these CDC functions see Change Data Capture Functions (Transact-SQL).

cdc.lsn_time_mapping

The [cdc].[lsn_time_mapping] table is generated by the SQL Server CDC component. Its use in the case of Oracle CDC is different than its normal use.

For the Oracle CDC, the LSN values stored in this table are based on the Oracle System Change Number (SCN) value associated with the change. The first 6 bytes of the LSN value is the original Oracle SCN number.

Also when using the Oracle CDC, the time columns (tran_begin_time and tran_end_time) store the UTC time of the change rather than the local time as it does with the regular SQL Server CDC. This ensures that daylight savings time changes do not impact the data stored in the lsn_time_mapping.

cdc.xdbcdc_config

This table contains the configuration data for the Oracle CDC Instance. It is updated using the CDC Designer Console. This table has only one row.

The following table describes the cdc.xdbcdc_config table columns.

Item Description
version This keeps track of the version of the CDC instance configuration. It is updated each time that the table is updated and each time a new capture instance is added or an existing capture instance is removed.
connect_string An Oracle connection string. A basic example is:

<server>:<port>/<instance> (for example, erp.contoso.com:1521/orcl).

The connection string can also specify an Oracle Net connect descriptor, for example, (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp) (HOST=erp.contoso.com) (PORT=1521)) (CONNECT_DATA=(SERVICE_NAME=orcl))).

If using a directory server or tnsnames, the connect string can be the name of the connection.

For more information about Oracle connection strings, see https://go.microsoft.com/fwlink/?LinkId=231153 for detailed information on Oracle database connection strings for the Oracle Instant Client that is used by the Oracle CDC Service.
use_windows_authentication A Boolean value that can be:

0: An Oracle user name and password are provided for authentication (the default)

1: Windows authentication is used to connect to the Oracle database. You can use this option only if the Oracle database is configured to work with Windows authentication.
username The name of the log-mining Oracle database user. This is mandatory only if use_windows_authentication = 0.
password The password for the log-mining Oracle database user. This is mandatory only if use_windows_authentication = 0.
transaction_staging_timeout The time, in seconds, that an uncommitted Oracle transaction is kept in memory before being written to the cdc.xdbcdc_staged_transactions table. The default is 120 seconds.
memory_limit The limit on the amount of memory, in Mb, that can be used for caching data in memory. A lower setting causes more transaction to be written to the cdc.xdbcdc_staged_transactions table. The default is 50 Mb.
options A list of options in the form of name[=value][; ] - it is used for specifying secondary options (for example, tracing, tuning). See the table below for a description of the available options.

The following table describes the available options.

Name Default Min Max Static Description
trace False - - False The available values are:

True

False

on

off
cdc_update_state_interval 10 1 120 False The size (in Kbytes) of memory chunks allocated for a transaction (a transaction can allocate more than one chunk). See the memory_limit column in cdc.xdbcdc_config table.
target_max_batched_transactions 100 1 1000 True The maximum number of Oracle transactions that can be processed as one transaction in SQL Server CT tables update.
target_idle_lsn_update_interval 10 0 1 False The interval (in seconds) for updating the lsn_time_mapping table when the captured tables have no activity.
trace_retention_period 24 1 24*31 False The amount of time (in hours to keep messages in the trace table).
sql_reconnect_interval 2 2 3600 False The amount of time (in seconds) to wait before reconnecting to SQL Server. This interval is used in addition to SQL Server client's connect timeout.
sql_reconnect_limit -1 -1 -1 False The maximum number of SQL Server reconnections. The default -1 means that the process tries to reconnect until it stops.
cdc_restart_limit 6 -1 3600 False In most cases, the CDC service restarts an abnormally ended CDC instance automatically. This property defines after how many failures per hour the service stops to restart the instance. The value -1 means that the instance should be always restarted.

The Service returns to restart the instance after any update of the configuration table.
cdc_memory_report 0 0 1000 False If the value of the parameter was changed, the CDC Instance prints its memory report on the trace table.
target_command_timeout 600 1 3600 False Command timeout working with SQL Server.
source_character_set - - - True Can be set to a specific Oracle encoding to be used instead of the Oracle database codepage. This may be of use when the actual encoding the character data is using is different than the one expressed by the Oracle database codepage.
source_error_retry_interval 30 1 3600 False Used before retry on several errors such as a connection error or temporary lack of synchronization between system tables.
source_prefetch_size 100 1 10000 True Size of the prefetch batch.
source_max_tables_in_query 100 1 10000 True Maximum number of tables in WHERE clause before switching to reading the Oracle log without table filtering.
source_read_retry_interval 2 1 3600 False The amount of time the source waits before trying to read the Oracle transaction logs on EOF again.
source_reconnect_interval 30 1 3600 False How long (in seconds) to wait before trying to re-connect to the source database.
source_reconnect_limit -1 -1 False The maximum number of the source database reconnections. The default -1 means that the process tries to reconnect until it is stopped.
source_command_timeout 30 1 3600 False Connection timeout working with Oracle.
source_connection_timeout 30 1 3600 False Connection timeout working with SQL Server.
trace_data_errors True - - False Boolean. True indicates to log data conversion and truncation errors.
CDC_stop_on_breaking_schema_changes False - - False Boolean. True indicates to stop when breaking schema change is detected.

False indicates to drop the mirror table and capture instance.
source_oracle_home - - False Can be set to a specific Oracle Home path or an Oracle Home Name that the CDC instance will use to connect to Oracle.

cdc.xdbcdc_state

This table contains information about the persisted state of the Oracle CDC Instance. The capture state is used in recovery and fail-over scenarios and for health monitoring.

The following table describes the cdc.xdbcdc_state table columns.

Item Description
status The current status code for the current Oracle CDC Instance. The status describes the current state for the CDC.
sub_status A second level status that provides additional information about the current status.
active A Boolean value that can be:

0: The Oracle CDC Instance process is not active.

1: The Oracle CDC Instance process is active.
error A Boolean value that can be:

0: The Oracle CDC Instance process is not in an error state.

1: The Oracle CDC Instance is in an error state.
status_message A string that provides a description of the error or status.
timestamp The timestamp with the time (UTC) that the capture state was last updated.
active_capture_node The name of the host (the host can be a node on a cluster) that is currently running the Oracle CDC Service and the Oracle CDC Instance (which is processing the Oracle transaction logs).
last_transaction_timestamp A timestamp with the time (UTC) when the last transaction that was written to the change tables.
last_change_timestamp A timestamp with the time (UTC) when the most recent change record was read from the source Oracle transaction log. This timestamp helps to identify the current latency of the CDC process.
transaction_log_head_cn The most recent change number (CN) read from the Oracle transaction log.
transaction_log_tail_cn The change number (CN) on the Oracle transaction log where the Oracle CDC Instance repositions to in case of a restart or recovery.
current_cn The most recent change number (CN) known to be in the source database.
software_version The internal version of the Oracle CDC Service.
completed_transactions The number of transactions processed since the CDC was last reset.
written_changes The number of change records written to the SQL Server change tables.
read_changes The number of change records read from the source Oracle transaction log.
staged_transactions The number of currently active transactions that are staged in the cdc.xdbcdc_staged_transactions table.

cdc.xdbcdc_trace

This table contains information about the operation of the CDC instance. Information stored in this table includes error records, notable status changes, and trace records. Error information is also written to the Windows event log to ensure that the information is available if the cdc.xcbcdc_trace table is unavailable.

The following table describes the cdc.xdbcdc_trace table columns.

Item Description
timestamp The exact UTC timestamp when the trace record was written.
type Contains one of the following values.

ERROR

INFO

TRACE
node The name of the node on which the record was written.
status The status code that is used by the state table.
sub_status The sub-status code that is used by the state table.
status_message The status message that is used by the state table.
data Additional data for cases when the error or trace record contains a payload (for example, a corrupted log record).

cdc.xdbcdc_staged_transactions

This table stores change records for large or long-running transactions until the transaction commit or rollback event is captured. The Oracle CDC Service orders captured log records by transaction commit time and then by chronological order for each transaction. Log records for the same transaction are stored in memory until the transaction ends and then are written to the target change table or discarded (in case of a rollback). Because there is a limited amount of memory available, large transactions are written into the cdc.xdbcdc_staged_transactions table until the transaction is complete. Transactions are also written to the staging table when they run for a long time. Therefore, when the Oracle CDC Instance is restarted, the old changes do not need to be re-read from the Oracle transaction logs.

The following table describes the cdc.xdbcdc_staged_transactions table columns.

Item Description
transaction_id The unique transaction identifier of the transaction being staged.
seq_num The number of xcbcdc_staged_transactions row for the current transaction (starting with 0).
data_start_cn The change number (CN) for the first change in the data in this row.
data_end_cn The change number (CN) for the last change in the data in this row.
data The staged changes for the transaction in the form of a BLOB.

See Also

Change Data Capture Designer for Oracle by Attunity