The Oracle CDC Databases
Important
Change Data Capture for Oracle by Attunity is deprecated now. For details, refer to the announcement.
An Oracle CDC Instance is associated with a SQL Server database by the same name on the target SQL Server instance. This database is called the Oracle CDC database (or the CDC database).
The CDC database is created and configured using the Oracle CDC Designer Console and it contains the following elements:
A
cdc
schema created by enabling the database for SQL Server CDC.A set of cdc.xdbcdc_xxxx tables used by the Oracle CDC Instance.
A set of empty mirror tables with the definitions of the captured tables in tuphe Source Oracle database.
A set of change tables and change access functions that are generated by the SQL Server CDC mechanism and are identical to those used in the regular, non-Oracle, SQL Server CDC.
The cdc
schema is initially accessible only to the members of the dbowner fixed database role. Access to the change tables and change functions is determined by the same security model as the SQL Server CDC. For more information about the security model, see Security Model.
Creating the CDC Database
In most cases, the CDC database is created using the CDC Designer Console, but it can also be created with a CDC deployment script that is generated using the CDC Designer Console. The SQL Server system administrator can change the database settings if necessary (for items such as for storage, security, or availability).
For more information about using the CDC Designer Console to create the database tables and the necessary scripts, see Use the New Instance Wizard.
CDC Database User Roles
When a CDC Database is created and enabled for CDC, a database user called cdc_service is created in the CDC database and is associated with the SQL Server login that the Oracle CDC Service was configured with. This user is made a member of the db_datareader, db_datawriter, and db_ddladmin database roles. If the SQL Server login is also the associated with the dbo
user then the cdc_service is not created.
This role assignment allows the Oracle CDC Service to update the tables under the cdc
schema with captured data and with control information.
When a CDC database is created and CDC source Oracle tables are set up, the CDC database owner can grant SELECT permission of mirror tables and define SQL Server CDC gating roles to control who accesses the change data.
Mirror Tables
For each captured table, <schema-name>.<table-name>, in the Oracle source database, a similar empty table is created in the CDC Database, with the same schema and table name. Oracle source tables with the schema name cdc
(not case sensitive) cannot be captured because the cdc
schema in SQL Server is reserved for the SQL Server CDC.
The mirror tables are empty; no data is stored in them. They are used to enable the standard SQL Server CDC infrastructure that is used by the Oracle CDC Instance. To prevent data from being inserted or updated into the mirror tables, all UPDATE, DELETE, and INSERT operations are denied for PUBLIC. This ensures that they cannot be modified.
Access to Change Data
Because of the SQL Server security model used to gain access to the change data that is associated with a capture instance, the user must be granted select
access to all the captured columns of the associated mirror table (access permissions to the original Oracle tables do not provide access to the change tables in SQL Server). For information on the SQL Server security model, see Security Model.
In addition, if a gating role is specified when the capture instance is created, the caller must also be a member of the specified gating role. Other general change data capture functions for accessing metadata are accessible to all database users through the PUBLIC role, although access to the returned metadata is usually gated by using select access to the underlying source tables, and by membership in any defined gating roles.
Change data may be read by calling special table-based functions generated by the SQL Server CDC component when a capture instance is created. For more information about this function, see Change Data Capture Functions (Transact-SQL).
Accessing CDC data through the Integration Services CDC Source component is subject to the same rules.
The CDC Database Tables
This section describes the following tables in the CDC database.
Change Tables (_CT)
The change tables are created from the mirror tables. They contain the change data that is captured from the Oracle database. The tables are named according to the following convention:
[cdc].[<capture-instance>_CT]
When capture is initially enabled for table <schema-name>.<table-name>
, the default capture instance name is <schema-name>_<table-name>
. For example, the default capture instance name for the Oracle HR.EMPLOYEES table is HR_EMPLOYEES and the associated change table is [cdc]. [HR_EMPLOYEES_CT].
The capture tables are written to by the Oracle CDC Instance. They are read using special table-valued functions generated by SQL Server when the capture instance is created. For example, fn_cdc_get_all_changes_HR_EMPLOYEES
. For more information about these CDC functions see Change Data Capture Functions (Transact-SQL).
cdc.lsn_time_mapping
The [cdc].[lsn_time_mapping] table is generated by the SQL Server CDC component. Its use in the case of Oracle CDC is different than its normal use.
For the Oracle CDC, the LSN values stored in this table are based on the Oracle System Change Number (SCN) value associated with the change. The first 6 bytes of the LSN value is the original Oracle SCN number.
Also when using the Oracle CDC, the time columns (tran_begin_time
and tran_end_time
) store the UTC time of the change rather than the local time as it does with the regular SQL Server CDC. This ensures that daylight savings time changes do not impact the data stored in the lsn_time_mapping.
cdc.xdbcdc_config
This table contains the configuration data for the Oracle CDC Instance. It is updated using the CDC Designer Console. This table has only one row.
The following table describes the cdc.xdbcdc_config table columns.
Item | Description |
---|---|
version | This keeps track of the version of the CDC instance configuration. It is updated each time that the table is updated and each time a new capture instance is added or an existing capture instance is removed. |
connect_string | An Oracle connection string. A basic example is:<server>:<port>/<instance> (for example, erp.contoso.com:1521/orcl ).The connection string can also specify an Oracle Net connect descriptor, for example, (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp) (HOST=erp.contoso.com) (PORT=1521)) (CONNECT_DATA=(SERVICE_NAME=orcl))) .If using a directory server or tnsnames, the connect string can be the name of the connection. For more information about Oracle connection strings, see https://go.microsoft.com/fwlink/?LinkId=231153 for detailed information on Oracle database connection strings for the Oracle Instant Client that is used by the Oracle CDC Service. |
use_windows_authentication | A Boolean value that can be: 0: An Oracle user name and password are provided for authentication (the default) 1: Windows authentication is used to connect to the Oracle database. You can use this option only if the Oracle database is configured to work with Windows authentication. |
username | The name of the log-mining Oracle database user. This is mandatory only if use_windows_authentication = 0. |
password | The password for the log-mining Oracle database user. This is mandatory only if use_windows_authentication = 0. |
transaction_staging_timeout | The time, in seconds, that an uncommitted Oracle transaction is kept in memory before being written to the cdc.xdbcdc_staged_transactions table. The default is 120 seconds. |
memory_limit | The limit on the amount of memory, in Mb, that can be used for caching data in memory. A lower setting causes more transaction to be written to the cdc.xdbcdc_staged_transactions table. The default is 50 Mb. |
options | A list of options in the form of name[=value][; ] - it is used for specifying secondary options (for example, tracing, tuning). See the table below for a description of the available options. |
The following table describes the available options.
Name | Default | Min | Max | Static | Description |
---|---|---|---|---|---|
trace | False | - | - | False | The available values are: True False on off |
cdc_update_state_interval | 10 | 1 | 120 | False | The size (in Kbytes) of memory chunks allocated for a transaction (a transaction can allocate more than one chunk). See the memory_limit column in cdc.xdbcdc_config table. |
target_max_batched_transactions | 100 | 1 | 1000 | True | The maximum number of Oracle transactions that can be processed as one transaction in SQL Server CT tables update. |
target_idle_lsn_update_interval | 10 | 0 | 1 | False | The interval (in seconds) for updating the lsn_time_mapping table when the captured tables have no activity. |
trace_retention_period | 24 | 1 | 24*31 | False | The amount of time (in hours to keep messages in the trace table). |
sql_reconnect_interval | 2 | 2 | 3600 | False | The amount of time (in seconds) to wait before reconnecting to SQL Server. This interval is used in addition to SQL Server client's connect timeout. |
sql_reconnect_limit | -1 | -1 | -1 | False | The maximum number of SQL Server reconnections. The default -1 means that the process tries to reconnect until it stops. |
cdc_restart_limit | 6 | -1 | 3600 | False | In most cases, the CDC service restarts an abnormally ended CDC instance automatically. This property defines after how many failures per hour the service stops to restart the instance. The value -1 means that the instance should be always restarted. The Service returns to restart the instance after any update of the configuration table. |
cdc_memory_report | 0 | 0 | 1000 | False | If the value of the parameter was changed, the CDC Instance prints its memory report on the trace table. |
target_command_timeout | 600 | 1 | 3600 | False | Command timeout working with SQL Server. |
source_character_set | - | - | - | True | Can be set to a specific Oracle encoding to be used instead of the Oracle database codepage. This may be of use when the actual encoding the character data is using is different than the one expressed by the Oracle database codepage. |
source_error_retry_interval | 30 | 1 | 3600 | False | Used before retry on several errors such as a connection error or temporary lack of synchronization between system tables. |
source_prefetch_size | 100 | 1 | 10000 | True | Size of the prefetch batch. |
source_max_tables_in_query | 100 | 1 | 10000 | True | Maximum number of tables in WHERE clause before switching to reading the Oracle log without table filtering. |
source_read_retry_interval | 2 | 1 | 3600 | False | The amount of time the source waits before trying to read the Oracle transaction logs on EOF again. |
source_reconnect_interval | 30 | 1 | 3600 | False | How long (in seconds) to wait before trying to re-connect to the source database. |
source_reconnect_limit | -1 | -1 | False | The maximum number of the source database reconnections. The default -1 means that the process tries to reconnect until it is stopped. | |
source_command_timeout | 30 | 1 | 3600 | False | Connection timeout working with Oracle. |
source_connection_timeout | 30 | 1 | 3600 | False | Connection timeout working with SQL Server. |
trace_data_errors | True | - | - | False | Boolean. True indicates to log data conversion and truncation errors. |
CDC_stop_on_breaking_schema_changes | False | - | - | False | Boolean. True indicates to stop when breaking schema change is detected. False indicates to drop the mirror table and capture instance. |
source_oracle_home | - | - | False | Can be set to a specific Oracle Home path or an Oracle Home Name that the CDC instance will use to connect to Oracle. |
cdc.xdbcdc_state
This table contains information about the persisted state of the Oracle CDC Instance. The capture state is used in recovery and fail-over scenarios and for health monitoring.
The following table describes the cdc.xdbcdc_state table columns.
Item | Description |
---|---|
status | The current status code for the current Oracle CDC Instance. The status describes the current state for the CDC. |
sub_status | A second level status that provides additional information about the current status. |
active | A Boolean value that can be: 0: The Oracle CDC Instance process is not active. 1: The Oracle CDC Instance process is active. |
error | A Boolean value that can be: 0: The Oracle CDC Instance process is not in an error state. 1: The Oracle CDC Instance is in an error state. |
status_message | A string that provides a description of the error or status. |
timestamp | The timestamp with the time (UTC) that the capture state was last updated. |
active_capture_node | The name of the host (the host can be a node on a cluster) that is currently running the Oracle CDC Service and the Oracle CDC Instance (which is processing the Oracle transaction logs). |
last_transaction_timestamp | A timestamp with the time (UTC) when the last transaction that was written to the change tables. |
last_change_timestamp | A timestamp with the time (UTC) when the most recent change record was read from the source Oracle transaction log. This timestamp helps to identify the current latency of the CDC process. |
transaction_log_head_cn | The most recent change number (CN) read from the Oracle transaction log. |
transaction_log_tail_cn | The change number (CN) on the Oracle transaction log where the Oracle CDC Instance repositions to in case of a restart or recovery. |
current_cn | The most recent change number (CN) known to be in the source database. |
software_version | The internal version of the Oracle CDC Service. |
completed_transactions | The number of transactions processed since the CDC was last reset. |
written_changes | The number of change records written to the SQL Server change tables. |
read_changes | The number of change records read from the source Oracle transaction log. |
staged_transactions | The number of currently active transactions that are staged in the cdc.xdbcdc_staged_transactions table. |
cdc.xdbcdc_trace
This table contains information about the operation of the CDC instance. Information stored in this table includes error records, notable status changes, and trace records. Error information is also written to the Windows event log to ensure that the information is available if the cdc.xcbcdc_trace table is unavailable.
The following table describes the cdc.xdbcdc_trace table columns.
Item | Description |
---|---|
timestamp | The exact UTC timestamp when the trace record was written. |
type | Contains one of the following values. ERROR INFO TRACE |
node | The name of the node on which the record was written. |
status | The status code that is used by the state table. |
sub_status | The sub-status code that is used by the state table. |
status_message | The status message that is used by the state table. |
data | Additional data for cases when the error or trace record contains a payload (for example, a corrupted log record). |
cdc.xdbcdc_staged_transactions
This table stores change records for large or long-running transactions until the transaction commit or rollback event is captured. The Oracle CDC Service orders captured log records by transaction commit time and then by chronological order for each transaction. Log records for the same transaction are stored in memory until the transaction ends and then are written to the target change table or discarded (in case of a rollback). Because there is a limited amount of memory available, large transactions are written into the cdc.xdbcdc_staged_transactions table until the transaction is complete. Transactions are also written to the staging table when they run for a long time. Therefore, when the Oracle CDC Instance is restarted, the old changes do not need to be re-read from the Oracle transaction logs.
The following table describes the cdc.xdbcdc_staged_transactions table columns.
Item | Description |
---|---|
transaction_id | The unique transaction identifier of the transaction being staged. |
seq_num | The number of xcbcdc_staged_transactions row for the current transaction (starting with 0). |
data_start_cn | The change number (CN) for the first change in the data in this row. |
data_end_cn | The change number (CN) for the last change in the data in this row. |
data | The staged changes for the transaction in the form of a BLOB. |