Implementing CDC process between a Microsoft SQL Azure Database and Google Cloud Platform

Jonathan Mackey 20 Reputation points
2024-09-05T10:58:51.61+00:00

I'm trying to implement CDC between a Microsoft SQL Azure Database and Google Cloud Platform. Essentially, we host some of our software in Azure, but the main group's data storage is in GCP.

My issue is that we use HADR geo-replication currently between our primary SQL Azure DB and a secondary SQL Azure DB. We don't want to remove this, but it's causing an issue. Part of the Documentation is asking to turn on Snapshot isolation. But this is impossible as the DB is involved in a mirroring session or availability group.

Is there anything I can do to force this, or can I reasonably go ahead with the CDC process without having the snapshot in place?

Azure SQL Database
{count} votes

Accepted answer
  1. Vinodh247 18,271 Reputation points
    2024-09-05T13:51:37.89+00:00

    Hi Jonathan Mackey,

    Thanks for reaching out to Microsoft Q&A.

    You are right that enabling snapshot Isolation is required for CDC to work, but the fact that your db is part of a geo replication setup is preventing you from enabling it. Unfortunately, it's not possible to force snapshot Isolation when the db is involved in a georeplication or availability group.

    listing out few potential workarounds/alternatives:

    1. Use a Readonly Secondary Replica: While Snapshot Isolation can't be enabled on the primary database in a geo-replicated setup, you could try to use the secondary replica (read-only) for CDC if the business use case allows. This would prevent changes to the main database schema and leave HADR untouched. However, you'd have to account for the fact that the data might be slightly behind real-time due to replication delays.
    2. Offload to a Staging Database: You could consider replicating the data from the primary SQL AzureDB to a separate, standalone AzureSQLdb or even an ADF pipeline. On this secondary database, you'd have the flexibility to enable Snapshot Isolation and implement CDC without disrupting the geo-replication.
    3. Using a Log Based CDC: Consider using a log-based CDC solution that operates independently of snapshot isolation. Log-based CDC captures changes directly from the transaction log, which can be more efficient and less intrusive than traditional methods. Solutions like Striim can facilitate this by continuously collecting changes and delivering them to your GCP environment without requiring a snapshot isolation.
    4. Alternative CDC Methods: If you are open to 3rd party tools, there are solutions available that can help with CDC without requiring changes to your existing Azure SQL db configuration. For instance, tools that integrate with both Azure and GCP can help manage data flows without needing to alter your current setup significantly.
    5. Use ADF or Change Tracking: If CDC doesn't fit your setup, ADF pipeline could be used to extract data from the source database in near real-time and move it to Google Cloud.

    Please 'Upvote'(Thumbs-up) and 'Accept' as an answer if the reply was helpful. This will benefit other community members who face the same issue.

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.