Data Latency between primary database and named replicas in Hyperscale database

Simon Goldshmid 20 Reputation points
2023-06-01T17:16:27.26+00:00

Before I ask a question I want to describe our use case.

  1. Multiple clients are inserting/updating the data in the primary hyperscale database.
  2. After data is inserted/modified those clients immediately sending service bus messages to several topics.
  3. Several function apps are subscribed to those topics and start a heavy-lifting analysis of the inserted/updated data.
  4. Right now those function apps are working from the primary database.
  5. We want to move the read-only portion of analysis from main database to a named replica to decrease the load on the main database.

Since there is an unpredictable latency between the data placed in main database and replicated into named replica, there is a big probability that the function apps will start the analysis before the data is replicated to the read-only replica. This is unacceptable.

Now is the question:

Is it possible from the function app to know when the data is fully replicated?

We can't tolerate latencies greater than 1.5 seconds.

Thank you,

Azure SQL Database
0 comments No comments
{count} votes

Accepted answer
  1. Alberto Morillo 32,886 Reputation points MVP
    2023-06-01T20:05:03.4233333+00:00

    Typical data latency for small transactions is in tens of milliseconds, however there is no upper bound on data latency. Let me share this paragraph of the documentation with you:

    Data latency from the time a transaction is committed on the primary to the time it is readable on a secondary depends on current log generation rate, transaction size, load on the replica, and other factors. Typical data latency for small transactions is in tens of milliseconds, however there is no upper bound on data latency. Data on a given secondary replica is always transactionally consistent, thus larger transactions take longer to propagate. However, at a given point in time data latency and database state may be different for different secondary replicas. Workloads that need to read committed data immediately should run on the primary replica.

    Source is here.

    1 person found this answer helpful.

2 additional answers

Sort by: Most helpful
  1. Oury Ba-MSFT 16,076 Reputation points Microsoft Employee
    2023-06-01T19:55:32.02+00:00

    Simon Goldshmid

    Thank you for reaching out.

    Is it possible from the function app to know when the data is fully replicated?

    To ensure that your function apps only start the analysis when the data is fully replicated to the read-only replica, you can use the sys.dm_hadr_database_replica_states dynamic management view to monitor the replication status of the database. This view provides information about the synchronization state of the database on each replica.

    Keep in mind that the replication latency can vary depending on the size of the database and the network conditions. You may need to adjust the frequency of your monitoring to ensure that the analysis is not started before the data is fully replicated.

    How much delay is there going to be between the primary and secondary compute replicas?

    Data latency from the time a transaction is committed on the primary to the time it is readable on a secondary depends on current log generation rate, transaction size, load on the replica, and other factors. Typical data latency for small transactions is in tens of milliseconds, however there is no upper bound on data latency. Data on a given secondary replica is always transactionally consistent, thus larger transactions take longer to propagate. However, at a given point in time data latency and database state may be different for different secondary replicas. Workloads that need to read committed data immediately should run on the primary replica.

    Regards,

    Oury

    1 person found this answer helpful.

  2. Simon Goldshmid 20 Reputation points
    2023-06-02T11:16:53+00:00

    When I tried to execute "SELECT * FROM sys.dm_hadr_database_replica_states" against our Hyperscale SQL Server 12.0.2000.8 database I received "Invalid object name 'sys.dm_hadr_database_replica_states" error.

    I tried to execute it against our work database and against master database and I got the same error. Do I need to have a special version of the database to have this view?