Share via

Azure SQL Database Hyperscale — pre-commit self-check without staging (ADLS raw → Main)

Anonymous
2025-09-01T13:06:41.09+00:00

Our project lands per-table CDC rows in ADLS raw. We have to load directly into Hyperscale main using Databricks with the Microsoft JDBC driver. Healthcare data; correctness is the top priority. Typical rate ~3k events/sec with occasional peaks.

Scenario: For each small, deterministic slice (by key or commit timestamp), one writer session does: BEGIN TRAN → apply I/U/D to the main table → capture affected rows with T-SQL OUTPUT into a session-local temp table → compute count and a row fingerprint on that temp data → if it matches the precomputed values from ADLS, COMMIT; otherwise ROLLBACK. No permanent staging table is used.

Question: Is this “pre-commit self-check inside one transaction” pattern fully supported and appropriate for Hyperscale when we keep each transaction short? Which isolation level do you recommend for the writer session (default READ COMMITTED vs READ_COMMITTED_SNAPSHOT), and are there any engine caveats we have to consider for this approach?

Azure SQL Database

1 answer

Sort by: Most helpful
  1. Smaran Thoomu 35,375 Reputation points Microsoft External Staff Moderator
    2025-09-01T14:42:00.08+00:00

    Hi @Anonymous
    Yes, the pattern you describe - using a single transaction with OUTPUT into a temp table, validating counts/fingerprints, then COMMIT or ROLLBACK - is fully supported on Hyperscale. As long as the batches remain short, the engine will maintain ACID guarantees and the rollback path is safe.

    For isolation:

    • Default READ COMMITTED (with RCSI enabled in Hyperscale) is generally the right balance.
    • You only need SERIALIZABLE if you must block concurrent writers on overlapping keys, but that comes with higher contention.
    • SNAPSHOT is optional if you want repeatable reads of untouched rows but not usually required for this write-check pattern.

    Points to watch:

    • Keep transactions small to avoid version store or log pressure.
    • OUTPUT into temp tables is fine; just monitor tempdb if you capture large sets.
    • Rollback latency is bounded by log flush, but in practice short slices won’t hit issues.

    So, yes - the “pre-commit self-check” approach is appropriate on Hyperscale, with default READ COMMITTED recommended.

    I hope this information helps. Please do let us know if you have any further queries.

    Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.