Read-only replica for SQL DWH and AAS suitability

Vadim K 21 Reputation points
2021-07-12T08:33:11.4+00:00

Hi folks!

We're using Dedicated SQL pool (formerly SQL DW) for reporting system, mainly for Power BI reports with direct querying for datasource views.
Our current tier is the minimum serverless level - Gen2: DW100c and during the active time of workday it's usually loaded for 50-100% because of frequent read queries and inserts.

1st problem: performance issues because of "heavy" read queries.
There are some reports that can't use aggregated data because data analysts need to see all the raw records (e.g. full stats for a certain client). So each of these reports slows down the whole server for 5-10 minutes. At that time, the volume of data is not so big, but there are many joins and biz logic in views.
The easiest workaround is a read only replica of DWH and that theoretically should solve this issue, but I can't find such built-in option or any info about how to setup it.

2nd problem: costs for using.
We're closing to $1000/month and it seems too high for 100 DWUs (it's 2x-3x comparing to other production RDBS).
There is a lot of regular and frequent spikes of 100% resource/DWU usage and such high resource utilization leads to high costs for vCore-based pricing model.
Actually we aren't using any Synapse or Analysis Services features, because all the incoming data is already structured relational data (either from other relational DBs or from back-end services). Another thing is that we have only 5-7 DWH users which build their reports by importing data in Power BI. So there is no need for Synapse pool, big data handling/transforming and so on.
May be for our scenario it's better to switch to some provisioned tier ?

Azure SQL Database
Azure Analysis Services
Azure Analysis Services
An Azure service that provides an enterprise-grade analytics engine.
480 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,379 questions
0 comments No comments
{count} votes

Accepted answer
  1. HimanshuSinha-msft 19,491 Reputation points Microsoft Employee Moderator
    2021-07-14T03:53:48.903+00:00

    Hello @Vadim K ,
    Thanks for the ask and using the Microsoft Q&A platform .
    Let me start with a disclaimer , its very difficult to answer question like this wiithout looking at the model and the data , I am sure you understand that .

    Can you move to Import data from direct query for PowerBI reports ?
    The advantage with this is that you can move the join and views logic to PowerBI , but the downside is that reports will not be realtime & also the reports will need some updates . You will have to wait for the data refresh .

    It looks like you are aware that the raw data is huge and it may be reason you have spikes .
    The question is does the raw data changing everday , if not may be you can move this piece of the data to a blob and then pull the blob data to in the Power BI report .

    Lots of Joins
    Do you think you can explore to denormalize the data and see if that helps . You can create a new schema say reporting and put the denormalized data in that . The reports should read the records from here .

    Read only Replica
    I dont think SQLDWH offers this at thsi time , I could request you to file a feature ask here .

    Product group does monitor the request and they can plan for the implementation in future .Once you log the feature request you will also be notified on the status of the request . We expect you to keep using this forum and motivate others to do the same . You can always help other community members by answering to their queries .

    Can you check to use Azure data factory / Mapping data flow ?
    You can copy the data to a different SQL DB using ADF / dataflow . The dataflow can help with the transformation ( which will reduce the spikes which you see right ) .

    costs for using.
    It sounds to me that you are not using the DWH features and may be its not the right option to use at this time . Read more here

    Although both Azure SQL DB and Azure Synapse are PaaS options on the Microsoft Azure platform, the original purposes are slightly different. Azure SQL DB is well suited for the OLTP workloads and Azure Synapse for OLAP

    Please do let me know how it goes .
    Thanks
    Himanshu
    Please do consider clicking on "Accept Answer" and "Up-vote" on the post that helps you, as it can be beneficial to other community members

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Vadim K 21 Reputation points
    2021-07-15T10:38:19.743+00:00

    @HimanshuSinha-msft
    Hi Himanshu!
    Thanks for the reply.

    Our BI reports use the "Import" data load method, but the DB overload was still significant, so we've already switched them to pre-calculated data tables (with the denormalized data). Currently this manual "view materialization" is done by stored proc that run nightly. So far this work-around works well, but I look for a better solution.

    As for the Azure Data Factory - I tried it from very begging, but it doesn't fit into our data-flow/work-flow. We have too frequent changes in data models and quite complex sync/transform process. Besides, there is a need for TableValueParams in some stored procs and SQL Pool has restrictions for Used Defined Types, Merge, etc.

    Yes, our use case of DWH is not what Synapse is designed for. We gonna try to switch to Azure SQL DB for some time and check the performance/costs.
    What will be an appropriate choice there if we have Gen2: DW100c?
    We don't store the sensetive/personal info in DWH (so no need in DC-series) nor use OLAP/Cubes (so no need in Fsv2/M-series). Thus Gen-5 with 4-8 vCores looks like the right option.

    Best regards,
    Vadim.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.