Merge multiple Azure databases into one Azure SQL database and sync them weekly.

Shruti Vyas 1 Reputation point
2022-11-25T09:54:11.687+00:00

Hello All,

I have 100-150 Azure databases with same table schema. There are 300-400 tables in each database. Separate reports are enabled on all these databases.

Now I want to merge these database into a centralized database and generate some Power BI reports from this.

The approach I am thinking is -

  • There will be multiple (30-35) instances of Azure data factory pipeline and each instance will be responsible to merge data from 10-15 databases.
  • These pipeline will be scheduled to run weekly.

Can anyone please guide is the above approach will be feasible in this scenario? Or there could anything else we can go for.

Thanks in Advance.

Azure SQL Database
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,346 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,910 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,487 questions
{count} votes

1 answer

Sort by: Most helpful
  1. HimanshuSinha-msft 19,376 Reputation points Microsoft Employee
    2022-11-28T22:11:24.713+00:00

    Hello @Shruti Vyas ,
    Thanks for the question and using MS Q&A platform.
    As we understand the ask here is merge tables from different database and run a report from centralized DB , please do let us know if its not accurate.

    What is the end goal

    1. Run reports ? You can do it in many different ways .
      Use PowerBI
      You can use PowerBI only and pull today fiom the databases and directly show the data in PowerBI.

    Use Blob

    You can write data to blob from the different database using ADF . Once the blob ( csv,txt etc ) are written you can merge all the files & create one file which you can then read fron PowerBI .
    Since mainataining data is blob is cheaper then SQL , it will turn out to be cheaper solution
    .

    1. Centrealized the DB

    You can use ADF for this . Since there are so many database , I suggest to enable log analytics .
    One think which is worth considering is that how will maintain the primary key in the centrealized database ., since the data will be duplicates from different database.schema.table you may have repeatation .

    Please do let me if you have any queries.
    Thanks
    Himanshu


    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
      • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    0 comments No comments