I'm accepting my own comment as answer to close this question.
Which is the best option for data warehouse / Power BI solution?
Hello,
I am building a data warehouse in Azure from several external data sources. Data can be imported via pipelines in Data Factory (or Azure Synapse). Data is to be stored in Azure. Reporting through Power BI is required.
There seem to be a multitude of methods to achieve this. I could store the data in an Azure SQL database, or use blob storage. I could use an Azure Data Lake. I could use Azure Analysis Service or Azure Synapse Analytics to integrate with Power BI.
For example, as I understand it, I could do something this:
External DBs-->Azure Data Factory-->Azure Blob Storage/Azure Data Lake-->Azure Synapse Analytics-->Power BI Reports
But it might be better (or worse) to do this:
External DBs-->Azure Data Factory-->Azure SQL Database-->Azure Analytics Service-->Power BI Reports
I am looking for advice on why I would use one particular approach rather than the other.
Thanks.
5 additional answers
Sort by: Most helpful
-
HimanshuSinha-msft 19,476 Reputation points Microsoft Employee
2021-10-19T21:36:07.29+00:00 Hello @System2000
Thanks for the ask and using Microsoft Q&A platform .
I am in agreement with you as there are lot of option and you may choose one over the other . I am not sure about which third party External DBs you are fering to here . PowerBI by iteslf supports lot of external datasource and I think you should check that out and if the you Ext DB are supported may be you can achieved lot by using PowerBI itself .
https://learn.microsoft.com/en-us/power-bi/connect-data/desktop-data-sources.
At this time PowerBI does support many of the AAS feature and unless there is any feature which is avaible in AAS only and you are planning to use that , I will suggest you stick to PowerBI .Please do let me know how it goes .
Thanks
Himanshu-------------------------------------------------------------------------------------------------------------------------
- Please don't forget to click on or upvote button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
- Want a reminder to come back and check responses? Here is how to subscribe to a notification
- If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
-
System2000 236 Reputation points
2021-10-20T09:01:01.673+00:00 Hi @HimanshuSinha-msft ,
Thanks for your answer, but I think you've misinterpreted my question. I am mainly interested in the part in the middle, between ingestion of data and Power BI reporting.
I can handle the ingestion through Data Factory pipelines (or Azure Synapse Analytics pipelines), but I have multiple options on whether to use Data Lake, Azure SQL databases, blob storage, etc - and secondly I have several ways I can get the data to Power BI e.g. through Azure Synapse Analytics, AAS, etc. I don't know which options to use, or why.
This is my current preferred approach:
Several External MS SQL DBs --> ADF --> CSV/blob --> Data Lake --> Azure Synapse Analytics --> Power BIWould you say I should be doing it differently? e.g. you might say "you don't need blob or Data Lake, just use an Azure SQL DB, and then you could use AAS instead of Synapse." Or would you say my approach is fine? Cost of Azure services is not an issue, unless significantly different.
Thanks again.
-
System2000 236 Reputation points
2021-10-20T13:15:08.05+00:00 Hi @HimanshuSinha-msft ,
As a follow up to my previous post, here are some examples of data warehouse architecture that I found on microsoft.com:
https://learn.microsoft.com/en-us/azure/architecture/example-scenario/data/data-warehouse
There are probably more examples, but in each of the above the architecture is different. One is using blob storage, another is using data lake, another is using Azure Synapse Analytics as storage. I'd like to commit to an architecture, but it's difficult to know which.
P.S. I appreciate your help.
-
System2000 236 Reputation points
2021-10-26T15:53:53.403+00:00 Ok. I decided how I'm going to do it. I don't know if this would count as an answer, so I'm leaving it open until someone either agrees or disagrees, or enough time passes without reply.
- I'm going to use Azure Synapse pipelines to ingest data from SQL Server databases and other data sources (utilising integration runtimes for on-prem DBs).
- I'm going to store this data in an Azure Data Lake Gen 2. I believe I will need to convert SQL Server data tables to CSV in the ingestion process in order to save the data in the data lake. This will likely be the staging area for the data.
- I'm going to employ an Azure Synapse SQL pool, although I'm not sure whether serverless or dedicated yet
--- if dedicated pool, then I will manipulate data from the data lake through a Synapse pipeline data flow and save it in the data warehouse for reporting.
---- if serverless, I will create a logical data warehouse and use views to serve the report data - I will link a Power BI workspace(s) to the Azure Synapse workspace, create Power BI reports through Power BI desktop and publish them to Power BI online. From there, the reports can be shared to the organisation.
Does this sound like a good plan? Anyone?