Why SQL Server (and when) instead of Data Bricks or Data Lake

Avik Mukherjee 1 Reputation point
2022-11-29T11:29:09.547+00:00

Firstly Apologies! if my question is not placed on the correct forum, I am really unsure, in which forum I should raise it...........

~ I am a SQL developer (and I do a little bit of Administration too) and currently trying to learn topics like Data Lake, Data Bricks, Data Factory, etc. on Azure.
A question I am constantly getting and I too am curious to find an appropriate answer, is --- Why is SQL Server still now when there is Azure Data Brick/Lakes allowing both Structured as well as unstructured data? In Azure Data Bricks/Lake comes also with so many options to visualize data, create UI, etc. But in SQL DB it's still limited. Cost-wise as well we see that it's more manageable compared to the price of an Enterprise License in 32 or higher core for SQL Server. You can write SQL / Python etc. on top of big data volume in Data Brick/Lake.
On the other hand, I still see Microsoft publishing SQL Server 2022, which means definitely they are investing money in it.
My question is Why! What could be a good reason (or scenario) to convince clients that they should use SQL servers, despite we see so many opportunities in Data Bricks etc?

Please don't tell me "it depends on situations" ---- I am exactly asking about those situations. One of my clients is still having lots of SQL DB (mostly on-prem and some in Azure SQL) and also some applications in Data Bricks. They are questing why shouldn't those applications (running SQL DB) move to data lake etc.

Any expert opinion, on when you'll advise your customer to stay in SQL Server and when you'll suggest they move to Data Lake/brick, etc? I personally have no bias on either but I need to know the appropriate case study/situation to design the right solution architecture model.

Thank you so much in advance... in case if you consider, that this exact question has already been answered correctly somewhere, please share the site for me to read as well.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,514 questions
SQL Server Other
{count} votes

4 answers

Sort by: Most helpful
  1. Bjoern Peters 8,921 Reputation points
    2022-11-29T12:52:51.923+00:00

    Hi @Avik Mukherjee

    Think about the history of SQL Server... in easy words... SQL Server is/was always some kind of data storage for all kinds of business data, so it still is!
    Yes, for sure, all those excellent services in Azure are fantastic and doing a lot of great work/offering millions of features to work with your business data... BUT they are built on top of this underlying data storage...

    You still have to store that data somewhere somehow... all those "new" services were built as a "workaround" for former SQL Server (on-premise) solutions like SSIS, SSAS, or SSRS.

    SSIS is doing all the magic with your data... loading, transforming, merging, deleting, extracting, and filtering thousands of CSV, Paquet, or whatever files into a data "storage."
    Some years ago, the only way to move this service into the cloud was to deploy an additional VM into Azure and run that service... an equivalent service to run those magic was introduced as a service... DataBricks and/or ADF, DataLake now combines several kinds of storage you had on-premise with some enhanced and optimized feature sets.

    And so on...

    And not every customer wants to have its data in the cloud... so you still need a solution to "store" that data on-premise!

    So you are asking, "When to choose what?"

    First, does your customer needs/want a cloud solution?
    If yes, then go ahead with the needs/requirements/pain/budget of your customer...

    In my area, this is what happens often:
    A lot of my customers have a business need. The business defines, for example, a new application/feature... that needs information from one data storage...
    So they decide on one particular application, and that application provider says... "I only support SQL Server 2019 and not any cloud services"...

    And many customers are not those big companies that have such specific requirements for several business use cases that require an individual solution. They need it right now and out of the box. And historically, they are not that flexible to migrate all of those (already existing) on-premise processes to the cloud!

    Here (in Germany), companies are slow in adapting to the cloud... if they are newly set up companies, they might build everything in the cloud, but not all of those old settled companies. Maybe some more prominent enterprise companies moving slowly into those services by migrating their processes step by step and learning slowly.

    They are questing why shouldn't those applications (running SQL DB) move to data lake etc.

    Why should they? => Many applications used in business are not able to run in the cloud! Why change all those processes around those applications? What about the network traffic (performance/costs)? Have you ever tried to connect, for example, BIZ BOOK, to a DataBrick/DataLake? That won't work...

    So it still comes to "it depends on situations"... it depends on your customer, on your customer's environment, on your customer's requirements, on your customer's budget, and if your customer is open to new things...

    4 people found this answer helpful.
    0 comments No comments

  2. Tom Phillips 17,771 Reputation points
    2022-11-29T16:05:44.527+00:00

    Your question is more like "when do I use a hammer, a screwdriver, or a drill". Those are all different tools for different jobs. There is some overlap, but they each have their own focus and specialty.

    2 people found this answer helpful.
    0 comments No comments

  3. PandaPan-MSFT 1,931 Reputation points
    2022-11-30T02:05:14.083+00:00

    Hi @Avik Mukherjee ,
    I think others answered really well, just in case.... You can see the table which show some differences and this pic is from the chinese platform so I can't post the link directly:
    265561-image.png
    265571-image.png

    2 people found this answer helpful.
    0 comments No comments

  4. Erland Sommarskog 121.4K Reputation points MVP Volunteer Moderator
    2022-11-29T22:17:32.313+00:00

    In addition to other posts: DataBricks and that is good for data warehouses, and not the least when you have unstructured data on file.

    But you would not implement an order-entry or any other type of OLTP system in DataBricks.

    1 person found this answer helpful.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.