Share via

Blob Data Loading Into Azure Sql Database - Best Approach

jase jackson USA 201 Reputation points
2020-08-06T10:31:42.507+00:00

Hi

Use Case: daily load of blob mutiple csv data into azure sql database (80+ files / daily size approx 50mb- 80mb)

What are the pros and cons to using either

(a) T-Sql to Bulk Insert the data from the blob csv files into the azure sql tables or
(b) Using ADF Copy Activity to loop through the blob files and load into the azure sql database.

Both appear to offer the same capability.

Particularly interested if there are any key considerations/challenges on best practice, azure costing (is one approach significantly cheaper), any future deprecation, particular technical benefits (parallel processing/speed).

Thanks and happy to claryfy where necessary.

Jase

Azure SQL Database
Azure Data Factory
Azure Data Factory

An Azure service for ingesting, preparing, and transforming data at scale.


2 answers

Sort by: Most helpful
  1. KalyanChanumolu-MSFT 8,361 Reputation points
    2020-08-07T10:16:04.207+00:00

    @jase jackson USA Thank you for your interest in Azure SQL Database and Azure Data Factory

    As you rightly pointed out, both solutions allow you to load data into Azure SQL database.

    From your requirement, it appears that you need to schedule this activity daily.

    If you use T-SQL, you will have to depend on an external scheduling solution like an Azure function app or Azure Run Book to run the T-SQL at the scheduled time.

    ADF appears to be a more convenient solution because it offers both scheduling and ingestion capabilities. You can also enhance it for better error handling and logging capabilities in future.

    Please let us know if you have further questions.

    --

    If an answer is helpful, please "Accept answer" or "Up-Vote" for the same which might be beneficial to other community members reading this thread.

    Was this answer helpful?

    1 person found this answer helpful.

  2. KalyanChanumolu-MSFT 8,361 Reputation points
    2020-08-18T15:52:37.707+00:00

    @jase jackson USA

    We cannot really answer this question without looking at your files, the logic in your stored procedure to read and load the files into tables etc.
    If you need the exact DTUs the operation consumes, I would recommend you enable Query performance Insights on your database and run some tests.

    If you have too many records in your files you might run into SQL Azure throttling issues if your database is not adequately scaled. Please test before you go to production.

    If you have a VM already running on Azure, you could schedule the execution of the stored procedure from the VM using PowerShell and not use ADF altogether.

    ADF is not very expensive. Please look at this example to get an overview.

    If it is within your budget, I would still recommend using ADF because you get better control of the execution, ability to resume from last failed run, better logging etc.

    Please let us know if you have further questions.

    ----------

    If an answer is helpful, please "Accept answer" or "Up-Vote" for the same which might be beneficial to other community members reading this thread.

    Was this answer helpful?


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.