Share via

Microsoft Tools choice for row by row processing

na 146 Reputation points
2022-08-03T08:48:20.67+00:00

We have a business process in which we must process the records one by one.

  • get the record from a database.
  • check it exists in another system via api.
  • if the record exists, do an update via an api.
  • if the record does not exist, do an insert, via api.

We are doing this using data factory and it works, but it feels cumbersome. Its more suited to simpler movements of lots of data.
We considered and tried logic apps, but agian, they feel convoluted and can become complex and cumbersome very quickly. they are also not very fast.

Are azure functions a better fit for such a pipeline?
Ideally we would do this in python, and simply run a python notebook but i know of no such facility in azure outside of databricks, which again, needs ADF.

Azure Data Factory
Azure Data Factory

An Azure service for ingesting, preparing, and transforming data at scale.


1 answer

Sort by: Most helpful
  1. MartinJaffer-MSFT 26,161 Reputation points
    2022-08-03T20:16:14.73+00:00

    Hello @na and welcome to Microsoft Q&A.

    As I understand, you are looking for a platform/service to run Python, fetch records from a database, and for each record fetched, interact with an api.

    You have not mentioned which flavor of database (Azure SQL, on-prem SQL, noSQL, Cosmos, oracle, etc.). Some services have inbuilt integrations with certain database types.

    You have mentioned Databricks, but I question whether it really needs ADF. I am mostly confident you can do your entire ask in Databricks alone, without ADF. Databricks does have some database integrations.

    Azure Synapse also runs Python, and has integrations with multiple azure database products. Depending upon your database, it might even be hosted in Synapse! Synapse is like ADF and Databricks put together with other stuff too.

    If you want something more bare-bones, you could try Azure Batch. Or even get your own VM. For those you bring your own code, and other stuff.

    As to whether to use Azure Functions, I do have some concerns. Mainly about how long you expect the execution to take and which trigger method. There are some timeout durations to worry about, mostly on Consumption plan and http trigger. Durable functions may help. Azure Functions is outside my area of expertise, so it may still be valid options.

    I'm sure there are other options as well.

    Was this answer helpful?


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.