Best suited Azure solution for Data Integration

monika 1 Reputation point
2021-03-10T01:49:58.133+00:00

have a requirement to do the Data integration from various source systems to a Master Data Management (MDM). Requirement in crux:

Target system is a MDM (Master data management) which needs data in a XML format.
Multiple Source systems to give different type of attributes (call data columns)
Format of source data is varied like CSV, through API, SAP to Azure Data Lake etc..
Some source systems wants to send through API, some through sFTP etc..
My suggested solution is >>> keep all the source files (is source format) in ADL >> Get Data transformation done( in ADF / Synapes / ADL.. I dont know) >> Azure Function to convert transformed code into XML >> Send to Target

I know there are so many gaps in this solution. I dont know about data handling capacity for Logic apps / Functions. Can we use Logic app connectors to get the bulk data and place in Data storage...

Please share your approach best suited solution for such problem statement.

TIA..

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
6,296 questions
Azure Logic Apps
Azure Logic Apps
An Azure service that automates the access and use of data across clouds without writing code.
1,817 questions
{count} votes

2 answers

Sort by: Most helpful
  1. John Aherne 81 Reputation points
    2021-03-10T05:07:04.38+00:00

    Logic apps cannot handle large amounts of data. Not sure about function apps.

    Your best bet would be Azure Data Factory Copy activity which can write out to XML, but I guess it depends on how complex the schema is.


  2. MartinJaffer-MSFT 24,091 Reputation points Microsoft Employee
    2021-03-11T18:51:21.6+00:00

    Hello @monika and welcome to Microsoft Q&A.

    In my opinion, you should look at Synapse, rather than Data Factory.

    Synapse contains most of Data Factory features, but also has notebooks to run custom Pyspark or SQL code. This is useful for the transformations Data Factory is not enough for. Databricks also has these notebooks and is worth looking into. Synapse also is closely integrated with Azure Data Lake Store Gen2.

    The Spark pools of Databricks and Synapse are meant for more volume than the Function App / Logic App.

    No comments