ADF and external source data handle

azure_learner 200 Reputation points
2024-08-14T11:52:39.4333333+00:00

I require that the business has more than 60 packages(sources are ERP and other applications), which must be refactored into ADF pipelines and moved to Azure Storage services. The issue is all these packages have complex multi-step logic with dependencies with other packages and sometimes with other sources. And, these packages have all the code mostly written in VB  scripts which is not supported in ADF custom activity. What would be a good approach and way to handle this? What is the neat and practical way to deal with it? Please help with detailed guidelines and options.

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,841 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,523 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Bhargava-MSFT 30,576 Reputation points Microsoft Employee
    2024-08-15T19:11:19.9633333+00:00

    Hello azure_learner,

    Refactoring more than 60 packages with complex multi-step logic and dependencies into Azure Data Factory pipelines can indeed be challenging, especially when dealing with VB scripts that are not supported in ADF custom activities.

    Here are some guidelines and options.

    Identify the dependencies between the packages and sources: Before refactoring the packages into ADF pipelines, it is important to identify the dependencies between the packages and sources. This will help you determine the order in which the packages should be refactored and ensure that the dependencies are properly handled.

    Break down the packages into smaller, more manageable tasks: Instead of trying to refactor the entire package at once, it may be helpful to break down the package into smaller, more manageable tasks. This will make it easier to identify the dependencies and ensure that each task is properly handled.

    Use ADF Data Flow for complex transformations: ADF Data Flow provides a visual interface for building complex data transformations and is a good option for handling complex transformations that cannot be easily handled with ADF pipelines.

    Use Azure Functions for custom logic: If the VB scripts contain custom logic that cannot be easily translated to ADF pipelines, you can consider using Azure Functions to handle the custom logic. Azure Functions can be triggered by ADF pipelines and can be used to perform custom logic and transformations.

    Use Azure Batch for parallel processing: If the packages require parallel processing, you can consider using Azure Batch to handle the parallel processing. Azure Batch can be used to run compute-intensive workloads and can be integrated with ADF pipelines.

    Use Azure DevOps for version control and deployment: To manage the refactoring process and ensure that changes are properly versioned and deployed, you can consider using Azure DevOps. Azure DevOps provides tools for version control, continuous integration, and continuous deployment, and can be integrated with ADF pipelines.

    Test and validate the refactored packages: After refactoring the packages into ADF pipelines, it is important to test and validate the refactored packages to ensure that they are working as expected. This can be done using ADF pipeline debugging and monitoring tools.

    I hope this helps. Please let me know if you have any further questions.

    1 person found this answer helpful.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.