Azure Data Factory - Using Two Data Factories

Shivoy Thakral 20 Reputation points
2024-03-13T23:48:24.3033333+00:00

Hi Everyone. I have two Data Factories D1 and D2. D1 is exposed to public and works on days ingress from sources to centralised storage. D2 works on data copied from centralized by copying it to Adls and further processing. Here - only D1 can push / pull data from outside sources so we had to refactor all our pipelines for these hops. Now - I have 4 instances of D2 - one for each environment (Development to Production). And they all share D1. Due to this we had to recreate pipelines with different linked services and datasets to correctly copy data and modify the respective environment logs. The problem is - for shared ADF - we now have to update all the pipelines if there is a change in dev instances of pipelines (due to the recreation). Is there a better way to deal with the manual effort - as we could not figure out a way to find and replace these jsons so that one can serve for all. I'm unable to find a solution in which I can make changes only in one instance and then somehow being able to deploy across these specific recreated pipelines for upper environments. Let me know if there's a better way - I am open to some thoughts for better clarity. The solution would be really impactful here.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,666 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Arjun Karthikeyan S 155 Reputation points
    2024-03-14T03:13:22.4733333+00:00

    Hi Shivoy Thakral Good day,

    Here are few suggestions that you can do to make it better,

    Define parameters in you ADF Pipelines. By this you can have parameters for things like linked services, datasets, activities. Then you can create a template for your pipelines using the parameters you have given to your pipelines. By this you can easily deploy the same pipeline structure across different environments. Use dynamic content in your ADF Pipeline. Store environment specific configurations like linked service credentials, paths in a centralized configuration store like Azure Key Vault and you can reference it in your pipeline. By this you can update it in a one and it may make changes across your pipeline. You can automate your deployment and establish CI/CD pipeline for your ADF. This may streamline your deployment changes and may reduce errors. You can use tools like Azure DevOps to automate your CI/CD pipeline. You can also consider Infrastructure as a Code pipeline for managing your ADF resources. You can use Azure Resource Manager to do this. By these you can create more efficient and scalable approach for your pipelines.

    Thank You.

    If the answer is helpful, please click "Accept Answer" and kindly upvote it. If you have an extra question about the answer, please click "Comment".


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.