What should be an effective devops code merge and branching strategy for Azure data factory

Smriti, Smriti 65 Reputation points
2024-07-25T18:51:23.5566667+00:00

I have data factory instances in dev environment, stg environment and prod environment. dev ADF is connected with a git repo. branching strategy for a regular feature is feature branch -- > stg branch {collaboration} --> publish happens and arm templates generated in stg_adf_publish branch -->deployed to stg ADF for QA ---> After successful QA deployed to prod ADF ---- then stg to main PR is done . Main branch acts as the golden copy/latest prod blueprint.

Now i have another ADF in stg environment for handling quick/hot fixes which is called release ADF. Here the same Repo is connected keeping main branch as collaboration branch and publish branch i prd_adf_publish. Currently What is the strategy:

A hot fix in prod ADF is fixed by creating a fix branch from main and once development is done Pr created to main then publish from main and the live mode gets updated the QA test the fix in this ADF and once QA done, deploy the code to Prod ADF through prd_adf_publish branch.

Now PRD fix is deployed, from main to stg branch sync the fix to stg and then again deploy the feature and fix in normal stg ADF to allow QA test the feature in presence of the fix.

This approach has some issues and some expert advice and solutions :

  1. We are not able to follow a standard or best practice of updating the main branch after prod deployment. In fix deployment we are deviating from the process as we update the main branch just before prod deployment. Please help in setting up the right branching for this use case.
  2. there are some Managed private endpoints and managed Integration runtimes in each ADF whose name are environment specific. Like in dev there is an MPE like dev-mpe-1 and in stg it is stg-mpe-1 and prod has prd-mpe-1 having their properties as per their environment. If i am in stg branch in dev ADF and create a PR to main {where main already have stg-mpe-1 from the stg release factory}, the PR will merge the dev-mpe-1 in main and the publishing from both these branches will include dev and stg mpes in ARM templates. Please suggest a solution for preventing cross environment parameter issues.
  3. In stg the publish_config will contain stg_adf_publish branch while in main it will contain prd_adf_publish branch. During code merge within these branches, i want to preserve the publish_config.json contents in the respective environments. I don't want the values to change in their respective environments otherwise the publish branch will change in dev and release factory. How can I achieve this???

Thanks in advance for patiently reading this issue.

Waiting for a response and help in this matter .

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,246 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Chandra Boorla 8,310 Reputation points Microsoft Vendor
    2024-07-26T06:35:21.8933333+00:00

    Hi @Smriti, Smriti

    Thanks for the question and using MS Q&A platform.

    As I understand that, you have three Azure Data Factory instances: Dev ADF, Stg ADF, and Prod ADF. The Dev ADF is connected to a Git repository with the following branching strategy for feature development: Feature branch → Stg branch (collaboration) → Publish to stg_adf_publish → Deploy to Stg ADF for QA → Deploy to Prod ADF after QA → PR from Stg to Main (golden copy). For hot fixes, you use a separate Release ADF in the Stg environment, connected to the same Git repository, with Main as the collaboration branch and prd_adf_publish as the publish branch.

    According to the above scenario, please follow the below mentioned approach that might help you:

    • To keep things consistent and avoid mistakes, you should follow this branching strategy: For feature development, start by creating a feature branch and then merge it into the staging branch. Publish the changes to the stg_adf_publish branch and deploy to the staging ADF for QA. After successful QA, create a pull request from the stg branch to the main branch and then deploy to the production ADF through the prd_adf_publish branch. For hot fixes, create a fix branch from the main branch, merge it into the stg branch, perform QA in the staging ADF, and then merge the stg branch back into the main branch before deploying to production through the prd_adf_publish branch. This ensures that the main branch is always updated after successful QA in the staging environment.
    • To prevent issues with parameters across different environments, you should use parameter files or environment-specific configuration files. Create separate parameter files for each environment, such as parameters.dev.json, parameters.stg.json, and parameters.prod.json. During deployment, reference these files to ensure the correct parameters are used. Additionally, use environment-specific branches for configuration files and make sure that merges do not overwrite environment-specific settings by using merge strategies that preserve these files.
    • To maintain the integrity of publish_config.json across different environments, store this file in environment-specific branches. Use a script or automation tool to ensure the correct publish_config.json is used during deployments. Implement merge strategies that prevent publish_config.json from being overwritten during merges, and use Git attributes to define how specific files should be merged.

    For additional information please refer to the following links:

    https://learn.microsoft.com/en-us/answers/questions/1463197/best-practices-to-organize-the-branchs-about-adf-i

    https://kunaldaskd.medium.com/seamless-integration-and-deployment-of-azure-data-factory-by-using-azure-devops-775832a94f6e

    I hope this information helps! Please do let us know if you have any further questions.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.