Advanced Feature Branch Development

Brian Jones 31 Reputation points
2021-09-27T18:13:40.187+00:00

The CICD Documentation discusses the use of feature branches for development. However, all branches in the repo operate on the same Azure Data Factory and connected services.

We are using feature branches in our ADF development to allow multiple concurrent development activities. The thought was for each of the feature branches to operate independently of each other as they became “ready for prod”.

In our environment, we have dedicated resource groups for DEV, QA, PROD, and each resource group has the following services:

  • Data Factory V2
  • Function App
  • Storage Account Gen2
  • Key Vault
  • Analysis Service
  • SQL Database

Additionally, we have Power BI connected to our Tabular Model in AAS.

The problem we are facing is the singular ADF Dev instance and the fact that all of the changes being made appear in the same “environment”.

As an example, I could have 2 different feature branches each operating on different aspects of the Tabular Data Model:

  • FeatureBranch-A is working with the Customers entity and requires changes to the ADF Pipelines/Data Flows, Tabular Model, and Power BI Reports
  • FeatureBranch-B is working with the Orders entity and also requires changes to the ADF Pipelines/Data Flows, Tabular Model, and Power BI Reports

In the single Dev environment, I can isolate the all of the changes in their respected services using feature branches. However, when “debugging” the ADF artifacts, both ADF feature branches are connected to the same resources (SQL DB, AAS, Power BI, etc.). Ideally, the changes being made across each of the feature branches would be independent of and isolated from each other. Essentially, I'm finding that ADF is causing my development to become single threaded. Rather, I can't test the changes independent of each other due to the use of the various linked services.

What I’ve tried…

Attempt # 1:
I spun up a completely new Resource Group (DEV02) with the full complement of services, each containing DEV02 in their names. I then connected the DEV02 ADF instance to my DEV ADF Azure DevOps Git Repo, but the collaboration branch was my DEV02 specific Feature Branch. I thought I had this working, but then realized while Live-Mode observes the DEV02 specific linked services, when I switch to the Azure DevOps Git repo it reverts all of the linked services to those used by the DEV (based on what is in the Git Repo). I don't want to change any of the values as they would be merged up to the master branch when my changes are made and negatively impact the DEV environment. I suppose this could be 'easily' changed back, but seems to be a bit of a land mine.

Attempt # 2:
The ADF Linked Services are using the Key Vault to get the:

I was looking to parameterize the pertinent values in the linked services, but the Azure Function does not support this option and is integral in our process. Also, this was a significant overhead change to all existing artifacts to ensure the proper value was being provided for the parameter.

Further Consideration # 1:
I’m considering using a global variable for Attempt # 2. However, I believe that will spill into my deployment process and without having the Function support I’m not sure it will be worthwhile.

Further Consideration # 2:
I’ve also considered just setting up the DEV02 ADF on its own Repo, but then I have to manually migrate the changes from DEV02 to DEV. This seems extremely fragile.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
{count} votes

1 answer

Sort by: Most helpful
  1. Brian Jones 31 Reputation points
    2021-10-01T17:00:02.7+00:00

    @MartinJaffer-MSFT - Sorry... my response was too long for the reply, so I put it here. :)

    cc: @MarkKromer-MSFT , @Abhishek Narain | MSFT

    I'm not attempting to execute via the Trigger now functionality and am aware of the difference in behavior.

    Unfortunately, the issue I am trying to explain is difficult to do so purely via text. It works better in a conversational format.
    Furthermore, only our DEV instance(s) are currently connected to Git Repos. All other environments are built using the CICD Pipeline and operate in Live mode. These environments do not suffer the same challenges as DEV as they are operating on published artifacts and configuration.

    My challenge remains at the DEV level and specifically when we have conflicting changes that need to be performed on independent tracks.

    We are using key vault where possible to store credentials and connection strings.

    Let's also make the following assumptions:

    • Azure Key Vault is used where possible
    • 3 developers on the team (A, B, and C)
    • changes being performed affect linked service assets
      -- Azure SQL Database
      -- Azure Function
      -- Azure Storage

    Developer A is working on changes in the DEV environment and using the /dev branch to support a bug fix currently in PROD.
    Developer B is working on changes in the DEV environment and using the /fb1 branch to add new functionality for an upcoming release.
    Developer C is working on changes in the DEV environment and using the /fb2 branch to modify the behavior of existing functionality that has the potential to break the way the existing behavior operates.

    If Developers A, B, and C are all working on the same ADF instance they likely can do so without significant issue. However, they will likely run into problems when they need to make changes to tables or stored procedures in Azure SQL or any of the Azure services used by the DEV Data Factory.

    137015-dev-sameresources.png

    One way to avoid these conflicts are to create completely separate ADF and Azure services instances. You can still point the "new" ADF instance to the Feature Branch within the same Repo, but now you run into a problem with the linked services configuration files. As a result, the ADF-FB1 instance is still connected to the same linked services as ADF-DEV.

    137052-dev-sameresources2branches.png

    Ideally, you would be able to manually edit the configuration files in the linkedService folder to modify:

    • azureFunctionLinkedService.json -> functionAppUrl (does not support dynamic content)
    • dataLakeStorageLinkedService.json -> url
    • keyVaultLinkedService.json -> baseUrl

    This would allow ADF-FB1 to be connected to the FB1 specific Azure instances.

    137053-dev-differentresources.png

    The challenge here is with the merging of changes across the branches. When you merge from /fb1 into /dev, you will overwrite the linked service configs and now DEV will point to the FB1 service instances.

    Potential solutions:

    1. Adding parameterization support for the Azure Function URL would allow us to parameterize the URL and would likely eliminate the need to alter the json config files.
    2. Git magic that would prevent certain commits, files, folders from being included in a merge would prevent the linkedService changes from being promoted when not desired/intended. This would likely need to be done explicitly as sometimes you want them and sometimes you don't.
    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.