Using Cache between dependent stages in Azure Devops

Lilac 80 Reputation points
2023-08-29T15:12:45.3+00:00

I have 2 stages in a pipeline : Test and Build(dependent on Test). I am installing both Poetry and other dependencies on the test and build pipeline again. I want to reduce the time it takes for the CI pipeline to run. I came across this feature called cache, which will eliminate the need to run the same scripts in the build pipeline again. Can someone please help me by rewriting the attached yaml. I am very new to this.

pr: 
    - main

pool:
    vmImage: ubuntu-latest

variables:
    python_version: "3.10"
    package_name: "hatutu"
    notebooks_location: "notebooks"

stages:
- stage: Test
  jobs:
      - job: Test
        displayName: Test
        steps:
            - task: UsePythonVersion@0
              inputs:
                  versionSpec: '$(python_version)'
              displayName: 'Use Python $(python_version)'

            - script: |
                  curl -sSL https://install.python-poetry.org | python3 - --version=1.5.1
                  echo "##vso[task.setvariable variable=PATH]${PATH}:${HOME}/.poetry/bin"
              displayName: 'Install Poetry'

            - script: make install
              displayName: 'Install dependencies'

            - script: |
                  echo "y
                  $(WORKSPACE_REGION_URL)
                  $(DATABRICKS_PAT)
                  $(EXISTING_CLUSTER_ID)
                  $(WORKSPACE_ORG_ID)
                  15001" | poetry run databricks-connect configure
              displayName: 'Configure DBConnect'

            - script: make ci-test
              displayName: 'Run tests'

            - task: PublishTestResults@2
              condition: succeededOrFailed()
              inputs:
                  testResultsFiles: 'test-output/test-*.xml'
              displayName: 'Publish test results'

            - task: PublishCodeCoverageResults@1
              condition: succeededOrFailed()
              inputs:
                  codeCoverageTool: 'cobertura'
                  summaryFileLocation: 'test-output/coverage.xml'
                  pathToSources: '$(package_name)'

            - script: make ci-lint-flake8
              condition: succeededOrFailed()
              displayName: 'Run Flake8 linter'

            - script: make ci-lint-mypy
              condition: succeededOrFailed()
              displayName: 'Run mypy linter'

            - task: PublishTestResults@2
              condition: succeededOrFailed()
              inputs:
                  testResultsFiles: 'test-output/lint-*.xml'
              displayName: 'Publish linting results'

- stage: Build
  dependsOn: Test
  condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'))
  jobs:
      - job: Build
        displayName: Build
        steps:
            - task: UsePythonVersion@0
              inputs:
                  versionSpec: '$(python_version)'
              displayName: 'Use Python $(python_version)'

            - script: |
                  curl -sSL https://install.python-poetry.org | python3 - --version=1.5.1
                  echo "##vso[task.setvariable variable=PATH]${PATH}:${HOME}/.poetry/bin"
              displayName: 'Install Poetry'

            - script: make install
              displayName: 'Install dependencies'

            - script: make build
              displayName: "Build Package"

            - script: |
                  poetry config repositories.azure https://pkgs.dev.azure.com/$(ORGANIZATION)/_packaging/$(ARTIFACT_FEED)/pypi/upload
                  poetry config http-basic.azure $(AZD_USER) $(AZD_PAT)
                  poetry publish -r azure
              displayName: Publish to $(ARTIFACT_FEED)

            - task: PublishBuildArtifacts@1
              displayName: 'Deploy Notebooks'
              inputs:
                  PathtoPublish: $(notebooks_location)
                  ArtifactName: hatutu-notebooks
azure-devops
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
0 comments No comments
{count} votes

Accepted answer
  1. Amira Bedhiafi 33,071 Reputation points Volunteer Moderator
    2023-08-30T11:22:51.5766667+00:00

    You can use Azure DevOps caching capability for dependencies to avoid redundant installations in the build stage.

    You can updare your YAML by incorporating caching for Poetry and dependencies:

    
    pr: 
    
        - main
    
    pool:
    
        vmImage: ubuntu-latest
    
    variables:
    
        python_version: "3.10"
    
        package_name: "hatutu"
    
        notebooks_location: "notebooks"
    
        cache_key: poetry | "$(Agent.OS)" | "$(python_version)" | poetry.lock
    
    stages:
    
    - stage: Test
    
      jobs:
    
          - job: Test
    
            displayName: Test
    
            steps:
    
                - task: UsePythonVersion@0
    
                  inputs:
    
                      versionSpec: '$(python_version)'
    
                  displayName: 'Use Python $(python_version)'
    
                # Restore the cache
    
                - task: Cache@2
    
                  inputs:
    
                    key: '$(cache_key)'
    
                    path: $(PIPELINE_WORKSPACE)/.poetry_cache
    
                  displayName: 'Restore Poetry and dependencies cache'
    
                - script: |
    
                      curl -sSL https://install.python-poetry.org | python3 - --version=1.5.1
    
                      echo "##vso[task.setvariable variable=PATH]${PATH}:${HOME}/.poetry/bin"
    
                  displayName: 'Install Poetry'
    
                - script: make install
    
                  displayName: 'Install dependencies'
    
                # Save to the cache
    
                - task: Cache@2
    
                  inputs:
    
                    key: '$(cache_key)'
    
                    path: $(PIPELINE_WORKSPACE)/.poetry_cache
    
                    cacheHitVar: CACHE_RESTORED
    
                  displayName: 'Save Poetry and dependencies to cache'
    
                  condition: ne(variables.CACHE_RESTORED, 'true')
    
                # ... [Other steps remain unchanged]
    
    - stage: Build
    
      dependsOn: Test
    
      condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'))
    
      jobs:
    
          - job: Build
    
            displayName: Build
    
            steps:
    
                - task: UsePythonVersion@0
    
                  inputs:
    
                      versionSpec: '$(python_version)'
    
                  displayName: 'Use Python $(python_version)'
    
                # Restore the cache
    
                - task: Cache@2
    
                  inputs:
    
                    key: '$(cache_key)'
    
                    path: $(PIPELINE_WORKSPACE)/.poetry_cache
    
                  displayName: 'Restore Poetry and dependencies cache'
    
                # You can skip the installation of Poetry and dependencies in the Build stage if the cache is restored
    
                - script: make build
    
                  displayName: "Build Package"
    
                # ... [Other steps remain unchanged]
    
    

    Changes I made:

    • Introduced cache_key variable for generating a consistent cache key.
    • Introduced Cache@2 tasks in the Test stage to restore and save the Poetry installation and dependencies to/from cache.
    • Introduced Cache@2 task in the Build stage to restore Poetry installation and dependencies from the cache.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.