Azure DevOps Services
Pipeline caching can help reduce build time by allowing the outputs or downloaded dependencies from one run to be reused in later runs, thereby reducing or avoiding the cost to recreate or redownload the same files again. Caching is especially useful in scenarios where the same dependencies are downloaded over and over at the start of each run. This is often a time consuming process involving hundreds or thousands of network calls.
Caching can be effective at improving build time provided the time to restore and save the cache is less than the time to produce the output again from scratch. Because of this, caching may not be effective in all scenarios and may actually have a negative impact on build time.
Caching is currently supported in CI and deployment jobs, but not classic release jobs.
When to use artifacts versus caching
Pipeline caching and pipeline artifacts perform similar functions but are designed for different scenarios and shouldn't be used interchangeably.
Use pipeline artifacts when you need to take specific files produced in one job and share them with other jobs (and these other jobs will likely fail without them).
Use pipeline caching when you want to improve build time by reusing files from previous runs (and not having these files won't impact the job's ability to run).
Pipeline caching and pipeline artifacts are free for all tiers (free and paid). see Artifacts storage consumption for more details.
Cache task: how it works
Caching is added to a pipeline using the Cache task. This task works like any other task and is added to the
steps section of a job.
When a cache step is encountered during a run, the task restores the cache based on the provided inputs. If no cache is found, the step completes and the next step in the job is run.
After all steps in the job have run and assuming a successful job status, a special "Post-job: Cache" step is automatically added and triggered for each "restore cache" step that wasn't skipped. This step is responsible for saving the cache.
Caches are immutable, meaning that once a cache is created, its contents cannot be changed.
Configure the Cache task
The Cache task has two required arguments: key and path:
- path: the path of the folder to cache. Can be an absolute or a relative path. Relative paths are resolved against
You can use predefined variables to store the path to the folder you want to cache, however wildcards are not supported.
- key: should be set to the identifier for the cache you want to restore or save. Keys are composed of a combination of string values, file paths, or file patterns, where each segment is separated by a
Fixed value (like the name of the cache or a tool name) or taken from an environment variable (like the current OS or current job name)
Path to a specific file whose contents will be hashed. This file must exist at the time the task is run. Keep in mind that any key segment that "looks like a file path" will be treated like a file path. In particular, this includes segments containing a
.. This could result in the task failing when this "file" doesn't exist.
To avoid a path-like string segment from being treated like a file path, wrap it with double quotes, for example:
"my.key" | $(Agent.OS) | key.file
Comma-separated list of glob-style wildcard pattern that must match at least one file. For example:
**/yarn.lock: all yarn.lock files under the sources directory
*/asset.json, !bin/**: all asset.json files located in a directory under the sources directory, except under the bin directory
The contents of any file identified by a file path or file pattern is hashed to produce a dynamic cache key. This is useful when your project has file(s) that uniquely identify what is being cached. For example, files like
Pipfile.lock are commonly referenced in a cache key since they all represent a unique set of dependencies.
Relative file paths or file patterns are resolved against
Here's an example showing how to cache dependencies installed by Yarn:
variables: YARN_CACHE_FOLDER: $(Pipeline.Workspace)/.yarn steps: - task: Cache@2 inputs: key: '"yarn" | "$(Agent.OS)" | yarn.lock' restoreKeys: | "yarn" | "$(Agent.OS)" "yarn" path: $(YARN_CACHE_FOLDER) displayName: Cache Yarn packages - script: yarn --frozen-lockfile
In this example, the cache key contains three parts: a static string ("yarn"), the OS the job is running on since this cache is unique per operating system, and the hash of the
yarn.lock file that uniquely identifies the set of dependencies in the cache.
On the first run after the task is added, the cache step will report a "cache miss" since the cache identified by this key doesn't exist. After the last step, a cache will be created from the files in
$(Pipeline.Workspace)/.yarn and uploaded. On the next run, the cache step will report a "cache hit" and the contents of the cache will be downloaded and restored.
Pipeline.Workspace is the local path on the agent running your pipeline where all directories are created. This variable has the same value as
restoreKeys can be used if one wants to query against multiple exact keys or key prefixes. This is used to fall back to another key in the case that a
key doesn't yield a hit. A restore key will search for a key by prefix and yield the latest created cache entry as a result. This is useful if the pipeline is unable to find an exact match but wants to use a partial cache hit instead. To insert multiple restore keys, simply delimit them by using a new line to indicate the restore key (see the example for more details). The order of which restore keys will be tried against will be from top to bottom.
Required software on self-hosted agent
|Archive software / Platform||Windows||Linux||Mac|
The above executables need to be in a folder listed in the PATH environment variable. Keep in mind that the hosted agents come with the software included, this is only applicable for self-hosted agents.
Here's an example of how to use restore keys by Yarn:
variables: YARN_CACHE_FOLDER: $(Pipeline.Workspace)/.yarn steps: - task: Cache@2 inputs: key: '"yarn" | "$(Agent.OS)" | yarn.lock' restoreKeys: | yarn | "$(Agent.OS)" yarn path: $(YARN_CACHE_FOLDER) displayName: Cache Yarn packages - script: yarn --frozen-lockfile
In this example, the cache task attempts to find if the key exists in the cache. If the key doesn't exist in the cache, it tries to use the first restore key
yarn | $(Agent.OS).
This will attempt to search for all keys that either exactly match that key or has that key as a prefix. A prefix hit can happen if there was a different
yarn.lock hash segment.
For example, if the following key
yarn | $(Agent.OS) | old-yarn.lock was in the cache where the
old-yarn.lock yielded a different hash than
yarn.lock, the restore key will yield a partial hit.
If there's a miss on the first restore key, it will then use the next restore key
yarn which will try to find any key that starts with
yarn. For prefix hits, the result will yield the most recently created cache key as the result.
A pipeline can have one or more caching task(s). There is no limit on the caching storage capacity, and jobs and tasks from the same pipeline can access and share the same cache.
Cache isolation and security
To ensure isolation between caches from different pipelines and different branches, every cache belongs to a logical container called a scope. Scopes provide a security boundary that ensures a job from one pipeline cannot access the caches from a different pipeline, and a job building a PR has read access to the caches for the PR's target branch (for the same pipeline), but cannot write (create) caches in the target branch's scope.
When a cache step is encountered during a run, the cache identified by the key is requested from the server. The server then looks for a cache with this key from the scopes visible to the job, and returns the cache (if available). On cache save (at the end of the job), a cache is written to the scope representing the pipeline and branch. See below for more details.
CI, manual, and scheduled runs
|main branch (default branch)||Yes||No|
Pull request runs
|Intermediate branch (such as
|main branch (default branch)||Yes||No|
Pull request fork runs
|Intermediate branch (such as
|main branch (default branch)||Yes||No|
Because caches are already scoped to a project, pipeline, and branch, there is no need to include any project, pipeline, or branch identifiers in the cache key.
Conditioning on cache restoration
In some scenarios, the successful restoration of the cache should cause a different set of steps to be run. For example, a step that installs dependencies can be skipped if the cache was restored. This is possible using the
cacheHitVar task input. Setting this input to the name of an environment variable will cause the variable to be set to
true when there's a cache hit,
inexact on a restore key cache hit, otherwise it will be set to
false. This variable can then be referenced in a step condition or from within a script.
In the following example, the
install-deps.sh step is skipped when the cache is restored:
steps: - task: Cache@2 inputs: key: mykey | mylockfile restoreKeys: mykey path: $(Pipeline.Workspace)/mycache cacheHitVar: CACHE_RESTORED - script: install-deps.sh condition: ne(variables.CACHE_RESTORED, 'true') - script: build.sh
For Ruby projects using Bundler, override the
BUNDLE_PATH environment variable used by Bundler to set the path Bundler will look for Gems in.
variables: BUNDLE_PATH: $(Pipeline.Workspace)/.bundle steps: - task: Cache@2 displayName: Bundler caching inputs: key: 'gems | "$(Agent.OS)" | Gemfile.lock' path: $(BUNDLE_PATH) restoreKeys: | gems | "$(Agent.OS)" gems
Ccache is a compiler cache for C/C++. To use Ccache in your pipeline make sure
Ccache is installed, and optionally added to your
PATH (see Ccache run modes). Set the
CCACHE_DIR environment variable to a path under
$(Pipeline.Workspace) and cache this directory.
variables: CCACHE_DIR: $(Pipeline.Workspace)/ccache steps: - bash: | sudo apt-get install ccache -y echo "##vso[task.prependpath]/usr/lib/ccache" displayName: Install ccache and update PATH to use linked versions of gcc, cc, etc - task: Cache@2 displayName: Ccache caching inputs: key: 'ccache | "$(Agent.OS)" | $(Build.SourceVersion)' path: $(CCACHE_DIR) restoreKeys: | ccache | "$(Agent.OS)"
See Ccache configuration settings for more details.
Caching Docker images dramatically reduces the time it takes to run your pipeline.
variables: repository: 'myDockerImage' dockerfilePath: '$(Build.SourcesDirectory)/app/Dockerfile' tag: '$(Build.BuildId)' pool: vmImage: 'ubuntu-latest' steps: - task: Cache@2 displayName: Cache task inputs: key: 'docker | "$(Agent.OS)" | cache' path: $(Pipeline.Workspace)/docker cacheHitVar: CACHE_RESTORED #Variable to set to 'true' when the cache is restored - script: | docker load -i $(Pipeline.Workspace)/docker/cache.tar displayName: Docker restore condition: and(not(canceled()), eq(variables.CACHE_RESTORED, 'true')) - task: Docker@2 displayName: 'Build Docker' inputs: command: 'build' repository: '$(repository)' dockerfile: '$(dockerfilePath)' tags: | '$(tag)' - script: | mkdir -p $(Pipeline.Workspace)/docker docker save -o $(Pipeline.Workspace)/docker/cache.tar $(repository):$(tag) displayName: Docker save condition: and(not(canceled()), not(failed()), ne(variables.CACHE_RESTORED, 'true'))
- key: (required) - a unique identifier for the cache.
- path: (required) - path of the folder or file that you want to cache.
For Golang projects, you can specify the packages to be downloaded in the go.mod file. If your
GOCACHE variable isn't already set, set it to where you want the cache to be downloaded.
variables: GO_CACHE_DIR: $(Pipeline.Workspace)/.cache/go-build/ steps: - task: Cache@2 inputs: key: 'go | "$(Agent.OS)" | go.mod' restoreKeys: | go | "$(Agent.OS)" path: $(GO_CACHE_DIR) displayName: Cache GO packages
Using Gradle's built-in caching support can have a significant impact on build time. To enable the build cache, set the
GRADLE_USER_HOME environment variable to a path under
$(Pipeline.Workspace) and either run your build with
--build-cache or add
org.gradle.caching=true to your
variables: GRADLE_USER_HOME: $(Pipeline.Workspace)/.gradle steps: - task: Cache@2 inputs: key: 'gradle | "$(Agent.OS)" | **/build.gradle.kts' # Swap build.gradle.kts for build.gradle when using Groovy restoreKeys: | gradle | "$(Agent.OS)" gradle path: $(GRADLE_USER_HOME) displayName: Configure gradle caching - task: Gradle@2 inputs: gradleWrapperFile: 'gradlew' tasks: 'build' options: '--build-cache' displayName: Build - script: | # stop the Gradle daemon to ensure no files are left open (impacting the save cache operation later) ./gradlew --stop displayName: Gradlew stop
- restoreKeys: The fallback keys if the primary key fails (Optional)
Caches are immutable, once a cache with a particular key is created for a specific scope (branch), the cache cannot be updated. This means that if the key is a fixed value, all subsequent builds for the same branch will not be able to update the cache even if the cache's contents have changed. If you want to use a fixed key value, you must use the
restoreKeys argument as a fallback option.
Maven has a local repository where it stores downloads and built artifacts. To enable, set the
maven.repo.local option to a path under
$(Pipeline.Workspace) and cache this folder.
variables: MAVEN_CACHE_FOLDER: $(Pipeline.Workspace)/.m2/repository MAVEN_OPTS: '-Dmaven.repo.local=$(MAVEN_CACHE_FOLDER)' steps: - task: Cache@2 inputs: key: 'maven | "$(Agent.OS)" | **/pom.xml' restoreKeys: | maven | "$(Agent.OS)" maven path: $(MAVEN_CACHE_FOLDER) displayName: Cache Maven local repo - script: mvn install -B -e
If you're using a Maven task, make sure to also pass the
MAVEN_OPTS variable because it gets overwritten otherwise:
- task: Maven@4 inputs: mavenPomFile: 'pom.xml' mavenOptions: '-Xmx3072m $(MAVEN_OPTS)'
If you use
PackageReferences to manage NuGet dependencies directly within your project file and have a
packages.lock.json file, you can enable caching by setting the
NUGET_PACKAGES environment variable to a path under
$(UserProfile) and caching this directory. See Package reference in project files for more details on how to lock dependencies.
If you want to use multiple packages.lock.json, you can still use the following example without making any changes. The content of all the packages.lock.json files will be hashed and if one of the files is changed, a new cache key will be generated.
variables: NUGET_PACKAGES: $(Pipeline.Workspace)/.nuget/packages steps: - task: Cache@2 inputs: key: 'nuget | "$(Agent.OS)" | $(Build.SourcesDirectory)/**/packages.lock.json' restoreKeys: | nuget | "$(Agent.OS)" nuget path: $(NUGET_PACKAGES) displayName: Cache NuGet packages
There are different ways to enable caching in a Node.js project, but the recommended way is to cache npm's shared cache directory. This directory is managed by npm and contains a cached version of all downloaded modules. During install, npm checks this directory first (by default) for modules that can reduce or eliminate network calls to the public npm registry or to a private registry.
Because the default path to npm's shared cache directory is not the same across all platforms, it's recommended to override the
npm_config_cache environment variable to a path under
$(Pipeline.Workspace). This also ensures the cache is accessible from container and non-container jobs.
variables: npm_config_cache: $(Pipeline.Workspace)/.npm steps: - task: Cache@2 inputs: key: 'npm | "$(Agent.OS)" | package-lock.json' restoreKeys: | npm | "$(Agent.OS)" path: $(npm_config_cache) displayName: Cache npm - script: npm ci
If your project doesn't have a
package-lock.json file, reference the
package.json file in the cache key input instead.
npm ci deletes the
node_modules folder to ensure that a consistent, repeatable set of modules is used, you should avoid caching
node_modules when calling
Like with npm, there are different ways to cache packages installed with Yarn. The recommended way is to cache Yarn's shared cache folder. This directory is managed by Yarn and contains a cached version of all downloaded packages. During install, Yarn checks this directory first (by default) for modules, which can reduce or eliminate network calls to public or private registries.
variables: YARN_CACHE_FOLDER: $(Pipeline.Workspace)/.yarn steps: - task: Cache@2 inputs: key: 'yarn | "$(Agent.OS)" | yarn.lock' restoreKeys: | yarn | "$(Agent.OS)" yarn path: $(YARN_CACHE_FOLDER) displayName: Cache Yarn packages - script: yarn --frozen-lockfile
Set up your pipeline caching with Anaconda environments:
variables: CONDA_CACHE_DIR: /usr/share/miniconda/envs # Add conda to system path steps: - script: echo "##vso[task.prependpath]$CONDA/bin" displayName: Add conda to PATH - bash: | sudo chown -R $(whoami):$(id -ng) $(CONDA_CACHE_DIR) displayName: Fix CONDA_CACHE_DIR directory permissions - task: Cache@2 displayName: Use cached Anaconda environment inputs: key: 'conda | "$(Agent.OS)" | environment.yml' restoreKeys: | python | "$(Agent.OS)" python path: $(CONDA_CACHE_DIR) cacheHitVar: CONDA_CACHE_RESTORED - script: conda env create --quiet --file environment.yml displayName: Create Anaconda environment condition: eq(variables.CONDA_CACHE_RESTORED, 'false')
- task: Cache@2 displayName: Cache Anaconda inputs: key: 'conda | "$(Agent.OS)" | environment.yml' restoreKeys: | python | "$(Agent.OS)" python path: $(CONDA)/envs cacheHitVar: CONDA_CACHE_RESTORED - script: conda env create --quiet --file environment.yml displayName: Create environment condition: eq(variables.CONDA_CACHE_RESTORED, 'false')
For PHP projects using Composer, override the
COMPOSER_CACHE_DIR environment variable used by Composer.
variables: COMPOSER_CACHE_DIR: $(Pipeline.Workspace)/.composer steps: - task: Cache@2 inputs: key: 'composer | "$(Agent.OS)" | composer.lock' restoreKeys: | composer | "$(Agent.OS)" composer path: $(COMPOSER_CACHE_DIR) displayName: Cache composer - script: composer install
Known issues and feedback
If you're experiencing issues setting up caching for your pipeline, check the list of open issues in the microsoft/azure-pipelines-tasks repo. If you don't see your issue listed, create a new one and provide the necessary information about your scenario.
Q: Can I clear a cache?
A: Clearing a cache is currently not supported. However you can add a string literal (such as
version2) to your existing cache key to change the key in a way that avoids any hits on existing caches. For example, change the following cache key from this:
key: 'yarn | "$(Agent.OS)" | yarn.lock'
key: 'version2 | yarn | "$(Agent.OS)" | yarn.lock'
Q: When does a cache expire?
A: Caches expire after seven days of no activity.
Q: When does the cache get uploaded?
A: After the last step of your pipeline a cache will be created from your cache
path and uploaded. See the example for more details.
Q: Is there a limit on the size of a cache?
A: There's no enforced limit on the size of individual caches or the total size of all caches in an organization.