Share via


What are volumes?

Volumes are Unity Catalog objects that govern access to non-tabular data. They provide a logical layer over cloud object storage so you can store, organize, and manage files with centralized governance.

For comprehensive documentation on volumes, see What are Unity Catalog volumes?.

Unity Catalog supports two types of volumes:

  • Managed volumes: Azure Databricks manages the lifecycle and cloud storage location
  • External volumes: You control the cloud storage location and lifecycle

What can you do with Unity Catalog volumes?

You can perform file management operations with volumes using multiple interfaces and tools:

You can use volumes with Databricks features that require a file system path. Volumes give you a governed path that works consistently across users and workspaces. For example:

  • Data ingestion: Use volumes as the source location for data ingestion. Start from files in a volume and ingest them into tables using:
  • Compute log delivery: Configure compute log delivery to write logs into a volume path, so log access is governed by Unity Catalog. See Compute log delivery.
  • File arrival triggers: Use file arrival triggers to start Lakeflow Jobs when new files arrive in a volume. See Trigger jobs when new files arrive.
  • Cluster libraries: Install cluster libraries from a volume (JARs, wheels, requirements.txt), so library access is governed by Unity Catalog. See Install libraries from a volume.
  • Init scripts: Store and run cluster-scoped init scripts from a volume, so access to init scripts is governed by Unity Catalog. See Cluster-scoped init scripts.
  • ML experiment artifacts: Store ML experiment artifacts (models, metrics, and output files) in a volume so access to your MLflow experiment outputs is governed by Unity Catalog. See Organize training runs with MLflow experiments.