Jaa


What is DBFS?

The term DBFS is used to describe two parts of the platform:

  • DBFS root
  • DBFS mounts

Storing and accessing data using DBFS root or DBFS mounts is a deprecated pattern and not recommended by Databricks. For recommendations for working with files, see Work with files on Azure Databricks.

What is the Databricks File System?

The term DBFS comes from Databricks File System, which describes the distributed file system used by Azure Databricks to interact with cloud-based storage.

The underlying technology associated with DBFS is still part of the Azure Databricks platform. For example, dbfs:/ is an optional scheme when interacting with Unity Catalog volumes.

Past and current warnings and caveats about DBFS only apply to the DBFS root or DBFS mounts.

How does DBFS work with Unity Catalog?

Databricks recommends using Unity Catalog to manage access to all data.

Unity Catalog adds the concepts of external locations, storage credentials, and volumes to help organizations provide the least privileged access to data in cloud object storage.

Some security configurations provide direct access to Unity Catalog-managed resources and DBFS, primarily for organizations that have completed migrations or partially migrated to Unity Catalog. See Best practices for DBFS and Unity Catalog.

What is the DBFS root?

The DBFS root is a storage location provisioned during workspace creation in the cloud account containing the Azure Databricks workspace. For details on DBFS root configuration and deployment, see the Azure Databricks quickstart.

Databricks does not recommend storing production data, libraries, or scripts in DBFS root. See Recommendations for working with DBFS root.

To configure customer-managed keys for the storage account that includes the DBFS root, see Customer-managed keys for DBFS root.

To limit network access to the storage account that includes the DBFS root, see Enable firewall support for your workspace storage account.

Mount object storage

Note

DBFS mounts are deprecated. Databricks recommends using Unity Catalog volumes. See What are Unity Catalog volumes?.

Mounting object storage to DBFS allows you to access objects in object storage as if they were on the local file system. Mounts store Hadoop configurations necessary for accessing storage. For more information, see Mounting cloud object storage on Azure Databricks.