Share data using the Delta Sharing open sharing protocol (for providers)
Article
This article gives an overview of how providers can use the Delta Sharing open sharing protocol to share data from your Unity Catalog-enabled Azure Databricks workspace with any user on any computing platform, anywhere.
Who should use the Delta Sharing open sharing protocol?
There are three ways to share data using Delta Sharing:
The Databricks open sharing protocol, covered in this article, lets you share data that you manage in a Unity Catalog-enabled Databricks workspace with users on any computing platform.
This approach uses the Delta Sharing server that is built into Azure Databricks and is useful when you manage data using Unity Catalog and want to share it with users who don’t use Databricks or don’t have access to a Unity Catalog-enabled Databricks workspace. The integration with Unity Catalog on the provider side simplifies setup and governance for providers.
A customer-managed implementation of the open-source Delta Sharing server lets you share from any platform to any platform, whether Databricks or not.
The Databricks-to-Databricks sharing protocol lets you share data from your Unity Catalog-enabled workspace with users who also have access to a Unity Catalog-enabled Databricks workspace.
For an introduction to Delta Sharing and more information about these three approaches, see What is Delta Sharing?.
Delta Sharing open sharing workflow
This section provides a high-level overview of the open sharing workflow, with links to detailed documentation for each step.
In the Delta Sharing open sharing model:
The data provider creates a recipient, which is a named object that represents a user or group of users that the data provider wants to share data with.
When the data provider creates the recipient, Azure Databricks generates a token, a credential file that includes the token, and an activation link that the data provider can send to the recipient to access the credential file.
The data provider creates a share, which is a named object that contains a collection of tables registered in a Unity Catalog metastore in the provider’s account.
The data provider sends the activation link to the recipient over a secure channel, along with instructions for using the activation link to download the credential file that the recipient will use to establish a secure connection with the data provider to receive the shared data.
Provider setup and security considerations for open sharing
Good token management is key to sharing data securely when you use the open sharing model:
Data providers on Azure Databricks who intend to use open sharing when they provide shares must configure the default recipient token lifetime when they enable Delta Sharing for their Unity Catalog metastore. Databricks recommends that you configure tokens to expire. See Enable Delta Sharing on a metastore.
Demonstrate understanding of common data engineering tasks to implement and manage data engineering workloads on Microsoft Azure, using a number of Azure services.