หมายเหตุ
การเข้าถึงหน้านี้ต้องได้รับการอนุญาต คุณสามารถลอง ลงชื่อเข้าใช้หรือเปลี่ยนไดเรกทอรีได้
การเข้าถึงหน้านี้ต้องได้รับการอนุญาต คุณสามารถลองเปลี่ยนไดเรกทอรีได้
Important
This feature is in Public Preview.
Important
Shallow clone support differs for Unity Catalog managed and external tables. For managed tables use Databricks Runtime 13.3 and above, and for external tables use Databricks Runtime 14.2 and above.
You can only clone Unity Catalog managed tables to Unity Catalog managed tables and Unity Catalog external tables to Unity Catalog external tables. VACUUM behavior differs between managed and external tables. See Use VACUUM with Unity Catalog shallow clones.
You can use shallow clone to create new Unity Catalog tables from existing Unity Catalog tables. Shallow clone support for Unity Catalog allows you to create tables with access control privileges independent from their parent tables without needing to copy underlying data files.
For information about cloning a table, see Clone a table on Azure Databricks.
Create a Unity Catalog managed shallow clone
Create a shallow clone of a managed table in Unity Catalog.
CREATE TABLE <catalog-name>.<schema-name>.<target-table-name>
SHALLOW CLONE <catalog-name>.<schema-name>.<source-table-name>
To create a managed shallow clone on Unity Catalog, you must have the following privileges on the source and target resources.
| Resource | Permissions required |
|---|---|
| Source schema | USE SCHEMA |
| Source catalog | USE CATALOG |
| Target schema | USE SCHEMA, CREATE TABLE |
| Target catalog | USE CATALOG |
Like other create table statements, the user who creates a shallow clone owns the target table. The owner of a cloned target table controls the access rights for that table independently of the source table. This means that the owner of a cloned table might be different from the owner of a source table.
Create a Unity Catalog external shallow clone
Create a Unity Catalog external shallow clone by specifying an external location.
CREATE TABLE <catalog-name>.<schema-name>.<target-table-name>
SHALLOW CLONE <catalog-name>.<schema-name>.<source-table-name>
LOCATION 's3://<bucket-name>/<path-name>/<target-table-name>'
To create an external shallow clone on Unity Catalog, you must have the following privileges on the source and target resources.
| Resource | Permissions required |
|---|---|
| Source schema | USE SCHEMA |
| Source catalog | USE CATALOG |
| Target schema | USE SCHEMA, CREATE TABLE |
| Target catalog | USE CATALOG |
| Target external location | CREATE EXTERNAL TABLE |
Work with shallow cloned tables in standard access mode
To query a shallow clone in standard access mode (formerly shared access mode), you must have the following privileges on the table and containing resources.
| Resource | Permissions required |
|---|---|
| Catalog | USE CATALOG |
| Schema | USE SCHEMA |
| Table | SELECT |
You must also have MODIFY permissions on the target of the clone operation to complete the following actions.
- Insert records
- Delete records
- Update records
MERGECREATE TABLEDROP TABLE
Work with shallow cloned tables in dedicated access mode
When working with Unity Catalog shallow clones in dedicated access mode (formerly single user access mode), you must have permissions on the resources for the cloned table source as well as the target table.
This means that for simple queries in addition to the required permissions on the target table, you must have USE permissions on the source catalog and schema and SELECT permissions on the source table. For any queries that would update or insert records to the target table, you must also have MODIFY permissions on the source table.
Databricks recommends working with Unity Catalog clones on compute with standard access mode as this allows independent evolution of permissions for Unity Catalog shallow clone targets and their source tables.
Use VACUUM with Unity Catalog shallow clones
When you use Unity Catalog tables for the source and target of a shallow clone operation, Unity Catalog manages the underlying data files to improve reliability for the source and target of the clone operation. Running VACUUM on the source of a shallow clone does not break the cloned table.
Normally, when VACUUM identifies valid files for a given retention threshold, only the metadata for the current table is considered. However, shallow clone support for Unity Catalog tracks the relationships between all cloned tables and the source data files, so valid files are expanded to include data files necessary for returning queries for any shallow cloned table as well as the source table.
This means that for Unity Catalog shallow clone VACUUM semantics, a valid data file is any file within the specified retention threshold for the source table or any cloned table. Managed tables and external tables have slightly different semantics.
This enhanced tracking of metadata changes how VACUUM operations impact underlying data files for the Delta tables, with the following semantics.
- For managed tables,
VACUUMoperations against either the source or target of a shallow clone operation might delete data files from the source table. - For external tables,
VACUUMoperations only remove data files from the source table when run against the source table. - Only data files not considered valid for the source table or any shallow clone against the source are removed.
- If multiple shallow clones are defined against a single source table, running
VACUUMon any of the cloned tables does not remove valid data files for other cloned tables.
Note
Databricks recommends never running VACUUM with a retention setting of less than 7 days to avoid corrupting ongoing long-running transactions. If you need to run VACUUM with a lower retention threshold, make sure you understand how VACUUM on shallow clones in Unity Catalog differs from how VACUUM interacts with other cloned tables on Azure Databricks. For more information, see Clone a table on Azure Databricks.
Also, even if a shallow cloned table is dropped, you might need SELECT access to that shallow cloned table to run VACUUM on the base table. Databricks reads the shallow clone's Delta log to verify which base table data files the clone still references before vacuuming them. Databricks maintains this link for 7 days after a shallow cloned table is dropped to support the UNDROP operation. In standard access mode, however, this permission is not required.
Drop the base table for a shallow clone
If a shallow clone’s base table is dropped, the clone becomes unusable. By default, Databricks blocks you from dropping a base table if it still has shallow clones referencing it.
To override this protection, use the DROP TABLE ... FORCE syntax. If you use FORCE:
- The base table is dropped immediately.
- All referencing shallow clones become broken and:
- Fail on operations that require reading data or metadata (for example,
SELECT,INSERT,UPDATE,DESCRIBE HISTORY,CLONE). - Are still visible via metadata-level operations (for example,
SHOW TABLES,DROP TABLE) to allow cleanup.
- Fail on operations that require reading data or metadata (for example,
This behavior applies only to Unity Catalog managed tables. For more information, see DROP TABLE.
Limitations
- Shallow clones on external tables must be external tables. Shallow clones on managed tables must be managed tables.
- You cannot use
REPLACEorCREATE OR REPLACEto overwrite an existing shallow clone. Instead,DROPthe shallow clone and run a newCREATEstatement. - You cannot share shallow clones using Delta Sharing.
- You cannot nest shallow clones, meaning you cannot make a shallow clone from a shallow clone.
- For managed tables, dropping the source table breaks the target table for shallow clones. The underlying data files for external tables are not removed by
DROP TABLEoperations, and so shallow clones of external tables are not impacted by dropping the source. - Unity Catalog allows users to
UNDROPmanaged tables for around 7 days after aDROP TABLEcommand. In Databricks Runtime 13.3 LTS and above, managed shallow clones of a dropped source table continue to work for the 7-day period during which Unity Catalog supportsUNDROP. If the source table is not restored within that window, the shallow clone stops functioning when the source data files are deleted during garbage collection.