Κοινή χρήση μέσω


SYNC

Applies to: check marked yes Databricks SQL check marked yes Databricks Runtime check marked yes Unity Catalog only

Use the SYNC command to upgrade external tables in Hive Metastore to external tables in Unity Catalog. You can also use SYNC to upgrade Hive managed tables that are stored outside of Databricks workspace storage (sometimes called DBFS root) to external tables in Unity Catalog. You cannot use it to upgrade Hive managed tables stored in workspace storage. To upgrade those tables, use CREATE TABLE CLONE.

You can use SYNC to create new tables in Unity Catalog from existing Hive Metastore tables as well as update the Unity Catalog tables when the source tables in Hive Metastore are modified.

The SYNC command can be run at a schema level using the SYNC SCHEMA syntax or for an individual table using the SYNC TABLE syntax.

The command performs a write operation (ALTER TABLE) to each source table it upgrades to add some additional table properties for its bookkeeping. In case of Delta tables, to perform the write operation the cluster or SQL Warehouse that runs the command must have write access to the table location.

In Databricks Runtime 12.2 LTS or above, this behavior can be turned off by setting the Spark configuration spark.databricks.sync.command.disableSourceTableWrites to true before running the SYNC command. When set to true, SYNC does not add new table properties and therefore might not detect if the table has previously been upgraded to Unity Catalog. In that case, it exclusively relies on the table name to determine if the table has been previously upgraded to Unity Catalog. If the source table has been renamed since the last SYNC command, the user needs to manually rename the destination table before re-running the SYNC command when the config is true.

Important

When a SYNC command is executed, the SET TBLPROPERTIES operation adds a table property that indicates the target Unity Catalog external table reference. This operation computes a new Delta snapshot and adds a new entry to the table Delta log, writing to the target table path in cloud storage.

Syntax

SYNC { SCHEMA target_schema [AS EXTERNAL] FROM source_schema |
       TABLE target_table [AS EXTERNAL] FROM source_table }
  [SET OWNER principal]
  [DRY RUN]

Parameters

  • SCHEMA

    SYNC all the tables within a schema.

    • target_schema

      An existing schema in Unity Catalog within which the user is authorized to create tables.

    • source_schema

      An existing schema in the hive_metastore catalog, which is owned by the user.

  • TABLE

    SYNC an individual table.

    • target_table

      A new or existing table in Unity Catalog in a schema within which the user is authorized to create tables. If the table already exists it is replaced to match source_table, and the user must also own the table. If the table does not exist it will be created.

    • source_table

      An existing table in hive_metastore which the user owns.

  • principal

    Optionally set the owner of the upgraded tables in Unity Catalog to principal. The default owner is the current user.

  • AS EXTERNAL

    SYNC a Hive managed table or schema that is stored outside of Databricks workspace storage (sometimes called DBFS root) to external tables in Unity Catalog. You cannot use AS EXTERNAL to upgrade Hive managed tables stored in workspace storage.

  • DRY RUN

    When specified checks whether the source_table or tables within source_schema can be upgraded without actually creating or upgrading the target tables. The command returns DRY_RUN_SUCCESS if a table can be upgraded.

  • AS EXTERNAL Starting Databricks Runtime 13.2 and above, this optional clause can be added to specify that managed tables in hive metastore are upgraded as external tables in Unity Catalog. When used with SYNC SCHEMA, it applies to all the tables, including managed tables in the source_schema.

Returns

A report with the following columns:

  • source_schema STRING

    The name of the source schema. The schema is NULL if the source is an unsupported temporary view.

  • source_name STRING NOT NULL

    The name of the source table.

  • source_type STRING NOT NULL

    The type of the table: MANAGED or EXTERNAL

  • target_catalog STRING NOT NULL

    The target catalog in Unity Catalog where the table is synced.

  • target_schema STRING NOT NULL

    The target schema in Unity Catalog where the table is synced.

  • target_name STRING NOT NULL

    The name of the table in Unity Catalog to which the source table is synced. This name matches the source table name.

  • status_code STRING NOT NULL

    A status code for the result of the SYNC command for the source table.

  • description STRING

    A descriptive message about the status of the sync command for the source table.

Common status codes returned by SYNC

The SYNC command provides a unique status_code field in the output for each table to be upgraded to the Unity Catalog representing the status of the upgrade. Some common status codes along with the recommendations to address them are:

  • DRY_RUN_SUCCESS: Dry run successful.

    The table can be upgraded to Unity Catalog using the SYNC command.

  • DBFS_ROOT_LOCATION: Table located in the Databricks Filesystem root.

    The table is located in the Databricks Filesystem root location. This is not supported in Unity Catalog. Copy the table data to the Unity Catalog location using a CREATE TABLE command with the DEEP CLONE option.

  • EXTERNAL_TABLE_IN_MANAGED_LOCATION: External table path cannot be under managed storage.

    The path given for the external table is within Unity Catalog manage storage. If the table needs to be under the managed storage, upgrade the table as a managed table using a CREATE TABLE command with the DEEP CLONE option or move the table location out of Unity Catalog managed storage.

  • HIVE_SERDE: The table is not eligible for an upgrade from Hive Metastore to Unity Catalog. Reason: Hive SerDe Table.

    Hive SerDe tables are not supported by Unity Catalog. Change the tables into Delta format and issue the SYNC command to upgrade.

  • INVALID_DATASOURCE_FORMAT: Datasource format not specified or is not supported.

    Use one of the supported data source formats: Delta, Parquet, CSV, JSON, ORC, TEXT

  • LOCATION_OVERLAP: Input path overlaps with other external tables.

    The table location overlaps with other external tables. Use a different location for the table or remove the overlapping external tables.

  • MULTIPLE_EXT_LOCATIONS: Input path contains other external locations.

    There are more than one external locations which are subdirectories of the provided table path. Check if the external locations within the table location are necessary.

  • MULTIPLE_TARGET_TABLE: A different synced table already exists. Only one target table per source table is allowed.

    The source table was already synced to a different target table previously which is not allowed. To force the SYNC to a different table, remove the table property upgraded_to from the source table or remove the previously synced table from Unity Catalog if it is not needed anymore.

  • NOT_EXTERNAL: Table is not eligible for upgrade from Hive Metastore to Unity Catalog. Reason: Not an external table.

    SYNC command only supports migrating external tables to Unity Catalog. For managed tables, create a managed table in Unity Catalog using a CREATE TABLE command with the DEEP CLONE option. Alternatively, use the AS EXTERNAL clause with the SYNC command to create an external table in Unity Catalog.

  • READ_ONLY_CATALOG: Data inside a Delta sharing catalog is read-only and cannot be modified or deleted.

    The chosen catalog is a delta sharing catalog which is read-only. Tables within a read-only catalog cannot be updated using the SYNC command.

  • SUCCESS: Table successfully synced.

  • TABLE_ALREADY_EXISTS: Target table already exists.

    A table with the same name as the chosen table already exists in Unity Catalog. Rename or remove the existing table in Unity Catalog and rerun the SYNC command.

  • TEMP_TABLE_NOT_SUPPORTED: Temporary tables or views are not supported.

    Temporary tables or views cannot be upgraded to Unity Catalog. To use temporary tables or views, recreate them in Unity Catalog using the SHOW CREATE TABLE command in Unity Catalog.

  • TIMEOUT: Sync task timed out.

    The sync command task took more than 300 seconds to complete. Increase spark.databricks.sync.command.task.timeout to a higher value in seconds. Default value is 300. If the error persists contact support.

  • VIEWS_NOT_SUPPORTED: Views are not supported.

    Recreate the views manually using SHOW CREATE TABLE command in Unity Catalog.

Examples

-- Sync an existing hive metastore table hive_metastore.default.my_tbl to a Unity Catalog
-- table named main.default.my_tbl.
> SYNC TABLE main.default.my_tbl FROM hive_metastore.default.my_tbl;
  source_schema source_name source_type target_catalog target_schema target_name status_code description
  ------------- ----------- ----------- -------------- ------------- ----------- ----------- ---------------------------------
  default       my_tbl      external    main           default       my_tbl      SUCCESS     Table main.default.my_tbl synced.

 -- Sync an existing managed hive metastore table hive_metastore.default.my_tbl to an external table named main.default.my_tbl in Unity Catalog.
 > SYNC TABLE main.default.my_tbl AS EXTERNAL FROM hive_metastore.default.my_tbl;
  source_schema source_name source_type target_catalog target_schema target_name status_code description
  ------------- ----------- ----------- -------------- ------------- ----------- ----------- ---------------------------------
  default       my_tbl      managed    main           default       my_tbl      SUCCESS     Table main.default.my_tbl synced.

-- SYNC a table in DRY RUN mode to evaluate the upgradability of the hive metastore table.
> SYNC TABLE main.default.my_tbl FROM hive_metastore.default.my_tbl DRY RUN;
  source_schema source_name source_type target_catalog target_schema target_name status_code     description
  ------------- ----------- ----------- -------------- ------------- ----------- --------------- ---------------------------------
  default       my_tbl      external    main           default       my_tbl      DRY_RUN_SUCCESS

-- SYNC all the eligible tables in schema hive_metastore.mydb to a Unity Catalog schema main.my_db_uc.
-- The upgraded tables in main.my_db_uc will be owned by alf@melmak.et
> SYNC SCHEMA main.my_db_uc FROM hive_metastore.my_db SET OWNER `alf@melmak.et`;
  source_schema source_name source_type target_catalog target_schema target_name status_code description
  ------------- ----------- ----------- -------------- ------------- ----------- ----------- ---------------------------------
  ...

-- DRY RUN mode of SYNC SCHEMA to evaluate all the tables in a schema
-- hive_metastore.mydb for upgrading to Unity Catalog.
> SYNC SCHEMA main.my_db_uc FROM hive_metastore.my_db DRY RUN;
  source_schema source_name source_type target_catalog target_schema target_name status_code     description
  ------------- ----------- ----------- -------------- ------------- ----------- -----------     ---------------------------------
  ...

-- Sync all tables including managed tables in a schema hive_metastore.mydb
-- as external tables in Unity Catalog.
> SYNC SCHEMA main.my_db_uc AS EXTERNAL FROM hive_metastore.my_db;
  source_schema source_name source_type target_catalog target_schema target_name status_code     description
  ------------- ----------- ----------- -------------- ------------- ----------- -----------     ---------------------------------
  ...