Configure Delta Lake catalog
Important
This feature is currently in preview. The Supplemental Terms of Use for Microsoft Azure Previews include more legal terms that apply to Azure features that are in beta, in preview, or otherwise not yet released into general availability. For information about this specific preview, see Azure HDInsight on AKS preview information. For questions or feature suggestions, please submit a request on AskHDInsight with the details and follow us for more updates on Azure HDInsight Community.
This article provides an overview of how to configure Delta Lake catalog in your Trino cluster with HDInsight on AKS. You can add a new catalog by updating your cluster ARM template except the hive catalog, which you can add during Trino cluster creation in the Azure portal.
Prerequisites
Steps to configure Delta Lake catalog
Update your cluster ARM template to add a new Delta Lake catalog config file. This configuration needs to be defined in
serviceConfigsProfiles
underclusterProfile
property of the ARM template.Property Value Description fileName delta.properties Name of the catalog file. If the file is called delta.properties, delta
becomes the catalog name.connector.name delta-lake The type of the catalog. For Delta Lake, catalog type must be delta-lake
delta.register-table-procedure.enabled true Required to allow external tables to be registered. See Trino documentation for other delta lake configuration options.
"serviceConfigsProfiles": [ { "serviceName": "trino", "configs": [ { "component": "catalogs", "files": [ { "fileName": "delta.properties", "values": { "connector.name": "delta-lake", "delta.register-table-procedure.enabled": "true" } } ] ...
Configure a Hive metastore for table definitions and locations if you don't have a metastore already configured.
Configure the Hive metastore for the Delta catalog.
The
catalogOptions
section of the ARM template defines the Hive metastore connection details and it can set up- Metastore config.
- Metastore instance.
- Link from the catalog to the metastore (
catalogName
).
Add this
catalogOptions
configuration undertrinoProfile
property to your cluster ARM template:Note
If Hive catalog options are already present, duplicate your Hive config and specify the delta catalog name.
"trinoProfile": { "catalogOptions": { "hive": [ { "catalogName": "delta", "metastoreDbConnectionURL": "jdbc:sqlserver://{{DATABASE_SERVER}}.database.windows.net:1433;database={DATABASE_NAME}};encrypt=true;trustServerCertificate=true;loginTimeout=30;", "metastoreDbConnectionUserName": "{{DATABASE_USER_NAME}}", "metastoreDbConnectionPasswordSecret": "hms-db-pwd-ref", "metastoreWarehouseDir": "abfss://{{AZURE_STORAGE_CONTAINER}}@{{AZURE_STORAGE_ACCOUNT_NAME}}.dfs.core.windows.net/" } ] } } ...
Assign the
Storage Blob Data Owner
role to your cluster user-assigned MSI in the storage account containing the delta tables. Learn how to assign a role.- User-assigned MSI name is listed in the
msiResourceId
property in the cluster's resource JSON.
- User-assigned MSI name is listed in the
Deploy the updated ARM template to reflect the changes in your cluster. Learn how to deploy an ARM template.
Once successfully deployed, you can see the "delta" catalog in your Trino cluster.
Next steps
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for