What’s coming?

Learn about features and behavioral changes in upcoming Azure Databricks releases.

Statistic management enabled by default with predictive optimization

Starting January 21, Databricks will begin enabling statistics management to all accounts with predictive optimization enabled. Statistics management expands existing predictive optimization functionality by adding stats collection on write and automatically running ANALYZE commands for Unity Catalog managed tables. For more information on predictive optimization, see Predictive optimization for Unity Catalog managed tables.

Behavior change when dataset definitions are removed from a Delta Live Tables pipeline

An upcoming release of Delta Live Tables will change the behavior when a materialized view or streaming table is removed from a pipeline. With this change, the removed materialized view or streaming table will not be deleted automatically when the next pipeline update runs. Instead, you will be able to use the DROP MATERIALIZED VIEW command to delete a materialized view or the DROP TABLE command to delete a streaming table. After dropping an object, running a pipeline update will not recover the object automatically. A new object is created if a materialized view or streaming table with the same definition is re-added to the pipeline. You can, however, recover an object using the UNDROP command.

Behavior change for working with variant data type

Azure Databricks is blocking support for using fields with the variant data type in comparisons performed as part of the following operators and clauses:

  • DISTINCT
  • INTERSECT
  • EXCEPT
  • UNION
  • DISTRIBUTE BY

The same holds for these DataFrame functions:

  • df.dropDuplicates()
  • df.repartition()

Azure Databricks does not support these operators and functions for variant data type comparisons because they produce undefined results.

These expressions will be blocked when using variant types in Databricks Runtime 16.1 and above. Maintenance releases will block support in Databricks Runtime 15.3 and above.

If you use the VARIANT type in your Azure Databricks workloads or tables, take the following recommended actions:

  1. Find the queries that use the variant with any of the listed operators.
  2. Update these queries using recommended patterns that explicitly cast variant values to non-variant types.

The following table has examples of existing unintended functionality and recommended workarounds:

Unintended use Recommended use
SELECT distinct(variant_expr) FROM ... SELECT distinct(variant_expr?::string) FROM ...
SELECT variant_expr FROM ...
EXCEPT
SELECT variant_expr FROM ...
SELECT variant_expr?::string FROM ...
EXCEPT
SELECT variant_expr?::string FROM ...

Note

For any fields you plan to use for comparison or distinct operations, Databricks recommends extracting these fields from the variant column and storing them using non-variant types.

See Query variant data. Contact your Databricks account representative if you require additional support or advisement.

Update to Databricks Marketplace and Partner Connect UI

We are simplifying the sidebar by merging Partner Connect and Marketplace into a single Marketplace link. The new Marketplace link will be higher on the sidebar.

Marketplace and Partner Connect.

Workspace files will be enabled for all Azure Databricks workspaces on Feb 1, 2025

Databricks will enable workspace files for all Azure Databricks workspaces on February 1, 2025. This change unblocks workspace users from using new workspace file features. After February 1, 2025, you won’t be able to disable workspace files using the enableWorkspaceFilesystem property with the Azure Databricks PATCH workspace-conf/setstatus REST API. For more details on workspace files, see What are workspace files?.

Tables are shared with history by default in Delta Sharing

Databricks plans to change the default setting for tables shared using Delta Sharing to include history by default. Previously, history sharing was disabled by default. Sharing table history improves read performance and provides automatic support for advanced Delta optimizations.

Predictive optimization enabled by default on all new Azure Databricks accounts

On November 11, Databricks will enable predictive optimization as the default for all new Azure Databricks accounts. Previously, it was disabled by default and could be enabled by your account administrator. When predictive optimization is enabled, Azure Databricks automatically runs maintenance operations for Unity Catalog managed tables. For more information on predictive optimization, see Predictive optimization for Unity Catalog managed tables.

Reduced cost and more control over performance vs. cost for your serverless compute for workflows workloads

In addition to the currently supported automatic performance optimizations, enhancements to the serverless compute for workflows optimization features will give you more control over whether workloads are optimized for performance or cost. To learn more, see Cost savings on serverless compute for Notebooks, Jobs, and Pipelines.

Changes to legacy dashboard version support

Databricks recommends using AI/BI dashboards (formerly Lakeview dashboards). Earlier versions of dashboards, previously referred to as Databricks SQL dashboards are now called legacy dashboards. Databricks does not recommend creating new legacy dashboards. AI/BI dashboards offer improved features compared to the legacy version, including AI-assisted authoring, draft and published modes, and cross-filtering.

End of support timeline for legacy dashboards

  • April 7, 2025: Official support for the legacy version of dashboards will end. Only critical security issues and service outages will be addressed.
  • November 3, 2025: Databricks will begin archiving legacy dashboards that have not been accessed in the past six months. Archived dashboards will no longer be accessible, and the archival process will occur on a rolling basis. Access to actively used dashboards will remain unchanged.

Databricks will work with customers to develop migration plans for active legacy dashboards after November 3, 2025.

To help transition to AI/BI dashboards, upgrade tools are available in both the user interface and the API. For instructions on how to use the built-in migration tool in the UI, see Clone a legacy dashboard to an AI/BI dashboard. For tutorials about creating and managing dashboards using the REST API at Use Azure Databricks APIs to manage dashboards.

Changes to serverless compute workload attribution

Currently, your billable usage system table might include serverless SKU billing records with null values for run_as, job_id, job_run_id, and notebook_id. These records represent costs associated with shared resources that are not directly attributable to any particular workload.

To help simplify cost reporting, Databricks will soon attribute these shared costs to the specific workloads that incurred them. You will no longer see billing records with null values in workload identifier fields. As you increase your usage of serverless compute and add more workloads, the proportion of these shared costs on your bill will decrease as they are shared across more workloads.

For more information on monitoring serverless compute costs, see Monitor the cost of serverless compute.

The sourceIpAddress field in audit logs will no longer include a port number

Due to a bug, certain authorization and authentication audit logs include a port number in addition to the IP in the sourceIPAddress field (for example, "sourceIPAddress":"10.2.91.100:0"). The port number, which is logged as 0, does not provide any real value and is inconsistent with the rest of the Databricks audit logs. To enhance the consistency of audit logs, Databricks plans to change the format of the IP address for these audit log events. This change will gradually roll out starting in early August 2024.

If the audit log contains a sourceIpAddress of 0.0.0.0, Databricks might stop logging it.

Legacy Git integration is EOL on January 31

After January 31, 2024, Databricks will remove legacy notebook Git integrations. This feature has been in legacy status for more than two years, and a deprecation notice has been displayed in the product UI since November 2023.

For details on migrating to Databricks Git folders (formerly Repos) from legacy Git integration, see Switching to Databricks Repos from Legacy Git integration. If this removal impacts you and you need an extension, contact your Databricks account team.

JDK8 and JDK11 will be unsupported

Azure Databricks plans to remove JDK 8 support with the next major Databricks Runtime version, when Spark 4.0 releases. Azure Databricks plans to remove JDK 11 support with the next LTS version of Databricks Runtime 14.x.

Automatic enablement of Unity Catalog for new workspaces

Databricks has begun to enable Unity Catalog automatically for new workspaces. This removes the need for account admins to configure Unity Catalog after a workspace is created. Rollout is proceeding gradually across accounts.

sqlite-jdbc upgrade

Databricks Runtime plans to upgrade the sqlite-jdbc version from 3.8.11.2 to 3.42.0.0 in all Databricks Runtime maintenance releases. The APIs of version 3.42.0.0 are not fully compatible with 3.8.11.2. Confirm your methods and return type use version 3.42.0.0.

If you are using sqlite-jdbc in your code, check the sqlite-jdbc compatibility report.