Events
Mar 31, 11 PM - Apr 2, 11 PM
The ultimate Microsoft Fabric, Power BI, SQL, and AI community-led event. March 31 to April 2, 2025.
Register todayThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Learn about features and behavioral changes in upcoming Azure Databricks releases.
Starting January 21, Databricks will begin enabling statistics management to all accounts with predictive optimization enabled. Statistics management expands existing predictive optimization functionality by adding stats collection on write and automatically running ANALYZE
commands for Unity Catalog managed tables. For more information on predictive optimization, see Predictive optimization for Unity Catalog managed tables.
An upcoming release of Delta Live Tables will change the behavior when a materialized view or streaming table is removed from a pipeline. With this change, the removed materialized view or streaming table will not be deleted automatically when the next pipeline update runs. Instead, you will be able to use the DROP MATERIALIZED VIEW
command to delete a materialized view or the DROP TABLE
command to delete a streaming table. After dropping an object, running a pipeline update will not recover the object automatically. A new object is created if a materialized view or streaming table with the same definition is re-added to the pipeline. You can, however, recover an object using the UNDROP
command.
Azure Databricks is blocking support for using fields with the variant data type in comparisons performed as part of the following operators and clauses:
DISTINCT
INTERSECT
EXCEPT
UNION
DISTRIBUTE BY
The same holds for these DataFrame functions:
df.dropDuplicates()
df.repartition()
Azure Databricks does not support these operators and functions for variant data type comparisons because they produce undefined results.
These expressions will be blocked when using variant types in Databricks Runtime 16.1 and above. Maintenance releases will block support in Databricks Runtime 15.3 and above.
If you use the VARIANT
type in your Azure Databricks workloads or tables, take the following recommended actions:
The following table has examples of existing unintended functionality and recommended workarounds:
Unintended use | Recommended use |
---|---|
SELECT distinct(variant_expr) FROM ... |
SELECT distinct(variant_expr?::string) FROM ... |
SELECT variant_expr FROM ... EXCEPT SELECT variant_expr FROM ... |
SELECT variant_expr?::string FROM ... EXCEPT SELECT variant_expr?::string FROM ... |
Note
For any fields you plan to use for comparison or distinct operations, Databricks recommends extracting these fields from the variant column and storing them using non-variant types.
See Query variant data. Contact your Databricks account representative if you require additional support or advisement.
We are simplifying the sidebar by merging Partner Connect and Marketplace into a single Marketplace link. The new Marketplace link will be higher on the sidebar.
Databricks will enable workspace files for all Azure Databricks workspaces on February 1, 2025. This change unblocks workspace users from using new workspace file features. After February 1, 2025, you won’t be able to disable workspace files using the enableWorkspaceFilesystem
property with the Azure Databricks PATCH workspace-conf/setstatus REST API. For more details on workspace files, see What are workspace files?.
Databricks plans to change the default setting for tables shared using Delta Sharing to include history by default. Previously, history sharing was disabled by default. Sharing table history improves read performance and provides automatic support for advanced Delta optimizations.
On November 11, Databricks will enable predictive optimization as the default for all new Azure Databricks accounts. Previously, it was disabled by default and could be enabled by your account administrator. When predictive optimization is enabled, Azure Databricks automatically runs maintenance operations for Unity Catalog managed tables. For more information on predictive optimization, see Predictive optimization for Unity Catalog managed tables.
In addition to the currently supported automatic performance optimizations, enhancements to the serverless compute for workflows optimization features will give you more control over whether workloads are optimized for performance or cost. To learn more, see Cost savings on serverless compute for Notebooks, Jobs, and Pipelines.
Databricks recommends using AI/BI dashboards (formerly Lakeview dashboards). Earlier versions of dashboards, previously referred to as Databricks SQL dashboards are now called legacy dashboards. Databricks does not recommend creating new legacy dashboards. AI/BI dashboards offer improved features compared to the legacy version, including AI-assisted authoring, draft and published modes, and cross-filtering.
Databricks will work with customers to develop migration plans for active legacy dashboards after November 3, 2025.
To help transition to AI/BI dashboards, upgrade tools are available in both the user interface and the API. For instructions on how to use the built-in migration tool in the UI, see Clone a legacy dashboard to an AI/BI dashboard. For tutorials about creating and managing dashboards using the REST API at Use Azure Databricks APIs to manage dashboards.
Currently, your billable usage system table might include serverless SKU billing records with null values for run_as
, job_id
, job_run_id
, and notebook_id
. These records represent costs associated with shared resources that are not directly attributable to any particular workload.
To help simplify cost reporting, Databricks will soon attribute these shared costs to the specific workloads that incurred them. You will no longer see billing records with null values in workload identifier fields. As you increase your usage of serverless compute and add more workloads, the proportion of these shared costs on your bill will decrease as they are shared across more workloads.
For more information on monitoring serverless compute costs, see Monitor the cost of serverless compute.
Due to a bug, certain authorization and authentication audit logs include a port number in addition to the IP in the sourceIPAddress
field (for example, "sourceIPAddress":"10.2.91.100:0"
). The port number, which is logged as 0
, does not provide any real value and is inconsistent with the rest of the Databricks audit logs. To enhance the consistency of audit logs, Databricks plans to change the format of the IP address for these audit log events. This change will gradually roll out starting in early August 2024.
If the audit log contains a sourceIpAddress
of 0.0.0.0
, Databricks might stop logging it.
After January 31, 2024, Databricks will remove legacy notebook Git integrations. This feature has been in legacy status for more than two years, and a deprecation notice has been displayed in the product UI since November 2023.
For details on migrating to Databricks Git folders (formerly Repos) from legacy Git integration, see Switching to Databricks Repos from Legacy Git integration. If this removal impacts you and you need an extension, contact your Databricks account team.
Azure Databricks plans to remove JDK 8 support with the next major Databricks Runtime version, when Spark 4.0 releases. Azure Databricks plans to remove JDK 11 support with the next LTS version of Databricks Runtime 14.x.
Databricks has begun to enable Unity Catalog automatically for new workspaces. This removes the need for account admins to configure Unity Catalog after a workspace is created. Rollout is proceeding gradually across accounts.
Databricks Runtime plans to upgrade the sqlite-jdbc version from 3.8.11.2 to 3.42.0.0 in all Databricks Runtime maintenance releases. The APIs of version 3.42.0.0 are not fully compatible with 3.8.11.2. Confirm your methods and return type use version 3.42.0.0.
If you are using sqlite-jdbc in your code, check the sqlite-jdbc compatibility report.
Events
Mar 31, 11 PM - Apr 2, 11 PM
The ultimate Microsoft Fabric, Power BI, SQL, and AI community-led event. March 31 to April 2, 2025.
Register todayTraining
Module
Optimize performance with Spark and Delta Live Tables - Training
Optimize performance with Spark and Delta Live Tables in Azure Databricks.
Certification
Microsoft Certified: Azure Database Administrator Associate - Certifications
Administer an SQL Server database infrastructure for cloud, on-premises and hybrid relational databases using the Microsoft PaaS relational database offerings.