January 2020
These features and Azure Databricks platform improvements were released in January 2020.
Note
Releases are staged. Your Azure Databricks account may not be updated until up to a week after the initial release date.
This month saw the release of Azure Databricks platform versions 3.9 and 3.11. There was no release of versions 3.10 or 3.8. Version 3.7 was a stability and bug-fix-only release.
Coming soon: workspace, pool, and cluster tags propagate to DBU usage details and Azure VMs for better cost management reporting
On February 10th, we will release tag propagation to Azure Databricks usage details and Azure VMs. The new tag propagation feature combines Azure Databricks workspace tags (that is, resource group tags), pool tags, and cluster tags and propagates them to the Databricks DBU usage details and Azure VMs as resource tags. You will be able to see the combined tag information in the Azure Cost Management portal and in usage detail exports, giving you better visibility into Azure Databricks usage (total cost of ownership) and accurate attribution to business units and teams.
Azure Databricks and Azure Lighthouse can now live in the same subscription
January 29, 2020
All existing Azure Databricks workspaces have migrated from using Managed Locks to Deny Assignments. All new workspaces created will have Deny Assignments. This does not change any existing behavior, and the level of security remains the same. While you can onboard subscriptions that use Azure Databricks, users in the managing tenant can’t launch Azure Databricks workspaces on a delegated subscription at this time.
Databricks Runtime 6.3 for Genomics GA
January 22, 2020
Databricks Runtime 6.3 for Genomics is built on top of Databricks Runtime 6.3. It includes many improvements and upgrades from Databricks Runtime 6.2 for Genomics.
The key features are:
- Support for Delta tables as input to the joint genotyping pipeline
- Automatic annotation parsing when reading VCFs
- Improved multiallelic variant splitter
- Faster linear and logistic regression functions
Databricks Runtime 6.3 ML GA
January 22, 2020
Databricks Runtime 6.3 ML GA brings many library upgrades, including:
- PyTorch: 1.3.0 to 1.3.1
- torchvision: 0.4.1 to 0.4.2
- MLflow: 1.4.0 to 1.5.0
- Hyperopt: 0.2.1 to 0.2.2
For details, see the complete Databricks Runtime 6.3 for ML (EoS) release notes.
Databricks Runtime 6.3 GA
January 22, 2020
Databricks Runtime 6.3 GA brings new features, improvements, and many bug fixes.
This release introduces improved concurrency. The key features are:
- Improved concurrency for all Delta Lake operations
- Improved support for file compaction
- Improved performance for insert-only merge
For details, see the complete Databricks Runtime 6.3 (EoS) release notes.
Disk caching enabled by default
January 7-14, 2020: Version 3.9
Disk caching is now enabled by default on Lsv2 series instances for all supported Databricks Runtime releases. See Selecting instance types to use disk caching.
Cluster standard autoscaling step is now configurable
January 7-14, 2020: Version 3.9
By default the first step of standard autoscaling adds 8 nodes. Now you can set the step value in the cluster Spark configuration. See Compute configuration reference.
SCIM API supports pagination for Get Users and Get Groups (Public Preview)
January 7-14, 2020: Version 3.9
The SCIM API now supports pagination for Get Users and Get Groups. When you specify the startIndex
and count
query parameters, SCIM will return a subset of users/groups. The startIndex
parameter is the 1-based index of the first result. The count
parameter is the maximum number of users or groups to return. This ensures scalability for the SCIM Client and simplifies SCIM calls for Azure Databricks admins. See Groups API.
File browser swimlane widths increased to 240px
January 7-14, 2020: Version 3.9
The increased width reduces the need to mouse over objects to see the full filename.
Databricks Runtime 3.5 LTS support ends
January 2, 2020
Support for Databricks Runtime 3.5 LTS (Long Term Support) ended on January 2. See Databricks support lifecycles.