Del via


Databricks Runtime 15.2

The following release notes provide information about Databricks Runtime 15.2, powered by Apache Spark 3.5.0.

Databricks released this version in May 2024.

Tip

To see release notes for Databricks Runtime versions that have reached end-of-support (EoS), see End-of-support Databricks Runtime release notes. The EoS Databricks Runtime versions have been retired and might not be updated.

Behavioral changes

Vacuum cleans up COPY INTO metadata files

Running VACUUM on a table written with COPY INTO now cleans up unreferenced metadata associated with tracking ingested files. There is no impact on the operational semantics of COPY INTO.

Lakehouse Federation is generally available (GA)

In Databricks Runtime 15.2 and later, Lakehouse Federation connectors across the following database types are generally available (GA):

  • MySQL
  • PostgreSQL
  • Amazon Redshift
  • Snowflake
  • Microsoft SQL Server
  • Azure Synapse (SQL Data Warehouse)
  • Databricks

This release also introduces the following improvements:

  • Support for single sign-on (SSO) authentication in the Snowflake and Microsoft SQL Server connectors.

  • Azure Private Link support for the SQL Server connector from serverless compute environments. See Step 3: Create private endpoint rules.

  • Support for additional pushdowns (string, math, and miscellaneous functions).

  • Improved pushdown success rate across different query shapes.

  • Additional pushdown debugging capabilities:

    • The EXPLAIN FORMATTED output displays the pushed-down query text.
    • The query profile UI displays the pushed-down query text, federated node identifiers, and JDBC query execution times (in verbose mode). See View system-generated federated queries.

BY POSITION for column mapping using COPY INTO with headerless CSV files

In Databricks Runtime 15.2 and later, you can use the BY POSITION key words (or alternative syntax ( col_name [ , <col_name> ... ] )) with COPY INTO for headerless CSV files to simplify the source column to target table column mapping. See Parameters.

Reduce memory consumption when Spark tasks fail with a Resubmitted error

In Databricks Runtime 15.2 and later, the return value of the Spark TaskInfo.accumulables() method is empty when tasks fail with a Resubmitted error. Previously, the method returned the values of an earlier successful task attempt. This behavior change affects the following consumers:

  • Spark tasks that use the EventLoggingListener class.
  • Custom Spark Listeners.

To restore the previous behavior, set spark.scheduler.dropTaskInfoAccumulablesOnTaskCompletion.enabled to false.

Viewing adaptive query execution plan versions is disabled

To reduce memory consumption, adaptive query execution (AQE) plan versions are now disabled by default in the Spark UI. To enable viewing AQE plan versions in the Spark UI, set the spark.databricks.sql.aqe.showPlanChangesInUI.enabled to true.

Limit on retained queries is lowered to reduce Spark UI memory usage

In Databricks Runtime 15.2 and later, to reduce the memory consumed by the Spark UI in Azure Databricks compute, the limit on the number of queries visible in the UI is lowered from 1000 to 100. To change the limit, set a new value using the spark.sql.ui.retainedExecutions Spark configuration.

DESCRIBE HISTORY now shows clustering columns for tables that use liquid clustering

When you run a DESCRIBE HISTORY query, the operationParameters column shows a clusterBy field by default for CREATE OR REPLACE and OPTIMIZE operations. For a Delta table that uses liquid clustering, the clusterBy field is populated with the table’s clustering columns. If the table does not use liquid clustering, the field is empty.

New features and improvements

Support for primary and foreign keys is GA

Support for primary and foreign keys in Databricks Runtime is generally available. The GA release includes the following changes to the privileges required to use primary and foreign keys:

  • To define a foreign key, you must have the SELECT privilege on the table with the primary key that the foreign key refers to. You do not need to own the table with the primary key, which was previously required.
  • Dropping a primary key using the CASCADE clause does not require privileges on the tables that define foreign keys that reference the primary key. Previously, you needed to own the referencing tables.
  • Dropping a table that includes constraints now requires the same privileges as dropping tables that do not include constraints.

To learn how to use primary and foreign keys with tables or views, see CONSTRAINT clause, ADD CONSTRAINT clause, and DROP CONSTRAINT clause.

Liquid clustering is GA

Support for liquid clustering is now generally available using Databricks Runtime 15.2 and above. See Use liquid clustering for Delta tables.

Type widening is in Public Preview

You can now enable type widening on tables backed by Delta Lake. Tables with type widening enabled allow changing the type of columns to a wider data type without rewriting underlying data files. See Type widening.

Schema evolution clause added to SQL merge syntax

You can now add the WITH SCHEMA EVOLUTION clause to a SQL merge statement to enable schema evolution for the operation. See Schema evolution syntax for merge.

PySpark custom data sources are available in Public Preview

A PySpark DataSource can be created using the Python (PySpark) DataSource API, which enables reading from custom data sources and writing to custom data sinks in Apache Spark using Python. See PySpark custom data sources

applyInPandas and mapInPandas now available on Unity Catalog compute with shared access mode

As part of a Databricks Runtime 14.3 LTS maintenance release, applyInPandas and mapInPandas UDF types are now supported on shared access mode compute running Databricks Runtime 14.3 and above.

Use dbutils.widgets.getAll() to get all widgets in a notebook

Use dbutils.widgets.getAll() to get all widget values in a notebook. This is especially helpful when passing multiple widgets values to a Spark SQL query.

Vacuum inventory support

You can now specify an inventory of files to consider when running the VACUUM command on a Delta table. See the OSS Delta docs.

Support for Zstandard compression functions

You can now use the zst_compress, zstd_decompress, and try_zstd_decompress functions to compress and decompress BINARY data.

Bug fixes

Query plans in the SQL UI now correctly display PhotonWriteStage

When displayed in the SQL UI, write commands in query plans incorrectly showed PhotonWriteStage as an operator. With this release, the UI is updated to show PhotonWriteStage as a stage. This is a UI change only and does not affect how queries are run.

Ray is updated to fix issues with starting Ray clusters

This release includes a patched version of Ray that fixes a breaking change that prevents Ray clusters from starting with Databricks Runtime for Machine Learning. This change ensures that Ray functionality is identical to versions of Databricks Runtime previous to 15.2.

GraphFrames is updated to fix incorrect results with Spark 3.5

This release includes an update to the GraphFrames package to fix issues that cause incorrect results for some algorithms with GraphFrames and Spark 3.5.

Corrected error class for DataFrame.sort() and DataFrame.sortWithinPartitions() functions

This release includes an update to the PySpark DataFrame.sort() and DataFrame.sortWithinPartitions() functions to ensure the ZERO_INDEX error class is thrown when 0 is passed as the index argument. Previously, the error class INDEX_NOT_POSITIVE was thrown.

ipywidgets is downgraded from 8.0.4 to 7.7.2

To fix errors introduced by an upgrade of ipywidgets to 8.0.4 in Databricks Runtime 15.0, ipywidgets is downgraded to 7.7.2 in Databricks Runtime 15.2. This is the same version included in previous Databricks Runtime versions.

Library upgrades

  • Upgraded Python libraries:
    • GitPython from 3.1.42 to 3.1.43
    • google-api-core from 2.17.1 to 2.18.0
    • google-auth from 2.28.1 to 2.29.0
    • google-cloud-storage from 2.15.0 to 2.16.0
    • googleapis-common-protos from 1.62.0 to 1.63.0
    • ipywidgets from 8.0.4 to 7.7.2
    • mlflow-skinny from 2.11.1 to 2.11.3
    • s3transfer from 0.10.0 to 0.10.1
    • sqlparse from 0.4.4 to 0.5.0
    • typing_extensions from 4.7.1 to 4.10.0
  • Upgraded R libraries:
  • Upgraded Java libraries:
    • com.amazonaws.aws-java-sdk-autoscaling from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-cloudformation from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-cloudfront from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-cloudhsm from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-cloudsearch from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-cloudtrail from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-cloudwatch from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-cloudwatchmetrics from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-codedeploy from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-cognitoidentity from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-cognitosync from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-config from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-core from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-datapipeline from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-directconnect from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-directory from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-dynamodb from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-ec2 from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-ecs from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-efs from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-elasticache from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-elasticbeanstalk from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-elasticloadbalancing from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-elastictranscoder from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-emr from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-glacier from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-glue from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-iam from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-importexport from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-kinesis from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-kms from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-lambda from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-logs from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-machinelearning from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-opsworks from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-rds from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-redshift from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-route53 from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-s3 from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-ses from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-simpledb from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-simpleworkflow from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-sns from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-sqs from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-ssm from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-storagegateway from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-sts from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-support from 1.12.390 to 1.12.610
    • com.amazonaws.aws-java-sdk-workspaces from 1.12.390 to 1.12.610
    • com.amazonaws.jmespath-java from 1.12.390 to 1.12.610

Apache Spark

Databricks Runtime 15.2 includes Apache Spark 3.5.0. This release includes all Spark fixes and improvements included in Databricks Runtime 15.1, as well as the following additional bug fixes and improvements made to Spark:

  • [SPARK-47941] [SC-163568] [SS] [Connect] Propagate ForeachBatch worker initialization errors to users for PySpark
  • [SPARK-47412] [SC-163455][SQL] Add Collation Support for LPad/RPad.
  • [SPARK-47907] [SC-163408][SQL] Put bang under a config
  • [SPARK-46820] [SC-157093][PYTHON] Fix error message regression by restoring new_msg
  • [SPARK-47602] [SPARK-47577][SPARK-47598][SPARK-47577]Core/MLLib/Resource managers: structured logging migration
  • [SPARK-47890] [SC-163324][CONNECT][PYTHON] Add variant functions to Scala and Python.
  • [SPARK-47894] [SC-163086][CORE][WEBUI] Add Environment page to Master UI
  • [SPARK-47805] [SC-163459][SS] Implementing TTL for MapState
  • [SPARK-47900] [SC-163326] Fix check for implicit (UTF8_BINARY) collation
  • [SPARK-47902] [SC-163316][SQL]Making Compute Current Time* expressions foldable
  • [SPARK-47845] [SC-163315][SQL][PYTHON][CONNECT] Support Column type in split function for scala and python
  • [SPARK-47754] [SC-162144][SQL] Postgres: Support reading multidimensional arrays
  • [SPARK-47416] [SC-163001][SQL] Add new functions to CollationBenchmark #90339
  • [SPARK-47839] [SC-163075][SQL] Fix aggregate bug in RewriteWithExpression
  • [SPARK-47821] [SC-162967][SQL] Implement is_variant_null expression
  • [SPARK-47883] [SC-163184][SQL] Make CollectTailExec.doExecute lazy with RowQueue
  • [SPARK-47390] [SC-163306][SQL] PostgresDialect distinguishes TIMESTAMP from TIMESTAMP_TZ
  • [SPARK-47924] [SC-163282][CORE] Add a DEBUG log to DiskStore.moveFileToBlock
  • [SPARK-47897] [SC-163183][SQL][3.5] Fix ExpressionSet performance regression in scala 2.12
  • [SPARK-47565] [SC-161786][PYTHON] PySpark worker pool crash resilience
  • [SPARK-47885] [SC-162989][PYTHON][CONNECT] Make pyspark.resource compatible with pyspark-connect
  • [SPARK-47887] [SC-163122][CONNECT] Remove unused import spark/connect/common.proto from spark/connect/relations.proto
  • [SPARK-47751] [SC-161991][PYTHON][CONNECT] Make pyspark.worker_utils compatible with pyspark-connect
  • [SPARK-47691] [SC-161760][SQL] Postgres: Support multi dimensional array on the write side
  • [SPARK-47617] [SC-162513][SQL] Add TPC-DS testing infrastructure for collations
  • [SPARK-47356] [SC-162858][SQL] Add support for ConcatWs & Elt (all collations)
  • [SPARK-47543] [SC-161234][CONNECT][PYTHON] Inferring dict as MapType from Pandas DataFrame to allow DataFrame creation
  • [SPARK-47863] [SC-162974][SQL] Fix startsWith & endsWith collation-aware implementation for ICU
  • [SPARK-47867] [SC-162966][SQL] Support variant in JSON scan.
  • [SPARK-47366] [SC-162475][SQL][PYTHON] Add VariantVal for PySpark
  • [SPARK-47803] [SC-162726][SQL] Support cast to variant.
  • [SPARK-47769] [SC-162841][SQL] Add schema_of_variant_agg expression.
  • [SPARK-47420] [SC-162842][SQL] Fix test output
  • [SPARK-47430] [SC-161178][SQL] Support GROUP BY for MapType
  • [SPARK-47357] [SC-162751][SQL] Add support for Upper, Lower, InitCap (all collations)
  • [SPARK-47788] [SC-162729][SS] Ensure the same hash partitioning for streaming stateful ops
  • [SPARK-47776] [SC-162291][SS] Disallow binary inequality collation be used in key schema of stateful operator
  • [SPARK-47673] [SC-162824][SS] Implementing TTL for ListState
  • [SPARK-47818] [SC-162845][CONNECT] Introduce plan cache in SparkConnectPlanner to improve performance of Analyze requests
  • [SPARK-47694] [SC-162783][CONNECT] Make max message size configurable on the client side
  • [SPARK-47274] Revert “[SC-162479][PYTHON][SQL] Provide more usef…
  • [SPARK-47616] [SC-161193][SQL] Add User Document for Mapping Spark SQL Data Types from MySQL
  • [SPARK-47862] [SC-162837][PYTHON][CONNECT]Fix generation of proto files
  • [SPARK-47849] [SC-162724][PYTHON][CONNECT] Change release script to release pyspark-connect
  • [SPARK-47410] [SC-162518][SQL] Refactor UTF8String and CollationFactory
  • [SPARK-47807] [SC-162505][PYTHON][ML] Make pyspark.ml compatible with pyspark-connect
  • [SPARK-47707] [SC-161768][SQL] Special handling of JSON type for MySQL Connector/J 5.x
  • [SPARK-47765] Revert “[SC-162636][SQL] Add SET COLLATION to pars…
  • [SPARK-47081] [SC-162151][CONNECT][FOLLOW] Improving the usability of the Progress Handler
  • [SPARK-47289] [SC-161877][SQL] Allow extensions to log extended information in explain plan
  • [SPARK-47274] [SC-162479][PYTHON][SQL] Provide more useful context for PySpark DataFrame API errors
  • [SPARK-47765] [SC-162636][SQL] Add SET COLLATION to parser rules
  • [SPARK-47828] [SC-162722][CONNECT][PYTHON] DataFrameWriterV2.overwrite fails with invalid plan
  • [SPARK-47812] [SC-162696][CONNECT] Support Serialization of SparkSession for ForEachBatch worker
  • [SPARK-47253] [SC-162698][CORE] Allow LiveEventBus to stop without the completely draining of event queue
  • [SPARK-47827] [SC-162625][PYTHON] Missing warnings for deprecated features
  • [SPARK-47733] [SC-162628][SS] Add custom metrics for transformWithState operator part of query progress
  • [SPARK-47784] [SC-162623][SS] Merge TTLMode and TimeoutMode into a single TimeMode.
  • [SPARK-47775] [SC-162319][SQL] Support remaining scalar types in the variant spec.
  • [SPARK-47736] [SC-162503][SQL] Add support for AbstractArrayType
  • [SPARK-47081] [SC-161758][CONNECT] Support Query Execution Progress
  • [SPARK-47682] [SC-162138][SQL] Support cast from variant.
  • [SPARK-47802] [SC-162478][SQL] Revert () from meaning struct() back to meaning *
  • [SPARK-47680] [SC-162318][SQL] Add variant_explode expression.
  • [SPARK-47809] [SC-162511][SQL] checkExceptionInExpression should check error for each codegen mode
  • [SPARK-41811] [SC-162470][PYTHON][CONNECT] Implement SQLStringFormatter with WithRelations
  • [SPARK-47693] [SC-162326][SQL] Add optimization for lowercase comparison of UTF8String used in UTF8_BINARY_LCASE collation
  • [SPARK-47541] [SC-162006][SQL] Collated strings in complex types supporting operations reverse, array_join, concat, map
  • [SPARK-46812] [SC-161535][CONNECT][PYTHON] Make mapInPandas / mapInArrow support ResourceProfile
  • [SPARK-47727] [SC-161982][PYTHON] Make SparkConf to root level to for both SparkSession and SparkContext
  • [SPARK-47406] [SC-159376][SQL] Handle TIMESTAMP and DATETIME in MYSQLDialect
  • [SPARK-47081] Revert “[SC-161758][CONNECT] Support Query Executi…
  • [SPARK-47681] [SC-162043][SQL] Add schema_of_variant expression.
  • [SPARK-47783] [SC-162222] Add some missing SQLSTATEs an clean up the YY000 to use…
  • [SPARK-47634] [SC-161558][SQL] Add legacy support for disabling map key normalization
  • [SPARK-47746] [SC-162022] Implement ordinal-based range encoding in the RocksDBStateEncoder
  • [SPARK-47285] [SC-158340][SQL] AdaptiveSparkPlanExec should always use the context.session
  • [SPARK-47643] [SC-161534][SS][PYTHON] Add pyspark test for python streaming source
  • [SPARK-47582] [SC-161943][SQL] Migrate Catalyst logInfo with variables to structured logging framework
  • [SPARK-47558] [SC-162007][SS] State TTL support for ValueState
  • [SPARK-47358] [SC-160912][SQL][COLLATION] Improve repeat expression support to return correct datatype
  • [SPARK-47504] [SC-162044][SQL] Resolve AbstractDataType simpleStrings for StringTypeCollated
  • [SPARK-47719] Revert “[SC-161909][SQL] Change spark.sql.legacy.t…
  • [SPARK-47657] [SC-162010][SQL] Implement collation filter push down support per file source
  • [SPARK-47081] [SC-161758][CONNECT] Support Query Execution Progress
  • [SPARK-47744] [SC-161999] Add support for negative-valued bytes in range encoder
  • [SPARK-47713] [SC-162009][SQL][CONNECT] Fix a self-join failure
  • [SPARK-47310] [SC-161930][SS] Add micro-benchmark for merge operations for multiple values in value portion of state store
  • [SPARK-47700] [SC-161774][SQL] Fix formatting of error messages with treeNode
  • [SPARK-47752] [SC-161993][PS][CONNECT] Make pyspark.pandas compatible with pyspark-connect
  • [SPARK-47575] [SC-161402][SPARK-47576][SPARK-47654] Implement logWarning/logInfo API in structured logging framework
  • [SPARK-47107] [SC-161201][SS][PYTHON] Implement partition reader for python streaming data source
  • [SPARK-47553] [SC-161772][SS] Add Java support for transformWithState operator APIs
  • [SPARK-47719] [SC-161909][SQL] Change spark.sql.legacy.timeParserPolicy default to CORRECTED
  • [SPARK-47655] [SC-161761][SS] Integrate timer with Initial State handling for state-v2
  • [SPARK-47665] [SC-161550][SQL] Use SMALLINT to Write ShortType to MYSQL
  • [SPARK-47210] [SC-161777][SQL] Addition of implicit casting without indeterminate support
  • [SPARK-47653] [SC-161767][SS] Add support for negative numeric types and range scan key encoder
  • [SPARK-46743] [SC-160777][SQL] Count bug after constant folding
  • [SPARK-47525] [SC-154568][SQL] Support subquery correlation joining on map attributes
  • [SPARK-46366] [SC-151277][SQL] Use WITH expression in BETWEEN to avoid duplicate expressions
  • [SPARK-47563] [SC-161183][SQL] Add map normalization on creation
  • [SPARK-42040] [SC-161171][SQL] SPJ: Introduce a new API for V2 input partition to report partition statistics
  • [SPARK-47679] [SC-161549][SQL] Use HiveConf.getConfVars or Hive conf names directly
  • [SPARK-47685] [SC-161566][SQL] Restore the support for Stream type in Dataset#groupBy
  • [SPARK-47646] [SC-161352][SQL] Make try_to_number return NULL for malformed input
  • [SPARK-47366] [SC-161324][PYTHON] Add pyspark and dataframe parse_json aliases
  • [SPARK-47491] [SC-161176][CORE] Add slf4j-api jar to the class path first before the others of jars directory
  • [SPARK-47270] [SC-158741][SQL] Dataset.isEmpty projects CommandResults locally
  • [SPARK-47364] [SC-158927][CORE] Make PluginEndpoint warn when plugins reply for one-way message
  • [SPARK-47280] [SC-158350][SQL] Remove timezone limitation for ORACLE TIMESTAMP WITH TIMEZONE
  • [SPARK-47551] [SC-161542][SQL] Add variant_get expression.
  • [SPARK-47559] [SC-161255][SQL] Codegen Support for variant parse_json
  • [SPARK-47572] [SC-161351][SQL] Enforce Window partitionSpec is orderable.
  • [SPARK-47546] [SC-161241][SQL] Improve validation when reading Variant from Parquet
  • [SPARK-47543] [SC-161234][CONNECT][PYTHON] Inferring dict as MapType from Pandas DataFrame to allow DataFrame creation
  • [SPARK-47485] [SC-161194][SQL][PYTHON][CONNECT] Create column with collations in dataframe API
  • [SPARK-47641] [SC-161376][SQL] Improve the performance for UnaryMinus and Abs
  • [SPARK-47631] [SC-161325][SQL] Remove unused SQLConf.parquetOutputCommitterClass method
  • [SPARK-47674] [SC-161504][CORE] Enable spark.metrics.appStatusSource.enabled by default
  • [SPARK-47273] [SC-161162][SS][PYTHON] implement Python data stream writer interface.
  • [SPARK-47637] [SC-161408][SQL] Use errorCapturingIdentifier in more places
  • [SPARK-47497] Revert “Revert “[SC-160724][SQL] Make to_csv support the output of array/struct/map/binary as pretty strings””
  • [SPARK-47492] [SC-161316][SQL] Widen whitespace rules in lexer
  • [SPARK-47664] [SC-161475][PYTHON][CONNECT] Validate the column name with cached schema
  • [SPARK-47638] [SC-161339][PS][CONNECT] Skip column name validation in PS
  • [SPARK-47363] [SC-161247][SS] Initial State without state reader implementation for State API v2.
  • [SPARK-47447] [SC-160448][SQL] Allow reading Parquet TimestampLTZ as TimestampNTZ
  • [SPARK-47497] Revert “[SC-160724][SQL] Make to_csv support the output of array/struct/map/binary as pretty strings”
  • [SPARK-47434] [SC-160122][WEBUI] Fix statistics link in StreamingQueryPage
  • [SPARK-46761] [SC-159045][SQL] Quoted strings in a JSON path should support ? characters
  • [SPARK-46915] [SC-155729][SQL] Simplify UnaryMinus Abs and align error class
  • [SPARK-47431] [SC-160919][SQL] Add session level default Collation
  • [SPARK-47620] [SC-161242][PYTHON][CONNECT] Add a helper function to sort columns
  • [SPARK-47570] [SC-161165][SS] Integrate range scan encoder changes with timer implementation
  • [SPARK-47497] [SC-160724][SQL] Make to_csv support the output of array/struct/map/binary as pretty strings
  • [SPARK-47562] [SC-161166][CONNECT] Factor literal handling out of plan.py
  • [SPARK-47509] [SC-160902][SQL] Block subquery expressions in lambda and higher-order functions
  • [SPARK-47539] [SC-160750][SQL] Make the return value of method castToString be Any => UTF8String
  • [SPARK-47372] [SC-160905][SS] Add support for range scan based key state encoder for use with state store provider
  • [SPARK-47517] [SC-160642][CORE][SQL] Prefer Utils.bytesToString for size display
  • [SPARK-47243] [SC-158059][SS] Correct the package name of StateMetadataSource.scala
  • [SPARK-47367] [SC-160913][PYTHON][CONNECT] Support Python data sources with Spark Connect
  • [SPARK-47521] [SC-160666][CORE] Use Utils.tryWithResource during reading shuffle data from external storage
  • [SPARK-47474] [SC-160522][CORE] Revert SPARK-47461 and add some comments
  • [SPARK-47560] [SC-160914][PYTHON][CONNECT] Avoid RPC to validate column name with cached schema
  • [SPARK-47451] [SC-160749][SQL] Support to_json(variant).
  • [SPARK-47528] [SC-160727][SQL] Add UserDefinedType support to DataTypeUtils.canWrite
  • [SPARK-44708] Revert “[SC-160734][PYTHON] Migrate test_reset_index assert_eq to use assertDataFrameEqual”
  • [SPARK-47506] [SC-160740][SQL] Add support to all file source formats for collated data types
  • [SPARK-47256] [SC-160784][SQL] Assign names to error classes _LEGACY_ERROR_TEMP_102[4-7]
  • [SPARK-47495] [SC-160720][CORE] Fix primary resource jar added to spark.jars twice under k8s cluster mode
  • [SPARK-47398] [SC-160572][SQL] Extract a trait for InMemoryTableScanExec to allow for extending functionality
  • [SPARK-47479] [SC-160623][SQL] Optimize cannot write data to relations with multiple paths error log
  • [SPARK-47483] [SC-160629][SQL] Add support for aggregation and join operations on arrays of collated strings
  • [SPARK-47458] [SC-160237][CORE] Fix the problem with calculating the maximum concurrent tasks for the barrier stage
  • [SPARK-47534] [SC-160737][SQL] Move o.a.s.variant to o.a.s.types.variant
  • [SPARK-47396] [SC-159312][SQL] Add a general mapping for TIME WITHOUT TIME ZONE to TimestampNTZType
  • [SPARK-44708] [SC-160734][PYTHON] Migrate test_reset_index assert_eq to use assertDataFrameEqual
  • [SPARK-47309] [SC-157733][SC-160398][SQL] XML: Add schema inference tests for value tags
  • [SPARK-47007] [SC-160630][SQL] Add the MapSort expression
  • [SPARK-47523] [SC-160645][SQL] Replace deprecated JsonParser#getCurrentName with JsonParser#currentName
  • [SPARK-47440] [SC-160635][SQL] Fix pushing unsupported syntax to MsSqlServer
  • [SPARK-47512] [SC-160617][SS] Tag operation type used with RocksDB state store instance lock acquisition/release
  • [SPARK-47346] [SC-159425][PYTHON] Make daemon mode configurable when creating Python planner workers
  • [SPARK-47446] [SC-160163][CORE] Make BlockManager warn before removeBlockInternal
  • [SPARK-46526] [SC-156099][SQL] Support LIMIT over correlated subqueries where predicates only reference outer table
  • [SPARK-47461] [SC-160297][CORE] Remove private function totalRunningTasksPerResourceProfile from ExecutorAllocationManager
  • [SPARK-47422] [SC-160219][SQL] Support collated strings in array operations
  • [SPARK-47500] [SC-160627][PYTHON][CONNECT] Factor column name handling out of plan.py
  • [SPARK-47383] [SC-160144][CORE] Support spark.shutdown.timeout config
  • [SPARK-47342] [SC-159049]Revert “[SQL] Support TimestampNTZ for DB2 TIMESTAMP WITH TIME ZONE”
  • [SPARK-47486] [SC-160491][CONNECT] Remove unused private ArrowDeserializers.getString method
  • [SPARK-47233] [SC-154486][CONNECT][SS][2/2] Client & Server logic for Client side streaming query listener
  • [SPARK-47487] [SC-160534][SQL] Simplify code in AnsiTypeCoercion
  • [SPARK-47443] [SC-160459][SQL] Window Aggregate support for collations
  • [SPARK-47296] [SC-160457][SQL][COLLATION] Fail unsupported functions for non-binary collations
  • [SPARK-47380] [SC-160164][CONNECT] Ensure on the server side that the SparkSession is the same
  • [SPARK-47327] [SC-160069][SQL] Move sort keys concurrency test to CollationFactorySuite
  • [SPARK-47494] [SC-160495][Doc] Add migration doc for the behavior change of Parquet timestamp inference since Spark 3.3
  • [SPARK-47449] [SC-160372][SS] Refactor and split list/timer unit tests
  • [SPARK-46473] [SC-155663][SQL] Reuse getPartitionedFile method
  • [SPARK-47423] [SC-160068][SQL] Collations - Set operation support for strings with collations
  • [SPARK-47439] [SC-160115][PYTHON] Document Python Data Source API in API reference page
  • [SPARK-47457] [SC-160234][SQL] Fix IsolatedClientLoader.supportsHadoopShadedClient to handle Hadoop 3.4+
  • [SPARK-47366] [SC-159348][SQL] Implement parse_json.
  • [SPARK-46331] [SC-152982][SQL] Removing CodegenFallback from subset of DateTime expressions and version() expression
  • [SPARK-47395] [SC-159404] Add collate and collation to other APIs
  • [SPARK-47437] [SC-160117][PYTHON][CONNECT] Correct the error class for DataFrame.sort*
  • [SPARK-47174] [SC-154483][CONNECT][SS][1/2] Server side SparkConnectListenerBusListener for Client side streaming query listener
  • [SPARK-47324] [SC-158720][SQL] Add missing timestamp conversion for JDBC nested types
  • [SPARK-46962] [SC-158834][SS][PYTHON] Add interface for python streaming data source API and implement python worker to run python streaming data source
  • [SPARK-45827] [SC-158498][SQL] Move data type checks to CreatableRelationProvider
  • [SPARK-47342] [SC-158874][SQL] Support TimestampNTZ for DB2 TIMESTAMP WITH TIME ZONE
  • [SPARK-47399] [SC-159378][SQL] Disable generated columns on expressions with collations
  • [SPARK-47146] [SC-158247][CORE] Possible thread leak when doing sort merge join
  • [SPARK-46913] [SC-159149][SS] Add support for processing/event time based timers with transformWithState operator
  • [SPARK-47375] [SC-159063][SQL] Add guidelines for timestamp mapping in JdbcDialect#getCatalystType
  • [SPARK-47394] [SC-159282][SQL] Support TIMESTAMP WITH TIME ZONE for H2Dialect
  • [SPARK-45827] Revert “[SC-158498][SQL] Move data type checks to …
  • [SPARK-47208] [SC-159279][CORE] Allow overriding base overhead memory
  • [SPARK-42627] [SC-158021][SPARK-26494][SQL] Support Oracle TIMESTAMP WITH LOCAL TIME ZONE
  • [SPARK-47055] [SC-156916][PYTHON] Upgrade MyPy 1.8.0
  • [SPARK-46906] [SC-157205][SS] Add a check for stateful operator change for streaming
  • [SPARK-47391] [SC-159283][SQL] Remove the test case workaround for JDK 8
  • [SPARK-47272] [SC-158960][SS] Add MapState implementation for State API v2.
  • [SPARK-47375] [SC-159278][Doc][FollowUp] Fix a mistake in JDBC’s preferTimestampNTZ option doc
  • [SPARK-42328] [SC-157363][SQL] Remove _LEGACY_ERROR_TEMP_1175 from error classes
  • [SPARK-47375] [SC-159261][Doc][FollowUp] Correct the preferTimestampNTZ option description in JDBC doc
  • [SPARK-47344] [SC-159146] Extend INVALID_IDENTIFIER error beyond catching ‘-‘ in an unquoted identifier and fix “IS ! NULL” et al.
  • [SPARK-47340] [SC-159039][SQL] Change “collate” in StringType typename to lowercase
  • [SPARK-47087] [SC-157077][SQL] Raise Spark’s exception with an error class in config value check
  • [SPARK-47327] [SC-158824][SQL] Fix thread safety issue in ICU Collator
  • [SPARK-47082] [SC-157058][SQL] Fix out-of-bounds error condition
  • [SPARK-47331] [SC-158719][SS] Serialization using case classes/primitives/POJO based on SQL encoder for Arbitrary State API v2.
  • [SPARK-47250] [SC-158840][SS] Add additional validations and NERF changes for RocksDB state provider and use of column families
  • [SPARK-47328] [SC-158745][SQL] Rename UCS_BASIC collation to UTF8_BINARY
  • [SPARK-47207] [SC-157845][CORE] Support spark.driver.timeout and DriverTimeoutPlugin
  • [SPARK-47370] [SC-158956][Doc] Add migration doc: TimestampNTZ type inference on Parquet files
  • [SPARK-47309] [SC-158827][SQL][XML] Add schema inference unit tests
  • [SPARK-47295] [SC-158850][SQL] Added ICU StringSearch for the startsWith and endsWith functions
  • [SPARK-47343] [SC-158851][SQL] Fix NPE when sqlString variable value is null string in execute immediate
  • [SPARK-46293] [SC-150117][CONNECT][PYTHON] Use protobuf transitive dependency
  • [SPARK-46795] [SC-154143][SQL] Replace UnsupportedOperationException by SparkUnsupportedOperationException in sql/core
  • [SPARK-46087] [SC-149023][PYTHON] Sync PySpark dependencies in docs and dev requirements
  • [SPARK-47169] [SC-158848][SQL] Disable bucketing on collated columns
  • [SPARK-42332] [SC-153996][SQL] Changing the require to a SparkException in ComplexTypeMergingExpression
  • [SPARK-45827] [SC-158498][SQL] Move data type checks to CreatableRelationProvider
  • [SPARK-47341] [SC-158825][Connect] Replace commands with relations in a few tests in SparkConnectClientSuite
  • [SPARK-43255] [SC-158026][SQL] Replace the error class _LEGACY_ERROR_TEMP_2020 by an internal error
  • [SPARK-47248] [SC-158494][SQL][COLLATION] Improved string function support: contains
  • [SPARK-47334] [SC-158716][SQL] Make withColumnRenamed reuse the implementation of withColumnsRenamed
  • [SPARK-46442] [SC-153168][SQL] DS V2 supports push down PERCENTILE_CONT and PERCENTILE_DISC
  • [SPARK-47313] [SC-158747][SQL] Added scala.MatchError handling inside QueryExecution.toInternalError
  • [SPARK-45827] [SC-158732][SQL] Add variant singleton type for Java
  • [SPARK-47337] [SC-158743][SQL][DOCKER] Upgrade DB2 docker image version to 11.5.8.0
  • [SPARK-47302] [SC-158609][SQL] Collate keyword as identifier
  • [SPARK-46817] [SC-154196][CORE] Fix spark-daemon.sh usage by adding decommission command
  • [SPARK-46739] [SC-153553][SQL] Add the error class UNSUPPORTED_CALL
  • [SPARK-47102] [SC-158253][SQL] Add the COLLATION_ENABLED config flag
  • [SPARK-46774] [SC-153925][SQL][AVRO] Use mapreduce.output.fileoutputformat.compress instead of deprecated mapred.output.compress in Avro write jobs
  • [SPARK-45245] [SC-146961][PYTHON][CONNECT] PythonWorkerFactory: Timeout if worker does not connect back.
  • [SPARK-46835] [SC-158355][SQL][Collations] Join support for non-binary collations
  • [SPARK-47131] [SC-158154][SQL][COLLATION] String function support: contains, startswith, endswith
  • [SPARK-46077] [SC-157839][SQL] Consider the type generated by TimestampNTZConverter in JdbcDialect.compileValue.
  • [SPARK-47311] [SC-158465][SQL][PYTHON] Suppress Python exceptions where PySpark is not in the Python path
  • [SPARK-47319] [SC-158599][SQL] Improve missingInput calculation
  • [SPARK-47316] [SC-158606][SQL] Fix TimestampNTZ in Postgres Array
  • [SPARK-47268] [SC-158158][SQL][Collations] Support for repartition with collations
  • [SPARK-47191] [SC-157831][SQL] Avoid unnecessary relation lookup when uncaching table/view
  • [SPARK-47168] [SC-158257][SQL] Disable parquet filter pushdown when working with non default collated strings
  • [SPARK-47236] [SC-158015][CORE] Fix deleteRecursivelyUsingJavaIO to skip non-existing file input
  • [SPARK-47238] [SC-158466][SQL] Reduce executor memory usage by making generated code in WSCG a broadcast variable
  • [SPARK-47249] [SC-158133][CONNECT] Fix bug where all connect executions are considered abandoned regardless of their actual status
  • [SPARK-47202] [SC-157828][PYTHON] Fix typo breaking datetimes with tzinfo
  • [SPARK-46834] [SC-158139][SQL][Collations] Support for aggregates
  • [SPARK-47277] [SC-158351][3.5] PySpark util function assertDataFrameEqual should not support streaming DF
  • [SPARK-47155] [SC-158473][PYTHON] Fix Error Class Issue
  • [SPARK-47245] [SC-158163][SQL] Improve error code for INVALID_PARTITION_COLUMN_DATA_TYPE
  • [SPARK-39771] [SC-158425][CORE] Add a warning msg in Dependency when a too large number of shuffle blocks is to be created.
  • [SPARK-47277] [SC-158329] PySpark util function assertDataFrameEqual should not support streaming DF
  • [SPARK-47293] [SC-158356][CORE] Build batchSchema with sparkSchema instead of append one by one
  • [SPARK-46732] [SC-153517][CONNECT]Make Subquery/Broadcast thread work with Connect’s artifact management
  • [SPARK-44746] [SC-158332][PYTHON] Add more Python UDTF documentation for functions that accept input tables
  • [SPARK-47120] [SC-157517][SQL] Null comparison push down data filter from subquery produces in NPE in Parquet filter
  • [SPARK-47251] [SC-158291][PYTHON] Block invalid types from the args argument for sql command
  • [SPARK-47251] Revert “[SC-158121][PYTHON] Block invalid types from the args argument for sql command”
  • [SPARK-47015] [SC-157900][SQL] Disable partitioning on collated columns
  • [SPARK-46846] [SC-154308][CORE] Make WorkerResourceInfo extend Serializable explicitly
  • [SPARK-46641] [SC-156314][SS] Add maxBytesPerTrigger threshold
  • [SPARK-47244] [SC-158122][CONNECT] SparkConnectPlanner make internal functions private
  • [SPARK-47266] [SC-158146][CONNECT] Make ProtoUtils.abbreviate return the same type as the input
  • [SPARK-46961] [SC-158183][SS] Using ProcessorContext to store and retrieve handle
  • [SPARK-46862] [SC-154548][SQL] Disable CSV column pruning in the multi-line mode
  • [SPARK-46950] [SC-155803][CORE][SQL] Align not available codec error-class
  • [SPARK-46368] [SC-153236][CORE] Support readyz in REST Submission API
  • [SPARK-46806] [SC-154108][PYTHON] Improve error message for spark.table when argument type is wrong
  • [SPARK-47211] [SC-158008][CONNECT][PYTHON] Fix ignored PySpark Connect string collation
  • [SPARK-46552] [SC-151366][SQL] Replace UnsupportedOperationException by SparkUnsupportedOperationException in catalyst
  • [SPARK-47147] [SC-157842][PYTHON][SQL] Fix PySpark collated string conversion error
  • [SPARK-47144] [SC-157826][CONNECT][SQL][PYTHON] Fix Spark Connect collation error by adding collateId protobuf field
  • [SPARK-46575] [SC-153200][SQL][HIVE] Make HiveThriftServer2.startWithContext DevelopApi retriable and fix flakiness of ThriftServerWithSparkContextInHttpSuite
  • [SPARK-46696] [SC-153832][CORE] In ResourceProfileManager, function calls should occur after variable declarations
  • [SPARK-47214] [SC-157862][Python] Create UDTF API for ‘analyze’ method to differentiate constant NULL arguments and other types of arguments
  • [SPARK-46766] [SC-153909][SQL][AVRO] ZSTD Buffer Pool Support For AVRO datasource
  • [SPARK-47192] [SC-157819] Convert some _LEGACY_ERROR_TEMP_0035 errors
  • [SPARK-46928] [SC-157341][SS] Add support for ListState in Arbitrary State API v2.
  • [SPARK-46881] [SC-154612][CORE] Support spark.deploy.workerSelectionPolicy
  • [SPARK-46800] [SC-154107][CORE] Support spark.deploy.spreadOutDrivers
  • [SPARK-45484] [SC-146014][SQL] Fix the bug that uses incorrect parquet compression codec lz4raw
  • [SPARK-46791] [SC-154018][SQL] Support Java Set in JavaTypeInference
  • [SPARK-46332] [SC-150224][SQL] Migrate CatalogNotFoundException to the error class CATALOG_NOT_FOUND
  • [SPARK-47164] [SC-157616][SQL] Make Default Value From Wider Type Narrow Literal of v2 behave the same as v1
  • [SPARK-46664] [SC-153181][CORE] Improve Master to recover quickly in case of zero workers and apps
  • [SPARK-46759] [SC-153839][SQL][AVRO] Codec xz and zstandard support compression level for avro files

Databricks ODBC/JDBC driver support

Databricks supports ODBC/JDBC drivers released in the past 2 years. Please download the recently released drivers and upgrade (download ODBC, download JDBC).

System environment

  • Operating System: Ubuntu 22.04.4 LTS
  • Java: Zulu 8.74.0.17-CA-linux64
  • Scala: 2.12.15
  • Python: 3.11.0
  • R: 4.3.2
  • Delta Lake: 3.2.0

Installed Python libraries

Library Version Library Version Library Version
asttokens 2.0.5 astunparse 1.6.3 azure-core 1.30.1
azure-storage-blob 12.19.1 azure-storage-file-datalake 12.14.0 backcall 0.2.0
black 23.3.0 blinker 1.4 boto3 1.34.39
botocore 1.34.39 cachetools 5.3.3 certifi 2023.7.22
cffi 1.15.1 chardet 4.0.0 charset-normalizer 2.0.4
click 8.0.4 cloudpickle 2.2.1 comm 0.1.2
contourpy 1.0.5 cryptography 41.0.3 cycler 0.11.0
Cython 0.29.32 databricks-sdk 0.20.0 dbus-python 1.2.18
debugpy 1.6.7 decorator 5.1.1 distlib 0.3.8
entrypoints 0.4 executing 0.8.3 facets-overview 1.1.1
filelock 3.13.1 fonttools 4.25.0 gitdb 4.0.11
GitPython 3.1.43 google-api-core 2.18.0 google-auth 2.29.0
google-cloud-core 2.4.1 google-cloud-storage 2.16.0 google-crc32c 1.5.0
google-resumable-media 2.7.0 googleapis-common-protos 1.63.0 grpcio 1.60.0
grpcio-status 1.60.0 httplib2 0.20.2 idna 3.4
importlib-metadata 6.0.0 ipyflow-core 0.0.198 ipykernel 6.25.1
ipython 8.15.0 ipython-genutils 0.2.0 ipywidgets 7.7.2
isodate 0.6.1 jedi 0.18.1 jeepney 0.7.1
jmespath 0.10.0 joblib 1.2.0 jupyter_client 7.4.9
jupyter_core 5.3.0 keyring 23.5.0 kiwisolver 1.4.4
launchpadlib 1.10.16 lazr.restfulclient 0.14.4 lazr.uri 1.0.6
matplotlib 3.7.2 matplotlib-inline 0.1.6 mlflow-skinny 2.11.3
more-itertools 8.10.0 mypy-extensions 0.4.3 nest-asyncio 1.5.6
numpy 1.23.5 oauthlib 3.2.0 packaging 23.2
pandas 1.5.3 parso 0.8.3 pathspec 0.10.3
patsy 0.5.3 pexpect 4.8.0 pickleshare 0.7.5
Pillow 9.4.0 pip 23.2.1 platformdirs 3.10.0
plotly 5.9.0 prompt-toolkit 3.0.36 proto-plus 1.23.0
protobuf 4.24.1 psutil 5.9.0 psycopg2 2.9.3
ptyprocess 0.7.0 pure-eval 0.2.2 pyarrow 14.0.1
pyasn1 0.4.8 pyasn1-modules 0.2.8 pyccolo 0.0.52
pycparser 2.21 pydantic 1.10.6 Pygments 2.15.1
PyGObject 3.42.1 PyJWT 2.3.0 pyodbc 4.0.38
pyparsing 3.0.9 python-dateutil 2.8.2 python-lsp-jsonrpc 1.1.1
pytz 2022.7 PyYAML 6.0 pyzmq 23.2.0
requests 2.31.0 rsa 4.9 s3transfer 0.10.1
scikit-learn 1.3.0 scipy 1.11.1 seaborn 0.12.2
SecretStorage 3.3.1 setuptools 68.0.0 six 1.16.0
smmap 5.0.1 sqlparse 0.5.0 ssh-import-id 5.11
stack-data 0.2.0 statsmodels 0.14.0 tenacity 8.2.2
threadpoolctl 2.2.0 tokenize-rt 4.2.1 tornado 6.3.2
traitlets 5.7.1 typing_extensions 4.10.0 tzdata 2022.1
ujson 5.4.0 unattended-upgrades 0.1 urllib3 1.26.16
virtualenv 20.24.2 wadllib 1.3.6 wcwidth 0.2.5
wheel 0.38.4 zipp 3.11.0

Installed R libraries

R libraries are installed from the Posit Package Manager CRAN snapshot.

Library Version Library Version Library Version
arrow 14.0.0.2 askpass 1.2.0 assertthat 0.2.1
backports 1.4.1 base 4.3.2 base64enc 0.1-3
bigD 0.2.0 bit 4.0.5 bit64 4.0.5
bitops 1.0-7 blob 1.2.4 boot 1.3-28
brew 1.0-10 brio 1.1.4 broom 1.0.5
bslib 0.6.1 cachem 1.0.8 callr 3.7.3
caret 6.0-94 cellranger 1.1.0 chron 2.3-61
class 7.3-22 cli 3.6.2 clipr 0.8.0
clock 0.7.0 cluster 2.1.4 codetools 0.2-19
colorspace 2.1-0 commonmark 1.9.1 compiler 4.3.2
config 0.3.2 conflicted 1.2.0 cpp11 0.4.7
crayon 1.5.2 credentials 2.0.1 curl 5.2.0
data.table 1.15.0 datasets 4.3.2 DBI 1.2.1
dbplyr 2.4.0 desc 1.4.3 devtools 2.4.5
diagram 1.6.5 diffobj 0.3.5 digest 0.6.34
downlit 0.4.3 dplyr 1.1.4 dtplyr 1.3.1
e1071 1.7-14 ellipsis 0.3.2 evaluate 0.23
fansi 1.0.6 farver 2.1.1 fastmap 1.1.1
fontawesome 0.5.2 forcats 1.0.0 foreach 1.5.2
foreign 0.8-85 forge 0.2.0 fs 1.6.3
future 1.33.1 future.apply 1.11.1 gargle 1.5.2
generics 0.1.3 gert 2.0.1 ggplot2 3.4.4
gh 1.4.0 git2r 0.33.0 gitcreds 0.1.2
glmnet 4.1-8 globals 0.16.2 glue 1.7.0
googledrive 2.1.1 googlesheets4 1.1.1 gower 1.0.1
graphics 4.3.2 grDevices 4.3.2 grid 4.3.2
gridExtra 2.3 gsubfn 0.7 gt 0.10.1
gtable 0.3.4 hardhat 1.3.1 haven 2.5.4
highr 0.10 hms 1.1.3 htmltools 0.5.7
htmlwidgets 1.6.4 httpuv 1.6.14 httr 1.4.7
httr2 1.0.0 ids 1.0.1 ini 0.3.1
ipred 0.9-14 isoband 0.2.7 iterators 1.0.14
jquerylib 0.1.4 jsonlite 1.8.8 juicyjuice 0.1.0
KernSmooth 2.23-21 knitr 1.45 labeling 0.4.3
later 1.3.2 lattice 0.21-8 lava 1.7.3
lifecycle 1.0.4 listenv 0.9.1 lubridate 1.9.3
magrittr 2.0.3 markdown 1.12 MASS 7.3-60
Matrix 1.5-4.1 memoise 2.0.1 methods 4.3.2
mgcv 1.8-42 mime 0.12 miniUI 0.1.1.1
mlflow 2.10.0 ModelMetrics 1.2.2.2 modelr 0.1.11
munsell 0.5.0 nlme 3.1-163 nnet 7.3-19
numDeriv 2016.8-1.1 openssl 2.1.1 parallel 4.3.2
parallelly 1.36.0 pillar 1.9.0 pkgbuild 1.4.3
pkgconfig 2.0.3 pkgdown 2.0.7 pkgload 1.3.4
plogr 0.2.0 plyr 1.8.9 praise 1.0.0
prettyunits 1.2.0 pROC 1.18.5 processx 3.8.3
prodlim 2023.08.28 profvis 0.3.8 progress 1.2.3
progressr 0.14.0 promises 1.2.1 proto 1.0.0
proxy 0.4-27 ps 1.7.6 purrr 1.0.2
R6 2.5.1 ragg 1.2.7 randomForest 4.7-1.1
rappdirs 0.3.3 rcmdcheck 1.4.0 RColorBrewer 1.1-3
Rcpp 1.0.12 RcppEigen 0.3.3.9.4 reactable 0.4.4
reactR 0.5.0 readr 2.1.5 readxl 1.4.3
recipes 1.0.9 rematch 2.0.0 rematch2 2.1.2
remotes 2.4.2.1 reprex 2.1.0 reshape2 1.4.4
rlang 1.1.3 rmarkdown 2.25 RODBC 1.3-23
roxygen2 7.3.1 rpart 4.1.21 rprojroot 2.0.4
Rserve 1.8-13 RSQLite 2.3.5 rstudioapi 0.15.0
rversions 2.1.2 rvest 1.0.3 sass 0.4.8
scales 1.3.0 selectr 0.4-2 sessioninfo 1.2.2
shape 1.4.6 shiny 1.8.0 sourcetools 0.1.7-1
sparklyr 1.8.4 spatial 7.3-15 splines 4.3.2
sqldf 0.4-11 SQUAREM 2021.1 stats 4.3.2
stats4 4.3.2 stringi 1.8.3 stringr 1.5.1
survival 3.5-5 swagger 3.33.1 sys 3.4.2
systemfonts 1.0.5 tcltk 4.3.2 testthat 3.2.1
textshaping 0.3.7 tibble 3.2.1 tidyr 1.3.1
tidyselect 1.2.0 tidyverse 2.0.0 timechange 0.3.0
timeDate 4032.109 tinytex 0.49 tools 4.3.2
tzdb 0.4.0 urlchecker 1.0.1 usethis 2.2.2
utf8 1.2.4 utils 4.3.2 uuid 1.2-0
V8 4.4.1 vctrs 0.6.5 viridisLite 0.4.2
vroom 1.6.5 waldo 0.5.2 whisker 0.4.1
withr 3.0.0 xfun 0.41 xml2 1.3.6
xopen 1.0.0 xtable 1.8-4 yaml 2.3.8
zeallot 0.1.0 zip 2.3.1

Installed Java and Scala libraries (Scala 2.12 cluster version)

Group ID Artifact ID Version
antlr antlr 2.7.7
com.amazonaws amazon-kinesis-client 1.12.0
com.amazonaws aws-java-sdk-autoscaling 1.12.610
com.amazonaws aws-java-sdk-cloudformation 1.12.610
com.amazonaws aws-java-sdk-cloudfront 1.12.610
com.amazonaws aws-java-sdk-cloudhsm 1.12.610
com.amazonaws aws-java-sdk-cloudsearch 1.12.610
com.amazonaws aws-java-sdk-cloudtrail 1.12.610
com.amazonaws aws-java-sdk-cloudwatch 1.12.610
com.amazonaws aws-java-sdk-cloudwatchmetrics 1.12.610
com.amazonaws aws-java-sdk-codedeploy 1.12.610
com.amazonaws aws-java-sdk-cognitoidentity 1.12.610
com.amazonaws aws-java-sdk-cognitosync 1.12.610
com.amazonaws aws-java-sdk-config 1.12.610
com.amazonaws aws-java-sdk-core 1.12.610
com.amazonaws aws-java-sdk-datapipeline 1.12.610
com.amazonaws aws-java-sdk-directconnect 1.12.610
com.amazonaws aws-java-sdk-directory 1.12.610
com.amazonaws aws-java-sdk-dynamodb 1.12.610
com.amazonaws aws-java-sdk-ec2 1.12.610
com.amazonaws aws-java-sdk-ecs 1.12.610
com.amazonaws aws-java-sdk-efs 1.12.610
com.amazonaws aws-java-sdk-elasticache 1.12.610
com.amazonaws aws-java-sdk-elasticbeanstalk 1.12.610
com.amazonaws aws-java-sdk-elasticloadbalancing 1.12.610
com.amazonaws aws-java-sdk-elastictranscoder 1.12.610
com.amazonaws aws-java-sdk-emr 1.12.610
com.amazonaws aws-java-sdk-glacier 1.12.610
com.amazonaws aws-java-sdk-glue 1.12.610
com.amazonaws aws-java-sdk-iam 1.12.610
com.amazonaws aws-java-sdk-importexport 1.12.610
com.amazonaws aws-java-sdk-kinesis 1.12.610
com.amazonaws aws-java-sdk-kms 1.12.610
com.amazonaws aws-java-sdk-lambda 1.12.610
com.amazonaws aws-java-sdk-logs 1.12.610
com.amazonaws aws-java-sdk-machinelearning 1.12.610
com.amazonaws aws-java-sdk-opsworks 1.12.610
com.amazonaws aws-java-sdk-rds 1.12.610
com.amazonaws aws-java-sdk-redshift 1.12.610
com.amazonaws aws-java-sdk-route53 1.12.610
com.amazonaws aws-java-sdk-s3 1.12.610
com.amazonaws aws-java-sdk-ses 1.12.610
com.amazonaws aws-java-sdk-simpledb 1.12.610
com.amazonaws aws-java-sdk-simpleworkflow 1.12.610
com.amazonaws aws-java-sdk-sns 1.12.610
com.amazonaws aws-java-sdk-sqs 1.12.610
com.amazonaws aws-java-sdk-ssm 1.12.610
com.amazonaws aws-java-sdk-storagegateway 1.12.610
com.amazonaws aws-java-sdk-sts 1.12.610
com.amazonaws aws-java-sdk-support 1.12.610
com.amazonaws aws-java-sdk-swf-libraries 1.11.22
com.amazonaws aws-java-sdk-workspaces 1.12.610
com.amazonaws jmespath-java 1.12.610
com.clearspring.analytics stream 2.9.6
com.databricks Rserve 1.8-3
com.databricks databricks-sdk-java 0.17.1
com.databricks jets3t 0.7.1-0
com.databricks.scalapb compilerplugin_2.12 0.4.15-10
com.databricks.scalapb scalapb-runtime_2.12 0.4.15-10
com.esotericsoftware kryo-shaded 4.0.2
com.esotericsoftware minlog 1.3.0
com.fasterxml classmate 1.3.4
com.fasterxml.jackson.core jackson-annotations 2.15.2
com.fasterxml.jackson.core jackson-core 2.15.2
com.fasterxml.jackson.core jackson-databind 2.15.2
com.fasterxml.jackson.dataformat jackson-dataformat-cbor 2.15.2
com.fasterxml.jackson.dataformat jackson-dataformat-yaml 2.15.2
com.fasterxml.jackson.datatype jackson-datatype-joda 2.15.2
com.fasterxml.jackson.datatype jackson-datatype-jsr310 2.16.0
com.fasterxml.jackson.module jackson-module-paranamer 2.15.2
com.fasterxml.jackson.module jackson-module-scala_2.12 2.15.2
com.github.ben-manes.caffeine caffeine 2.9.3
com.github.fommil jniloader 1.1
com.github.fommil.netlib native_ref-java 1.1
com.github.fommil.netlib native_ref-java 1.1-natives
com.github.fommil.netlib native_system-java 1.1
com.github.fommil.netlib native_system-java 1.1-natives
com.github.fommil.netlib netlib-native_ref-linux-x86_64 1.1-natives
com.github.fommil.netlib netlib-native_system-linux-x86_64 1.1-natives
com.github.luben zstd-jni 1.5.5-4
com.github.wendykierp JTransforms 3.1
com.google.code.findbugs jsr305 3.0.0
com.google.code.gson gson 2.10.1
com.google.crypto.tink tink 1.9.0
com.google.errorprone error_prone_annotations 2.10.0
com.google.flatbuffers flatbuffers-java 23.5.26
com.google.guava guava 15.0
com.google.protobuf protobuf-java 2.6.1
com.helger profiler 1.1.1
com.ibm.icu icu4j 72.1
com.jcraft jsch 0.1.55
com.jolbox bonecp 0.8.0.RELEASE
com.lihaoyi sourcecode_2.12 0.1.9
com.microsoft.azure azure-data-lake-store-sdk 2.3.9
com.microsoft.sqlserver mssql-jdbc 11.2.2.jre8
com.ning compress-lzf 1.1.2
com.sun.mail javax.mail 1.5.2
com.sun.xml.bind jaxb-core 2.2.11
com.sun.xml.bind jaxb-impl 2.2.11
com.tdunning json 1.8
com.thoughtworks.paranamer paranamer 2.8
com.trueaccord.lenses lenses_2.12 0.4.12
com.twitter chill-java 0.10.0
com.twitter chill_2.12 0.10.0
com.twitter util-app_2.12 7.1.0
com.twitter util-core_2.12 7.1.0
com.twitter util-function_2.12 7.1.0
com.twitter util-jvm_2.12 7.1.0
com.twitter util-lint_2.12 7.1.0
com.twitter util-registry_2.12 7.1.0
com.twitter util-stats_2.12 7.1.0
com.typesafe config 1.4.3
com.typesafe.scala-logging scala-logging_2.12 3.7.2
com.uber h3 3.7.3
com.univocity univocity-parsers 2.9.1
com.zaxxer HikariCP 4.0.3
commons-cli commons-cli 1.5.0
commons-codec commons-codec 1.16.0
commons-collections commons-collections 3.2.2
commons-dbcp commons-dbcp 1.4
commons-fileupload commons-fileupload 1.5
commons-httpclient commons-httpclient 3.1
commons-io commons-io 2.13.0
commons-lang commons-lang 2.6
commons-logging commons-logging 1.1.3
commons-pool commons-pool 1.5.4
dev.ludovic.netlib arpack 3.0.3
dev.ludovic.netlib blas 3.0.3
dev.ludovic.netlib lapack 3.0.3
info.ganglia.gmetric4j gmetric4j 1.0.10
io.airlift aircompressor 0.25
io.delta delta-sharing-client_2.12 1.0.5
io.dropwizard.metrics metrics-annotation 4.2.19
io.dropwizard.metrics metrics-core 4.2.19
io.dropwizard.metrics metrics-graphite 4.2.19
io.dropwizard.metrics metrics-healthchecks 4.2.19
io.dropwizard.metrics metrics-jetty9 4.2.19
io.dropwizard.metrics metrics-jmx 4.2.19
io.dropwizard.metrics metrics-json 4.2.19
io.dropwizard.metrics metrics-jvm 4.2.19
io.dropwizard.metrics metrics-servlets 4.2.19
io.netty netty-all 4.1.96.Final
io.netty netty-buffer 4.1.96.Final
io.netty netty-codec 4.1.96.Final
io.netty netty-codec-http 4.1.96.Final
io.netty netty-codec-http2 4.1.96.Final
io.netty netty-codec-socks 4.1.96.Final
io.netty netty-common 4.1.96.Final
io.netty netty-handler 4.1.96.Final
io.netty netty-handler-proxy 4.1.96.Final
io.netty netty-resolver 4.1.96.Final
io.netty netty-tcnative-boringssl-static 2.0.61.Final-linux-aarch_64
io.netty netty-tcnative-boringssl-static 2.0.61.Final-linux-x86_64
io.netty netty-tcnative-boringssl-static 2.0.61.Final-osx-aarch_64
io.netty netty-tcnative-boringssl-static 2.0.61.Final-osx-x86_64
io.netty netty-tcnative-boringssl-static 2.0.61.Final-windows-x86_64
io.netty netty-tcnative-classes 2.0.61.Final
io.netty netty-transport 4.1.96.Final
io.netty netty-transport-classes-epoll 4.1.96.Final
io.netty netty-transport-classes-kqueue 4.1.96.Final
io.netty netty-transport-native-epoll 4.1.96.Final
io.netty netty-transport-native-epoll 4.1.96.Final-linux-aarch_64
io.netty netty-transport-native-epoll 4.1.96.Final-linux-x86_64
io.netty netty-transport-native-kqueue 4.1.96.Final-osx-aarch_64
io.netty netty-transport-native-kqueue 4.1.96.Final-osx-x86_64
io.netty netty-transport-native-unix-common 4.1.96.Final
io.prometheus simpleclient 0.7.0
io.prometheus simpleclient_common 0.7.0
io.prometheus simpleclient_dropwizard 0.7.0
io.prometheus simpleclient_pushgateway 0.7.0
io.prometheus simpleclient_servlet 0.7.0
io.prometheus.jmx collector 0.12.0
jakarta.annotation jakarta.annotation-api 1.3.5
jakarta.servlet jakarta.servlet-api 4.0.3
jakarta.validation jakarta.validation-api 2.0.2
jakarta.ws.rs jakarta.ws.rs-api 2.1.6
javax.activation activation 1.1.1
javax.el javax.el-api 2.2.4
javax.jdo jdo-api 3.0.1
javax.transaction jta 1.1
javax.transaction transaction-api 1.1
javax.xml.bind jaxb-api 2.2.11
javolution javolution 5.5.1
jline jline 2.14.6
joda-time joda-time 2.12.1
net.java.dev.jna jna 5.8.0
net.razorvine pickle 1.3
net.sf.jpam jpam 1.1
net.sf.opencsv opencsv 2.3
net.sf.supercsv super-csv 2.2.0
net.snowflake snowflake-ingest-sdk 0.9.6
net.sourceforge.f2j arpack_combined_all 0.1
org.acplt.remotetea remotetea-oncrpc 1.1.2
org.antlr ST4 4.0.4
org.antlr antlr-runtime 3.5.2
org.antlr antlr4-runtime 4.9.3
org.antlr stringtemplate 3.2.1
org.apache.ant ant 1.10.11
org.apache.ant ant-jsch 1.10.11
org.apache.ant ant-launcher 1.10.11
org.apache.arrow arrow-format 15.0.0
org.apache.arrow arrow-memory-core 15.0.0
org.apache.arrow arrow-memory-netty 15.0.0
org.apache.arrow arrow-vector 15.0.0
org.apache.avro avro 1.11.3
org.apache.avro avro-ipc 1.11.3
org.apache.avro avro-mapred 1.11.3
org.apache.commons commons-collections4 4.4
org.apache.commons commons-compress 1.23.0
org.apache.commons commons-crypto 1.1.0
org.apache.commons commons-lang3 3.12.0
org.apache.commons commons-math3 3.6.1
org.apache.commons commons-text 1.10.0
org.apache.curator curator-client 2.13.0
org.apache.curator curator-framework 2.13.0
org.apache.curator curator-recipes 2.13.0
org.apache.datasketches datasketches-java 3.1.0
org.apache.datasketches datasketches-memory 2.0.0
org.apache.derby derby 10.14.2.0
org.apache.hadoop hadoop-client-runtime 3.3.6
org.apache.hive hive-beeline 2.3.9
org.apache.hive hive-cli 2.3.9
org.apache.hive hive-jdbc 2.3.9
org.apache.hive hive-llap-client 2.3.9
org.apache.hive hive-llap-common 2.3.9
org.apache.hive hive-serde 2.3.9
org.apache.hive hive-shims 2.3.9
org.apache.hive hive-storage-api 2.8.1
org.apache.hive.shims hive-shims-0.23 2.3.9
org.apache.hive.shims hive-shims-common 2.3.9
org.apache.hive.shims hive-shims-scheduler 2.3.9
org.apache.httpcomponents httpclient 4.5.14
org.apache.httpcomponents httpcore 4.4.16
org.apache.ivy ivy 2.5.1
org.apache.logging.log4j log4j-1.2-api 2.22.1
org.apache.logging.log4j log4j-api 2.22.1
org.apache.logging.log4j log4j-core 2.22.1
org.apache.logging.log4j log4j-layout-template-json 2.22.1
org.apache.logging.log4j log4j-slf4j2-impl 2.22.1
org.apache.orc orc-core 1.9.2-shaded-protobuf
org.apache.orc orc-mapreduce 1.9.2-shaded-protobuf
org.apache.orc orc-shims 1.9.2
org.apache.thrift libfb303 0.9.3
org.apache.thrift libthrift 0.12.0
org.apache.ws.xmlschema xmlschema-core 2.3.0
org.apache.xbean xbean-asm9-shaded 4.23
org.apache.yetus audience-annotations 0.13.0
org.apache.zookeeper zookeeper 3.6.3
org.apache.zookeeper zookeeper-jute 3.6.3
org.checkerframework checker-qual 3.31.0
org.codehaus.jackson jackson-core-asl 1.9.13
org.codehaus.jackson jackson-mapper-asl 1.9.13
org.codehaus.janino commons-compiler 3.0.16
org.codehaus.janino janino 3.0.16
org.datanucleus datanucleus-api-jdo 4.2.4
org.datanucleus datanucleus-core 4.1.17
org.datanucleus datanucleus-rdbms 4.1.19
org.datanucleus javax.jdo 3.2.0-m3
org.eclipse.collections eclipse-collections 11.1.0
org.eclipse.collections eclipse-collections-api 11.1.0
org.eclipse.jetty jetty-client 9.4.52.v20230823
org.eclipse.jetty jetty-continuation 9.4.52.v20230823
org.eclipse.jetty jetty-http 9.4.52.v20230823
org.eclipse.jetty jetty-io 9.4.52.v20230823
org.eclipse.jetty jetty-jndi 9.4.52.v20230823
org.eclipse.jetty jetty-plus 9.4.52.v20230823
org.eclipse.jetty jetty-proxy 9.4.52.v20230823
org.eclipse.jetty jetty-security 9.4.52.v20230823
org.eclipse.jetty jetty-server 9.4.52.v20230823
org.eclipse.jetty jetty-servlet 9.4.52.v20230823
org.eclipse.jetty jetty-servlets 9.4.52.v20230823
org.eclipse.jetty jetty-util 9.4.52.v20230823
org.eclipse.jetty jetty-util-ajax 9.4.52.v20230823
org.eclipse.jetty jetty-webapp 9.4.52.v20230823
org.eclipse.jetty jetty-xml 9.4.52.v20230823
org.eclipse.jetty.websocket websocket-api 9.4.52.v20230823
org.eclipse.jetty.websocket websocket-client 9.4.52.v20230823
org.eclipse.jetty.websocket websocket-common 9.4.52.v20230823
org.eclipse.jetty.websocket websocket-server 9.4.52.v20230823
org.eclipse.jetty.websocket websocket-servlet 9.4.52.v20230823
org.fusesource.leveldbjni leveldbjni-all 1.8
org.glassfish.hk2 hk2-api 2.6.1
org.glassfish.hk2 hk2-locator 2.6.1
org.glassfish.hk2 hk2-utils 2.6.1
org.glassfish.hk2 osgi-resource-locator 1.0.3
org.glassfish.hk2.external aopalliance-repackaged 2.6.1
org.glassfish.hk2.external jakarta.inject 2.6.1
org.glassfish.jersey.containers jersey-container-servlet 2.40
org.glassfish.jersey.containers jersey-container-servlet-core 2.40
org.glassfish.jersey.core jersey-client 2.40
org.glassfish.jersey.core jersey-common 2.40
org.glassfish.jersey.core jersey-server 2.40
org.glassfish.jersey.inject jersey-hk2 2.40
org.hibernate.validator hibernate-validator 6.1.7.Final
org.ini4j ini4j 0.5.4
org.javassist javassist 3.29.2-GA
org.jboss.logging jboss-logging 3.3.2.Final
org.jdbi jdbi 2.63.1
org.jetbrains annotations 17.0.0
org.joda joda-convert 1.7
org.jodd jodd-core 3.5.2
org.json4s json4s-ast_2.12 3.7.0-M11
org.json4s json4s-core_2.12 3.7.0-M11
org.json4s json4s-jackson_2.12 3.7.0-M11
org.json4s json4s-scalap_2.12 3.7.0-M11
org.lz4 lz4-java 1.8.0
org.mlflow mlflow-spark_2.12 2.9.1
org.objenesis objenesis 2.5.1
org.postgresql postgresql 42.6.1
org.roaringbitmap RoaringBitmap 0.9.45-databricks
org.roaringbitmap shims 0.9.45-databricks
org.rocksdb rocksdbjni 8.3.2
org.rosuda.REngine REngine 2.1.0
org.scala-lang scala-compiler_2.12 2.12.15
org.scala-lang scala-library_2.12 2.12.15
org.scala-lang scala-reflect_2.12 2.12.15
org.scala-lang.modules scala-collection-compat_2.12 2.11.0
org.scala-lang.modules scala-parser-combinators_2.12 1.1.2
org.scala-lang.modules scala-xml_2.12 1.2.0
org.scala-sbt test-interface 1.0
org.scalacheck scalacheck_2.12 1.14.2
org.scalactic scalactic_2.12 3.2.15
org.scalanlp breeze-macros_2.12 2.1.0
org.scalanlp breeze_2.12 2.1.0
org.scalatest scalatest-compatible 3.2.15
org.scalatest scalatest-core_2.12 3.2.15
org.scalatest scalatest-diagrams_2.12 3.2.15
org.scalatest scalatest-featurespec_2.12 3.2.15
org.scalatest scalatest-flatspec_2.12 3.2.15
org.scalatest scalatest-freespec_2.12 3.2.15
org.scalatest scalatest-funspec_2.12 3.2.15
org.scalatest scalatest-funsuite_2.12 3.2.15
org.scalatest scalatest-matchers-core_2.12 3.2.15
org.scalatest scalatest-mustmatchers_2.12 3.2.15
org.scalatest scalatest-propspec_2.12 3.2.15
org.scalatest scalatest-refspec_2.12 3.2.15
org.scalatest scalatest-shouldmatchers_2.12 3.2.15
org.scalatest scalatest-wordspec_2.12 3.2.15
org.scalatest scalatest_2.12 3.2.15
org.slf4j jcl-over-slf4j 2.0.7
org.slf4j jul-to-slf4j 2.0.7
org.slf4j slf4j-api 2.0.7
org.slf4j slf4j-simple 1.7.25
org.threeten threeten-extra 1.7.1
org.tukaani xz 1.9
org.typelevel algebra_2.12 2.0.1
org.typelevel cats-kernel_2.12 2.1.1
org.typelevel spire-macros_2.12 0.17.0
org.typelevel spire-platform_2.12 0.17.0
org.typelevel spire-util_2.12 0.17.0
org.typelevel spire_2.12 0.17.0
org.wildfly.openssl wildfly-openssl 1.1.3.Final
org.xerial sqlite-jdbc 3.42.0.0
org.xerial.snappy snappy-java 1.1.10.3
org.yaml snakeyaml 2.0
oro oro 2.0.8
pl.edu.icm JLargeArrays 1.5
software.amazon.cryptools AmazonCorrettoCryptoProvider 1.6.1-linux-x86_64
software.amazon.ion ion-java 1.0.2
stax stax-api 1.0.1