Databricks Runtime 9.0 (EoS)

Άρθρο
09/03/2024

Note

Support for this Databricks Runtime version has ended. For the end-of-support date, see End-of-support history. For all supported Databricks Runtime versions, see Databricks Runtime release notes versions and compatibility.

The following release notes provide information about Databricks Runtime 9.0 and Databricks Runtime 9.0 Photon, powered by Apache Spark 3.1.2. Databricks released this version in August 2021. Photon is in Public Preview.

Correction

A previous version of these release notes incorrectly stated that Apache Parquet dependencies were upgraded from 1.10 to 1.12. In fact, Parquet dependencies remain at version 1.10. The incorrect release note has been removed.

New features and improvements

New API for summary statistics of datasets (Public Preview)
Easier external data source configuration for the Azure Synapse connector
Optionally limit the session to a specified duration for the Amazon Redshift connector
Auto Loader
SQL
R support
Avoid redo by specifying initial state for Structured Streaming stateful processing
A low-shuffle implementation of the Delta MERGE INTO command is now available (Public Preview)

New API for summary statistics of datasets (Public Preview)

The new dbutils.data.summarize command in Databricks Utilities allows you to launch a Spark job that automatically computes summary statistics on the columns of a Spark DataFrame and then displays the results interactively. This function is available in Scala and Python. See Data utility (dbutils.data).

Easier external data source configuration for the Azure Synapse connector

The new externalDataSource option in the Query data in Azure Synapse Analytics connector allows you to use a pre-provisioned external data source to read from an Azure Synapse database. The externalDataSource option removes the need for the CONTROL permission previously required.

When setting externalDataSource, the external data source and the tempDir option used to configure temporary storage must reference the same container in the storage account.

Optionally limit the session to a specified duration for the Amazon Redshift connector

The new fs.s3a.assumed.role.session.duration option in the Query Amazon Redshift using Azure Databricks connector allows you to optionally set a duration for the session when Redshift is accessing the temporary S3 bucket with an assumed role.

Auto Loader

Optimized file listing
Optimized image data storage
Image thumbnails for binary files (Public Preview)
DirectoryRename events enable atomic processing of multiple files

Optimized file listing

Auto Loader optimizations provide performance improvements and cost savings when listing nested directories in cloud storage, including AWS S3, Azure Data Lake Storage Gen2 (ADLS Gen2), and Google Cloud Storage (GCS).

For example, if you had files being uploaded as /some/path/YYYY/MM/DD/HH/fileName, to find all the files in these directories, Auto Loader used to do a parallel listing of all subdirectories, causing 365 (per day) * 24 (per hour) = 8760 LIST API directory calls to the underlying storage for each year directory. By receiving a flattened response from these storage systems, Auto Loader reduces the number of API calls to the number of files in the storage system divided by the number of results returned by each API call (1000 for S3, 5000 for ADLS Gen2, and 1024 for GCS), greatly reducing your cloud costs.

Optimized image data storage

Auto Loader can now auto-detect image data that is being ingested and optimize its storage in Delta tables to improve read and write performance. See Ingest image or binary data to Delta Lake for ML.

Image thumbnails for binary files (Public Preview)

Images in binaryFile format loaded or saved as Delta tables using Auto Loader have annotations attached so that the image thumbnails appear when you display the table in an Azure Databricks notebook. For more information, see Images.

`DirectoryRename` events enable atomic processing of multiple files

Auto Loader streams created in Databricks Runtime 9.0 and above on Azure Data Lake Storage Gen2 set up file event notifications to include directory renames and listen to RenameDirectory events. You can use directory renames to make multiple files appear atomically to Auto Loader.

SQL

Exclude columns in SELECT * (Public Preview)
SQL scalar functions (Public Preview)
Reference preceding aliases and columns in FROM subqueries (Public Preview)

Exclude columns in `SELECT *` (Public Preview)

SELECT * now supports an EXCEPT keyword, which allows you to exclude specified top level columns from the expansion. For example, SELECT * EXCEPT (b) FROM tbl from a table with schema (a, b, c) expands to (a, c).

SQL scalar functions (Public Preview)

CREATE FUNCTION now supports SQL scalar functions. You can create scalar functions that take a set of arguments and return a single scalar type value. The SQL function body can be any expression. For example:

CREATE FUNCTION square(x DOUBLE) RETURNS DOUBLE RETURN x * x;
SELECT square(2);

For details, see CREATE FUNCTION (SQL and Python).

Reference preceding aliases and columns in `FROM` subqueries (Public Preview)

Subqueries in the FROM clause of a query can now be preceded by the LATERAL keyword which allows them to reference aliases and columns in the preceding FROM items. For example:

SELECT * FROM t1, LATERAL (SELECT * FROM t2 WHERE t1.c1 = t2.c1)

The LATERAL keyword supports INNER, CROSS, and LEFT (OUTER) JOIN.

See Parameters and Parameters.

R support

Notebook-scoped R libraries (Public Preview)

Notebook-scoped libraries allow you to install libraries and create an environment scoped to a notebook session. These libraries do not affect other notebooks running on the same cluster. The libraries are available both on the driver and worker nodes, so you can reference them in user-defined functions. See Notebook-scoped R libraries.

Warning messages in R notebooks

The default value of the warn option is now set to 1 inside R notebooks. As a result, all warnings are now exposed as part of the command result. To learn more about the warn option, see Options Settings.

Avoid redo by specifying initial state for Structured Streaming stateful processing

You can now specify a user defined initial state for structured streaming stateful processing using [flat]MapGroupsWithState operator.

See Specify initial state for mapGroupsWithState.

A low-shuffle implementation of the Delta MERGE INTO command is now available (Public Preview)

The Delta MERGE INTO command has a new implementation available which reduces shuffling of unmodified rows. This improves performance of the command and helps to preserve existing clustering on the table, such as Z-ordering. To enable low shuffle merge, set spark.databricks.delta.merge.enableLowShuffle to true. See Low shuffle merge on Azure Databricks.

Bug fixes

You can no longer overwrite a view by creating a view with the same name.

Library upgrades

Upgraded Python libraries:
- certifi from 2021.5.30 to 2020.12.5
- chardet from 3.0.4 to 4.0.0
- Cython from 0.29.21 to 0.29.23
- decorator from 4.4.2 to 5.0.6
- ipython from 7.19.0 to 7.22.0
- joblib from 0.17.0 to 1.0.1
- jupyter-client from 6.1.7 to 6.1.12
- jupyter-core from 4.6.3 to 4.7.1
- kiwisolver from 1.3.0 to 1.3.1
- matplotlib from 3.2.2 to 3.4.2
- pandas from 1.1.5 to 1.2.4
- pip from 20.2.4 to 21.0.1
- prompt-toolkit from 3.0.8 to 3.0.17
- protobuf from 3.17.3 to 3.17.2
- ptyprocess from 0.6.0 to 0.7.0
- pyarrow from 1.0.1 to 4.0.0
- Pygments from 2.7.2 to 2.8.1
- pyzmq from 19.0.2 to 20.0.0
- requests from 2.24.0 to 2.25.1
- s3transfer from 0.3.6 to 0.3.7
- scikit-learn from 0.23.2 to 0.24.1
- scipy from 1.5.2 to 1.6.2
- seaborn from 0.10.0 to 0.11.1
- setuptools from 50.3.1 to 52.0.0
- statsmodels from 0.12.0 to 0.12.2
- tornado from 6.0.4 to 6.1
- virtualenv from 20.2.1 to 20.4.1
- wheel from 0.35.1 to 0.36.2
Upgraded R libraries:
- Matrix from 1.3-3 to 1.3-4

Apache Spark

Databricks Runtime 9.0 includes Apache Spark 3.1.2. This release includes all Spark fixes and improvements included in Databricks Runtime 8.4 (EoS), as well as the following additional bug fixes and improvements made to Spark:

[SPARK-35886] [SQL][3.1] PromotePrecision should not overwrite genCodePromotePrecision should not overwrite genCode
[SPARK-35879] [CORE][SHUFFLE] Fix performance regression caused by collectFetchRequests
[SPARK-35817] [SQL][3.1] Restore performance of queries against wide Avro tables
[SPARK-35841] [SQL] Casting string to decimal type doesn’t work if the…
[SPARK-35783] [SQL] Set the list of read columns in the task configuration to reduce reading of ORC data
[SPARK-35576] [SQL][3.1] Redact the sensitive info in the result of Set command
[SPARK-35449] [SQL][3.1] Only extract common expressions from CaseWhen values if elseValue is set
[SPARK-35288] [SQL] StaticInvoke should find the method without exact argument classes match
[SPARK-34794] [SQL] Fix lambda variable name issues in nested DataFrame functions
[SPARK-35278] [SQL] Invoke should find the method with correct number of parameters
[SPARK-35226] [SQL] Support refreshKrb5Config option in JDBC datasources
[SPARK-35244] [SQL] Invoke should throw the original exception
[SPARK-35213] [SQL] Keep the correct ordering of nested structs in chained withField operations
[SPARK-35087] [UI] Some columns in table Aggregated Metrics by Executor of stage-detail page shows incorrectly.
[SPARK-35168] [SQL] mapred.reduce.tasks should be shuffle.partitions not adaptive.coalescePartitions.initialPartitionNum
[SPARK-35127] [UI] When we switch between different stage-detail pages, the entry item in the newly-opened page may be blank
[SPARK-35142] [PYTHON][ML] Fix incorrect return type for rawPredictionUDF in OneVsRestModel
[SPARK-35096] [SQL] SchemaPruning should adhere spark.sql.caseSensitive config
[SPARK-34639] [SQL][3.1] RelationalGroupedDataset.alias should not create UnresolvedAlias
[SPARK-35080] [SQL] Only allow a subset of correlated equality predicates when a subquery is aggregated
[SPARK-35117] [UI] Change progress bar back to highlight ratio of tasks in progress
[SPARK-35136] Remove initial null value of LiveStage.info
[SPARK-34834] [NETWORK] Fix a potential Netty memory leak in TransportResponseHandler
[SPARK-35045] [SQL] Add an internal option to control input buffer in univocity
[SPARK-35014] Fix the PhysicalAggregation pattern to not rewrite foldable expressions
[SPARK-35019] [PYTHON][SQL] Fix type hints mismatches in pyspark.sql.*
[SPARK-34926] [SQL][3.1] PartitioningUtils.getPathFragment() should respect partition value is null
[SPARK-34630] [PYTHON] Add typehint for pyspark.version
[SPARK-34963] [SQL] Fix nested column pruning for extracting case-insensitive struct field from array of struct
[SPARK-34988] [CORE][3.1] Upgrade Jetty for CVE-2021-28165
[SPARK-34922] [SQL][3.1] Use a relative cost comparison function in the CBO
[SPARK-34970] [SQL][SECURITY][3.1] Redact map-type options in the output of explain()
[SPARK-34923] [SQL] Metadata output should be empty for more plans
[SPARK-34949] [CORE] Prevent BlockManager reregister when Executor is shutting down
[SPARK-34939] [CORE] Throw fetch failure exception when unable to deserialize broadcasted map statuses
[SPARK-34909] [SQL] Fix conversion of negative to unsigned in conv()
[SPARK-34845] [CORE] ProcfsMetricsGetter shouldn’t return partial procfs metrics
[SPARK-34814] [SQL] LikeSimplification should handle NULL
[SPARK-34876] [SQL] Fill defaultResult of non-nullable aggregates
[SPARK-34829] [SQL] Fix higher order function results
[SPARK-34840] [SHUFFLE] Fixes cases of corruption in merged shuffle …
[SPARK-34833] [SQL] Apply right-padding correctly for correlated subqueries
[SPARK-34630] [PYTHON][SQL] Added typehint for pyspark.sql.Column.contains
[SPARK-34763] [SQL] col(), $”name” and df(“name”) should handle quoted column names properly
[SPARK-33482] [SPARK-34756] [SQL] Fix FileScan equality check
[SPARK-34790] [CORE] Disable fetching shuffle blocks in batch when io encryption is enabled
[SPARK-34803] [PYSPARK] Pass the raised ImportError if pandas or pyarrow fail to import
[SPARK-34225] [CORE] Don’t encode further when a URI form string is passed to addFile or addJar
[SPARK-34811] [CORE] Redact fs.s3a.access.key like secret and token
[SPARK-34796] [SQL][3.1] Initialize counter variable for LIMIT code-gen in doProduce()
[SPARK-34128] [SQL] Suppress undesirable TTransportException warnings involved in THRIFT-4805
[SPARK-34776] [SQL] Nested column pruning should not prune Window produced attributes
[SPARK-34087] [3.1][SQL] Fix memory leak of ExecutionListenerBus
[SPARK-34772] [SQL] RebaseDateTime loadRebaseRecords should use Spark classloader instead of context
[SPARK-34719] [SQL][3.1] Correctly resolve the view query with duplicated column names
[SPARK-34766] [SQL][3.1] Do not capture maven config for views
[SPARK-34731] [CORE] Avoid ConcurrentModificationException when redacting properties in EventLoggingListener
[SPARK-34737] [SQL][3.1] Cast input float to double in TIMESTAMP_SECONDS
[SPARK-34749] [SQL][3.1] Simplify ResolveCreateNamedStruct
[SPARK-34768] [SQL] Respect the default input buffer size in Univocity
[SPARK-34770] [SQL] InMemoryCatalog.tableExists should not fail if database doesn’t exist
[SPARK-34504] [SQL] Avoid unnecessary resolving of SQL temp views for DDL commands
[SPARK-34727] [SQL] Fix discrepancy in casting float to timestamp
[SPARK-34723] [SQL] Correct parameter type for subexpression elimination under whole-stage
[SPARK-34724] [SQL] Fix Interpreted evaluation by using getMethod instead of getDeclaredMethod
[SPARK-34713] [SQL] Fix group by CreateStruct with ExtractValue
[SPARK-34697] [SQL] Allow DESCRIBE FUNCTION and SHOW FUNCTIONS explain about || (string concatenation operator)
[SPARK-34682] [SQL] Use PrivateMethodTester instead of reflection
[SPARK-34682] [SQL] Fix regression in canonicalization error check in CustomShuffleReaderExec
[SPARK-34681] [SQL] Fix bug for full outer shuffled hash join when building left side with non-equal condition
[SPARK-34545] [SQL] Fix issues with valueCompare feature of pyrolite
[SPARK-34607] [SQL][3.1] Add Utils.isMemberClass to fix a malformed class name error on jdk8u
[SPARK-34596] [SQL] Use Utils.getSimpleName to avoid hitting Malformed class name in NewInstance.doGenCode
[SPARK-34613] [SQL] Fix view does not capture disable hint config
[SPARK-32924] [WEBUI] Make duration column in master UI sorted in the correct order
[SPARK-34482] [SS] Correct the active SparkSession for StreamExecution.logicalPlan
[SPARK-34567] [SQL] CreateTableAsSelect should update metrics too
[SPARK-34599] [SQL] Fix the issue that INSERT INTO OVERWRITE doesn’t support partition columns containing dot for DSv2
[SPARK-34577] [SQL] Fix drop/add columns to a dataset of DESCRIBE NAMESPACE
[SPARK-34584] [SQL] Static partition should also follow StoreAssignmentPolicy when insert into v2 tables
[SPARK-34555] [SQL] Resolve metadata output from DataFrame
[SPARK-34534] Fix blockIds order when use FetchShuffleBlocks to fetch blocks
[SPARK-34547] [SQL] Only use metadata columns for resolution as last resort
[SPARK-34417] [SQL] org.apache.spark.sql.DataFrameNaFunctions.fillMap fails for column name having a dot
[SPARK-34561] [SQL] Fix drop/add columns from/to a dataset of v2 DESCRIBE TABLE
[SPARK-34556] [SQL] Checking duplicate static partition columns should respect case sensitive conf
[SPARK-34392] [SQL] Support ZoneOffset +h:mm in DateTimeUtils. getZoneId
[SPARK-34550] [SQL] Skip InSet null value during push filter to Hive metastore
[SPARK-34543] [SQL] Respect the spark.sql.caseSensitive config while resolving partition spec in v1 SET LOCATION
[SPARK-34436] [SQL] DPP support LIKE ANY/ALL expression
[SPARK-34531] [CORE] Remove Experimental API tag in PrometheusServlet
[SPARK-34497] [SQL] Fix built-in JDBC connection providers to restore JVM security context changes
[SPARK-34515] [SQL] Fix NPE if InSet contains null value during getPartitionsByFilter
[SPARK-34490] [SQL] Analysis should fail if the view refers a dropped table
[SPARK-34473] [SQL] Avoid NPE in DataFrameReader.schema(StructType)
[SPARK-34384] [CORE] Add missing docs for ResourceProfile APIs
[SPARK-34373] [SQL] HiveThriftServer2 startWithContext may hang with a race issue
[SPARK-20977] [CORE] Use a non-final field for the state of CollectionAccumulator
[SPARK-34421] [SQL] Resolve temporary functions and views in views with CTEs
[SPARK-34431] [CORE] Only load hive-site.xml once
[SPARK-34405] [CORE] Fix mean value of timersLabels in the PrometheusServlet class
[SPARK-33438] [SQL] Eagerly init objects with defined SQL Confs for command set -v
[SPARK-34158] Incorrect url of the only developer Matei in pom.xml
[SPARK-34346] [CORE][SQL][3.1] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may cause perf regression
[SPARK-34359] [SQL][3.1] Add a legacy config to restore the output schema of SHOW DATABASES
[SPARK-34331] [SQL] Speed up DS v2 metadata col resolution
[SPARK-34318] [SQL][3.1] Dataset.colRegex should work with column names and qualifiers which contain newlines
[SPARK-34326] [CORE][SQL] Fix UTs added in SPARK-31793 depending on the length of temp path
[SPARK-34319] [SQL] Resolve duplicate attributes for FlatMapCoGroupsInPandas/MapInPandas
[SPARK-34310] [CORE][SQL] Replaces map and flatten with flatMap
[SPARK-34083] [SQL][3.1] Using TPCDS original definitions for char/varchar colums
[SPARK-34233] [SQL][3.1] FIX NPE for char padding in binary comparison
[SPARK-34270] [SS] Combine StateStoreMetrics should not override StateStoreCustomMetric
[SPARK-34144] [SQL] Exception thrown when trying to write LocalDate and Instant values to a JDBC relation
[SPARK-34273] [CORE] Do not reregister BlockManager when SparkContext is stopped
[SPARK-34262] [SQL][3.1] Refresh cached data of v1 table in ALTER TABLE .. SET LOCATION
[SPARK-34275] [CORE][SQL][MLLIB] Replaces filter and size with count
[SPARK-34260] [SQL] Fix UnresolvedException when creating temp view twice
[SPARK-33867] [SQL] Instant and LocalDate values aren’t handled when generating SQL queries
[SPARK-34193] [CORE] TorrentBroadcast block manager decommissioning race fix
[SPARK-34221] [WEBUI] Ensure if a stage fails in the UI page, the corresponding error message can be displayed correctly
[SPARK-34236] [SQL] Fix v2 Overwrite w/ null static partition raise Cannot translate expression to source filter: null
[SPARK-34212] [SQL] Fix incorrect decimal reading from Parquet files
[SPARK-34244] [SQL] Remove the Scala function version of regexp_extract_all
[SPARK-34235] [SS] Make spark.sql.hive as a private package
[SPARK-34232] [CORE] Redact SparkListenerEnvironmentUpdate event in log
[SPARK-34229] [SQL] Avro should read decimal values with the file schema
[SPARK-34223] [SQL] FIX NPE for static partition with null in InsertIntoHadoopFsRelationCommand
[SPARK-34192] [SQL] Move char padding to write side and remove length check on read side too
[SPARK-34203] [SQL] Convert null partition values to __HIVE_DEFAULT_PARTITION__ in v1 In-Memory catalog
[SPARK-33726] [SQL] Fix for Duplicate field names during Aggregation
[SPARK-34133] [AVRO] Respect case sensitivity when performing Catalyst-to-Avro field matching
[SPARK-34187] [SS] Use available offset range obtained during polling when checking offset validation
[SPARK-34052] [SQL][3.1] store SQL text for a temp view created using “CACHE TABLE .. AS SELECT …”
[SPARK-34213] [SQL] Refresh cached data of v1 table in LOAD DATA
[SPARK-34191] [PYTHON][SQL] Add typing for udf overload
[SPARK-34200] [SQL] Ambiguous column reference should consider attribute availability
[SPARK-33813] [SQL][3.1] Fix the issue that JDBC source can’t treat MS SQL Server’s spatial types
[SPARK-34178] [SQL] Copy tags for the new node created by MultiInstanceRelation.newInstance
[SPARK-34005] [CORE][3.1] Update peak memory metrics for each Executor on task end
[SPARK-34115] [CORE] Check SPARK_TESTING as lazy val to avoid slowdown
[SPARK-34153] [SQL][3.1][3.0] Remove unused getRawTable() from HiveExternalCatalog.alterPartitions()
[SPARK-34130] [SQL] Impove preformace for char varchar padding and length check with StaticInvoke
[SPARK-34027] [SQL][3.1] Refresh cache in ALTER TABLE .. RECOVER PARTITIONS
[SPARK-34151] [SQL] Replaces java.io.File.toURL with java.io.File.toURI.toURL
[SPARK-34140] [SQL][3.1] Move QueryCompilationErrors.scala to org/apache/spark/sql/errors
[SPARK-34080] [ML][PYTHON] Add UnivariateFeatureSelector
[SPARK-33790] [CORE][3.1] Reduce the rpc call of getFileStatus in SingleFileEventLogFileReader
[SPARK-34118] [CORE][SQL][3.1] Replaces filter and check for emptiness with exists or forall
[SPARK-34114] [SQL] should not trim right for read-side char length check and padding
[SPARK-34086] [SQL][3.1] RaiseError generates too much code and may fails codegen in length check for char varchar
[SPARK-34075] [SQL][CORE] Hidden directories are being listed for partition inference
[SPARK-34076] [SQL] SQLContext.dropTempTable fails if cache is non-empty
[SPARK-34084] [SQL][3.1] Fix auto updating of table stats in ALTER TABLE .. ADD PARTITION
[SPARK-34090] [SS] Cache HadoopDelegationTokenManager.isServiceEnabled result used in KafkaTokenUtil.needTokenUpdate
[SPARK-34069] [CORE] Kill barrier tasks should respect SPARK_JOB_INTERRUPT_ON_CANCEL
[SPARK-34091] [SQL] Shuffle batch fetch should be able to disable after it’s been enabled
[SPARK-34059] [SQL][CORE][3.1] Use for/foreach rather than map to make sure execute it eagerly
[SPARK-34002] [SQL] Fix the usage of encoder in ScalaUDF
[SPARK-34060] [SQL][3.1] Fix Hive table caching while updating stats by ALTER TABLE .. DROP PARTITION
[SPARK-31952] [SQL] Fix incorrect memory spill metric when doing Aggregate
[SPARK-33591] [SQL][3.1] Recognize null in partition spec values
[SPARK-34055] [SQL][3.1] Refresh cache in ALTER TABLE .. ADD PARTITION
[SPARK-34039] [SQL][3.1] ReplaceTable should invalidate cache
[SPARK-34003] [SQL] Fix Rule conflicts between PaddingAndLengthCheckForCharVarchar and ResolveAggregateFunctions
[SPARK-33938] [SQL][3.1] Optimize Like Any/All by LikeSimplification
[SPARK-34021] [R] Fix hyper links in SparkR documentation for CRAN submission
[SPARK-34011] [SQL][3.1][3.0] Refresh cache in ALTER TABLE .. RENAME TO PARTITION
[SPARK-33948] [SQL] Fix CodeGen error of MapObjects.doGenCode method in Scala 2.13
[SPARK-33635] [SS] Adjust the order of check in KafkaTokenUtil.needTokenUpdate to remedy perf regression
[SPARK-33029] [CORE][WEBUI] Fix the UI executor page incorrectly marking the driver as excluded
[SPARK-34015] [R] Fixing input timing in gapply
[SPARK-34012] [SQL] Keep behavior consistent when conf spark.sql.legacy.parser.havingWithoutGroupByAsWhere is true with migration guide
[SPARK-33844] [SQL][3.1] InsertIntoHiveDir command should check col name too
[SPARK-33935] [SQL] Fix CBO cost function
[SPARK-33100] [SQL] Ignore a semicolon inside a bracketed comment in spark-sql
[SPARK-34000] [CORE] Fix stageAttemptToNumSpeculativeTasks java.util.NoSuchElementException
[SPARK-33992] [SQL] override transformUpWithNewOutput to add allowInvokingTransformsInAnalyzer
[SPARK-33894] [SQL] Change visibility of private case classes in mllib to avoid runtime compilation errors with Scala 2.13
[SPARK-33950] [SQL][3.1][3.0] Refresh cache in v1 ALTER TABLE .. DROP PARTITION
[SPARK-33980] [SS] Invalidate char/varchar in spark.readStream.schema
[SPARK-33945] [SQL][3.1] Handles a random seed consisting of an expr tree
[SPARK-33398] Fix loading tree models prior to Spark 3.0
[SPARK-33963] [SQL] Canonicalize HiveTableRelation w/o table stats
[SPARK-33906] [WEBUI] Fix the bug of UI Executor page stuck due to undefined peakMemoryMetrics
[SPARK-33944] [SQL] Incorrect logging for warehouse keys in SharedState options
[SPARK-33936] [SQL][3.1] Add the version when connector’s interfaces were added
[SPARK-33916] [CORE] Fix fallback storage offset and improve compression codec test coverage
[SPARK-33899] [SQL][3.1] Fix assert failure in v1 SHOW TABLES/VIEWS on spark_catalog
[SPARK-33901] [SQL] Fix Char and Varchar display error after DDLs
[SPARK-33897] [SQL] Can’t set option ‘cross’ in join method
[SPARK-33907] [SQL][3.1] Only prune columns of JsonToStructs if parsing options is empty
[SPARK-33621] [SPARK-33784] [SQL][3.1] Add a way to inject data source rewrite rules
[SPARK-33900] [WEBUI] Show shuffle read size / records correctly when only remotebytesread is available
[SPARK-33892] [SQL] Display char/varchar in DESC and SHOW CREATE TABLE
[SPARK-33895] [SQL] Char and Varchar fail in MetaOperation of ThriftServer
[SPARK-33659] [SS] Document the current behavior for DataStreamWriter.toTable API
[SPARK-33893] [CORE] Exclude fallback block manager from executorList
[SPARK-33277] [PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends
[SPARK-33889] [SQL][3.1] Fix NPE from SHOW PARTITIONS on V2 tables
[SPARK-33879] [SQL] Char Varchar values fails w/ match error as partition columns
[SPARK-33877] [SQL] SQL reference documents for INSERT w/ a column list
[SPARK-33876] [SQL] Add length-check for reading char/varchar from tables w/ a external location
[SPARK-33846] [SQL] Include Comments for a nested schema in StructType.toDDL
[SPARK-33860] [SQL] Make CatalystTypeConverters.convertToCatalyst match special Array value
[SPARK-33834] [SQL] Verify ALTER TABLE CHANGE COLUMN with Char and Varchar
[SPARK-33853] [SQL] EXPLAIN CODEGEN and BenchmarkQueryTest don’t show subquery code
[SPARK-33836] [SS][PYTHON] Expose DataStreamReader.table and DataStreamWriter.toTable
[SPARK-33829] [SQL][3.1] Renaming v2 tables should recreate the cache
[SPARK-33756] [SQL] Make BytesToBytesMap’s MapIterator idempotent
[SPARK-33850] [SQL] EXPLAIN FORMATTED doesn’t show the plan for subqueries if AQE is enabled
[SPARK-33841] [CORE][3.1] Fix issue with jobs disappearing intermittently from the SHS under high load
[SPARK-33593] [SQL] Vector reader got incorrect data with binary partition value
[SPARK-26341] [WEBUI] Expose executor memory metrics at the stage level, in the Stages tab
[SPARK-33831] [UI] Update to jetty 9.4.34
[SPARK-33822] [SQL] Use the CastSupport.cast method in HashJoin
[SPARK-33774] [UI][CORE] Back to Master” returns 500 error in Standalone cluster
[SPARK-26199] [SPARK-31517] [R] Fix strategy for handling … names in mutate
[SPARK-33819] [CORE][3.1] SingleFileEventLogFileReader/RollingEventLogFilesFileReader should be package private
[SPARK-33697] [SQL] RemoveRedundantProjects should require column ordering by default
[SPARK-33752] [SQL][3.1] Avoid the getSimpleMessage of AnalysisException adds semicolon repeatedly
[SPARK-33788] [SQL][3.1][3.0][2.4] Throw NoSuchPartitionsException from HiveExternalCatalog.dropPartitions()
[SPARK-33803] [SQL] Sort table properties by key in DESCRIBE TABLE command
[SPARK-33786] [SQL] The storage level for a cache should be respected when a table name is altered
[SPARK-33273] [SQL] Fix a race condition in subquery execution
[SPARK-33653] [SQL][3.1] DSv2: REFRESH TABLE should recache the table itself
[SPARK-33777] [SQL] Sort output of V2 SHOW PARTITIONS
[SPARK-33733] [SQL] PullOutNondeterministic should check and collect deterministic field
[SPARK-33764] [SS] Make state store maintenance interval as SQL config
[SPARK-33729] [SQL] When refreshing cache, Spark should not use cached plan when recaching data
[SPARK-33742] [SQL][3.1] Throw PartitionsAlreadyExistException from HiveExternalCatalog.createPartitions()
[SPARK-33706] [SQL] Require fully specified partition identifier in partitionExists()
[SPARK-33740] [SQL] hadoop configs in hive-site.xml can overrides pre-existing hadoop ones
[SPARK-33692] [SQL] View should use captured catalog and namespace to lookup function
[SPARK-33669] Wrong error message from YARN application state monitor when sc.stop in yarn client mode
[SPARK-32110] [SQL] normalize special floating numbers in HyperLogLog++
[SPARK-33677] [SQL] Skip LikeSimplification rule if pattern contains any escapeChar
[SPARK-33693] [SQL] deprecate spark.sql.hive.convertCTAS
[SPARK-33641] [SQL] Invalidate new char/varchar types in public APIs that produce incorrect results
[SPARK-32680] [SQL] Don’t Preprocess V2 CTAS with Unresolved Query
[SPARK-33676] [SQL] Require exact matching of partition spec to the schema in V2 ALTER TABLE .. ADD/DROP PARTITION
[SPARK-33670] [SQL] Verify the partition provider is Hive in v1 SHOW TABLE EXTENDED
[SPARK-33663] [SQL] Uncaching should not be called on non-existing temp views
[SPARK-33667] [SQL] Respect the spark.sql.caseSensitive config while resolving partition spec in v1 SHOW PARTITIONS
[SPARK-33652] [SQL] DSv2: DeleteFrom should refresh cache

Maintenance updates

See Databricks Runtime 9.0 maintenance updates.

System environment

Operating System: Ubuntu 20.04.2 LTS
Java: Zulu 8.54.0.21-CA-linux64
Scala: 2.12.10
Python: 3.8.10
R: 4.1.0 (2021-05-18)
Delta Lake 1.0.0

Installed Python libraries

Library	Version	Library	Version	Library	Version
Antergos Linux	2015.10 (ISO-Rolling)	appdirs	1.4.4	backcall	0.2.0
boto3	1.16.7	botocore	1.19.7	certifi	2020.12.5
chardet	4.0.0	cycler	0.10.0	Cython	0.29.23
dbus-python	1.2.16	decorator	5.0.6	distlib	0.3.2
distro-info	0.23ubuntu1	facets-overview	1.0.0	filelock	3.0.12
idna	2.10	ipykernel	5.3.4	ipython	7.22.0
ipython-genutils	0.2.0	jedi	0.17.2	jmespath	0.10.0
joblib	1.0.1	jupyter-client	6.1.12	jupyter-core	4.7.1
kiwisolver	1.3.1	koalas	1.8.1	matplotlib	3.4.2
numpy	1.19.2	pandas	1.2.4	parso	0.7.0
patsy	0.5.1	pexpect	4.8.0	pickleshare	0.7.5
Pillow	8.2.0	pip	21.0.1	plotly	4.14.3
prompt-toolkit	3.0.17	protobuf	3.17.2	psycopg2	2.8.5
ptyprocess	0.7.0	pyarrow	4.0.0	Pygments	2.8.1
PyGObject	3.36.0	pyparsing	2.4.7	python-apt	2.0.0+ubuntu0.20.4.5
python-dateutil	2.8.1	pytz	2020.5	pyzmq	20.0.0
requests	2.25.1	requests-unixsocket	0.2.0	retrying	1.3.3
s3transfer	0.3.7	scikit-learn	0.24.1	scipy	1.6.2
seaborn	0.11.1	setuptools	52.0.0	six	1.15.0
ssh-import-id	5.10	statsmodels	0.12.2	threadpoolctl	2.1.0
tornado	6.1	traitlets	5.0.5	unattended-upgrades	0.1
urllib3	1.25.11	virtualenv	20.4.1	wcwidth	0.2.5
wheel	0.36.2

Installed R libraries

R libraries are installed from the Microsoft CRAN snapshot on 2021-07-28.

Library	Version	Library	Version	Library	Version
askpass	1.1	assertthat	0.2.1	backports	1.2.1
base	4.1.0	base64enc	0.1-3	BH	1.72.0-3
bit	4.0.4	bit64	4.0.5	blob	1.2.1
boot	1.3-28	brew	1.0-6	brio	1.1.0
broom	0.7.2	callr	3.5.1	caret	6.0-86
cellranger	1.1.0	chron	2.3-56	class	7.3-19
cli	2.2.0	clipr	0.7.1	cluster	2.1.2
codetools	0.2-18	colorspace	2.0-0	commonmark	1.7
compiler	4.1.0	config	0.3	covr	3.5.1
cpp11	0.2.4	crayon	1.3.4	credentials	1.3.0
crosstalk	1.1.0.1	curl	4.3	data.table	1.13.4
datasets	4.1.0	DBI	1.1.0	dbplyr	2.0.0
desc	1.2.0	devtools	2.3.2	diffobj	0.3.2
digest	0.6.27	dplyr	1.0.2	DT	0.16
ellipsis	0.3.1	evaluate	0.14	fansi	0.4.1
farver	2.0.3	fastmap	1.0.1	forcats	0.5.0
foreach	1.5.1	foreign	0.8-81	forge	0.2.0
fs	1.5.0	future	1.21.0	generics	0.1.0
gert	1.0.2	ggplot2	3.3.2	gh	1.2.0
gitcreds	0.1.1	glmnet	4.0-2	globals	0.14.0
glue	1.4.2	gower	0.2.2	graphics	4.1.0
grDevices	4.1.0	grid	4.1.0	gridExtra	2.3
gsubfn	0.7	gtable	0.3.0	haven	2.3.1
highr	0.8	hms	0.5.3	htmltools	0.5.0
htmlwidgets	1.5.3	httpuv	1.5.4	httr	1.4.2
hwriter	1.3.2	hwriterPlus	1.0-3	ini	0.3.1
ipred	0.9-9	isoband	0.2.3	iterators	1.0.13
jsonlite	1.7.2	KernSmooth	2.23-20	knitr	1.30
labeling	0.4.2	later	1.1.0.1	lattice	0.20-44
lava	1.6.8.1	lazyeval	0.2.2	lifecycle	0.2.0
listenv	0.8.0	lubridate	1.7.9.2	magrittr	2.0.1
markdown	1.1	MASS	7.3-54	Matrix	1.3-4
memoise	1.1.0	methods	4.1.0	mgcv	1.8-36
mime	0.9	ModelMetrics	1.2.2.2	modelr	0.1.8
munsell	0.5.0	nlme	3.1-152	nnet	7.3-16
numDeriv	2016.8-1.1	openssl	1.4.3	parallel	4.1.0
parallelly	1.22.0	pillar	1.4.7	pkgbuild	1.1.0
pkgconfig	2.0.3	pkgload	1.1.0	plogr	0.2.0
plyr	1.8.6	praise	1.0.0	prettyunits	1.1.1
pROC	1.16.2	processx	3.4.5	prodlim	2019.11.13
progress	1.2.2	promises	1.1.1	proto	1.0.0
ps	1.5.0	purrr	0.3.4	r2d3	0.2.3
R6	2.5.0	randomForest	4.6-14	rappdirs	0.3.1
rcmdcheck	1.3.3	RColorBrewer	1.1-2	Rcpp	1.0.5
readr	1.4.0	readxl	1.3.1	recipes	0.1.15
rematch	1.0.1	rematch2	2.1.2	remotes	2.2.0
reprex	0.3.0	reshape2	1.4.4	rex	1.2.0
rlang	0.4.9	rmarkdown	2.6	RODBC	1.3-17
roxygen2	7.1.1	rpart	4.1-15	rprojroot	2.0.2
Rserve	1.8-8	RSQLite	2.2.1	rstudioapi	0.13
rversions	2.0.2	rvest	0.3.6	scales	1.1.1
selectr	0.4-2	sessioninfo	1.1.1	shape	1.4.5
shiny	1.5.0	sourcetools	0.1.7	sparklyr	1.5.2
SparkR	3.1.1	spatial	7.3-11	splines	4.1.0
sqldf	0.4-11	SQUAREM	2020.5	stats	4.1.0
stats4	4.1.0	stringi	1.5.3	stringr	1.4.0
survival	3.2-11	sys	3.4	tcltk	4.1.0
TeachingDemos	2.10	testthat	3.0.0	tibble	3.0.4
tidyr	1.1.2	tidyselect	1.1.0	tidyverse	1.3.0
timeDate	3043.102	tinytex	0.28	tools	4.1.0
usethis	2.0.0	utf8	1.1.4	utils	4.1.0
uuid	0.1-4	vctrs	0.3.5	viridisLite	0.3.0
waldo	0.2.3	whisker	0.4	withr	2.3.0
xfun	0.19	xml2	1.3.2	xopen	1.0.0
xtable	1.8-4	yaml	2.2.1	zip	2.1.1

Installed Java and Scala libraries (Scala 2.12 cluster version)

Group ID	Artifact ID	Version
antlr	antlr	2.7.7
com.amazonaws	amazon-kinesis-client	1.12.0
com.amazonaws	aws-java-sdk-autoscaling	1.11.655
com.amazonaws	aws-java-sdk-cloudformation	1.11.655
com.amazonaws	aws-java-sdk-cloudfront	1.11.655
com.amazonaws	aws-java-sdk-cloudhsm	1.11.655
com.amazonaws	aws-java-sdk-cloudsearch	1.11.655
com.amazonaws	aws-java-sdk-cloudtrail	1.11.655
com.amazonaws	aws-java-sdk-cloudwatch	1.11.655
com.amazonaws	aws-java-sdk-cloudwatchmetrics	1.11.655
com.amazonaws	aws-java-sdk-codedeploy	1.11.655
com.amazonaws	aws-java-sdk-cognitoidentity	1.11.655
com.amazonaws	aws-java-sdk-cognitosync	1.11.655
com.amazonaws	aws-java-sdk-config	1.11.655
com.amazonaws	aws-java-sdk-core	1.11.655
com.amazonaws	aws-java-sdk-datapipeline	1.11.655
com.amazonaws	aws-java-sdk-directconnect	1.11.655
com.amazonaws	aws-java-sdk-directory	1.11.655
com.amazonaws	aws-java-sdk-dynamodb	1.11.655
com.amazonaws	aws-java-sdk-ec2	1.11.655
com.amazonaws	aws-java-sdk-ecs	1.11.655
com.amazonaws	aws-java-sdk-efs	1.11.655
com.amazonaws	aws-java-sdk-elasticache	1.11.655
com.amazonaws	aws-java-sdk-elasticbeanstalk	1.11.655
com.amazonaws	aws-java-sdk-elasticloadbalancing	1.11.655
com.amazonaws	aws-java-sdk-elastictranscoder	1.11.655
com.amazonaws	aws-java-sdk-emr	1.11.655
com.amazonaws	aws-java-sdk-glacier	1.11.655
com.amazonaws	aws-java-sdk-glue	1.11.655
com.amazonaws	aws-java-sdk-iam	1.11.655
com.amazonaws	aws-java-sdk-importexport	1.11.655
com.amazonaws	aws-java-sdk-kinesis	1.11.655
com.amazonaws	aws-java-sdk-kms	1.11.655
com.amazonaws	aws-java-sdk-lambda	1.11.655
com.amazonaws	aws-java-sdk-logs	1.11.655
com.amazonaws	aws-java-sdk-machinelearning	1.11.655
com.amazonaws	aws-java-sdk-marketplacecommerceanalytics	1.11.655
com.amazonaws	aws-java-sdk-marketplacemeteringservice	1.11.655
com.amazonaws	aws-java-sdk-opsworks	1.11.655
com.amazonaws	aws-java-sdk-rds	1.11.655
com.amazonaws	aws-java-sdk-redshift	1.11.655
com.amazonaws	aws-java-sdk-route53	1.11.655
com.amazonaws	aws-java-sdk-s3	1.11.655
com.amazonaws	aws-java-sdk-ses	1.11.655
com.amazonaws	aws-java-sdk-simpledb	1.11.655
com.amazonaws	aws-java-sdk-simpleworkflow	1.11.655
com.amazonaws	aws-java-sdk-sns	1.11.655
com.amazonaws	aws-java-sdk-sqs	1.11.655
com.amazonaws	aws-java-sdk-ssm	1.11.655
com.amazonaws	aws-java-sdk-storagegateway	1.11.655
com.amazonaws	aws-java-sdk-sts	1.11.655
com.amazonaws	aws-java-sdk-support	1.11.655
com.amazonaws	aws-java-sdk-swf-libraries	1.11.22
com.amazonaws	aws-java-sdk-workspaces	1.11.655
com.amazonaws	jmespath-java	1.11.655
com.chuusai	shapeless_2.12	2.3.3
com.clearspring.analytics	stream	2.9.6
com.databricks	Rserve	1.8-3
com.databricks	jets3t	0.7.1-0
com.databricks.scalapb	compilerplugin_2.12	0.4.15-10
com.databricks.scalapb	scalapb-runtime_2.12	0.4.15-10
com.esotericsoftware	kryo-shaded	4.0.2
com.esotericsoftware	minlog	1.3.0
com.fasterxml	classmate	1.3.4
com.fasterxml.jackson.core	jackson-annotations	2.10.0
com.fasterxml.jackson.core	jackson-core	2.10.0
com.fasterxml.jackson.core	jackson-databind	2.10.0
com.fasterxml.jackson.dataformat	jackson-dataformat-cbor	2.10.0
com.fasterxml.jackson.datatype	jackson-datatype-joda	2.10.0
com.fasterxml.jackson.module	jackson-module-paranamer	2.10.0
com.fasterxml.jackson.module	jackson-module-scala_2.12	2.10.0
com.github.ben-manes.caffeine	caffeine	2.3.4
com.github.fommil	jniloader	1.1
com.github.fommil.netlib	core	1.1.2
com.github.fommil.netlib	native_ref-java	1.1
com.github.fommil.netlib	native_ref-java-natives	1.1
com.github.fommil.netlib	native_system-java	1.1
com.github.fommil.netlib	native_system-java-natives	1.1
com.github.fommil.netlib	netlib-native_ref-linux-x86_64-natives	1.1
com.github.fommil.netlib	netlib-native_system-linux-x86_64-natives	1.1
com.github.joshelser	dropwizard-metrics-hadoop-metrics2-reporter	0.1.2
com.github.luben	zstd-jni	1.4.8-1
com.github.wendykierp	JTransforms	3.1
com.google.code.findbugs	jsr305	3.0.0
com.google.code.gson	gson	2.2.4
com.google.flatbuffers	flatbuffers-java	1.9.0
com.google.guava	guava	15.0
com.google.protobuf	protobuf-java	2.6.1
com.h2database	h2	1.4.195
com.helger	profiler	1.1.1
com.jcraft	jsch	0.1.50
com.jolbox	bonecp	0.8.0.RELEASE
com.lihaoyi	sourcecode_2.12	0.1.9
com.microsoft.azure	azure-data-lake-store-sdk	2.3.9
com.microsoft.sqlserver	mssql-jdbc	9.2.1.jre8
com.ning	compress-lzf	1.0.3
com.sun.mail	javax.mail	1.5.2
com.tdunning	json	1.8
com.thoughtworks.paranamer	paranamer	2.8
com.trueaccord.lenses	lenses_2.12	0.4.12
com.twitter	chill-java	0.9.5
com.twitter	chill_2.12	0.9.5
com.twitter	util-app_2.12	7.1.0
com.twitter	util-core_2.12	7.1.0
com.twitter	util-function_2.12	7.1.0
com.twitter	util-jvm_2.12	7.1.0
com.twitter	util-lint_2.12	7.1.0
com.twitter	util-registry_2.12	7.1.0
com.twitter	util-stats_2.12	7.1.0
com.typesafe	config	1.2.1
com.typesafe.scala-logging	scala-logging_2.12	3.7.2
com.univocity	univocity-parsers	2.9.1
com.zaxxer	HikariCP	3.1.0
commons-beanutils	commons-beanutils	1.9.4
commons-cli	commons-cli	1.2
commons-codec	commons-codec	1.10
commons-collections	commons-collections	3.2.2
commons-configuration	commons-configuration	1.6
commons-dbcp	commons-dbcp	1.4
commons-digester	commons-digester	1.8
commons-fileupload	commons-fileupload	1.3.3
commons-httpclient	commons-httpclient	3.1
commons-io	commons-io	2.4
commons-lang	commons-lang	2.6
commons-logging	commons-logging	1.1.3
commons-net	commons-net	3.1
commons-pool	commons-pool	1.5.4
hive-2.3__hadoop-2.7	jets3t-0.7	liball_deps_2.12
hive-2.3__hadoop-2.7	zookeeper-3.4	liball_deps_2.12
info.ganglia.gmetric4j	gmetric4j	1.0.10
io.airlift	aircompressor	0.10
io.delta	delta-sharing-spark_2.12	0.1.0
io.dropwizard.metrics	metrics-core	4.1.1
io.dropwizard.metrics	metrics-graphite	4.1.1
io.dropwizard.metrics	metrics-healthchecks	4.1.1
io.dropwizard.metrics	metrics-jetty9	4.1.1
io.dropwizard.metrics	metrics-jmx	4.1.1
io.dropwizard.metrics	metrics-json	4.1.1
io.dropwizard.metrics	metrics-jvm	4.1.1
io.dropwizard.metrics	metrics-servlets	4.1.1
io.netty	netty-all	4.1.51.Final
io.prometheus	simpleclient	0.7.0
io.prometheus	simpleclient_common	0.7.0
io.prometheus	simpleclient_dropwizard	0.7.0
io.prometheus	simpleclient_pushgateway	0.7.0
io.prometheus	simpleclient_servlet	0.7.0
io.prometheus.jmx	collector	0.12.0
jakarta.annotation	jakarta.annotation-api	1.3.5
jakarta.validation	jakarta.validation-api	2.0.2
jakarta.ws.rs	jakarta.ws.rs-api	2.1.6
javax.activation	activation	1.1.1
javax.el	javax.el-api	2.2.4
javax.jdo	jdo-api	3.0.1
javax.servlet	javax.servlet-api	3.1.0
javax.servlet.jsp	jsp-api	2.1
javax.transaction	jta	1.1
javax.transaction	transaction-api	1.1
javax.xml.bind	jaxb-api	2.2.2
javax.xml.stream	stax-api	1.0-2
javolution	javolution	5.5.1
jline	jline	2.14.6
joda-time	joda-time	2.10.5
log4j	apache-log4j-extras	1.2.17
log4j	log4j	1.2.17
maven-trees	hive-2.3__hadoop-2.7	liball_deps_2.12
net.java.dev.jna	jna	5.8.0
net.razorvine	pyrolite	4.30
net.sf.jpam	jpam	1.1
net.sf.opencsv	opencsv	2.3
net.sf.supercsv	super-csv	2.2.0
net.snowflake	snowflake-ingest-sdk	0.9.6
net.snowflake	snowflake-jdbc	3.13.3
net.snowflake	spark-snowflake_2.12	2.9.0-spark_3.1
net.sourceforge.f2j	arpack_combined_all	0.1
org.acplt.remotetea	remotetea-oncrpc	1.1.2
org.antlr	ST4	4.0.4
org.antlr	antlr-runtime	3.5.2
org.antlr	antlr4-runtime	4.8-1
org.antlr	stringtemplate	3.2.1
org.apache.ant	ant	1.9.2
org.apache.ant	ant-jsch	1.9.2
org.apache.ant	ant-launcher	1.9.2
org.apache.arrow	arrow-format	2.0.0
org.apache.arrow	arrow-memory-core	2.0.0
org.apache.arrow	arrow-memory-netty	2.0.0
org.apache.arrow	arrow-vector	2.0.0
org.apache.avro	avro	1.8.2
org.apache.avro	avro-ipc	1.8.2
org.apache.avro	avro-mapred-hadoop2	1.8.2
org.apache.commons	commons-compress	1.20
org.apache.commons	commons-crypto	1.1.0
org.apache.commons	commons-lang3	3.10
org.apache.commons	commons-math3	3.4.1
org.apache.commons	commons-text	1.6
org.apache.curator	curator-client	2.7.1
org.apache.curator	curator-framework	2.7.1
org.apache.curator	curator-recipes	2.7.1
org.apache.derby	derby	10.12.1.1
org.apache.directory.api	api-asn1-api	1.0.0-M20
org.apache.directory.api	api-util	1.0.0-M20
org.apache.directory.server	apacheds-i18n	2.0.0-M15
org.apache.directory.server	apacheds-kerberos-codec	2.0.0-M15
org.apache.hadoop	hadoop-annotations	2.7.4
org.apache.hadoop	hadoop-auth	2.7.4
org.apache.hadoop	hadoop-client	2.7.4
org.apache.hadoop	hadoop-common	2.7.4
org.apache.hadoop	hadoop-hdfs	2.7.4
org.apache.hadoop	hadoop-mapreduce-client-app	2.7.4
org.apache.hadoop	hadoop-mapreduce-client-common	2.7.4
org.apache.hadoop	hadoop-mapreduce-client-core	2.7.4
org.apache.hadoop	hadoop-mapreduce-client-jobclient	2.7.4
org.apache.hadoop	hadoop-mapreduce-client-shuffle	2.7.4
org.apache.hadoop	hadoop-yarn-api	2.7.4
org.apache.hadoop	hadoop-yarn-client	2.7.4
org.apache.hadoop	hadoop-yarn-common	2.7.4
org.apache.hadoop	hadoop-yarn-server-common	2.7.4
org.apache.hive	hive-beeline	2.3.7
org.apache.hive	hive-cli	2.3.7
org.apache.hive	hive-jdbc	2.3.7
org.apache.hive	hive-llap-client	2.3.7
org.apache.hive	hive-llap-common	2.3.7
org.apache.hive	hive-serde	2.3.7
org.apache.hive	hive-shims	2.3.7
org.apache.hive	hive-storage-api	2.7.2
org.apache.hive.shims	hive-shims-0.23	2.3.7
org.apache.hive.shims	hive-shims-common	2.3.7
org.apache.hive.shims	hive-shims-scheduler	2.3.7
org.apache.htrace	htrace-core	3.1.0-incubating
org.apache.httpcomponents	httpclient	4.5.6
org.apache.httpcomponents	httpcore	4.4.12
org.apache.ivy	ivy	2.4.0
org.apache.mesos	mesos-shaded-protobuf	1.4.0
org.apache.orc	orc-core	1.5.12
org.apache.orc	orc-mapreduce	1.5.12
org.apache.orc	orc-shims	1.5.12
org.apache.parquet	parquet-column	1.10.1-databricks9
org.apache.parquet	parquet-common	1.10.1-databricks9
org.apache.parquet	parquet-encoding	1.10.1-databricks9
org.apache.parquet	parquet-format	2.4.0
org.apache.parquet	parquet-hadoop	1.10.1-databricks9
org.apache.parquet	parquet-jackson	1.10.1-databricks9
org.apache.thrift	libfb303	0.9.3
org.apache.thrift	libthrift	0.12.0
org.apache.xbean	xbean-asm7-shaded	4.15
org.apache.yetus	audience-annotations	0.5.0
org.apache.zookeeper	zookeeper	3.4.14
org.codehaus.jackson	jackson-core-asl	1.9.13
org.codehaus.jackson	jackson-jaxrs	1.9.13
org.codehaus.jackson	jackson-mapper-asl	1.9.13
org.codehaus.jackson	jackson-xc	1.9.13
org.codehaus.janino	commons-compiler	3.0.16
org.codehaus.janino	janino	3.0.16
org.datanucleus	datanucleus-api-jdo	4.2.4
org.datanucleus	datanucleus-core	4.1.17
org.datanucleus	datanucleus-rdbms	4.1.19
org.datanucleus	javax.jdo	3.2.0-m3
org.eclipse.jetty	jetty-client	9.4.36.v20210114
org.eclipse.jetty	jetty-continuation	9.4.36.v20210114
org.eclipse.jetty	jetty-http	9.4.36.v20210114
org.eclipse.jetty	jetty-io	9.4.36.v20210114
org.eclipse.jetty	jetty-jndi	9.4.36.v20210114
org.eclipse.jetty	jetty-plus	9.4.36.v20210114
org.eclipse.jetty	jetty-proxy	9.4.36.v20210114
org.eclipse.jetty	jetty-security	9.4.36.v20210114
org.eclipse.jetty	jetty-server	9.4.36.v20210114
org.eclipse.jetty	jetty-servlet	9.4.36.v20210114
org.eclipse.jetty	jetty-servlets	9.4.36.v20210114
org.eclipse.jetty	jetty-util	9.4.36.v20210114
org.eclipse.jetty	jetty-util-ajax	9.4.36.v20210114
org.eclipse.jetty	jetty-webapp	9.4.36.v20210114
org.eclipse.jetty	jetty-xml	9.4.36.v20210114
org.fusesource.leveldbjni	leveldbjni-all	1.8
org.glassfish.hk2	hk2-api	2.6.1
org.glassfish.hk2	hk2-locator	2.6.1
org.glassfish.hk2	hk2-utils	2.6.1
org.glassfish.hk2	osgi-resource-locator	1.0.3
org.glassfish.hk2.external	aopalliance-repackaged	2.6.1
org.glassfish.hk2.external	jakarta.inject	2.6.1
org.glassfish.jersey.containers	jersey-container-servlet	2.30
org.glassfish.jersey.containers	jersey-container-servlet-core	2.30
org.glassfish.jersey.core	jersey-client	2.30
org.glassfish.jersey.core	jersey-common	2.30
org.glassfish.jersey.core	jersey-server	2.30
org.glassfish.jersey.inject	jersey-hk2	2.30
org.glassfish.jersey.media	jersey-media-jaxb	2.30
org.hibernate.validator	hibernate-validator	6.1.0.Final
org.javassist	javassist	3.25.0-GA
org.jboss.logging	jboss-logging	3.3.2.Final
org.jdbi	jdbi	2.63.1
org.joda	joda-convert	1.7
org.jodd	jodd-core	3.5.2
org.json4s	json4s-ast_2.12	3.7.0-M5
org.json4s	json4s-core_2.12	3.7.0-M5
org.json4s	json4s-jackson_2.12	3.7.0-M5
org.json4s	json4s-scalap_2.12	3.7.0-M5
org.lz4	lz4-java	1.7.1
org.mariadb.jdbc	mariadb-java-client	2.2.5
org.objenesis	objenesis	2.5.1
org.postgresql	postgresql	42.1.4
org.roaringbitmap	RoaringBitmap	0.9.14
org.roaringbitmap	shims	0.9.14
org.rocksdb	rocksdbjni	6.20.3
org.rosuda.REngine	REngine	2.1.0
org.scala-lang	scala-compiler_2.12	2.12.10
org.scala-lang	scala-library_2.12	2.12.10
org.scala-lang	scala-reflect_2.12	2.12.10
org.scala-lang.modules	scala-collection-compat_2.12	2.1.1
org.scala-lang.modules	scala-parser-combinators_2.12	1.1.2
org.scala-lang.modules	scala-xml_2.12	1.2.0
org.scala-sbt	test-interface	1.0
org.scalacheck	scalacheck_2.12	1.14.2
org.scalactic	scalactic_2.12	3.0.8
org.scalanlp	breeze-macros_2.12	1.0
org.scalanlp	breeze_2.12	1.0
org.scalatest	scalatest_2.12	3.0.8
org.slf4j	jcl-over-slf4j	1.7.30
org.slf4j	jul-to-slf4j	1.7.30
org.slf4j	slf4j-api	1.7.30
org.slf4j	slf4j-log4j12	1.7.30
org.spark-project.spark	unused	1.0.0
org.springframework	spring-core	4.1.4.RELEASE
org.springframework	spring-test	4.1.4.RELEASE
org.threeten	threeten-extra	1.5.0
org.tukaani	xz	1.5
org.typelevel	algebra_2.12	2.0.0-M2
org.typelevel	cats-kernel_2.12	2.0.0-M4
org.typelevel	machinist_2.12	0.6.8
org.typelevel	macro-compat_2.12	1.1.1
org.typelevel	spire-macros_2.12	0.17.0-M1
org.typelevel	spire-platform_2.12	0.17.0-M1
org.typelevel	spire-util_2.12	0.17.0-M1
org.typelevel	spire_2.12	0.17.0-M1
org.wildfly.openssl	wildfly-openssl	1.0.7.Final
org.xerial	sqlite-jdbc	3.8.11.2
org.xerial.snappy	snappy-java	1.1.8.2
org.yaml	snakeyaml	1.24
oro	oro	2.0.8
pl.edu.icm	JLargeArrays	1.5
software.amazon.ion	ion-java	1.0.2
stax	stax-api	1.0.1
xmlenc	xmlenc	0.52

Κοινή χρήση μέσω

Databricks Runtime 9.0 (EoS)

Correction

New features and improvements

New API for summary statistics of datasets (Public Preview)

Easier external data source configuration for the Azure Synapse connector

Optionally limit the session to a specified duration for the Amazon Redshift connector

Auto Loader

Optimized file listing

Optimized image data storage

Image thumbnails for binary files (Public Preview)

`DirectoryRename` events enable atomic processing of multiple files

SQL

Exclude columns in `SELECT *` (Public Preview)

SQL scalar functions (Public Preview)

Reference preceding aliases and columns in `FROM` subqueries (Public Preview)

R support

Notebook-scoped R libraries (Public Preview)

Warning messages in R notebooks

Avoid redo by specifying initial state for Structured Streaming stateful processing

A low-shuffle implementation of the Delta MERGE INTO command is now available (Public Preview)

Bug fixes

Library upgrades

Apache Spark

Maintenance updates

System environment

Installed Python libraries

Installed R libraries

Installed Java and Scala libraries (Scala 2.12 cluster version)

Σχόλια

Πρόσθετοι πόροι

Κοινή χρήση μέσω

Databricks Runtime 9.0 (EoS)

Correction

New features and improvements

New API for summary statistics of datasets (Public Preview)

Easier external data source configuration for the Azure Synapse connector

Optionally limit the session to a specified duration for the Amazon Redshift connector

Auto Loader

Optimized file listing

Optimized image data storage

Image thumbnails for binary files (Public Preview)

DirectoryRename events enable atomic processing of multiple files

SQL

Exclude columns in SELECT * (Public Preview)

SQL scalar functions (Public Preview)

Reference preceding aliases and columns in FROM subqueries (Public Preview)

R support

Notebook-scoped R libraries (Public Preview)

Warning messages in R notebooks

Avoid redo by specifying initial state for Structured Streaming stateful processing

A low-shuffle implementation of the Delta MERGE INTO command is now available (Public Preview)

Bug fixes

Library upgrades

Apache Spark

Maintenance updates

System environment

Installed Python libraries

Installed R libraries

Installed Java and Scala libraries (Scala 2.12 cluster version)

Σχόλια

Πρόσθετοι πόροι

`DirectoryRename` events enable atomic processing of multiple files

Exclude columns in `SELECT *` (Public Preview)

Reference preceding aliases and columns in `FROM` subqueries (Public Preview)