Databricks Runtime 4.0 (EoS)

Artikkeli
09/03/2024

Note

Support for this Databricks Runtime version has ended. For the end-of-support date, see End-of-support history. For all supported Databricks Runtime versions, see Databricks Runtime release notes versions and compatibility.

Databricks released this version in March 2018.

Important

This release was deprecated on November 1, 2018. For more information about the Databricks Runtime deprecation policy and schedule, see Databricks support lifecycles.

The following release notes provide information about Databricks Runtime 4.0, powered by Apache Spark.

Changes and improvements

The JSON data source now tries to auto-detect encoding instead of assuming it to be UTF-8. In cases where the auto-detection fails, users can specify the charset option to enforce a certain encoding. See Charset auto-detection.
Scoring and prediction using Spark MLlib pipelines in Structured Streaming is fully supported.
Databricks ML Model Export is fully supported. With this feature, you can train a Spark MLlib model on Databricks, export it with a function call, and use a Databricks library in the system of your choice to import the model and score new data.
A new Spark data source implementation offers scalable read/write access to Azure Synapse Analytics. See Spark - Synapse Analytics Connector.
The schema of the from_json function is now always converted to a nullable one. In other words, all fields, including nested ones, are nullable. This ensures that the data is compatible with the schema, preventing corruption after writing the data to parquet when a field is missing in the data and the user-provided schema declares the field as non-nullable.
Upgraded some installed Python libraries:
- futures: from 3.1.1 to 3.2.0
- pandas: from 0.18.1 to 0.19.2
- pyarrow: from 0.4.1 to 0.8.0
- setuptools: from 38.2.3 to 38.5.1
- tornado: 4.5.2 to 4.5.3
Upgraded several installed R libraries. See Installed R Libraries.
Upgraded AWS Java SDK from 1.11.126 to 1.11.253.
Upgraded SQL Server JDBC driver from 6.1.0.jre8 to 6.2.2.jre8.
Upgraded PostgreSQL JDBC driver from 9.4-1204-jdbc41 to 42.1.4.

Apache Spark

Databricks Runtime 4.0 includes Apache Spark 2.3.0.

Core, PySpark, and Spark SQL

Major features

Vectorized ORC Reader: [SPARK-16060]: Adds support for new ORC reader that substantially improves the ORC scan throughput through vectorization (2-5x). To enable the reader, users can set spark.sql.orc.impl to native.
Spark History Server V2: [SPARK-18085]: A new spark history server (SHS) backend that provides better scalability for large-scale applications with a more efficient event storage mechanism.
Data source API V2: [SPARK-15689][SPARK-22386]: An experimental API for plugging in new data sources in Spark. The new API attempts to address several limitations of the V1 API and aims to facilitate development of highly-performant, easy-to-maintain, and extensible external data sources. This API is still undergoing active development and breaking changes should be expected.
PySpark Performance Enhancements: [SPARK-22216][SPARK-21187]: Significant improvements in Python performance and interoperability by fast data serialization and vectorized execution.

Performance and stability

[SPARK-21975]: Histogram support in cost-based optimizer.
[SPARK-20331]: Better support for predicate pushdown for Hive partition pruning.
[SPARK-19112]: Support for ZStandard compression codec.
[SPARK-21113]: Support for read ahead input stream to amortize disk I/O cost in the spill reader.
[SPARK-22510][SPARK-22692][SPARK-21871]: Further stabilize the codegen framework to avoid hitting the 64KB JVM bytecode limit on the Java method and Java compiler constant pool limit.
[SPARK-23207]: Fixed a long standing bug in Spark where consecutive shuffle+repartition on a DataFrame could lead to incorrect answers in certain surgical cases.
[SPARK-22062][SPARK-17788][SPARK-21907]: Fix various causes of OOMs.
[SPARK-22489][SPARK-22916][SPARK-22895][SPARK-20758][SPARK-22266][SPARK-19122][SPARK-22662][SPARK-21652]: Enhancements in rule-based optimizer and planner.

Other notable changes

[SPARK-20236]: Support Hive-style dynamic partition overwrite semantics.
[SPARK-4131]: Support INSERT OVERWRITE DIRECTORY to write data directly into the filesystem from a query.
[SPARK-19285][SPARK-22945][SPARK-21499][SPARK-20586][SPARK-20416][SPARK-20668]: UDF enhancements.
[SPARK-20463][SPARK-19951][SPARK-22934][SPARK-21055][SPARK-17729][SPARK-20962][SPARK-20963][SPARK-20841][SPARK-17642][SPARK-22475][SPARK-22934]: Improved ANSI SQL compliance and Hive compatibility.
[SPARK-20746]: More comprehensive SQL built-in functions.
[SPARK-21485]: Spark SQL documentation generation for built-in functions.
[SPARK-19810]: Remove support for Scala 2.10.
[SPARK-22324]: Upgrade Arrow to 0.8.0 and Netty to 4.1.17.

Structured Streaming

Continuous Processing

A new execution engine that can execute streaming queries with sub-millisecond end-to-end latency by changing only a single line of user code. To learn more see the programming guide.

Stream-Stream Joins

Ability to join two streams of data, buffering rows until matching tuples arrive in the other stream. Predicates can be used against event time columns to bound the amount of state that needs to be retained.

Streaming API V2

An experimental API for plugging in new source and sinks that works for batch, micro-batch, and continuous execution. This API is still undergoing active development, and breaking changes should be expected.

MLlib

Highlights

ML Prediction now works with Structured Streaming, using updated APIs. Details follow.

New and improved APIs

[SPARK-21866]: Built-in support for reading images into a DataFrame (Scala/Java/Python).
[SPARK-19634]: DataFrame functions for descriptive summary statistics over vector columns (Scala/Java).
[SPARK-14516]: ClusteringEvaluator for tuning clustering algorithms, supporting Cosine silhouette and squared Euclidean silhouette metrics (Scala/Java/Python).
[SPARK-3181]: Robust linear regression with Huber loss (Scala/Java/Python).
[SPARK-13969]: FeatureHasher transformer (Scala/Java/Python).
Multiple column support for several feature transformers:
- [SPARK-13030]: OneHotEncoderEstimator (Scala/Java/Python)
- [SPARK-22397]: QuantileDiscretizer (Scala/Java)
- [SPARK-20542]: Bucketizer (Scala/Java/Python)
[SPARK-21633] and SPARK-21542]: Improved support for custom pipeline components in Python.

New features

[SPARK-21087]: CrossValidator and TrainValidationSplit can collect all models when fitting (Scala/Java). This allows you to inspect or save all fitted models.
[SPARK-19357]: Meta-algorithms CrossValidator, TrainValidationSplit,OneVsRest support a parallelism Param for fitting multiple sub-models in parallel Spark jobs.
[SPARK-17139]: Model summary for multinomial logistic regression (Scala/Java/Python)
[SPARK-18710]: Add offset in GLM.
[SPARK-20199]: Added featureSubsetStrategy Param to GBTClassifier and GBTRegressor. Using this to subsample features can significantly improve training speed; this option has been a key strength of xgboost.

Other notable changes

[SPARK-22156]: Fixed Word2Vec learning rate scaling with num iterations. The new learning rate is set to match the original Word2Vec C code and should give better results from training.
[SPARK-22289]: Add JSON support for Matrix parameters (This fixed a bug for ML persistence with LogisticRegressionModel when using bounds on coefficients.)
[SPARK-22700]: Bucketizer.transform incorrectly drops row containing NaN. When Param handleInvalid was set to “skip,” Bucketizer would drop a row with a valid value in the input column if another (irrelevant) column had a NaN value.
[SPARK-22446]: Catalyst optimizer sometimes caused StringIndexerModel to throw an incorrect “Unseen label” exception when handleInvalid was set to “error.” This could happen for filtered data, due to predicate push-down, causing errors even after invalid rows had already been filtered from the input dataset.
[SPARK-21681]: Fixed an edge case bug in multinomial logistic regression that resulted in incorrect coefficients when some features had zero variance.
Major optimizations:
- [SPARK-22707]: Reduced memory consumption for CrossValidator.
- [SPARK-22949]: Reduced memory consumption for TrainValidationSplit.
- [SPARK-21690]: Imputer should train using a single pass over the data.
- [SPARK-14371]: OnlineLDAOptimizer avoids collecting statistics to the driver for each mini-batch.

SparkR

The main focus of SparkR in the 2.3.0 release was improving the stability of UDFs and adding several new SparkR wrappers around existing APIs:

Major features

Improved function parity between SQL and R
[SPARK-22933]: Structured Streaming APIs for withWatermark, trigger, partitionBy and stream-stream joins.
[SPARK-21266]: SparkR UDF with DDL-formatted schema support.
[SPARK-20726][SPARK-22924][SPARK-22843]: Several new Dataframe API Wrappers.
[SPARK-15767][SPARK-21622][SPARK-20917][SPARK-20307][SPARK-20906]: Several new SparkML API Wrappers.

GraphX

Optimizations

[SPARK-5484]: Pregel now checkpoints periodically to avoid StackOverflowErrors.
[SPARK-21491]: Small performance improvement in several places.

Deprecations

Python

[SPARK-23122]: Deprecate register* for UDFs in SQLContext and Catalog in PySpark

MLlib

[SPARK-13030]: OneHotEncoder has been deprecated and will be removed in 3.0. It has been replaced by the new OneHotEncoderEstimator. OneHotEncoderEstimator will be renamed to OneHotEncoder in 3.0 (but OneHotEncoderEstimator will be kept as an alias).

Changes of behavior

SparkSQL

[SPARK-22036]: By default arithmetic operations between decimals return a rounded value if an exact representation is not possible (instead of returning NULL in the prior versions)
[SPARK-22937]: When all inputs are binary, SQL elt() returns an output as binary. Otherwise, it returns as a string. In prior versions, it always returned as a string regardless of input types.
[SPARK-22895]: The Join/Filter’s deterministic predicates that are after the first non-deterministic predicates are also pushed down/through the child operators, if possible. In the prior versions, these filters were not eligible for predicate pushdown.
[SPARK-22771]: When all inputs are binary, functions.concat() returns an output as binary. Otherwise, it returns as a string. In the prior versions, it always returned as a string regardless of input types.
[SPARK-22489]: When either of the join sides is broadcastable, we prefer to broadcast the table that is explicitly specified in a broadcast hint.
[SPARK-22165]: Partition column inference previously found incorrect common type for different inferred types. For example, previously it ended up with double type as the common type for double type and date type. Now it finds the correct common type for such conflicts. For details, see the migration guide.
[SPARK-22100]: The percentile_approx function previously accepted numeric type input and outputted double type results. Now it supports date type, timestamp type and numeric types as input types. The result type is also changed to be the same as the input type, which is more reasonable for percentiles.
[SPARK-21610]: the queries from raw JSON/CSV files are disallowed when the referenced columns only include the internal corrupt record column (named _corrupt_record by default). Instead, you can cache or save the parsed results and then send the same query.
[SPARK-23421]: Since Spark 2.2.1 and 2.3.0, the schema is always inferred at runtime when the data source tables have the columns that exist in both partition schema and data schema. The inferred schema does not have the partitioned columns. When reading the table, Spark respects the partition values of these overlapping columns instead of the values stored in the data source files. In 2.2.0 and 2.1.x release, the inferred schema is partitioned but the data of the table is invisible to users (i.e., the result set is empty).

PySpark

[SPARK-19732]: na.fill() or fillna also accepts boolean and replaces nulls with booleans. In prior Spark versions, PySpark just ignores it and returns the original Dataset/DataFrame.
[SPARK-22395]: pandas 0.19.2 or upper is required for using pandas related functionalities, such as toPandas, createDataFrame from pandas DataFrame, etc.
[SPARK-22395]: The behavior of timestamp values for pandas related functionalities was changed to respect session timezone, which is ignored in the prior versions.
[SPARK-23328]: df.replace does not allow to omit value when to_replace is not a dictionary. Previously, value could be omitted in the other cases and had None by default, which is counter-intuitive and error prone.

MLlib

Breaking API Changes: The class and trait hierarchy for logistic regression model summaries was changed to be cleaner and better accommodate the addition of the multi-class summary. This is a breaking change for user code that casts a LogisticRegressionTrainingSummary to a BinaryLogisticRegressionTrainingSummary. Users should instead use the model.binarySummary method. See [SPARK-17139]: for more detail (note this is an @Experimental API). This does not affect the Python summary method, which will still work correctly for both multinomial and binary cases.
[SPARK-21806]: BinaryClassificationMetrics.pr(): first point (0.0, 1.0) is misleading and has been replaced by (0.0, p) where precision p matches the lowest recall point.
[SPARK-16957]: Decision trees now use weighted midpoints when choosing split values. This may change results from model training.
[SPARK-14657]: RFormula without an intercept now outputs the reference category when encoding string terms, in order to match native R behavior. This may change results from model training.
[SPARK-21027]: The default parallelism used in OneVsRest is now set to 1 (i.e. serial). In 2.2 and earlier versions, the level of parallelism was set to the default threadpool size in Scala. This may change performance.
[SPARK-21523]: Upgraded Breeze to 0.13.2. This included an important bug fix in strong Wolfe line search for L-BFGS.
[SPARK-15526]: The JPMML dependency is now shaded.
Also see the “Bug fixes” section for behavior changes resulting from fixing bugs.

Known issues

[SPARK-23523][SQL]: Incorrect result caused by the rule OptimizeMetadataOnlyQuery.
[SPARK-23406]: Bugs in stream-stream self-joins.

Maintenance updates

See Databricks Runtime 4.0 maintenance updates.

System environment

Operating System: Ubuntu 16.04.4 LTS
Java: 1.8.0_151
Scala: 2.11.8
Python: 2.7.12 (or 3.5.2 if using Python 3)
R: R version 3.4.3 (2017-11-30)
GPU clusters: The following NVIDIA GPU libraries are installed:
- Tesla driver 375.66
- CUDA 8.0
- CUDNN 6.0

Installed Python libraries

Library	Version	Library	Version	Library	Version
ansi2html	1.1.1	argparse	1.2.1	backports-abc	0.5
boto	2.42.0	boto3	1.4.1	botocore	1.4.70
brewer2mpl	1.4.1	certifi	2016.2.28	cffi	1.7.0
chardet	2.3.0	colorama	0.3.7	configobj	5.0.6
cryptography	1.5	cycler	0.10.0	Cython	0.24.1
decorator	4.0.10	docutils	0.14	enum34	1.1.6
et-xmlfile	1.0.1	freetype-py	1.0.2	funcsigs	1.0.2
fusepy	2.0.4	futures	3.2.0	ggplot	0.6.8
html5lib	0.999	idna	2.1	ipaddress	1.0.16
ipython	2.2.0	ipython-genutils	0.1.0	jdcal	1.2
Jinja2	2.8	jmespath	0.9.0	llvmlite	0.13.0
lxml	3.6.4	MarkupSafe	0.23	matplotlib	1.5.3
mpld3	0.2	msgpack-python	0.4.7	ndg-httpsclient	0.3.3
numba	0.28.1	numpy	1.11.1	openpyxl	2.3.2
pandas	0.19.2	pathlib2	2.1.0	patsy	0.4.1
pexpect	4.0.1	pickleshare	0.7.4	Pillow	3.3.1
pip	9.0.1	ply	3.9	prompt-toolkit	1.0.7
psycopg2	2.6.2	ptyprocess	0.5.1	py4j	0.10.3
pyarrow	0.8.0	pyasn1	0.1.9	pycparser	2.14
Pygments	2.1.3	PyGObject	3.20.0	pyOpenSSL	16.0.0
pyparsing	2.2.0	pypng	0.0.18	Python	2.7.12
python-dateutil	2.5.3	python-geohash	0.8.5	pytz	2016.6.1
requests	2.11.1	s3transfer	0.1.9	scikit-learn	0.18.1
scipy	0.18.1	scour	0.32	seaborn	0.7.1
setuptools	38.5.1	simplejson	3.8.2	simples3	1.0
singledispatch	3.4.0.3	six	1.10.0	statsmodels	0.6.1
tornado	4.5.3	traitlets	4.3.0	urllib3	1.19.1
virtualenv	15.0.1	wcwidth	0.1.7	wheel	0.30.0
wsgiref	0.1.2

Installed R libraries

Library	Version	Library	Version	Library	Version
abind	1.4-5	assertthat	0.2.0	backports	1.1.1
base	3.4.3	BH	1.65.0-1	bindr	0.1
bindrcpp	0.2	bit	1.1-12	bit64	0.9-7
bitops	1.0-6	blob	1.1.0	boot	1.3-20
brew	1.0-6	broom	0.4.3	car	2.1-6
caret	6.0-77	chron	2.3-51	class	7.3-14
cluster	2.0.6	codetools	0.2-15	colorspace	1.3-2
commonmark	1.4	compiler	3.4.3	crayon	1.3.4
curl	3.0	CVST	0.2-1	data.table	1.10.4-3
datasets	3.4.3	DBI	0.7	ddalpha	1.3.1
DEoptimR	1.0-8	desc	1.1.1	devtools	1.13.4
dichromat	2.0-0	digest	0.6.12	dimRed	0.1.0
doMC	1.3.4	dplyr	0.7.4	DRR	0.0.2
foreach	1.4.3	foreign	0.8-69	gbm	2.1.3
ggplot2	2.2.1	git2r	0.19.0	glmnet	2.0-13
glue	1.2.0	gower	0.1.2	graphics	3.4.3
grDevices	3.4.3	grid	3.4.3	gsubfn	0.6-6
gtable	0.2.0	h2o	3.16.0.1	httr	1.3.1
hwriter	1.3.2	hwriterPlus	1.0-3	ipred	0.9-6
iterators	1.0.8	jsonlite	1.5	kernlab	0.9-25
KernSmooth	2.23-15	labeling	0.3	lattice	0.20-35
lava	1.5.1	lazyeval	0.2.1	littler	0.3.2
lme4	1.1-14	lubridate	1.7.1	magrittr	1.5
mapproj	1.2-5	maps	3.2.0	MASS	7.3-48
Matrix	1.2-11	MatrixModels	0.4-1	memoise	1.1.0
methods	3.4.3	mgcv	1.8-23	mime	0.5
minqa	1.2.4	mnormt	1.5-5	ModelMetrics	1.1.0
munsell	0.4.3	mvtnorm	1.0-6	nlme	3.1-131
nloptr	1.0.4	nnet	7.3-12	numDeriv	2016.8-1
openssl	0.9.9	parallel	3.4.3	pbkrtest	0.4-7
pkgconfig	2.0.1	pkgKitten	0.1.4	plogr	0.1-1
plyr	1.8.4	praise	1.0.0	pROC	1.10.0
prodlim	1.6.1	proto	1.0.0	psych	1.7.8
purrr	0.2.4	quantreg	5.34	R.methodsS3	1.7.1
R.oo	1.21.0	R.utils	2.6.0	R6	2.2.2
randomForest	4.6-12	RColorBrewer	1.1-2	Rcpp	0.12.14
RcppEigen	0.3.3.3.1	RcppRoll	0.2.2	RCurl	1.95-4.8
recipes	0.1.1	reshape2	1.4.2	rlang	0.1.4
robustbase	0.92-8	RODBC	1.3-15	roxygen2	6.0.1
rpart	4.1-12	rprojroot	1.2	Rserve	1.7-3
RSQLite	2.0	rstudioapi	0.7	scales	0.5.0
sfsmisc	1.1-1	sp	1.2-5	SparkR	2.3.0
SparseM	1.77	spatial	7.3-11	splines	3.4.3
sqldf	0.4-11	statmod	1.4.30	stats	3.4.3
stats4	3.4.3	stringi	1.1.6	stringr	1.2.0
survival	2.41-3	tcltk	3.4.3	TeachingDemos	2.10
testthat	1.0.2	tibble	1.3.4	tidyr	0.7.2
tidyselect	0.2.3	timeDate	3042.101	tools	3.4.3
utils	3.4.3	viridisLite	0.2.0	whisker	0.3-2
withr	2.1.0	xml2	1.1.1

Installed Java and Scala libraries (Scala 2.11 cluster version)

Group ID	Artifact ID	Version
antlr	antlr	2.7.7
com.amazonaws	amazon-kinesis-client	1.7.3
com.amazonaws	aws-java-sdk-autoscaling	1.11.253
com.amazonaws	aws-java-sdk-cloudformation	1.11.253
com.amazonaws	aws-java-sdk-cloudfront	1.11.253
com.amazonaws	aws-java-sdk-cloudhsm	1.11.253
com.amazonaws	aws-java-sdk-cloudsearch	1.11.253
com.amazonaws	aws-java-sdk-cloudtrail	1.11.253
com.amazonaws	aws-java-sdk-cloudwatch	1.11.253
com.amazonaws	aws-java-sdk-cloudwatchmetrics	1.11.253
com.amazonaws	aws-java-sdk-codedeploy	1.11.253
com.amazonaws	aws-java-sdk-cognitoidentity	1.11.253
com.amazonaws	aws-java-sdk-cognitosync	1.11.253
com.amazonaws	aws-java-sdk-config	1.11.253
com.amazonaws	aws-java-sdk-core	1.11.253
com.amazonaws	aws-java-sdk-datapipeline	1.11.253
com.amazonaws	aws-java-sdk-directconnect	1.11.253
com.amazonaws	aws-java-sdk-directory	1.11.253
com.amazonaws	aws-java-sdk-dynamodb	1.11.253
com.amazonaws	aws-java-sdk-ec2	1.11.253
com.amazonaws	aws-java-sdk-ecs	1.11.253
com.amazonaws	aws-java-sdk-efs	1.11.253
com.amazonaws	aws-java-sdk-elasticache	1.11.253
com.amazonaws	aws-java-sdk-elasticbeanstalk	1.11.253
com.amazonaws	aws-java-sdk-elasticloadbalancing	1.11.253
com.amazonaws	aws-java-sdk-elastictranscoder	1.11.253
com.amazonaws	aws-java-sdk-emr	1.11.253
com.amazonaws	aws-java-sdk-glacier	1.11.253
com.amazonaws	aws-java-sdk-iam	1.11.253
com.amazonaws	aws-java-sdk-importexport	1.11.253
com.amazonaws	aws-java-sdk-kinesis	1.11.253
com.amazonaws	aws-java-sdk-kms	1.11.253
com.amazonaws	aws-java-sdk-lambda	1.11.253
com.amazonaws	aws-java-sdk-logs	1.11.253
com.amazonaws	aws-java-sdk-machinelearning	1.11.253
com.amazonaws	aws-java-sdk-opsworks	1.11.253
com.amazonaws	aws-java-sdk-rds	1.11.253
com.amazonaws	aws-java-sdk-redshift	1.11.253
com.amazonaws	aws-java-sdk-route53	1.11.253
com.amazonaws	aws-java-sdk-s3	1.11.253
com.amazonaws	aws-java-sdk-ses	1.11.253
com.amazonaws	aws-java-sdk-simpledb	1.11.253
com.amazonaws	aws-java-sdk-simpleworkflow	1.11.253
com.amazonaws	aws-java-sdk-sns	1.11.253
com.amazonaws	aws-java-sdk-sqs	1.11.253
com.amazonaws	aws-java-sdk-ssm	1.11.253
com.amazonaws	aws-java-sdk-storagegateway	1.11.253
com.amazonaws	aws-java-sdk-sts	1.11.253
com.amazonaws	aws-java-sdk-support	1.11.253
com.amazonaws	aws-java-sdk-swf-libraries	1.11.22
com.amazonaws	aws-java-sdk-workspaces	1.11.253
com.amazonaws	jmespath-java	1.11.253
com.carrotsearch	hppc	0.7.2
com.chuusai	shapeless_2.11	2.3.2
com.clearspring.analytics	stream	2.7.0
com.databricks	Rserve	1.8-3
com.databricks	dbml-local_2.11	0.3.0-db1-spark2.3
com.databricks	dbml-local_2.11-tests	0.3.0-db1-spark2.3
com.databricks	jets3t	0.7.1-0
com.databricks.scalapb	compilerplugin_2.11	0.4.15-9
com.databricks.scalapb	scalapb-runtime_2.11	0.4.15-9
com.esotericsoftware	kryo-shaded	3.0.3
com.esotericsoftware	minlog	1.3.0
com.fasterxml	classmate	1.0.0
com.fasterxml.jackson.core	jackson-annotations	2.6.7
com.fasterxml.jackson.core	jackson-core	2.6.7
com.fasterxml.jackson.core	jackson-databind	2.6.7.1
com.fasterxml.jackson.dataformat	jackson-dataformat-cbor	2.6.7
com.fasterxml.jackson.datatype	jackson-datatype-joda	2.6.7
com.fasterxml.jackson.module	jackson-module-paranamer	2.6.7
com.fasterxml.jackson.module	jackson-module-scala_2.11	2.6.7.1
com.github.fommil	jniloader	1.1
com.github.fommil.netlib	core	1.1.2
com.github.fommil.netlib	native_ref-java	1.1
com.github.fommil.netlib	native_ref-java-natives	1.1
com.github.fommil.netlib	native_system-java	1.1
com.github.fommil.netlib	native_system-java-natives	1.1
com.github.fommil.netlib	netlib-native_ref-linux-x86_64-natives	1.1
com.github.fommil.netlib	netlib-native_system-linux-x86_64-natives	1.1
com.github.luben	zstd-jni	1.3.2-2
com.github.rwl	jtransforms	2.4.0
com.google.code.findbugs	jsr305	2.0.1
com.google.code.gson	gson	2.2.4
com.google.guava	guava	15.0
com.google.protobuf	protobuf-java	2.6.1
com.googlecode.javaewah	JavaEWAH	0.3.2
com.h2database	h2	1.3.174
com.jamesmurty.utils	java-xmlbuilder	1.1
com.jcraft	jsch	0.1.50
com.jolbox	bonecp	0.8.0.RELEASE
com.mchange	c3p0	0.9.5.1
com.mchange	mchange-commons-java	0.2.10
com.microsoft.azure	azure-data-lake-store-sdk	2.0.11
com.microsoft.sqlserver	mssql-jdbc	6.2.2.jre8
com.ning	compress-lzf	1.0.3
com.sun.mail	javax.mail	1.5.2
com.thoughtworks.paranamer	paranamer	2.8
com.trueaccord.lenses	lenses_2.11	0.3
com.twitter	chill-java	0.8.4
com.twitter	chill_2.11	0.8.4
com.twitter	parquet-hadoop-bundle	1.6.0
com.twitter	util-app_2.11	6.23.0
com.twitter	util-core_2.11	6.23.0
com.twitter	util-jvm_2.11	6.23.0
com.typesafe	config	1.2.1
com.typesafe.scala-logging	scala-logging-api_2.11	2.1.2
com.typesafe.scala-logging	scala-logging-slf4j_2.11	2.1.2
com.univocity	univocity-parsers	2.5.9
com.vlkan	flatbuffers	1.2.0-3f79e055
com.zaxxer	HikariCP	2.4.1
commons-beanutils	commons-beanutils	1.7.0
commons-beanutils	commons-beanutils-core	1.8.0
commons-cli	commons-cli	1.2
commons-codec	commons-codec	1.10
commons-collections	commons-collections	3.2.2
commons-configuration	commons-configuration	1.6
commons-dbcp	commons-dbcp	1.4
commons-digester	commons-digester	1.8
commons-httpclient	commons-httpclient	3.1
commons-io	commons-io	2.4
commons-lang	commons-lang	2.6
commons-logging	commons-logging	1.1.3
commons-net	commons-net	2.2
commons-pool	commons-pool	1.5.4
info.ganglia.gmetric4j	gmetric4j	1.0.7
io.airlift	aircompressor	0.8
io.dropwizard.metrics	metrics-core	3.1.5
io.dropwizard.metrics	metrics-ganglia	3.1.5
io.dropwizard.metrics	metrics-graphite	3.1.5
io.dropwizard.metrics	metrics-healthchecks	3.1.5
io.dropwizard.metrics	metrics-jetty9	3.1.5
io.dropwizard.metrics	metrics-json	3.1.5
io.dropwizard.metrics	metrics-jvm	3.1.5
io.dropwizard.metrics	metrics-log4j	3.1.5
io.dropwizard.metrics	metrics-servlets	3.1.5
io.netty	netty	3.9.9.Final
io.netty	netty-all	4.1.17.Final
io.prometheus	simpleclient	0.0.16
io.prometheus	simpleclient_common	0.0.16
io.prometheus	simpleclient_dropwizard	0.0.16
io.prometheus	simpleclient_servlet	0.0.16
io.prometheus.jmx	collector	0.7
javax.activation	activation	1.1.1
javax.annotation	javax.annotation-api	1.2
javax.el	javax.el-api	2.2.4
javax.jdo	jdo-api	3.0.1
javax.servlet	javax.servlet-api	3.1.0
javax.servlet.jsp	jsp-api	2.1
javax.transaction	jta	1.1
javax.validation	validation-api	1.1.0.Final
javax.ws.rs	javax.ws.rs-api	2.0.1
javax.xml.bind	jaxb-api	2.2.2
javax.xml.stream	stax-api	1.0-2
javolution	javolution	5.5.1
jline	jline	2.11
joda-time	joda-time	2.9.3
log4j	apache-log4j-extras	1.2.17
log4j	log4j	1.2.17
net.hydromatic	eigenbase-properties	1.1.5
net.iharder	base64	2.3.8
net.java.dev.jets3t	jets3t	0.9.4
net.razorvine	pyrolite	4.13
net.sf.jpam	jpam	1.1
net.sf.opencsv	opencsv	2.3
net.sf.supercsv	super-csv	2.2.0
net.sourceforge.f2j	arpack_combined_all	0.1
org.acplt	oncrpc	1.0.7
org.antlr	ST4	4.0.4
org.antlr	antlr-runtime	3.4
org.antlr	antlr4-runtime	4.7
org.antlr	stringtemplate	3.2.1
org.apache.ant	ant	1.9.2
org.apache.ant	ant-jsch	1.9.2
org.apache.ant	ant-launcher	1.9.2
org.apache.arrow	arrow-format	0.8.0
org.apache.arrow	arrow-memory	0.8.0
org.apache.arrow	arrow-vector	0.8.0
org.apache.avro	avro	1.7.7
org.apache.avro	avro-ipc	1.7.7
org.apache.avro	avro-ipc-tests	1.7.7
org.apache.avro	avro-mapred-hadoop2	1.7.7
org.apache.calcite	calcite-avatica	1.2.0-incubating
org.apache.calcite	calcite-core	1.2.0-incubating
org.apache.calcite	calcite-linq4j	1.2.0-incubating
org.apache.commons	commons-compress	1.4.1
org.apache.commons	commons-crypto	1.0.0
org.apache.commons	commons-lang3	3.5
org.apache.commons	commons-math3	3.4.1
org.apache.curator	curator-client	2.7.1
org.apache.curator	curator-framework	2.7.1
org.apache.curator	curator-recipes	2.7.1
org.apache.derby	derby	10.12.1.1
org.apache.directory.api	api-asn1-api	1.0.0-M20
org.apache.directory.api	api-util	1.0.0-M20
org.apache.directory.server	apacheds-i18n	2.0.0-M15
org.apache.directory.server	apacheds-kerberos-codec	2.0.0-M15
org.apache.hadoop	hadoop-annotations	2.7.3
org.apache.hadoop	hadoop-auth	2.7.3
org.apache.hadoop	hadoop-client	2.7.3
org.apache.hadoop	hadoop-common	2.7.3
org.apache.hadoop	hadoop-hdfs	2.7.3
org.apache.hadoop	hadoop-mapreduce-client-app	2.7.3
org.apache.hadoop	hadoop-mapreduce-client-common	2.7.3
org.apache.hadoop	hadoop-mapreduce-client-core	2.7.3
org.apache.hadoop	hadoop-mapreduce-client-jobclient	2.7.3
org.apache.hadoop	hadoop-mapreduce-client-shuffle	2.7.3
org.apache.hadoop	hadoop-yarn-api	2.7.3
org.apache.hadoop	hadoop-yarn-client	2.7.3
org.apache.hadoop	hadoop-yarn-common	2.7.3
org.apache.hadoop	hadoop-yarn-server-common	2.7.3
org.apache.htrace	htrace-core	3.1.0-incubating
org.apache.httpcomponents	httpclient	4.5.4
org.apache.httpcomponents	httpcore	4.4.8
org.apache.ivy	ivy	2.4.0
org.apache.orc	orc-core-nohive	1.4.1
org.apache.orc	orc-mapreduce-nohive	1.4.1
org.apache.parquet	parquet-column	1.8.2-databricks1
org.apache.parquet	parquet-common	1.8.2-databricks1
org.apache.parquet	parquet-encoding	1.8.2-databricks1
org.apache.parquet	parquet-format	2.3.1
org.apache.parquet	parquet-hadoop	1.8.2-databricks1
org.apache.parquet	parquet-jackson	1.8.2-databricks1
org.apache.thrift	libfb303	0.9.3
org.apache.thrift	libthrift	0.9.3
org.apache.xbean	xbean-asm5-shaded	4.4
org.apache.zookeeper	zookeeper	3.4.6
org.bouncycastle	bcprov-jdk15on	1.58
org.codehaus.jackson	jackson-core-asl	1.9.13
org.codehaus.jackson	jackson-jaxrs	1.9.13
org.codehaus.jackson	jackson-mapper-asl	1.9.13
org.codehaus.jackson	jackson-xc	1.9.13
org.codehaus.janino	commons-compiler	3.0.8
org.codehaus.janino	janino	3.0.8
org.datanucleus	datanucleus-api-jdo	3.2.6
org.datanucleus	datanucleus-core	3.2.10
org.datanucleus	datanucleus-rdbms	3.2.9
org.eclipse.jetty	jetty-client	9.3.20.v20170531
org.eclipse.jetty	jetty-continuation	9.3.20.v20170531
org.eclipse.jetty	jetty-http	9.3.20.v20170531
org.eclipse.jetty	jetty-io	9.3.20.v20170531
org.eclipse.jetty	jetty-jndi	9.3.20.v20170531
org.eclipse.jetty	jetty-plus	9.3.20.v20170531
org.eclipse.jetty	jetty-proxy	9.3.20.v20170531
org.eclipse.jetty	jetty-security	9.3.20.v20170531
org.eclipse.jetty	jetty-server	9.3.20.v20170531
org.eclipse.jetty	jetty-servlet	9.3.20.v20170531
org.eclipse.jetty	jetty-servlets	9.3.20.v20170531
org.eclipse.jetty	jetty-util	9.3.20.v20170531
org.eclipse.jetty	jetty-webapp	9.3.20.v20170531
org.eclipse.jetty	jetty-xml	9.3.20.v20170531
org.fusesource.leveldbjni	leveldbjni-all	1.8
org.glassfish.hk2	hk2-api	2.4.0-b34
org.glassfish.hk2	hk2-locator	2.4.0-b34
org.glassfish.hk2	hk2-utils	2.4.0-b34
org.glassfish.hk2	osgi-resource-locator	1.0.1
org.glassfish.hk2.external	aopalliance-repackaged	2.4.0-b34
org.glassfish.hk2.external	javax.inject	2.4.0-b34
org.glassfish.jersey.bundles.repackaged	jersey-guava	2.22.2
org.glassfish.jersey.containers	jersey-container-servlet	2.22.2
org.glassfish.jersey.containers	jersey-container-servlet-core	2.22.2
org.glassfish.jersey.core	jersey-client	2.22.2
org.glassfish.jersey.core	jersey-common	2.22.2
org.glassfish.jersey.core	jersey-server	2.22.2
org.glassfish.jersey.media	jersey-media-jaxb	2.22.2
org.hibernate	hibernate-validator	5.1.1.Final
org.iq80.snappy	snappy	0.2
org.javassist	javassist	3.18.1-GA
org.jboss.logging	jboss-logging	3.1.3.GA
org.jdbi	jdbi	2.63.1
org.joda	joda-convert	1.7
org.jodd	jodd-core	3.5.2
org.json4s	json4s-ast_2.11	3.2.11
org.json4s	json4s-core_2.11	3.2.11
org.json4s	json4s-jackson_2.11	3.2.11
org.lz4	lz4-java	1.4.0
org.mariadb.jdbc	mariadb-java-client	2.1.2
org.mockito	mockito-all	1.9.5
org.objenesis	objenesis	2.1
org.postgresql	postgresql	42.1.4
org.roaringbitmap	RoaringBitmap	0.5.11
org.rocksdb	rocksdbjni	5.2.1
org.rosuda.REngine	REngine	2.1.0
org.scala-lang	scala-compiler_2.11	2.11.8
org.scala-lang	scala-library_2.11	2.11.8
org.scala-lang	scala-reflect_2.11	2.11.8
org.scala-lang	scalap_2.11	2.11.8
org.scala-lang.modules	scala-parser-combinators_2.11	1.0.2
org.scala-lang.modules	scala-xml_2.11	1.0.5
org.scala-sbt	test-interface	1.0
org.scalacheck	scalacheck_2.11	1.12.5
org.scalanlp	breeze-macros_2.11	0.13.2
org.scalanlp	breeze_2.11	0.13.2
org.scalatest	scalatest_2.11	2.2.6
org.slf4j	jcl-over-slf4j	1.7.16
org.slf4j	jul-to-slf4j	1.7.16
org.slf4j	slf4j-api	1.7.16
org.slf4j	slf4j-log4j12	1.7.16
org.spark-project.hive	hive-beeline	1.2.1.spark2
org.spark-project.hive	hive-cli	1.2.1.spark2
org.spark-project.hive	hive-exec	1.2.1.spark2
org.spark-project.hive	hive-jdbc	1.2.1.spark2
org.spark-project.hive	hive-metastore	1.2.1.spark2
org.spark-project.spark	unused	1.0.0
org.spire-math	spire-macros_2.11	0.13.0
org.spire-math	spire_2.11	0.13.0
org.springframework	spring-core	4.1.4.RELEASE
org.springframework	spring-test	4.1.4.RELEASE
org.tukaani	xz	1.0
org.typelevel	machinist_2.11	0.6.1
org.typelevel	macro-compat_2.11	1.1.1
org.xerial	sqlite-jdbc	3.8.11.2
org.xerial.snappy	snappy-java	1.1.2.6
org.yaml	snakeyaml	1.16
oro	oro	2.0.8
software.amazon.ion	ion-java	1.0.2
stax	stax-api	1.0.1
xmlenc	xmlenc	0.52

Jaa