Databricks Runtime 14.3 LTS for Machine Learning

Databricks Runtime 14.3 LTS for Machine Learning provides a ready-to-go environment for machine learning and data science based on Databricks Runtime 14.3 LTS. Databricks Runtime ML contains many popular machine learning libraries, including TensorFlow, PyTorch, and XGBoost. Databricks Runtime ML includes AutoML, a tool to automatically train machine learning pipelines. Databricks Runtime ML also supports distributed deep learning training using Horovod.

New features and improvements

Databricks Runtime 14.3 LTS ML is built on top of Databricks Runtime 14.3 LTS. For information on what’s new in Databricks Runtime 14.3 LTS, including Apache Spark MLlib and SparkR, see the Databricks Runtime 14.3 LTS release notes.

System environment

The system environment in Databricks Runtime 14.3 LTS ML differs from Databricks Runtime 14.3 LTS as follows:

  • For GPU clusters, Databricks Runtime ML includes the following NVIDIA GPU libraries:
    • CUDA 11.8
    • cuDNN 8.9.0.131-1
    • NCCL 2.15.5
    • TensorRT 8.5.3-1

Databricks Runtime 14.3 LTS ML includes XGBoost 1.7.6, which does not support GPU clusters with compute capability 5.2 and below.

Libraries

The following sections list the libraries included in Databricks Runtime 14.3 LTS ML that differ from those included in Databricks Runtime 14.3 LTS.

In this section:

Top-tier libraries

Databricks Runtime 14.3 LTS ML includes the following top-tier libraries:

Python libraries

Databricks Runtime 14.3 LTS ML uses virtualenv for Python package management and includes many popular ML packages.

In addition to the packages specified in the following sections, Databricks Runtime 14.3 LTS ML also includes the following packages:

  • hyperopt 0.2.7+db4
  • sparkdl 3.0.0_db1
  • automl 1.24.0

To reproduce the Databricks Runtime ML Python environment in your local Python virtual environment, download the requirements-14.3.txt file and run pip install -r requirements-14.3.txt. This command installs all of the open source libraries that Databricks Runtime ML uses, but does not install libraries developed by Databricks, such as databricks-automl, databricks-feature-store, or the Databricks fork of hyperopt.

Python libraries on CPU clusters

Library Version Library Version Library Version
absl-py 1.0.0 accelerate 0.25.0 aiohttp 3.9.1
aiosignal 1.3.1 anyio 3.5.0 appdirs 1.4.4
argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 astor 0.8.1
asttokens 2.0.5 astunparse 1.6.3 async-timeout 4.0.3
attrs 22.1.0 audioread 3.0.1 azure-core 1.29.1
azure-cosmos 4.3.1 azure-storage-blob 12.19.0 azure-storage-file-datalake 12.14.0
backcall 0.2.0 bcrypt 3.2.0 beautifulsoup4 4.11.1
black 22.6.0 bleach 4.1.0 blinker 1.4
blis 0.7.11 boto3 1.24.28 botocore 1.27.96
cachetools 5.3.2 catalogue 2.0.10 category-encoders 2.6.3
certifi 2022.12.7 cffi 1.15.1 chardet 4.0.0
charset-normalizer 2.0.4 click 8.0.4 cloudpathlib 0.16.0
cloudpickle 2.0.0 cmdstanpy 1.2.0 comm 0.1.2
confection 0.1.4 configparser 5.2.0 contourpy 1.0.5
cryptography 39.0.1 cycler 0.11.0 cymem 2.0.8
Cython 0.29.32 dacite 1.8.1 databricks-automl-runtime 0.2.20
databricks-cli 0.18.0 databricks-feature-engineering 0.2.0 databricks-sdk 0.1.6
dataclasses-json 0.6.3 datasets 2.15.0 dbl-tempo 0.1.26
dbus-python 1.2.18 debugpy 1.6.7 decorator 5.1.1
deepspeed 0.12.4 defusedxml 0.7.1 dill 0.3.6
diskcache 5.6.3 distlib 0.3.7 docstring-to-markdown 0.11
entrypoints 0.4 evaluate 0.4.1 executing 0.8.3
facets-overview 1.1.1 fastjsonschema 2.19.1 fasttext 0.9.2
filelock 3.9.0 Flask 2.2.5 flatbuffers 23.5.26
fonttools 4.25.0 frozenlist 1.4.1 fsspec 2023.6.0
future 0.18.3 gast 0.4.0 gitdb 4.0.11
GitPython 3.1.27 google-api-core 2.15.0 google-auth 2.21.0
google-auth-oauthlib 1.0.0 google-cloud-core 2.4.1 google-cloud-storage 2.11.0
google-crc32c 1.5.0 google-pasta 0.2.0 google-resumable-media 2.7.0
googleapis-common-protos 1.62.0 greenlet 2.0.1 grpcio 1.48.2
grpcio-status 1.48.1 gunicorn 20.1.0 gviz-api 1.10.0
h5py 3.7.0 hjson 3.1.0 holidays 0.38
horovod 0.28.1 htmlmin 0.1.12 httplib2 0.20.2
huggingface-hub 0.19.4 idna 3.4 ImageHash 4.3.1
imbalanced-learn 0.11.0 importlib-metadata 4.11.3 importlib-resources 6.1.1
ipykernel 6.25.0 ipython 8.14.0 ipython-genutils 0.2.0
ipywidgets 7.7.2 isodate 0.6.1 itsdangerous 2.0.1
jedi 0.18.1 jeepney 0.7.1 Jinja2 3.1.2
jmespath 0.10.0 joblib 1.2.0 joblibspark 0.5.1
jsonpatch 1.33 jsonpointer 2.4 jsonschema 4.17.3
jupyter-client 7.3.4 jupyter-server 1.23.4 jupyter_core 5.2.0
jupyterlab-pygments 0.1.2 jupyterlab-widgets 1.0.0 keras 2.14.0
keyring 23.5.0 kiwisolver 1.4.4 langchain 0.0.348
langchain-core 0.0.13 langcodes 3.3.0 langsmith 0.0.79
launchpadlib 1.10.16 lazr.restfulclient 0.14.4 lazr.uri 1.0.6
lazy_loader 0.3 libclang 15.0.6.1 librosa 0.10.1
lightgbm 4.1.0 llvmlite 0.39.1 lxml 4.9.1
Mako 1.2.0 Markdown 3.4.1 MarkupSafe 2.1.1
marshmallow 3.20.2 matplotlib 3.7.0 matplotlib-inline 0.1.6
mccabe 0.7.0 mistune 0.8.4 ml-dtypes 0.2.0
mlflow-skinny 2.9.2 more-itertools 8.10.0 mpmath 1.2.1
msgpack 1.0.7 multidict 6.0.4 multimethod 1.10
multiprocess 0.70.14 murmurhash 1.0.10 mypy-extensions 0.4.3
nbclassic 0.5.2 nbclient 0.5.13 nbconvert 6.5.4
nbformat 5.7.0 nest-asyncio 1.5.6 networkx 2.8.4
ninja 1.11.1.1 nltk 3.7 nodeenv 1.8.0
notebook 6.5.2 notebook_shim 0.2.2 numba 0.56.4
numpy 1.23.5 oauthlib 3.2.0 openai 0.28.1
opt-einsum 3.3.0 packaging 23.2 pandas 1.5.3
pandocfilters 1.5.0 paramiko 2.9.2 parso 0.8.3
pathspec 0.10.3 patsy 0.5.3 petastorm 0.12.1
pexpect 4.8.0 phik 0.12.4 pickleshare 0.7.5
Pillow 9.4.0 pip 22.3.1 platformdirs 2.5.2
plotly 5.9.0 pluggy 1.0.0 pmdarima 2.0.4
pooch 1.4.0 preshed 3.0.9 prometheus-client 0.14.1
prompt-toolkit 3.0.36 prophet 1.1.5 protobuf 4.24.0
psutil 5.9.0 psycopg2 2.9.3 ptyprocess 0.7.0
pure-eval 0.2.2 py-cpuinfo 9.0.0 pyarrow 8.0.0
pyarrow-hotfix 0.5 pyasn1 0.4.8 pyasn1-modules 0.2.8
pybind11 2.11.1 pycparser 2.21 pydantic 1.10.6
pyflakes 3.1.0 Pygments 2.11.2 PyGObject 3.42.1
PyJWT 2.3.0 PyNaCl 1.5.0 pynvml 11.5.0
pyodbc 4.0.32 pyparsing 3.0.9 pyright 1.1.294
pyrsistent 0.18.0 pytesseract 0.3.10 python-dateutil 2.8.2
python-editor 1.0.4 python-lsp-jsonrpc 1.1.1 python-lsp-server 1.8.0
pytoolconfig 1.2.5 pytz 2022.7 PyWavelets 1.4.1
PyYAML 6.0 pyzmq 23.2.0 regex 2022.7.9
requests 2.28.1 requests-oauthlib 1.3.1 responses 0.18.0
rope 1.7.0 rsa 4.9 s3transfer 0.6.2
safetensors 0.4.1 scikit-learn 1.1.1 scipy 1.10.0
seaborn 0.12.2 SecretStorage 3.3.1 Send2Trash 1.8.0
sentence-transformers 2.2.2 sentencepiece 0.1.99 setuptools 65.6.3
shap 0.44.0 simplejson 3.17.6 six 1.16.0
slicer 0.0.7 smart-open 5.2.1 smmap 5.0.0
sniffio 1.2.0 soundfile 0.12.1 soupsieve 2.3.2.post1
soxr 0.3.7 spacy 3.7.2 spacy-legacy 3.0.12
spacy-loggers 1.0.5 spark-tensorflow-distributor 1.0.0 SQLAlchemy 1.4.39
sqlparse 0.4.2 srsly 2.4.8 ssh-import-id 5.11
stack-data 0.2.0 stanio 0.3.0 statsmodels 0.13.5
sympy 1.11.1 tabulate 0.8.10 tangled-up-in-unicode 0.2.0
tenacity 8.1.0 tensorboard 2.14.1 tensorboard-data-server 0.7.2
tensorboard-plugin-profile 2.14.0 tensorflow-cpu 2.14.1 tensorflow-estimator 2.14.0
tensorflow-io-gcs-filesystem 0.35.0 termcolor 2.4.0 terminado 0.17.1
thinc 8.2.2 threadpoolctl 2.2.0 tiktoken 0.5.2
tinycss2 1.2.1 tokenize-rt 4.2.1 tokenizers 0.15.0
tomli 2.0.1 torch 2.0.1+cpu torchvision 0.15.2+cpu
tornado 6.1 tqdm 4.64.1 traitlets 5.7.1
transformers 4.36.1 typeguard 2.13.3 typer 0.9.0
typing-inspect 0.9.0 typing_extensions 4.4.0 ujson 5.4.0
unattended-upgrades 0.1 urllib3 1.26.14 virtualenv 20.16.7
visions 0.7.5 wadllib 1.3.6 wasabi 1.1.2
wcwidth 0.2.5 weasel 0.3.4 webencodings 0.5.1
websocket-client 0.58.0 Werkzeug 2.2.2 whatthepatch 1.0.2
wheel 0.38.4 widgetsnbextension 3.6.1 wordcloud 1.9.3
wrapt 1.14.1 xgboost 1.7.6 xxhash 3.4.1
yapf 0.33.0 yarl 1.9.4 ydata-profiling 4.2.0
zipp 3.11.0

Python libraries on GPU clusters

Library Version Library Version Library Version
absl-py 1.0.0 accelerate 0.25.0 aiohttp 3.9.1
aiosignal 1.3.1 anyio 3.5.0 appdirs 1.4.4
argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 astor 0.8.1
asttokens 2.0.5 astunparse 1.6.3 async-timeout 4.0.3
attrs 22.1.0 audioread 3.0.1 azure-core 1.29.1
azure-cosmos 4.3.1 azure-storage-blob 12.19.0 azure-storage-file-datalake 12.14.0
backcall 0.2.0 bcrypt 3.2.0 beautifulsoup4 4.11.1
black 22.6.0 bleach 4.1.0 blinker 1.4
blis 0.7.11 boto3 1.24.28 botocore 1.27.96
cachetools 5.3.2 catalogue 2.0.10 category-encoders 2.6.3
certifi 2022.12.7 cffi 1.15.1 chardet 4.0.0
charset-normalizer 2.0.4 click 8.0.4 cloudpathlib 0.16.0
cloudpickle 2.0.0 cmake 3.28.1 cmdstanpy 1.2.0
comm 0.1.2 confection 0.1.4 configparser 5.2.0
contourpy 1.0.5 cryptography 39.0.1 cycler 0.11.0
cymem 2.0.8 Cython 0.29.32 dacite 1.8.1
databricks-automl-runtime 0.2.20 databricks-cli 0.18.0 databricks-feature-engineering 0.2.0
databricks-sdk 0.1.6 dataclasses-json 0.6.3 datasets 2.15.0
dbl-tempo 0.1.26 dbus-python 1.2.18 debugpy 1.6.7
decorator 5.1.1 deepspeed 0.12.4 defusedxml 0.7.1
dill 0.3.6 diskcache 5.6.3 distlib 0.3.7
docstring-to-markdown 0.11 einops 0.7.0 entrypoints 0.4
evaluate 0.4.1 executing 0.8.3 facets-overview 1.1.1
fastjsonschema 2.19.1 fasttext 0.9.2 filelock 3.9.0
flash-attn 2.3.6 Flask 2.2.5 flatbuffers 23.5.26
fonttools 4.25.0 frozenlist 1.4.1 fsspec 2023.6.0
future 0.18.3 gast 0.4.0 gitdb 4.0.11
GitPython 3.1.27 google-api-core 2.15.0 google-auth 2.21.0
google-auth-oauthlib 1.0.0 google-cloud-core 2.4.1 google-cloud-storage 2.11.0
google-crc32c 1.5.0 google-pasta 0.2.0 google-resumable-media 2.7.0
googleapis-common-protos 1.62.0 greenlet 2.0.1 grpcio 1.48.2
grpcio-status 1.48.1 gunicorn 20.1.0 gviz-api 1.10.0
h5py 3.7.0 hjson 3.1.0 holidays 0.38
horovod 0.28.1 htmlmin 0.1.12 httplib2 0.20.2
huggingface-hub 0.19.4 idna 3.4 ImageHash 4.3.1
imbalanced-learn 0.11.0 importlib-metadata 4.11.3 importlib-resources 6.1.1
ipykernel 6.25.0 ipython 8.14.0 ipython-genutils 0.2.0
ipywidgets 7.7.2 isodate 0.6.1 itsdangerous 2.0.1
jedi 0.18.1 jeepney 0.7.1 Jinja2 3.1.2
jmespath 0.10.0 joblib 1.2.0 joblibspark 0.5.1
jsonpatch 1.33 jsonpointer 2.4 jsonschema 4.17.3
jupyter-client 7.3.4 jupyter-server 1.23.4 jupyter_core 5.2.0
jupyterlab-pygments 0.1.2 jupyterlab-widgets 1.0.0 keras 2.14.0
keyring 23.5.0 kiwisolver 1.4.4 langchain 0.0.348
langchain-core 0.0.13 langcodes 3.3.0 langsmith 0.0.79
launchpadlib 1.10.16 lazr.restfulclient 0.14.4 lazr.uri 1.0.6
lazy_loader 0.3 libclang 15.0.6.1 librosa 0.10.1
lightgbm 4.1.0 lit 17.0.6 llvmlite 0.39.1
lxml 4.9.1 Mako 1.2.0 Markdown 3.4.1
MarkupSafe 2.1.1 marshmallow 3.20.2 matplotlib 3.7.0
matplotlib-inline 0.1.6 mccabe 0.7.0 mistune 0.8.4
ml-dtypes 0.2.0 mlflow-skinny 2.9.2 more-itertools 8.10.0
mpmath 1.2.1 msgpack 1.0.7 multidict 6.0.4
multimethod 1.10 multiprocess 0.70.14 murmurhash 1.0.10
mypy-extensions 0.4.3 nbclassic 0.5.2 nbclient 0.5.13
nbconvert 6.5.4 nbformat 5.7.0 nest-asyncio 1.5.6
networkx 2.8.4 ninja 1.11.1.1 nltk 3.7
nodeenv 1.8.0 notebook 6.5.2 notebook_shim 0.2.2
numba 0.56.4 numpy 1.23.5 oauthlib 3.2.0
openai 0.28.1 opt-einsum 3.3.0 packaging 23.2
pandas 1.5.3 pandocfilters 1.5.0 paramiko 2.9.2
parso 0.8.3 pathspec 0.10.3 patsy 0.5.3
petastorm 0.12.1 pexpect 4.8.0 phik 0.12.4
pickleshare 0.7.5 Pillow 9.4.0 pip 22.3.1
platformdirs 2.5.2 plotly 5.9.0 pluggy 1.0.0
pmdarima 2.0.4 pooch 1.4.0 preshed 3.0.9
prompt-toolkit 3.0.36 prophet 1.1.5 protobuf 4.24.0
psutil 5.9.0 psycopg2 2.9.3 ptyprocess 0.7.0
pure-eval 0.2.2 py-cpuinfo 9.0.0 pyarrow 8.0.0
pyarrow-hotfix 0.5 pyasn1 0.4.8 pyasn1-modules 0.2.8
pybind11 2.11.1 pycparser 2.21 pydantic 1.10.6
pyflakes 3.1.0 Pygments 2.11.2 PyGObject 3.42.1
PyJWT 2.3.0 PyNaCl 1.5.0 pynvml 11.5.0
pyodbc 4.0.32 pyparsing 3.0.9 pyright 1.1.294
pyrsistent 0.18.0 pytesseract 0.3.10 python-dateutil 2.8.2
python-editor 1.0.4 python-lsp-jsonrpc 1.1.1 python-lsp-server 1.8.0
pytoolconfig 1.2.5 pytz 2022.7 PyWavelets 1.4.1
PyYAML 6.0 pyzmq 23.2.0 regex 2022.7.9
requests 2.28.1 requests-oauthlib 1.3.1 responses 0.18.0
rope 1.7.0 rsa 4.9 s3transfer 0.6.2
safetensors 0.4.1 scikit-learn 1.1.1 scipy 1.10.0
seaborn 0.12.2 SecretStorage 3.3.1 Send2Trash 1.8.0
sentence-transformers 2.2.2 sentencepiece 0.1.99 setuptools 65.6.3
shap 0.44.0 simplejson 3.17.6 six 1.16.0
slicer 0.0.7 smart-open 5.2.1 smmap 5.0.0
sniffio 1.2.0 soundfile 0.12.1 soupsieve 2.3.2.post1
soxr 0.3.7 spacy 3.7.2 spacy-legacy 3.0.12
spacy-loggers 1.0.5 spark-tensorflow-distributor 1.0.0 SQLAlchemy 1.4.39
sqlparse 0.4.2 srsly 2.4.8 ssh-import-id 5.11
stack-data 0.2.0 stanio 0.3.0 statsmodels 0.13.5
sympy 1.11.1 tabulate 0.8.10 tangled-up-in-unicode 0.2.0
tenacity 8.1.0 tensorboard 2.14.1 tensorboard-data-server 0.7.2
tensorboard-plugin-profile 2.14.0 tensorflow 2.14.1 tensorflow-estimator 2.14.0
tensorflow-io-gcs-filesystem 0.35.0 termcolor 2.4.0 terminado 0.17.1
thinc 8.2.2 threadpoolctl 2.2.0 tiktoken 0.5.2
tinycss2 1.2.1 tokenize-rt 4.2.1 tokenizers 0.15.0
tomli 2.0.1 torch 2.0.1+cu118 torchvision 0.15.2+cu118
tornado 6.1 tqdm 4.64.1 traitlets 5.7.1
transformers 4.36.1 triton 2.0.0 typeguard 2.13.3
typer 0.9.0 typing-inspect 0.9.0 typing_extensions 4.4.0
ujson 5.4.0 unattended-upgrades 0.1 urllib3 1.26.14
virtualenv 20.16.7 visions 0.7.5 wadllib 1.3.6
wasabi 1.1.2 wcwidth 0.2.5 weasel 0.3.4
webencodings 0.5.1 websocket-client 0.58.0 Werkzeug 2.2.2
whatthepatch 1.0.2 wheel 0.38.4 widgetsnbextension 3.6.1
wordcloud 1.9.3 wrapt 1.14.1 xgboost 1.7.6
xxhash 3.4.1 yapf 0.33.0 yarl 1.9.4
ydata-profiling 4.2.0 zipp 3.11.0

R libraries

The R libraries are identical to the R Libraries in Databricks Runtime 14.3 LTS.

Java and Scala libraries (Scala 2.12 cluster)

In addition to Java and Scala libraries in Databricks Runtime 14.3 LTS, Databricks Runtime 14.3 LTS ML contains the following JARs:

CPU clusters

Group ID Artifact ID Version
com.typesafe.akka akka-actor_2.12 2.5.23
ml.dmlc xgboost4j-spark_2.12 1.7.3
ml.dmlc xgboost4j_2.12 1.7.3
org.graphframes graphframes_2.12 0.8.2-db2-spark3.4
org.mlflow mlflow-client 2.9.2
org.scala-lang.modules scala-java8-compat_2.12 0.8.0
org.tensorflow spark-tensorflow-connector_2.12 1.15.0

GPU clusters

Group ID Artifact ID Version
com.typesafe.akka akka-actor_2.12 2.5.23
ml.dmlc xgboost4j-gpu_2.12 1.7.3
ml.dmlc xgboost4j-spark-gpu_2.12 1.7.3
org.graphframes graphframes_2.12 0.8.2-db2-spark3.4
org.mlflow mlflow-client 2.9.2
org.scala-lang.modules scala-java8-compat_2.12 0.8.0
org.tensorflow spark-tensorflow-connector_2.12 1.15.0