Databricks Runtime 12.2 LTS per Machine Learning
Databricks Runtime 12.2 LTS per Machine Learning offre un ambiente pronto per l'apprendimento automatico e l'analisi scientifica dei dati basato su Databricks Runtime 12.2 LTS. Databricks Runtime ML contiene molte librerie di Machine Learning più diffuse, tra cui TensorFlow, PyTorch e XGBoost. Databricks Runtime ML include AutoML, uno strumento per eseguire automaticamente il training delle pipeline di Machine Learning. Databricks Runtime ML supporta anche il training di Deep Learning distribuito usando Horovod.
Per altre informazioni, incluse le istruzioni per la creazione di un cluster Databricks Runtime ML, vedere Intelligenza artificiale e Machine Learning in Databricks.
Miglioramenti e nuove funzionalità
Databricks Runtime 12.2 LTS ML è basato su Databricks Runtime 12.2 LTS. Per informazioni sulle novità di Databricks Runtime 12.2 LTS, tra cui Apache Spark MLlib e SparkR, vedere le note sulla versione di Databricks Runtime 12.2 LTS .
Databricks AutoML
È possibile usare le tabelle delle funzionalità esistenti in Feature Store per aumentare il set di dati di input originale per i problemi di previsione autoML. Per informazioni dettagliate, vedere Integrazione di Feature Store.
Per altre informazioni su Databricks AutoML, vedere Informazioni su AutoML.
Ambiente di sistema
L'ambiente di sistema in Databricks Runtime 12.2 LTS ML differisce da Databricks Runtime 12.2 LTS come indicato di seguito:
- DBUtils: Databricks Runtime ML non include l'utilità libreria (dbutils.library) (legacy)..
Usare
%pip
invece i comandi. Vedere Librerie Python con ambito notebook. - Per i cluster GPU, Databricks Runtime ML include le librerie GPU NVIDIA seguenti:
- CUDA 11.3
- cuDNN 8.0.5.39
- NCCL 2.9.9
- TensorRT 7.2.2
Databricks Runtime 12.2 LTS ML include XGBoost 1.7.2, che non supporta cluster GPU con funzionalità di calcolo 5.2 e versioni successive.
Librerie
Le sezioni seguenti elencano le librerie incluse in Databricks Runtime 12.2 LTS ML diverse da quelle incluse in Databricks Runtime 12.2 LTS.
Contenuto della sezione:
Librerie di livello superiore
Databricks Runtime 12.2 LTS ML include le librerie di livello superiore seguenti:
- GraphFrames
- Horovod e HorovodRunner
- MLflow
- PyTorch
- spark-tensorflow-connector
- TensorFlow
- TensorBoard
- Scikit-learn
Librerie Python
Databricks Runtime 12.2 LTS ML usa Virtualenv per la gestione dei pacchetti Python e include molti pacchetti di Machine Learning più diffusi.
Oltre ai pacchetti specificati nelle sezioni seguenti, Databricks Runtime 12.2 LTS ML include anche i pacchetti seguenti:
- hyperopt 0.2.7+db3
- sparkdl 2.3.0-db3
- automl 1.16.0
Per riprodurre l'ambiente Python di Databricks Runtime ML nell'ambiente virtuale Python locale, scaricare il file di requirements-12.2.txt ed eseguire pip install -r requirements-12.2.txt
. Questo comando installa tutte le librerie open source usate da Databricks Runtime ML, ma non installa librerie sviluppate da Databricks, ad esempio databricks-automl
, databricks-feature-store
o il fork di Databricks di hyperopt
.
Librerie Python nei cluster CPU
Libreria | Versione | Libreria | Versione | Libreria | Versione |
---|---|---|---|---|---|
absl-py | 1.0.0 | argon2-cffi | 21.3.0 | argon2-cffi-bindings | 21.2.0 |
astor | 0.8.1 | asttoken | 2.0.5 | astunparse | 1.6.3 |
attrs | 21.4.0 | azure-core | 1.26.3 | azure-cosmos | 4.2.0 |
backcall | 0.2.0 | backports.entry-points-selectable | 1.2.0 | bcrypt | 3.2.0 |
beautifulsoup4 | 4.11.1 | black | 22.3.0 | bleach | 4.1.0 |
blis | 0.7.9 | boto3 | 1.21.32 | botocore | 1.24.32 |
cachetools | 4.2.2 | Catalogo | 2.0.8 | codificatori di categoria | 2.5.1.post0 |
certifi | 2021.10.8 | cffi | 1.15.0 | chardet | 4.0.0 |
charset-normalizer | 2.0.4 | Clic | 8.0.4 | cloudpickle | 2.0.0 |
cmdstanpy | 1.1.0 | Confezione | 0.0.4 | configparser | 5.2.0 |
convertdate | 2.4.0 | Crittografia | 3.4.8 | cycler | 0.11.0 |
cymem | 2.0.7 | Cython | 0.29.28 | databricks-automl-runtime | 0.2.15 |
databricks-cli | 0.17.4 | databricks-feature-store | 0.10.0 | dbl-tempo | 0.1.12 |
dbus-python | 1.2.16 | debugpy | 1.5.1 | decorator | 5.1.1 |
defusedxml | 0.7.1 | dill | 0.3.4 | diskcache | 5.4.0 |
distlib | 0.3.6 | docstring-to-markdown | 0,11 | entrypoints | 0,4 |
ephem | 4.1.4 | executing | 0.8.3 | facet-overview | 1.0.0 |
fastjsonschema | 2.16.2 | fasttext | 0.9.2 | filelock | 3.6.0 |
Flask | 1.1.2 | flatbuffers | 23.1.21 | fonttools | 4.25.0 |
fsspec | 2022.2.0 | future | 0.18.2 | gast | 0.4.0 |
gitdb | 4.0.10 | GitPython | 3.1.27 | google-auth | 1.33.0 |
google-auth-oauthlib | 0.4.6 | google-pasta | 0.2.0 | grpcio | 1.42.0 |
gunicorn | 20.1.0 | gviz-api | 1.10.0 | h5py | 3.6.0 |
hijri-converter | 2.2.4 | festività | 0,18 | horovod | 0.27.0 |
htmlmin | 0.1.12 | huggingface-hub | 0.12.0 | idna | 3.3 |
ImageHash | 4.3.1 | sbilanciato-learn | 0.10.1 | importlib-metadata | 4.11.3 |
ipykernel | 6.15.3 | ipython | 8.5.0 | ipython-genutils | 0.2.0 |
ipywidgets | 7.7.2 | isodate | 0.6.1 | itsdangerous | 2.0.1 |
jedi | 0.18.1 | Jinja2 | 2.11.3 | jmespath | 0.10.0 |
joblib | 1.1.1 | joblibspark | 0.5.1 | jsonschema | 4.4.0 |
jupyter-client | 6.1.12 | jupyter_core | 4.11.2 | jupyterlab-pygments | 0.1.2 |
jupyterlab-widgets | 1.0.0 | keras | 2.11.0 | kiwisolver | 1.3.2 |
coreano-lunare-calendario | 0.3.1 | langcodes | 3.3.0 | libclang | 15.0.6.1 |
lightgbm | 3.3.4 | llvmlite | 0.38.0 | LunarCalendar | 0.0.9 |
Mako | 1.2.0 | Markdown | 3.3.4 | MarkupSafe | 2.0.1 |
matplotlib | 3.5.1 | matplotlib-inline | 0.1.2 | Mccabe | 0.7.0 |
mistune | 0.8.4 | mleap | 0.20.0 | mlflow-skinny | 2.1.1 |
multimethod | 1.9.1 | mormurhash | 1.0.9 | mypy-extensions | 0.4.3 |
nbclient | 0.5.13 | nbconvert | 6.4.4 | nbformat | 5.3.0 |
nest-asyncio | 1.5.5 | networkx | 2.7.1 | nltk | 3,7 |
nodeenv | 1.7.0 | notebook | 6.4.8 | numba | 0.55.1 |
numpy | 1.21.5 | oauthlib | 3.2.0 | opt-einsum | 3.3.0 |
creazione del pacchetto | 21.3 | pandas | 1.4.2 | pandas-profiling | 3.6.2 |
pandocfilters | 1.5.0 | paramiko | 2.9.2 | parso | 0.8.3 |
pathspec | 0.9.0 | patia | 0.10.1 | patsy | 0.5.2 |
petastorm | 0.12.1 | pexpect | 4.8.0 | phik | 0.12.3 |
pickleshare | 0.7.5 | Pillow | 9.0.1 | pip | 21.2.4 |
platformdirs | 2.6.2 | plotly | 5.6.0 | pluggy | 1.0.0 |
pmdarima | 2.0.2 | preshed | 3.0.8 | prometheus-client | 0.13.1 |
prompt-toolkit | 3.0.20 | Profeta | 1.1.1 | protobuf | 3.19.4 |
psutil | 5.8.0 | psycopg2 | 2.9.3 | ptyprocess | 0.7.0 |
pure-eval | 0.2.2 | pyarrow | 7.0.0 | pyasn1 | 0.4.8 |
pyasn1-modules | 0.2.8 | pybind11 | 2.10.3 | pycparser | 2.21 |
pydantic | 1.10.2 | pyflakes | 2.5.0 | Pygments | 2.11.2 |
PyGObject | 3.36.0 | PyJWT | 2.6.0 | PyMeeus | 0.5.12 |
PyNaCl | 1.5.0 | pyodbc | 4.0.32 | pyparsing | 3.0.4 |
pyright | 1.1.283 | pirsistente | 0.18.0 | python-dateutil | 2.8.2 |
python-editor | 1.0.4 | python-lsp-jsonrpc | 1.0.0 | python-lsp-server | 1.6.0 |
pytz | 2021.3 | PyWavelets | 1.3.0 | PyYAML | 6.0 |
pyzmq | 22.3.0 | regex | 2022.3.15 | requests | 2.27.1 |
requests-oauthlib | 1.3.1 | requests-unixsocket | 0.2.0 | Corda | 0.22.0 |
rsa | 4.7.2 | s3transfer | 0.5.0 | scikit-learn | 1.0.2 |
Scipy | 1.7.3 | seaborn | 0.11.2 | Send2Trash | 1.8.0 |
setuptools | 61.2.0 | setuptools-git | 1.2 | shap | 0.41.0 |
simplejson | 3.17.6 | sei | 1.16.0 | filtro dei dati | 0.0.7 |
smart-open | 5.2.1 | smmap | 5.0.0 | soupsieve | 2.3.1 |
Spacy | 3.4.4 | spacy-legacy | 3.0.12 | spacy-logger | 1.0.4 |
spark-tensorflow-distributor | 1.0.0 | sqlparse | 0.4.2 | srsly | 2.4.5 |
ssh-import-id | 5.10 | stack-data | 0.2.0 | statsmodels | 0.13.2 |
tabulate | 0.8.9 | tangled-up-in-unicode | 0.2.0 | tenacity | 8.0.1 |
tensorboard | 2.11.2 | tensorboard-data-server | 0.6.1 | tensorboard-plugin-profile | 2.11.1 |
tensorboard-plugin-wit | 1.8.1 | tensorflow-cpu | 2.11.0 | tensorflow-estimator | 2.11.0 |
tensorflow-io-gcs-filesystem | 0.30.0 | termcolor | 2.2.0 | terminado | 0.13.1 |
testpath | 0.5.0 | thinc | 8.1.7 | threadpoolctl | 2.2.0 |
tokenize-rt | 4.2.1 | tokenizer | 0.13.2 | tomli | 1.2.2 |
Torcia | 1.13.1+CPU | torchvision | 0.14.1+CPU | tornado | 6.1 |
tqdm | 4.64.0 | traitlets | 5.1.1 | Trasformatori | 4.25.1 |
typeguard | 2.13.3 | Typer | 0.7.0 | typing_extensions | 4.1.1 |
ujson | 5.1.0 | aggiornamenti automatici | 0,1 | urllib3 | 1.26.9 |
virtualenv | 20.8.0 | Visioni | 0.7.5 | Wasabi | 0.10.1 |
wcwidth | 0.2.5 | webencodings | 0.5.1 | websocket-client | 0.58.0 |
Werkzeug | 2.0.3 | whatthepatch | 1.0.4 | wheel | 0.37.1 |
widgetsnbextension | 3.6.1 | wrapt | 1.12.1 | xgboost | 1.7.2 |
yapf | 0.31.0 | zipp | 3.7.0 |
Librerie Python nei cluster GPU
Libreria | Versione | Libreria | Versione | Libreria | Versione |
---|---|---|---|---|---|
absl-py | 1.0.0 | argon2-cffi | 21.3.0 | argon2-cffi-bindings | 21.2.0 |
astor | 0.8.1 | asttoken | 2.0.5 | astunparse | 1.6.3 |
attrs | 21.4.0 | azure-core | 1.26.3 | azure-cosmos | 4.2.0 |
backcall | 0.2.0 | backports.entry-points-selectable | 1.2.0 | bcrypt | 3.2.0 |
beautifulsoup4 | 4.11.1 | black | 22.3.0 | bleach | 4.1.0 |
blis | 0.7.9 | boto3 | 1.21.32 | botocore | 1.24.32 |
cachetools | 4.2.2 | Catalogo | 2.0.8 | codificatori di categoria | 2.5.1.post0 |
certifi | 2021.10.8 | cffi | 1.15.0 | chardet | 4.0.0 |
charset-normalizer | 2.0.4 | Clic | 8.0.4 | cloudpickle | 2.0.0 |
cmdstanpy | 1.1.0 | Confezione | 0.0.4 | configparser | 5.2.0 |
convertdate | 2.4.0 | Crittografia | 3.4.8 | cycler | 0.11.0 |
cymem | 2.0.7 | Cython | 0.29.28 | databricks-automl-runtime | 0.2.15 |
databricks-cli | 0.17.4 | databricks-feature-store | 0.10.0 | dbl-tempo | 0.1.12 |
dbus-python | 1.2.16 | debugpy | 1.5.1 | decorator | 5.1.1 |
defusedxml | 0.7.1 | dill | 0.3.4 | diskcache | 5.4.0 |
distlib | 0.3.6 | docstring-to-markdown | 0,11 | entrypoints | 0,4 |
ephem | 4.1.4 | executing | 0.8.3 | facet-overview | 1.0.0 |
fastjsonschema | 2.16.2 | fasttext | 0.9.2 | filelock | 3.6.0 |
Flask | 1.1.2 | flatbuffers | 23.1.21 | fonttools | 4.25.0 |
fsspec | 2022.2.0 | future | 0.18.2 | gast | 0.4.0 |
gitdb | 4.0.10 | GitPython | 3.1.27 | google-auth | 1.33.0 |
google-auth-oauthlib | 0.4.6 | google-pasta | 0.2.0 | grpcio | 1.42.0 |
gunicorn | 20.1.0 | gviz-api | 1.10.0 | h5py | 3.6.0 |
hijri-converter | 2.2.4 | festività | 0,18 | horovod | 0.27.0 |
htmlmin | 0.1.12 | huggingface-hub | 0.12.0 | idna | 3.3 |
ImageHash | 4.3.1 | sbilanciato-learn | 0.10.1 | importlib-metadata | 4.11.3 |
ipykernel | 6.15.3 | ipython | 8.5.0 | ipython-genutils | 0.2.0 |
ipywidgets | 7.7.2 | isodate | 0.6.1 | itsdangerous | 2.0.1 |
jedi | 0.18.1 | Jinja2 | 2.11.3 | jmespath | 0.10.0 |
joblib | 1.1.1 | joblibspark | 0.5.1 | jsonschema | 4.4.0 |
jupyter-client | 6.1.12 | jupyter_core | 4.11.2 | jupyterlab-pygments | 0.1.2 |
jupyterlab-widgets | 1.0.0 | keras | 2.11.0 | kiwisolver | 1.3.2 |
coreano-lunare-calendario | 0.3.1 | langcodes | 3.3.0 | libclang | 15.0.6.1 |
lightgbm | 3.3.4 | llvmlite | 0.38.0 | LunarCalendar | 0.0.9 |
Mako | 1.2.0 | Markdown | 3.3.4 | MarkupSafe | 2.0.1 |
matplotlib | 3.5.1 | matplotlib-inline | 0.1.2 | Mccabe | 0.7.0 |
mistune | 0.8.4 | mleap | 0.20.0 | mlflow-skinny | 2.1.1 |
multimethod | 1.9.1 | mormurhash | 1.0.9 | mypy-extensions | 0.4.3 |
nbclient | 0.5.13 | nbconvert | 6.4.4 | nbformat | 5.3.0 |
nest-asyncio | 1.5.5 | networkx | 2.7.1 | nltk | 3,7 |
nodeenv | 1.7.0 | notebook | 6.4.8 | numba | 0.55.1 |
numpy | 1.21.5 | oauthlib | 3.2.0 | opt-einsum | 3.3.0 |
creazione del pacchetto | 21.3 | pandas | 1.4.2 | pandas-profiling | 3.6.2 |
pandocfilters | 1.5.0 | paramiko | 2.9.2 | parso | 0.8.3 |
pathspec | 0.9.0 | patia | 0.10.1 | patsy | 0.5.2 |
petastorm | 0.12.1 | pexpect | 4.8.0 | phik | 0.12.3 |
pickleshare | 0.7.5 | Pillow | 9.0.1 | pip | 21.2.4 |
platformdirs | 2.6.2 | plotly | 5.6.0 | pluggy | 1.0.0 |
pmdarima | 2.0.2 | preshed | 3.0.8 | prompt-toolkit | 3.0.20 |
Profeta | 1.1.1 | protobuf | 3.19.4 | psutil | 5.8.0 |
psycopg2 | 2.9.3 | ptyprocess | 0.7.0 | pure-eval | 0.2.2 |
pyarrow | 7.0.0 | pyasn1 | 0.4.8 | pyasn1-modules | 0.2.8 |
pybind11 | 2.10.3 | pycparser | 2.21 | pydantic | 1.10.2 |
pyflakes | 2.5.0 | Pygments | 2.11.2 | PyGObject | 3.36.0 |
PyJWT | 2.6.0 | PyMeeus | 0.5.12 | PyNaCl | 1.5.0 |
pyodbc | 4.0.32 | pyparsing | 3.0.4 | pyright | 1.1.283 |
pirsistente | 0.18.0 | python-dateutil | 2.8.2 | python-editor | 1.0.4 |
python-lsp-jsonrpc | 1.0.0 | python-lsp-server | 1.6.0 | pytz | 2021.3 |
PyWavelets | 1.3.0 | PyYAML | 6.0 | pyzmq | 22.3.0 |
regex | 2022.3.15 | requests | 2.27.1 | requests-oauthlib | 1.3.1 |
requests-unixsocket | 0.2.0 | Corda | 0.22.0 | rsa | 4.7.2 |
s3transfer | 0.5.0 | scikit-learn | 1.0.2 | Scipy | 1.7.3 |
seaborn | 0.11.2 | Send2Trash | 1.8.0 | setuptools | 61.2.0 |
setuptools-git | 1.2 | shap | 0.41.0 | simplejson | 3.17.6 |
sei | 1.16.0 | filtro dei dati | 0.0.7 | smart-open | 5.2.1 |
smmap | 5.0.0 | soupsieve | 2.3.1 | Spacy | 3.4.4 |
spacy-legacy | 3.0.12 | spacy-logger | 1.0.4 | spark-tensorflow-distributor | 1.0.0 |
sqlparse | 0.4.2 | srsly | 2.4.5 | ssh-import-id | 5.10 |
stack-data | 0.2.0 | statsmodels | 0.13.2 | tabulate | 0.8.9 |
tangled-up-in-unicode | 0.2.0 | tenacity | 8.0.1 | tensorboard | 2.11.2 |
tensorboard-data-server | 0.6.1 | tensorboard-plugin-profile | 2.11.1 | tensorboard-plugin-wit | 1.8.1 |
tensorflow | 2.11.0 | tensorflow-estimator | 2.11.0 | tensorflow-io-gcs-filesystem | 0.30.0 |
termcolor | 2.2.0 | terminado | 0.13.1 | testpath | 0.5.0 |
thinc | 8.1.7 | threadpoolctl | 2.2.0 | tokenize-rt | 4.2.1 |
tokenizer | 0.13.2 | tomli | 1.2.2 | Torcia | 1.13.1+cu117 |
torchvision | 0.14.1+cu117 | tornado | 6.1 | tqdm | 4.64.0 |
traitlets | 5.1.1 | Trasformatori | 4.25.1 | typeguard | 2.13.3 |
Typer | 0.7.0 | typing_extensions | 4.1.1 | ujson | 5.1.0 |
aggiornamenti automatici | 0,1 | urllib3 | 1.26.9 | virtualenv | 20.8.0 |
Visioni | 0.7.5 | Wasabi | 0.10.1 | wcwidth | 0.2.5 |
webencodings | 0.5.1 | websocket-client | 0.58.0 | Werkzeug | 2.0.3 |
whatthepatch | 1.0.4 | wheel | 0.37.1 | widgetsnbextension | 3.6.1 |
wrapt | 1.12.1 | xgboost | 1.7.2 | yapf | 0.31.0 |
zipp | 3.7.0 |
Librerie R
Le librerie R sono identiche alle librerie R in Databricks Runtime 12.2 LTS.
Librerie Java e Scala (cluster Scala 2.12)
Oltre alle librerie Java e Scala in Databricks Runtime 12.2 LTS, Databricks Runtime 12.2 LTS ML contiene i file JAR seguenti:
Cluster CPU
ID gruppo | ID artefatto | Versione |
---|---|---|
com.typesafe.akka | akka-actor_2.12 | 2.5.23 |
ml.combust.mleap | mleap-databricks-runtime_2.12 | v0.20.0-db1 |
ml.dmlc | xgboost4j-spark_2.12 | 1.7.3 |
ml.dmlc | xgboost4j_2.12 | 1.7.3 |
org.graphframes | graphframes_2.12 | 0.8.2-db1-spark3.2 |
org.mlflow | mlflow-client | 2.1.1 |
org.scala-lang.modules | scala-java8-compat_2.12 | 0.8.0 |
org.tensorflow | spark-tensorflow-connector_2.12 | 1.15.0 |
Cluster GPU
ID gruppo | ID artefatto | Versione |
---|---|---|
com.typesafe.akka | akka-actor_2.12 | 2.5.23 |
ml.combust.mleap | mleap-databricks-runtime_2.12 | v0.20.0-db1 |
ml.dmlc | xgboost4j-gpu_2.12 | 1.7.3 |
ml.dmlc | xgboost4j-spark-gpu_2.12 | 1.7.3 |
org.graphframes | graphframes_2.12 | 0.8.2-db1-spark3.2 |
org.mlflow | mlflow-client | 2.1.1 |
org.scala-lang.modules | scala-java8-compat_2.12 | 0.8.0 |
org.tensorflow | spark-tensorflow-connector_2.12 | 1.15.0 |