What tools does the Azure Data Science Virtual Machine include?

You can use the Data Science Virtual Machine (DSVM) to easily explore data and handle machine learning in the cloud. A DSVM is preconfigured with security patches, drivers, popular data science and development software, and a complete operating system. You can choose the hardware environment that works for you, ranging from lower-cost CPU-centric machines to powerful machines with multiple GPUs, NVMe storage, and large amounts of memory. For machines with GPUs, all drivers are installed, and all machine learning frameworks are version-matched for GPU compatibility. Additionally, acceleration is enabled in all application software that supports GPUs.

The DSVM comes with the most useful data-science tools preinstalled.

Build deep learning and machine learning solutions

Tool Windows Server 2019 DSVM Windows Server 2022 DSVM Ubuntu 20.04 DSVM Usage notes
CUDA, cuDNN, NVIDIA Driver
CUDA, cuDNN, NVIDIA Driver on the DSVM
Horovod Horovod on the DSVM
NVidia System Management Interface (nvidia-smi) nvidia-smi on the DSVM
PyTorch PyTorch on the DSVM
TensorFlow
TensorFlow on the DSVM
Integration with Azure Machine Learning (Python)
(Python SDK, samples)

(Python SDK, samples)

(Python SDK,CLI, samples)
Azure Machine Learning SDK
XGBoost
(CUDA support)

(CUDA support)

(CUDA support)
XGBoost on the DSVM
Vowpal Wabbit
Vowpal Wabbit on the DSVM
Weka
LightGBM
(GPU, MPI support)
H2O
CatBoost
Intel MKL
OpenCV
Dlib
Docker
(Windows containers only)

(Windows containers only)
Nccl
Rattle
PostgreSQL
ONNX Runtime

Store, retrieve, and manipulate data

Tool Windows Server 2019 DSVM Windows Server 2022 DSVM Ubuntu 20.04 DSVM Usage notes
Relational databases SQL Server 2019
Developer Edition
SQL Server 2019
Developer Edition
SQL Server 2019
Developer Edition
SQL Server on the DSVM
Database tools SQL Server Management Studio
SQL Server Integration Services
bcp, sqlcmd
SQL Server Management Studio
SQL Server Integration Services
bcp, sqlcmd
SQuirreL SQL (querying tool),
bcp, sqlcmd
ODBC/JDBC drivers
Azure Storage Explorer

Azure CLI


AzCopy

AzCopy on the DSVM
Blob FUSE driver
blobfuse on the DSVM
Azure Cosmos DB Data Migration Tool Azure Cosmos DB on the DSVM
Unix/Linux command-line tools
Apache Spark 3.1 (standalone)

Program in Python, R, Julia, and Node.js

Tool Windows Server 2019 DSVM Windows Server 2022 DSVM Ubuntu 20.04 DSVM Usage notes
CRAN-R with popular packages preinstalled
Anaconda Python with popular packages preinstalled
(Miniconda)

(Miniconda)
Julia (Julialang)
JupyterHub (multiuser notebook server)
JupyterLab (multiuser notebook server)
Node.js
Jupyter Notebook Server with the following kernels:

Jupyter Notebook samples
     R R Jupyter Samples
     Python Python Jupyter Samples
     Julia Julia Jupyter Samples
     PySpark pySpark Jupyter Samples

Ubuntu 20.04 DSVM, Windows Server 2019 DSVM, and Windows Server 2022 DSVM have these Jupyter Kernels:

  • Python3.8-default
  • Python3.8-Tensorflow-Pytorch
  • Python3.8-AzureML
  • R
  • Python 3.7 - Spark (local)
  • Julia 1.6.0
  • R Spark – HDInsight
  • Scala Spark – HDInsight
  • Python 3 Spark – HDInsight

Ubuntu 20.04 DSVM, Windows Server 2019 DSVM, and Windows Server 2022 DSVM have the following conda environments:

  • Python3.8-default
  • Python3.8-Tensorflow-Pytorch
  • Python3.8-AzureML

Use your preferred editor or IDE

Tool Windows Server 2019 DSVM Windows Server 2022 DSVM Ubuntu 20.04 DSVM Usage notes
Notepad++


Nano


Visual Studio 2019 Community Edition
Visual Studio on the DSVM
Visual Studio Code


Visual Studio Code on the DSVM
PyCharm Community Edition


PyCharm on the DSVM
IntelliJ IDEA
Vim
Emacs
Git and Git Bash


OpenJDK 11


.NET Framework

Azure SDK

Organize & present results

Tool Windows Server 2019 DSVM Windows Server 2022 DSVM Ubuntu 20.04 DSVM Usage notes
Microsoft 365 (Word, Excel, PowerPoint)
Microsoft Teams
Power BI Desktop
Microsoft Edge Browser