What tools are included on the Azure Data Science Virtual Machine?

The Data Science Virtual Machine is an easy way to explore data and do machine learning in the cloud. The Data Science Virtual Machines are pre-configured with the complete operating system, security patches, drivers, and popular data science and development software. You can choose the hardware environment, ranging from lower-cost CPU-centric machines to very powerful machines with multiple GPUs, NVMe storage, and large amounts of memory. For machines with GPUs, all drivers are installed, all machine learning frameworks are version-matched for GPU compatibility, and acceleration is enabled in all application software that supports GPUs.

The Data Science Virtual Machine comes with the most useful data-science tools pre-installed.

Build deep learning and machine learning solutions

Tool Windows Server 2019 DSVM Windows Server 2022 DSVM Ubuntu 20.04 DSVM Usage notes
CUDA, cuDNN, NVIDIA Driver
CUDA, cuDNN, NVIDIA Driver on the DSVM
Horovod Horovod on the DSVM
NVidia System Management Interface (nvidia-smi) nvidia-smi on the DSVM
PyTorch PyTorch on the DSVM
TensorFlow
TensorFlow on the DSVM
Integration with Azure Machine Learning (Python)
(Python SDK, samples)

(Python SDK, samples)

(Python SDK,CLI, samples)
Azure Machine Learning SDK
XGBoost
(CUDA support)

(CUDA support)

(CUDA support)
XGBoost on the DSVM
Vowpal Wabbit
Vowpal Wabbit on the DSVM
Weka
LightGBM
(GPU, MPI support)
H2O
CatBoost
Intel MKL
OpenCV
Dlib
Docker
(Windows containers only)

(Windows containers only)
Nccl
Rattle
PostgreSQL
ONNX Runtime

Store, retrieve, and manipulate data

Tool Windows Server 2019 DSVM Windows Server 2022 DSVM Ubuntu 20.04 DSVM Usage notes
Relational databases SQL Server 2019
Developer Edition
SQL Server 2019
Developer Edition
SQL Server 2019
Developer Edition
SQL Server on the DSVM
Database tools SQL Server Management Studio
SQL Server Integration Services
bcp, sqlcmd
SQL Server Management Studio
SQL Server Integration Services
bcp, sqlcmd
SQuirreL SQL (querying tool),
bcp, sqlcmd
ODBC/JDBC drivers
Azure Storage Explorer

Azure CLI


AzCopy

AzCopy on the DSVM
Blob FUSE driver
blobfuse on the DSVM
Azure Cosmos DB Data Migration Tool Azure Cosmos DB on the DSVM
Unix/Linux command-line tools
Apache Spark 3.1 (standalone)

Program in Python, R, Julia, and Node.js

Tool Windows Server 2019 DSVM Windows Server 2022 DSVM Ubuntu 20.04 DSVM Usage notes
CRAN-R with popular packages pre-installed
Anaconda Python with popular packages pre-installed
(Miniconda)

(Miniconda)
Julia (Julialang)
JupyterHub (multiuser notebook server)
JupyterLab (multiuser notebook server)
Node.js
Jupyter Notebook Server with the following kernels:

Jupyter Notebook samples
     R R Jupyter Samples
     Python Python Jupyter Samples
     Julia Julia Jupyter Samples
     PySpark pySpark Jupyter Samples

Ubuntu 20.04 DSVM, Windows Server 2019 DSVM, and Windows Server 2022 DSVM have the following Jupyter Kernels:-

  • Python3.8-default
  • Python3.8-Tensorflow-Pytorch
  • Python3.8-AzureML
  • R
  • Python 3.7 - Spark (local)
  • Julia 1.6.0
  • R Spark – HDInsight
  • Scala Spark – HDInsight
  • Python 3 Spark – HDInsight

Ubuntu 20.04 DSVM, Windows Server 2019 DSVM, and Windows Server 2022 DSVM have the following conda environments:-

  • Python3.8-default 
  • Python3.8-Tensorflow-Pytorch 
  • Python3.8-AzureML 

Use your preferred editor or IDE

Tool Windows Server 2019 DSVM Windows Server 2022 DSVM Ubuntu 20.04 DSVM Usage notes
Notepad++


Nano


Visual Studio 2019 Community Edition
Visual Studio on the DSVM
Visual Studio Code


Visual Studio Code on the DSVM
PyCharm Community Edition


PyCharm on the DSVM
IntelliJ IDEA
Vim
Emacs
Git and Git Bash


OpenJDK 11


.NET Framework

Azure SDK

Organize & present results

Tool Windows Server 2019 DSVM Windows Server 2022 DSVM Ubuntu 20.04 DSVM Usage notes
Microsoft 365 (Word, Excel, PowerPoint)
Microsoft Teams
Power BI Desktop
Microsoft Edge Browser