Databricks 7.3ML vs 7.0ML

Rohit Pillai 1 Reputation point Microsoft Employee


I'm trying to run a distributed training job using Horovod and Pytorch on Azure Databricks using 2 nodes with 2 gpus each. When I run my code on version 7.0ML, the code sees each node as having 2 GPUs which is expected behavior. However, on 7.3ML, the code sees a node as having only 1 GPU. Why is this happening? Is there someway to change this to show each node's original configuration with multiple GPUs?

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,903 questions
{count} votes