Has Anyone Been Able to Successfully Parallelize Their Code on a VM?

Gregory Jacobs 11 Reputation points
2021-04-19T02:31:24.8+00:00

I wrote a script to train a large number of time series models on my Azure VM and it uses the multiprocessing.Pool package to parallelize the code to reduce the training time period. The code runs perfectly on my local machine, but does not parallelize across the available cores when I run the script on my Azure VM. If anyone has run into this problem with the Pool package or has alternate suggestions as to how to parallelize training on a VM, I'd appreciate your insight. Thanks!

Azure Machine Learning
Azure Virtual Machines
Azure Virtual Machines
An Azure service that is used to provision Windows and Linux virtual machines.
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Ramr-msft 17,831 Reputation points
    2021-04-19T14:25:19.597+00:00

    @Gregory Jacobs Thanks for the question. For parallel/distributed training - if deep learning, frameworks like PyTorch and Tensorflow can typically be distributed directly or use something like Horovod to easily scale out on Azure ML Clusters - see an example here.

    Another I've been watching is HyperGBM/Hypernets: DataCanvasIO/Hypernets: A General Automated Machine Learning framework to simplify the development of End-to-end AutoML toolkits in specific domains. (github.com), DataCanvasIO/HyperGBM: A full pipeline AutoML tool for tabular data (github.com).


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.