Error while depolying Whisper Model in batch pipeline

Question

Error while depolying Whisper Model in batch pipeline

Arie Youlus 21

I'm trying to deploy the OpenAI Whisper model with a batch pipeline, following the example notebook: https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/parallel-run/file-dataset-image-inference-mnist.ipynb

I'm using the STANDARD_NC6S_V3 Machine.

I keep getting the following error:

Error '/azureml-envs/azureml_2b0a8ce0115582fe46e2aa65a9665d55/lib/python3.9/site-packages/torch/lib/../../nvidia/cublas/lib/libcublas.so.11: undefined symbol: cublasLtGetStatusString, version libcublasLt.so.11 File "/mnt/azureml/cr/j/3f034c6e7a1b4166b24b196339e7b655/exe/wd/whisper_transcribe.py", line 3, in <module>
import whisper
File "/azureml-envs/azureml_2b0a8ce0115582fe46e2aa65a9665d55/lib/python3.9/site-packages/whisper/init.py", line 8, in <module>
import torch
File "/azureml-envs/azureml_2b0a8ce0115582fe46e2aa65a9665d55/lib/python3.9/site-packages/torch/init.py", line 191, in <module>
_load_global_deps()
File "/azureml-envs/azureml_2b0a8ce0115582fe46e2aa65a9665d55/lib/python3.9/site-packages/torch/init.py", line 153, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/azureml-envs/azureml_2b0a8ce0115582fe46e2aa65a9665d55/lib/python3.9/ctypes/init.py", line 382, in init
self._handle = _dlopen(self._name, mode)'

I can't find what causes the error, but I think it has to do with the machine I'm deploying on, because with a different machine, the error did not appear.
help will be much appreciated.

romungi-MSFT 48,911 Microsoft Employee Moderator

@Arie Youlus I think the error is due to missing dependencies of your environment. Seems to fail at import statements probably from your scoring script.
Are you defining an environment similar to the one mentioned in the example notebook?

batch_conda_deps = CondaDependencies.create(python_version="3.7",  
                                            conda_packages=['pip==20.2.4'],  
                                            pip_packages=["tensorflow==1.15.2", "pillow", "protobuf==3.20.1",  
                                                          "azureml-core", "azureml-dataset-runtime[fuse]"])  
batch_env = Environment(name="batch_environment")  
batch_env.python.conda_dependencies = batch_conda_deps  
batch_env.docker.base_image = DEFAULT_CPU_IMAGE

Arie Youlus 21 Reputation points

2022-11-04T08:21:24.09+00:00

I have edited the dependencies to match with the Whisper model dependencies -

from azureml.core import Environment
from azureml.core.runconfig import CondaDependencies, DEFAULT_GPU_IMAGE

batch_conda_deps = CondaDependencies.create(python_version="3.9",
conda_packages=['pip==20.2.4', "ffmpeg"],
pip_packages=["azureml-core", "azureml-dataset-runtime[fuse]", "git+https://github.com/openai/whisper.git", "ffmpeg-python"])

batch_env = Environment(name="batch_environment")
batch_env.python.conda_dependencies = batch_conda_deps
batch_env.docker.base_image = DEFAULT_GPU_IMAGE

As I mentioned, the code did run with a different machine.
Arie Youlus 21 Reputation points

2022-11-06T08:39:52.837+00:00

I found out that it has to do with the DEFAULT_GPU_IMAGE base image, when changing to DEFAULT_CPU_IMAGE things are working as expected.
The problem is that I use a machine with GPU and I can't get the base image work with it.
Arie Youlus 21 Reputation points

2022-11-09T16:32:06.59+00:00

Still waiting for some help, I can't find what causes the problem.
romungi-MSFT 48,911 Reputation points Microsoft Employee Moderator

2022-11-30T08:31:54.59+00:00

@Arie Youlus I have recently worked with another user reporting the same issue and they were able to resolve the issue with a workaround. Please see this thread for details. Does this help for your scenario?

Your answer

romungi-MSFT 48,911 Reputation points Microsoft Employee Moderator

2022-11-04T05:42:23.303+00:00

@Arie Youlus I think the error is due to missing dependencies of your environment. Seems to fail at import statements probably from your scoring script.
Are you defining an environment similar to the one mentioned in the example notebook?

batch_conda_deps = CondaDependencies.create(python_version="3.7", conda_packages=['pip==20.2.4'], pip_packages=["tensorflow==1.15.2", "pillow", "protobuf==3.20.1", "azureml-core", "azureml-dataset-runtime[fuse]"]) batch_env = Environment(name="batch_environment") batch_env.python.conda_dependencies = batch_conda_deps batch_env.docker.base_image = DEFAULT_CPU_IMAGE
Arie Youlus 21 Reputation points

2022-11-04T08:21:24.09+00:00

I have edited the dependencies to match with the Whisper model dependencies -

from azureml.core import Environment
from azureml.core.runconfig import CondaDependencies, DEFAULT_GPU_IMAGE

batch_conda_deps = CondaDependencies.create(python_version="3.9",
conda_packages=['pip==20.2.4', "ffmpeg"],
pip_packages=["azureml-core", "azureml-dataset-runtime[fuse]", "git+https://github.com/openai/whisper.git", "ffmpeg-python"])

batch_env = Environment(name="batch_environment")
batch_env.python.conda_dependencies = batch_conda_deps
batch_env.docker.base_image = DEFAULT_GPU_IMAGE

As I mentioned, the code did run with a different machine.
Arie Youlus 21 Reputation points

2022-11-06T08:39:52.837+00:00

I found out that it has to do with the DEFAULT_GPU_IMAGE base image, when changing to DEFAULT_CPU_IMAGE things are working as expected.
The problem is that I use a machine with GPU and I can't get the base image work with it.
Arie Youlus 21 Reputation points

2022-11-09T16:32:06.59+00:00

Still waiting for some help, I can't find what causes the problem.
romungi-MSFT 48,911 Reputation points Microsoft Employee Moderator

2022-11-30T08:31:54.59+00:00

@Arie Youlus I have recently worked with another user reporting the same issue and they were able to resolve the issue with a workaround. Please see this thread for details. Does this help for your scenario?

Share via

Error while depolying Whisper Model in batch pipeline

Your answer