Error while depolying Whisper Model in batch pipeline
I'm trying to deploy the OpenAI Whisper model with a batch pipeline, following the example notebook: https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/parallel-run/file-dataset-image-inference-mnist.ipynb
I'm using the STANDARD_NC6S_V3 Machine.
I keep getting the following error:
- Error '/azureml-envs/azureml_2b0a8ce0115582fe46e2aa65a9665d55/lib/python3.9/site-packages/torch/lib/../../nvidia/cublas/lib/libcublas.so.11: undefined symbol: cublasLtGetStatusString, version libcublasLt.so.11 File "/mnt/azureml/cr/j/3f034c6e7a1b4166b24b196339e7b655/exe/wd/whisper_transcribe.py", line 3, in <module>
File "/azureml-envs/azureml_2b0a8ce0115582fe46e2aa65a9665d55/lib/python3.9/site-packages/whisper/init.py", line 8, in <module>
File "/azureml-envs/azureml_2b0a8ce0115582fe46e2aa65a9665d55/lib/python3.9/site-packages/torch/init.py", line 191, in <module>
File "/azureml-envs/azureml_2b0a8ce0115582fe46e2aa65a9665d55/lib/python3.9/site-packages/torch/init.py", line 153, in _load_global_deps
File "/azureml-envs/azureml_2b0a8ce0115582fe46e2aa65a9665d55/lib/python3.9/ctypes/init.py", line 382, in init
self._handle = _dlopen(self._name, mode)'
I can't find what causes the error, but I think it has to do with the machine I'm deploying on, because with a different machine, the error did not appear.
help will be much appreciated.
@Arie Youlus I think the error is due to missing dependencies of your environment. Seems to fail at import statements probably from your scoring script.
Are you defining an environment similar to the one mentioned in the example notebook?
batch_conda_deps = CondaDependencies.create(python_version="3.7", conda_packages=['pip==20.2.4'], pip_packages=["tensorflow==1.15.2", "pillow", "protobuf==3.20.1", "azureml-core", "azureml-dataset-runtime[fuse]"]) batch_env = Environment(name="batch_environment") batch_env.python.conda_dependencies = batch_conda_deps batch_env.docker.base_image = DEFAULT_CPU_IMAGE
I have edited the dependencies to match with the Whisper model dependencies -
from azureml.core import Environment
from azureml.core.runconfig import CondaDependencies, DEFAULT_GPU_IMAGE
batch_conda_deps = CondaDependencies.create(python_version="3.9",
pip_packages=["azureml-core", "azureml-dataset-runtime[fuse]", "git+https://github.com/openai/whisper.git", "ffmpeg-python"])
batch_env = Environment(name="batch_environment")
batch_env.python.conda_dependencies = batch_conda_deps
batch_env.docker.base_image = DEFAULT_GPU_IMAGE
As I mentioned, the code did run with a different machine.
I found out that it has to do with the DEFAULT_GPU_IMAGE base image, when changing to DEFAULT_CPU_IMAGE things are working as expected.
The problem is that I use a machine with GPU and I can't get the base image work with it.
Still waiting for some help, I can't find what causes the problem.
@Arie Youlus I have recently worked with another user reporting the same issue and they were able to resolve the issue with a workaround. Please see this thread for details. Does this help for your scenario?
Sign in to comment