Unable to Include en_core_web_sm Model in Azure ML Pipeline Environment

Kavinamoole, Abhishek 0 Reputation points
2023-06-01T18:48:24.4766667+00:00

I have created a custom pipeline in Azure Machine Learning that involves executing a script called data_prep.py. This script requires the en_core_web_sm model from the Spacy library for data cleaning purposes.

I have defined an environment YAML file for my pipeline, where I included the spacy package as a dependency. However, when I try to include the en_core_web_sm model in the YAML file, it fails to create the environment.

I have also attempted to download the en_core_web_sm model manually and store it in a folder, and then referenced the model path in my data_prep.py script. However, this approach is not working either.

I would greatly appreciate any guidance on how to correctly include the en_core_web_sm model in my Azure ML pipeline environment. Is there a recommended method to include Spacy models in Azure ML environments? How can I ensure that the en_core_web_sm model is accessible to my pipeline and can be utilized by data_prep.py?

Thank you for your assistance.

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,965 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Kavinamoole, Abhishek 0 Reputation points
    2023-06-05T04:17:49.0566667+00:00

    Due to restrictions imposed by my organization, Azure is unable to connect to the internet and download from GitHub. I attempted the previously suggested method, but encountered an SSL error. Are there any alternative approaches available? Could you please provide guidance on how to proceed?

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.