Does ML Studio/Designer/AutoML support Natural Language Processing?

Micro-Me 1 Reputation point
2021-03-16T14:41:34.633+00:00

Hey everyone,
I think Microsoft doesn't explicitly state this anywhere so I was wondering if I can create models (via AutoML or via the manual designer) using datasets containing text in natural language (such as a couple sentences, paragraphs etc.). AutoML doesn't really indicate that it can process paragraphs using NLP anywhere.
There are Text Analytics features in the designer and I heard about Azure AutoML's BERT support so I suppose it should be possible but I just wanted to make sure.
Right now I can upload such dataset and create a classification model based on it but I don't know if it treats these cells containing paragraphs just as one long string and doesn't do anything or if it actually processes the individual words etc.
Could anyone let me know, please? And if it does support NLP, what can I do besides classification? Can it do sentiment analysis, entity extraction etc.?

Thanks a lot!

(I don't see a tag for AutoML or the designer, that's why I tagged the classic Studio.)

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,350 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Micro-Me 1 Reputation point
    2021-03-17T11:16:24.56+00:00

    @YutongTie-5848 (For whatever reason the reply button doesn't work for me, that's why I posted it this way.)

    Well, let's say it's an email classification. So you got a .csv file where you have the text of an e-mail in one column and its class in the other column. Example could be:
    "Click here to win $1000.",spam
    "Hey, how are you?",normal
    "Hello, PFA pictures.",normal

    This is just an example with the text of the e-mails being just one sentence but you can imagine e-mails can be longer (like a paragraph or even more). You obviously can't treat a paragraph of text (like here) the same way you would treat a classification with the text being just one word.

    I saw the e-mail classification (and other similar ones) being done in Azure AI gallery (https://gallery.azure.ai/Experiment/Email-Classification-for-Automated-Support-Ticket-Generation-Step-1-of-2-Train-and-Evaluate-Models-3) but that's in the studio/designer, not AutoML.

    My expected result is classification but I wanted to know if AutoML can do other common tasks where NLP is used (named-entity recognition, sentiment analysis etc.).

    When I choose the featurization settings, I see that there is a "Text" option as a feature type but one word is probably also "Text". So I'm asking if AutoML processes long strings in a different way.

    I know there is the API but I would like to create models specifically via AutoML (or designer but preferably AutoML) for now.

    0 comments No comments

  2. YutongTie-MSFT 53,981 Reputation points Moderator
    2021-03-22T19:39:06.593+00:00

    @Micro-Me

    Hello,

    Here is a list of the samples we have right now. https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/automated-machine-learning

    I see the first three scenarios are very similar to yours. Could you please check if that fits your scenario?

    80382-image.png

    Regards,
    Yutong


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.