Hello James,
There are several ways in Microsoft to develop NLP, one is using Azure Cognitive Service - Language, one is using Apache Spark as a customized NLP framework.
In Azure, Spark services like Azure Databricks, Azure Synapse Analytics, and Azure HDInsight provide NLP functionality when you use them with Spark NLP. Azure Cognitive Services is another option for NLP functionality. To decide which service to use, consider these questions:
- Do you want to use prebuilt or pretrained models? If yes, consider using the APIs that Azure Cognitive Services offers. Or download your model of choice through Spark NLP.
- Do you need to train custom models against a large corpus of text data? If yes, consider using Azure Databricks, Azure Synapse Analytics, or Azure HDInsight with Spark NLP.
- Do you need low-level NLP capabilities like tokenization, stemming, lemmatization, and term frequency/inverse document frequency (TF/IDF)? If yes, consider using Azure Databricks, Azure Synapse Analytics, or Azure HDInsight with Spark NLP. Or use an open-source software library in your processing tool of choice.
- Do you need simple, high-level NLP capabilities like entity and intent identification, topic detection, spell check, or sentiment analysis? If yes, consider using the APIs that Cognitive Services offers. Or download your model of choice through Spark NLP.
More information I would invite you to check on this document - [https://learn.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/natural-language-processing#capability-matrix
Please choose one according to your scenario, please let us know if you have any questions.
Regards,
Yutong
-Please kindly accept the answer if you feel helpful to support the community, thanks a lot.