For sure, you can do that,
I hope this article helps you to find the steps:
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Data is stored in the data lake and needs to be extracted from data lake when queried by a customer via chat. Could this be done using SparkNLP library on Azure Synapse?
For sure, you can do that,
I hope this article helps you to find the steps:
@Nara Kanga While it looks like you can use SparkNLP for including NLP into your pipelines to extract semantic information from data, I don't think it alone can be used to power an LLM-based chat bot.
These chatbots use a pattern called Retrieval Augmented Generation (RAG) which typically involves using a vector database to perform semantic search on an index dataset to feed into Generative AI models for the final response.
The Azure OpenAI on your Data feature builds this upon Azure Cognitive Search, which can ingest and index data using OpenAI generated embeddings, but you could technically use a vector database for this approach.
There are a few possible architectures that you could explore like the Embedding Approach mentioned which uses Redis as the vector database, and instead of an Azure Function, you could use SparkNLP to generate the embeddings on your data in the data lake.
Your chatbot would then try to retrieve semantically similar information from Redis and use the same generate responses after feeding that information to an LLM model.