Serverless AI Chat with RAG using LangChain.js
This sample shows how to build a serverless AI chat experience with Retrieval-Augmented Generation using LangChain.js and Azure. The application is hosted on Azure Static Web Apps and Azure Functions, with Azure Cosmos DB for NoSQL as the vector database. You can use it as a starting point for building more complex AI applications.
Overview
Building AI applications can be complex and time-consuming, but using LangChain.js and Azure serverless technologies allows to greatly simplify the process. This application is a chatbot that uses a set of enterprise documents to generate responses to user queries.
We provide sample data to make this sample ready to try, but feel free to replace it with your own. We use a fictitious company called Contoso Real Estate, and the experience allows its customers to ask support questions about the usage of its products. The sample data includes a set of documents that describes its terms of service, privacy policy and a support guide.
This application is made from multiple components:
A web app made with a single chat web component built with Lit and hosted on Azure Static Web Apps. The code is located in the
packages/webapp
folder.A serverless API built with Azure Functions and using LangChain.js to ingest the documents and generate responses to the user chat queries. The code is located in the
packages/api
folder.A database to store the text extracted from the documents and the vectors generated by LangChain.js, using Azure Cosmos DB for NoSQL.
A file storage to store the source documents, using Azure Blob Storage.
Prerequisites
- Node.js LTS
- Azure Developer CLI
- Git
- Azure account. If you're new to Azure, get an Azure account for free to get free Azure credits to get started. If you're a student, you can also get free credits with Azure for Students.
- Azure subscription with access enabled for the Azure OpenAI service. You can request access with this form.
- Azure account permissions:
- Your Azure account must have
Microsoft.Authorization/roleAssignments/write
permissions, such as Role Based Access Control Administrator, User Access Administrator, or Owner. If you don't have subscription-level permissions, you must be granted RBAC for an existing resource group and deploy to that existing group. - Your Azure account also needs
Microsoft.Resources/deployments/write
permissions on the subscription level.
- Your Azure account must have
Setup the sample
You can run this project directly in your browser by using GitHub Codespaces, which will open a web-based VS Code.
- Fork the project to create your own copy of this repository.
- On your forked repository, select the Code button, then the Codespaces tab, and clink on the button Create codespace on main.
- Wait for the Codespace to be created, it should take a few minutes.
Deploy on Azure
- Open a terminal at the root of the project.
- Authenticate with Azure by running
azd auth login
. - Run
azd up
to deploy the application to Azure. This will provision Azure resources, deploy this sample, and build the search index based on the files found in the./data
folder.- You will be prompted to select a base location for the resources. If you're unsure of which location to choose, select
eastus2
. - By default, the OpenAI resource will be deployed to
eastus2
. You can set a different location withazd env set AZURE_OPENAI_RESOURCE_GROUP_LOCATION <location>
. Currently only a short list of locations is accepted. That location list is based on the OpenAI model availability table and may become outdated as availability changes.
- You will be prompted to select a base location for the resources. If you're unsure of which location to choose, select
The deployment process will take a few minutes. Once it's done, you'll see the URL of the web app in the terminal.
You can now open the web app in your browser and start chatting with the bot.
Key concepts
Our API is composed of two main endpoints:
/documents
: This endpoint allows to upload a PDF documents in the database. Using LangChain.js, we extract the text from the PDF file, split it into smaller chunks, and generate vectors for each chunk. We store the text and the vectors in the database for later use./chat
: This endpoint receives a list of messages, the last being the user query and returns a response generated by the LLM. It uses the documents stored in the database to generate the response. We use LangChain.js components to connect to the database, load the documents and perform a vector search after vectorizing the user query. After that, the most relevant documents are injected into the prompt, and we generate the response. While this process seems complex, LangChain.js does all the heavy lifting for us so we can focus on the application flow.
The /documents
endpoint is used to ingest the documents after the application is deployed by uploading the PDFs, using either curl
commands or the Node.js script we built (have a look at the postup
hook in the azure.yaml
file).
The web app is a simple chat interface that sends the user queries to the /chat
endpoint and displays the responses.
We use the HTTP protocol for AI chat apps to communicate between the web app and the API.
Clean up
To clean up all the Azure resources created by this sample:
- Run
azd down --purge
- When asked if you are sure you want to continue, enter
y
The resource group and all the resources will be deleted.
Troubleshooting
If you have any issue when running or deploying this sample, please check the troubleshooting guide. If you can't find a solution to your problem, please open an issue in this repository.
Next steps
Here are some resources to learn more about the technologies used in this sample:
- LangChain.js documentation
- Generative AI For Beginners
- Azure OpenAI Service
- Azure Cosmos DB for NoSQL
- Ask YouTube: LangChain.js + Azure Quickstart sample
- Chat + Enterprise data with Azure OpenAI and Azure AI Search
- Revolutionize your Enterprise Data with Chat: Next-gen Apps w/ Azure OpenAI and AI Search
You can also find more Azure AI samples here.