Private data chat on Langchain

Janarthanan S 700 Reputation points
2023-09-21T07:00:18.7166667+00:00

Hi Team,

I am trying to create a simple qnachat application on private data using langchain, with Azure OpenAI service, but unable to create and get it. can you please provide the code snippet to do the same? Its difficult to find any resource for it.

Thanks in advance.

Regards,

Janarthanan S

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,041 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,833 questions
{count} votes

Accepted answer
  1. santoshkc 8,775 Reputation points Microsoft Vendor
    2023-10-09T05:50:44.9433333+00:00

    Hi @Janarthanan S ,

    Glad to know that your issue has been resolved. And thanks for sharing your feedback.

    To reiterate the resolution here, let me jot down the gist of my first comment above.

    I understand that you are trying to create a Q&A chat application on private data using LangChain with Azure OpenAI service. I will be happy to assist you with this.

    Please refer to the below URL of code snippet:

    https://github.com/microsoft/azure-openai-in-a-day-workshop/blob/main/qna-chat-with-langchain/qna-chat-with-langchain.ipynb

    Please 'Accept as answer' and ‘Upvote’ if it helped so that it can help others in the community looking for help on similar topics.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. ChakaravarthiRangarajanBhargavi-1820 715 Reputation points
    2023-09-22T12:42:22.2533333+00:00

    Hi Janarthanan S,

    Thanks for the question. I would certainly recommend you to follow the below steps to create a simple qnachat application on private data using langchain, with Azure OpenAI service:

    Before we get started with the Langchain with Azure Open AI service, make sure that you are having the prerequisite as below:

    •        An active Azure subscription

    •        Access granted to Azure OpenAI in the desired Azure subscription (you can ask access at https://aka.ms/oai/access).

    •        An Azure OpenAI resource with a model deployed.

    •        Python 3.7 or higher

    •        LangChain library installed (you can do so via pip install langchain)

    Steps to create:

    First, create a .env and add your Azure OpenAI Service details:

    OPENAI_API_KEY=xxxxxx
    OPENAI_API_BASE=https://xxxxxxxx.openai.azure.com/
    OPENAI_API_VERSION=2023-05-15
    

    Next, make sure that you have gpt-35-turbo and text-embedding-ada-002 deployed and used the same name as the model itself for the deployment.

    User's image

    Let’s install the latest versions of openai and langchain via pip:

    pip install openai --upgrade
    pip install langchain –upgrade
    

    we’re using openai==0.27.8 and langchain==0.0.240

    First, let’s initialize our Azure OpenAI Service connection and create the LangChain objects:

    import os
    import openai
    from dotenv import load_dotenv
    from langchain.chat_models import AzureChatOpenAI
    from langchain.embeddings import OpenAIEmbeddings
     
    # Load environment variables (set OPENAI_API_KEY, OPENAI_API_BASE, and OPENAI_API_VERSION in .env)
    load_dotenv()
     
    # Configure OpenAI API
    openai.api_type = "azure"
    openai.api_base = os.getenv('OPENAI_API_BASE')
    openai.api_key = os.getenv("OPENAI_API_KEY")
    openai.api_version = os.getenv('OPENAI_API_VERSION')
     
    # Initialize gpt-35-turbo and our embedding model
    llm = AzureChatOpenAI(deployment_name="gpt-35-turbo")
    embeddings = OpenAIEmbeddings(deployment_id="text-embedding-ada-002", chunk_size=1)
    

    Next, we can load up a bunch of text files, chunk them up and embed them. LangChain supports a lot of different document loaders (https://python.langchain.com/docs/modules/data_connection/document_loaders.html), which makes it easy to adapt to other data sources and file formats. You can have your own sample data.

    from langchain.document_loaders import DirectoryLoader
    from langchain.document_loaders import TextLoader
    from langchain.text_splitter import TokenTextSplitter
    loader = DirectoryLoader('data/qna/', glob="*.txt", loader_cls=TextLoader, loader_kwargs={'autodetect_encoding': True})
    documents = loader.load()
    text_splitter = TokenTextSplitter(chunk_size=1000, chunk_overlap=0)
    docs = text_splitter.split_documents(documents)
    

    Next, let’s ingest documents into Faiss so we can efficiently query our embeddings:

    from langchain.vectorstores import FAISS
    db = FAISS.from_documents(documents=docs, embedding=embeddings)
    

    Lastly, we can create our document question-answering chat chain. In this case, we specify the question prompt, which converts the user’s question to a standalone question, in case the user asked a follow-up question:

    from langchain.chains import ConversationalRetrievalChain
    from langchain.prompts import PromptTemplate
    # Adapt if needed
    CONDENSE_QUESTION_PROMPT = PromptTemplate.from_template("""Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question.
    Chat History:
    {chat_history}
    Follow Up Input: {question}
    Standalone question:""")
    qa = ConversationalRetrievalChain.from_llm(llm=llm,
                                               retriever=db.as_retriever(),
                                               condense_question_prompt=CONDENSE_QUESTION_PROMPT,
                                               return_source_documents=True,
                                               verbose=False)
    Let’s ask a question:
    chat_history = []
    query = "what is Azure OpenAI Service?"
    result = qa({"question": query, "chat_history": chat_history})
    print("Question:", query)
    print("Answer:", result["answer"])
    From where, we can also ask follow up questions:
    chat_history = [(query, result["answer"])]
    query = "Which regions does the service support?"
    result = qa({"question": query, "chat_history": chat_history})
    print("Question:", query)
    print("Answer:", result["answer"])
    

    Here we go! We have created a simple qnachat application on private data using langchain, with Azure OpenAI service using the above code.

    Please try out these steps with your data and check if it works. Hope this answer helps you with solution! Please comment below if you need any assistance on the same. Happy to help!

    Regards,

    Chakravarthi Rangarajan Bhargavi

    -Please kindly accept the answer and vote 'Yes' if you feel helpful to support the community, thanks a lot.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.