Creating a domain-specific chatbot using Azure's Open AI service is an exciting project! However, as you've discovered, the process can be challenging. Here are some potential solutions to your issues: Accessing Data and Pretrained Knowledge: Using the Retrieve and Rank (RAG) methodology, your chatbot should indeed use your provided data in combination with its pre-existing knowledge. However, it's important to note that when the conversation involves a search query, the model will primarily use the fetched documents to respond, rather than relying on its own internal knowledge. Training on Custom Files: As of now (last update in 2023), fine-tuning or direct training of GPT models on custom data is not available in Azure Cognitive Service. You can only add data sources from which the model retrieves information to answer specific queries in the conversation. Web App Deployment Disappearing: It seems unusual for the deployed Web App to disappear and require redeployment. It might be a temporary issue or a glitch on the platform. If this continues, please reach out to Azure support for direct assistance.
Given these constraints, here's what you can do to create your tech news assistant: Retrieval Over Ingestion Model: Since GPT can't be trained on custom data in Azure, consider adopting a "retrieval over ingestion" approach where you regularly update your data source (in your case, Blob Storage) with the latest tech news articles. The updated data source can then be indexed by Azure Cognitive Search, allowing the chatbot to retrieve up-to-date information from those articles. Improve Retrieval with Tags & Metadata: You can improve the retrieval effectiveness by introducing tags and metadata in your data. For example, each tech news article could have metadata about the topics covered, published date, author, etc. This would help in fetching more accurate documents relevant to user queries. Use QnA Maker for Frequently Asked Questions: For common queries or standard responses, consider using another service like Azure QnA Maker. You can feed it pairs of questions and answers related to your domain. During a conversation, if a user asks something similar, the bot will provide the corresponding answer.
For your next steps: Consider adopting a "retrieval over ingestion" approach and updating your data source regularly. Investigate improving data retrieval with metadata and tags. Explore using Azure's QnA Maker for handling frequently asked questions.
Please let me know if this helps or if you need further clarification or guidance!