Query regarding indexing SQL Server database stored in Azure Blob Storage with Azure Cognitive Search Basic Tier

Dinnemidi Ananda Kumar 60 Reputation points
2024-04-03T18:08:56.0833333+00:00

Hi Azure community,

I'm currently working on a project where I have to load a SQL Server database into Azure Blob Storage, and I'm exploring the possibility of indexing the full database data using the Basic Tier of Azure Cognitive Search.

My specific questions are:

  1. Is it feasible to index the entire SQL Server database data stored in Azure Blob Storage using the Basic Tier of Azure Cognitive Search?
  2. Are there any limitations or constraints I should be aware of when using the Basic Tier for indexing a large dataset, approximately 20 GB in size?
  3. If indexing the entire dataset is not feasible with the Basic Tier, what are some recommended strategies or approaches to handle large datasets in Azure Cognitive Search?
  4. Additionally, once the data is indexed, I plan to use Azure OpenAI to train a model for the creation of a chatbot. Are there any considerations or best practices I should keep in mind when integrating Azure Cognitive Search data with Azure OpenAI for chatbot training?

Any insights, recommendations, or best practices would be greatly appreciated.

Thank you!

Azure Storage Accounts
Azure Storage Accounts
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
2,944 questions
Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
865 questions
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
2,645 questions
0 comments No comments
{count} votes

Accepted answer
  1. joseandresc 480 Reputation points Microsoft Employee
    2024-04-03T18:39:02.1633333+00:00

    Hello @Dinnemidi Ananda Kumar thanks for reaching out.

    For questions 1 and 2, indeed you will have a constrain when using basic SKU as currently it gives you 2GB of storage. I would recommend the Standard SKU as it will give you 25GB of storage per partition, that way you can start with only 1 partition and if you need to later scale out then you only need to add another one.

    As for indexing large data sets efficiently, Standard SKU should work for you based on your storage requirements, however, depending on the desired indexing speed you might want to look at this documentation as it gives you a guide on the different approaches to optimize for indexing throughput, either using the Push API or using Indexers.

    As for Open AI integration, you can use AI Search with the AI on your data tooling one of the highlighted recommendations is to enabled Semantic Ranking to improve the precision of the retrieved results, taking into consideration to potential pricing increase that comes with the features.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Dinnemidi Ananda Kumar 60 Reputation points
    2024-04-04T06:03:20.37+00:00

    Hello Azure community,

    I've recently integrated Azure AI Search with Azure Open AI to create a question-answering application. I've uploaded a text file into the storage account, indexed it using Azure AI Search, and deployed the GPT-3.5 Turbo 16k model in Azure Open AI. However, I'm encountering an issue where the application is not providing answers for certain questions from the indexed data.

    When i have indexed the text file data in azure ai search and uploaded to open ai it should give me correct answers. I've ensured that the data is properly formatted and relevant to the questions asked. Additionally, I've configured the indexing process correctly, and the deployment of the Azure Open AI application seems to be functioning properly.

    Despite these efforts, I'm still facing challenges with retrieving answers for some questions. Could anyone provide insights or suggestions on how to troubleshoot and resolve this issue? Are there any known limitations or best practices I should be aware of when using Azure Open AI for question-answering applications integrated with Azure AI Search?

    Any help or guidance would be greatly appreciated. Thank you in advance!