Plan your question answering app

To plan your question answering app, you need to understand how question answering works and interacts with other Azure services. You should also have a solid grasp of knowledge base concepts.

Azure resources

Each Azure resource created with question answering has a specific purpose. Each resource has its own purpose, limits, and pricing tier. It's important to understand the function of these resources so that you can use that knowledge into your planning process.

Resource Purpose
Language resource resource Authoring, query prediction endpoint and telemetry
Cognitive Search resource Data storage and search

Resource planning

Question answering throughput is currently capped at 10 text records per second for both management APIs and prediction APIs. To target 10 text records per second for your service, we recommend the S1 (one instance) SKU of Azure Cognitive Search.

Language resource

A single language resource with the custom question answering feature enabled can host more than one project/knowledge base. The number of projects/knowledge bases is determined by the Cognitive Search pricing tier's quantity of supported indexes. Learn more about the relationship of indexes to knowledge bases.

Knowledge base size and throughput

When you build a real app, plan sufficient resources for the size of your knowledge base and for your expected query prediction requests.

A knowledge base size is controlled by the:

The knowledge base query prediction request is controlled by the web app plan and web app. Refer to recommended settings to plan your pricing tier.

Understand the impact of resource selection

Proper resource selection means your knowledge base answers query predictions successfully.

If your knowledge base isn't functioning properly, it's typically an issue of improper resource management.

Improper resource selection requires investigation to determine which resource needs to change.


A project/knowledge base is directly tied its language resource. It holds the question and answer (QnA) pairs that are used to answer query prediction requests.

Language considerations

You can now have projects in different languages within the same language resource where the custom question answering feature is enabled. When you create the first project, you can choose whether you want to use the resource for projects/knowledge bases in a single language that will apply to all subsequent projects or make a language selection each time a project is created.

Ingest data sources

Question answering also supports unstructured content. You can upload a file that has unstructured content.

Currently we do not support URLs for unstructured content.

The ingestion process converts supported content types to markdown. All further editing of the answer is done with markdown. After you create a knowledge base, you can edit QnA pairs in Language Studio with rich text authoring.

Data format considerations

Because the final format of a QnA pair is markdown, it's important to understand markdown support.

Bot personality

Add a bot personality to your project/knowledge base with chit-chat. This personality comes through with answers provided in a certain conversational tone such as professional and friendly. This chit-chat is provided as a conversational set, which you have total control to add, edit, and remove.

A bot personality is recommended if your bot connects to your knowledge base. You can choose to use chit-chat in your knowledge base even if you also connect to other services, but you should review how the bot service interacts to know if that is the correct architectural design for your use.

Conversation flow with a project

Conversation flow usually begins with a salutation from a user, such as Hi or Hello. Your knowledge base can answer with a general answer, such as Hi, how can I help you, and it can also provide a selection of follow-up prompts to continue the conversation.

You should design your conversational flow with a loop in mind so that a user knows how to use your bot and isn't abandoned by the bot in the conversation. Follow-up prompts provide linking between QnA pairs, which allow for the conversational flow.

Authoring with collaborators

Collaborators may be other developers who share the full development stack of the knowledge base application or may be limited to just authoring the knowledge base.

Knowledge base authoring supports several role-based access permissions you apply in the Azure portal to limit the scope of a collaborator's abilities.

Integration with client applications

Integration with client applications is accomplished by sending a query to the prediction runtime endpoint. A query is sent to your specific project/knowledge base with an SDK or REST-based request to your question answering web app endpoint.

To authenticate a client request correctly, the client application must send the correct credentials and knowledge base ID. If you're using an Azure Bot Service, configure these settings as part of the bot configuration in the Azure portal.

Conversation flow in a client application

Conversation flow in a client application, such as an Azure bot, may require functionality before and after interacting with the knowledge base.

Does your client application support conversation flow, either by providing alternate means to handle follow-up prompts or including chit-chit? If so, design these early and make sure the client application query is handled correctly by another service or when sent to your knowledge base.

Active learning from a client application

Question answering uses active learning to improve your knowledge base by suggesting alternate questions to an answer. The client application is responsible for a part of this active learning. Through conversational prompts, the client application can determine that the knowledge base returned an answer that's not useful to the user, and it can determine a better answer. The client application needs to send that information back to the knowledge base to improve the prediction quality.

Providing a default answer

If your knowledge base doesn't find an answer, it returns the default answer. This answer is configurable on the Settings page.

This default answer is different from the Azure bot default answer. You configure the default answer for your Azure bot in the Azure portal as part of configuration settings. It's returned when the score threshold isn't met.


The prediction is the response from your knowledge base, and it includes more information than just the answer. To get a query prediction response, use the question answering API.

Prediction score fluctuations

A score can change based on several factors:

  • Number of answers you requested in response with the top property
  • Variety of available alternate questions
  • Filtering for metadata
  • Query sent to test or production project/knowledge base.

Analytics with Azure Monitor

In question answering, telemetry is offered through the Azure Monitor service. Use our top queries to understand your metrics.

Development lifecycle

The development lifecycle of a knowledge base is ongoing: editing, testing, and publishing your knowledge base.

Knowledge base development of question answer pairs

Your QnA pairs should be designed and developed based on your client application usage.

Each pair can contain:

  • Metadata - filterable when querying to allow you to tag your QnA pairs with additional information about the source, content, format, and purpose of your data.
  • Follow-up prompts - helps to determine a path through your knowledge base so the user arrives at the correct answer.
  • Alternate questions - important to allow search to match to your answer from different forms of the question. Active learning suggestions turn into alternate questions.

DevOps development

Developing a knowledge base to insert into a DevOps pipeline requires that the knowledge base is isolated during batch testing.

A knowledge base shares the Cognitive Search index with all other knowledge bases on the language resource. While the knowledge base is isolated by partition, sharing the index can cause a difference in the score when compared to the published knowledge base.

To have the same score on the test and production knowledge bases, isolate a language resource to a single knowledge base. In this architecture, the resource only needs to live as long as the isolated batch test.

Next steps