This article provides information on general and transparency frequently asked questions about Microsoft Copilot for Azure in Cosmos DB.
General
What can Copilot do?
Copilot can help you write NoSQL queries on your own data with ease and confidence to boost your productivity through AI powered natural language to query generation.
What data was used to train the Copilot?
Copilot is powered by Large Language Models (LLMs) in Azure OpenAI that are pretrained then configured to generate Azure Cosmos DB NoSQL queries and natural language explanations.
How does a user get the most out of Copilot?
Users can get the most out of their experience by following these steps:
- Input a prompt for AI to generate a query on a specific Azure Cosmos DB container. Users can type a natural language prompt in the Copilot box and select the Generate Query arrow button. Copilot then generates an Azure Cosmos DB for NoSQL query that matches the prompt and displays it in the query editor along with an explanation.
- Modify the prompt to be more specific and regenerate the query: If the user isn't satisfied with the query generated by Copilot, they can select the Regenerate button to ask the AI to generate a different query based on the refined prompt.
- Send feedback: Users can provide feedback to the Copilot team by using the feedback mechanism included with the query prompt. This feedback is used to improve the performance quality of Copilot responses.
Transparency
What data does Copilot collect and how might it be used?
Copilot relies on the schema of items in your Azure Cosmos DB container to work. It collects data to provide the service, some of which is then retained for analysis, error mitigation, and product improvements. Per the preview terms of use, your data might be stored and processed outside of your tenant's geographic region, compliance boundary, or national cloud instance.
Collected data includes:
- Service data: When you use Copilot in Azure Cosmos DB, it collects usage information about events generated when interacting with the Copilot service. This data includes information such as a timestamp, database ID, collection ID, HTTP response code, HTTP request latency, etc. This data might be used for service improvements and error mitigation.
- Logging: If an error occurs in the Azure Cosmos DB service, we log the error and other data used by the service at the time of the error. These logs might include information such as the prompt you entered to Copilot, the generated query, or the information about your data schema sent to the Copilot service. This data might be used for service improvements and error mitigation.
- Feedback: Users have an option of giving feedback on a specific query. This feedback data also contains the prompt submitted to Copilot by the user, the generated query and explanation, and any feedback the user would like to provide to Microsoft. This data might be used to improve the product.
How is the transmitted prompt and query data protected?
Copilot takes several measures to protect data including:
- The transmitted data is encrypted both in transit and at rest; Copilot-related data is encrypted in transit using TLS, and at rest using Microsoft Azure's data encryption (FIPS Publication 140-2 standards).
- Access to log and feedback data is strictly controlled. The data is stored in a separate subscription. The data is only accessible by 1) Just-In-Time JIT approval from Azure operations personnel using secure admin workstations.
Will my private prompts, queries, or data be shared with others?
No. Prompts, queries, and any other data aren't shared with others.
Where can I learn more about privacy and data protection?
For more information on how Copilot processes and uses personal data, see the Microsoft Privacy Statement.
Terms and limitations
Where can I find the preview terms for using Azure OpenAI-powered previews like Copilot?
For more information, see our preview terms.
What is Copilot's intended use?
You can generate Azure Cosmos DB for NoSQL queries from your own natural language questions and prompts within the Azure Cosmos DB Data Explorer. Each generated output also contains a natural (English) language description of the query operations. While in public preview, the performance and accuracy might be limited. Humans should review and validate all queries generated by Copilot before use.
How was Copilot evaluated? What metrics are used to measure performance?
Copilot is evaluated with test data and prompts on several metrics including:
- Validity: The generated query is a valid Azure Cosmos DB for NoSQL query that can be executed on the selected container.
- Correctness: The generated query is one that would be expected in response to the user's prompt.
- Accuracy: The generated query returns the results that are relevant to and expected for the user's prompt.
What are the limitations of Copilot?
Copilot is a feature that helps users write NoSQL queries for Azure Cosmos DB by providing suggestions based on natural language input. However, it has some limitations that users should be aware of and try to minimize. Some of the limitations include:
- Rate limits: Copilot limits how many queries a user can execute. If a user exceeds five calls per minute, or eight hours of total usage per day, they can receive an error message. The user will then have to wait until the next time window to use Copilot again.
- Limited accuracy: Copilot is in public preview, which means that performance and accuracy might be limited. Humans should review and validate all queries generated by Copilot before use.
- The queries generated might not be accurate and provide the results the user intended to receive. Copilot isn't a perfect system and can sometimes generate queries that are incorrect, incomplete, or irrelevant. These errant queries could happen due to:
- Ambiguity in the natural language prompt
- Limitations of the underlying natural language processing
- limitations of the underlying query generation models
- Other issues.
- Users should always review the queries generated by Copilot and verify that they match their expectations and requirements. Users should also provide feedback to the Copilot team if they encounter any errors or issues with the queries. Users can submit feedback directly through Copilot interface's feedback mechanism.
- English-only support: Copilot officially supports English as the input and output language. Users who want to use Copilot in other languages can experience degraded quality and accuracy of results.
Does Copilot write perfect or optimal queries?
Copilot aims to provide accurate and informative responses based on the available data. The answers generated by Copilot are based on patterns and probabilities in language data, which means that they might not always be accurate. Humans should carefully review, test, and validate all content generated by Copilot.
To mitigate the risk of sharing unexpected offensive content in results and displaying potentially harmful articles, Copilot has several measures in place. Despite these measures, you can still encounter unexpected results. We're constantly working to improve our technology to proactively address issues in line with our responsible AI principles
What should I do if I see unexpected or offensive outputs?
Copilot utilizes Azure OpenAI customized content filters to block offensive language in the prompts and to avoid synthesizing suggestions in sensitive contexts. It's a feature that helps users write NoSQL queries for Azure Cosmos DB and uphold our principles on responsible AI.