Data, privacy, and security for Azure OpenAI Service

This article provides details regarding how data provided by you to the Azure OpenAI service is processed, used, and stored. Azure OpenAI stores and processes data to provide the service, monitor for abusive use, and to develop and improve the quality of Azure’s Responsible AI systems. Please also see the Microsoft Products and Services Data Protection Addendum, which governs data processing by the Azure OpenAI Service except as otherwise provided in the applicable Product Terms.

Azure OpenAI was designed with compliance, privacy, and security in mind; however, the customer is responsible for its use and the implementation of this technology.

What data does the Azure OpenAI Service process?

Azure OpenAI processes the following types of data:

  • Text prompts, queries and responses submitted by the user via the completions, search, and embeddings operations.
  • Training & validation data. You can provide your own training data consisting of prompt-completion pairs for the purposes of fine-tuning an OpenAI model.
  • Results data from training process. After training a fine-tuned model, the service will output meta-data on the job which includes tokens processed and validation scores at each step.

How does the Azure OpenAI Service process data?

The diagram below illustrates how your data is processed. This diagram covers three different types of processing:

  1. How the Azure OpenAI Service creates a fine-tuned (custom) model with your training data
  2. How the Azure OpenAI Service processes your text prompts to generate completions, embeddings, and search results; and
  3. How the Azure OpenAI Service and Microsoft personnel analyze prompts & completions for abuse, misuse or harmful content generation.

Data Flow Diagram for the service

Training data for purposes of fine-tuning an OpenAI model

The training data (prompt-completion pairs) submitted to the Fine-tunes API through the Azure OpenAI Studio is pre-processed using automated tools for quality checking including data format check. The training data is then imported to the model training component on the Azure OpenAI platform. During the training process, the training data are decomposed into batches and used to modify the weights of the OpenAI models.

Training data provided by the customer is only used to fine-tune the customer’s model and is not used by Microsoft to train or improve any Microsoft models.

Text prompts to generate completions, embeddings and search results

Once a model is deployed, you can generate text using this model using our Completions operation through the REST API, client libraries or Azure OpenAI Studio.

Abuse and harmful content generation

The Azure OpenAI Service stores prompts & completions from the service to monitor for abusive use and to develop and improve the quality of Azure OpenAI’s content management systems. Learn more about our content management and filtering. Authorized Microsoft employees can access your prompt & completion data that has triggered our automated systems for the purposes of investigating and verifying potential abuse; for customers who have deployed Azure OpenAI Service in the European Union, the authorized Microsoft employees will be located in the European Union. This data may be used to improve our content management systems.

In the event of a confirmed policy violation, we may ask you to take immediate action to remediate the issue to and to prevent further abuse. Failure to address the issue may result in suspension or termination of Azure OpenAI resource access.

How is data retained and what Customer controls are available?

  • Training, validation, and training results data. The Files API allows customers to upload their training data for the purpose of fine-tuning a model. This data is stored in Azure Storage, encrypted at rest by Microsoft Managed keys, within the same region as the resource and logically isolated with their Azure subscription and API Credentials. Uploaded files can be deleted by the user via the DELETE API operation.

  • Fine-tuned OpenAI models. The Fine-tunes API allows customers to create their own fine-tuned version of the OpenAI models based on the training data that you have uploaded to the service via the Files APIs. The trained fine-tuned models are stored in Azure Storage in the same region, encrypted at rest and logically isolated with their Azure subscription and API credentials. Fine-tuned models can be deleted by the user by calling the DELETE API operation.

  • Text prompts, queries and responses. The requests & response data may be temporarily stored by the Azure OpenAI Service for up to 30 days. This data is encrypted and is only accessible to authorized engineers for (1) debugging purposes in the event of a failure, (2) investigating patterns of abuse and misuse or (3) improving the content filtering system through using the prompts and completions flagged for abuse or misuse.

To learn more about Microsoft's privacy and security commitments visit the Microsoft Trust Center.

Frequently asked questions

Can a customer opt out of the logging and human review process?

Some customers in highly regulated industries with low risk use cases process sensitive data with less likelihood of misuse. Because of the nature of the data or use case, these customers do not want or do not have the right to permit Microsoft to process such data for abuse detection due to their internal policies or applicable legal regulations.

To empower its enterprise customers and to strike a balance between regulatory / privacy needs and abuse prevention, the Azure Open AI Service will include a set of Limited Access features to provide potential customers with the option to modify following:

  1. abuse monitoring
  2. content filtering

These Limited Access features will enable potential customers to opt out of the human review and data logging processes subject to eligibility criteria governed by Microsoft’s Limited Access framework. Customers who meet Microsoft’s Limited Access eligibility criteria and have a low-risk use case can apply for the ability to opt-out of both data logging and human review process. This allows trusted customers with low-risk scenarios the data and privacy controls they require while also allowing us to offer AOAI models to all other customers in a way that minimizes the risk of harm and abuse.

Diagram of the openai data review process.

If Microsoft approves a customer’s request to access Limited Access features with the capability to (i) modify abuse monitoring and (ii) modify content filtering, then Microsoft will not store the associated request or response. Since no request or response data will be stored at rest in the Service Results Store in this case, the human review process will no longer be feasible. Therefore, both CMK and Lockbox will be deemed out-of-scope for harm and abuse detection.

See also