questions about OpenAI data-privacy.

JJP 0 Reputation points
2023-04-13T02:22:23.8633333+00:00

I looked at https://learn.microsoft.com/en-us/legal/cognitive-services/openai/data-privacy. And I have a few questions that I can't check in the link.

  1. The additional OpenAI limited access application(Add additional use cases) states that it is used for the purpose of "(3)improving the content filtering system". Do you use customer prompts and completions data for the purpose of improving the content filtering system?
  2. Prompts and completions are "temporarily stored by the Azure OpenAI Service in the same region as the resource for up to 30 days." it was explained If the customer wishes, can the temporarily stored data be stored permanently and encrypted in the customer's storage without deleting it?
  3. Can the customer's prompts and completions data be used to train and improve the customer's models?
  4. Prompts and completions, Fine-tuned Are the encryption keys for OpenAI models Microsoft Managed keys?
  5. Training, validation, and training results data and Fine-tuned OpenAI models are not being filtered by the Content filtering system? Are abuses/misuses also not monitored by Microsoft employees?
  6. Training, validation, and training results data and Fine-tuned OpenAI models are, "stored in Azure Storage in the same region, encrypted at rest and logically isolated with their Azure subscription and API credentials." After that, "CMK encrypts all customer data stored at rest in the Azure OpenAI Service (such as data uploaded for fine-tuning) except for data logged for 30 days as described above." After 30 days, I understand that the encryption method can be changed by the customer's choice. Is my understanding correct? If the customer does not have a CMK, is it encrypted with Microsoft Managed keys? Does the encryption key change?

Thanks.

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,378 questions
{count} votes

1 answer

Sort by: Most helpful
  1. romungi-MSFT 47,426 Reputation points Microsoft Employee
    2023-04-13T12:21:29.74+00:00

    @JJP Some of the question mentioned are answered in the FAQ document mentioned here. I will try to answer all the questions in the sequence posted.

    1. Azure OpenAI doesn't use customer data to retrain models. Content filtering system is in place to prevent abuse of the service and if content is flagged by the Azure OpenAI service’s content filters, the content may be reviewed by an authorized Microsoft full-time employee for the purposes as mentioned in the form. For more details about content filtering please see this page.
    2. If you need to store the data you would need to enable diagnostic logging on your resource and then store the data in any of the available sources. This is covered in the monitoring section of the documentation.
    3. Customer data is not used by MS to retrain the models. You will have to use fine tuning to improve the performance of your deployments.
    4. Yes, the keys are Microsoft managed unless you will choose to use customer managed keys(CMK).
    5. Content filtering system works by applying algorithmic detection to the prompts and completions at inference time to determine if content should be filtered. For the data that is used for training, finetuning and validation the filtering process is not enabled. You can also submit a request to modify the content filter and abuse monitoring policy for your subscription if applicable based on data submitted in this form.
    6. If CMK is not used all data that is encrypted is done my microsoft managed keys. The policy governing MS managed keys is same as cognitive services security/encryption management and the keys are periodically rotated and managed by MS.

    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.