Developing Responsible Generative AI Applications and Features on Windows

Cikk
05/21/2024

This document provides an overview of recommended responsible development practices to use as you create applications and features on Windows with generative artificial intelligence.

Guidelines for responsible development of generative AI apps and features on Windows

Every team at Microsoft follows core principles and practices to responsibly build and ship AI, including Windows. You can read more about Microsoft’s approach to responsible development in the first annual Responsible AI Transparency Report. Windows follows foundational pillars of RAI development — govern, map, measure, and manage — that are aligned to the National Institute for Standards and Technology (NIST) AI Risk Management Framework.

Govern - Policies, practices, and processes

Standards are the foundation of governance and compliance processes. Microsoft has developed our own Responsible AI Standard, including six principles that you can use as a starting point to develop your guidelines for responsible AI. We recommend you build AI principles into your development lifecycle end to end, as well as into your processes and workflows for compliance with laws and regulations across privacy, security, and responsible AI. This spans from early assessment of each AI feature, using tools like the AI Fairness Checklist and Guidelines for Human-AI Interaction - Microsoft Research, to monitoring and review of AI benchmarks, testing and processes using tools like a Responsible AI scorecard, to public documentation into your AI features’ capabilities and limitations and user disclosure and controls -- notice, consent, data collection and processing information, etc. -- in keeping with applicable privacy laws, regulatory requirements, and policies.

Map - Identify risk

Recommended practices for identifying risks include:

End-to-end testing

Red-teaming: The term red teaming has historically described systematic adversarial attacks for testing security vulnerabilities. With the rise of large language models (LLMs), the term has extended beyond traditional cybersecurity and evolved in common usage to describe many kinds of probing, testing, and attacking of AI systems. With LLMs, both benign and adversarial usage can produce potentially harmful outputs, which can take many forms, including harmful content such as hate speech, incitement or glorification of violence, or sexual content.
Model evaluation: In addition to testing end-to-end, it is also important to evaluate the model itself.
- Model Card: For publicly available models, such as those on HuggingFace, you can check each model’s Model Card as a handy reference to understand if a model is the right one for your use case. Read more about Model Cards.
- Manual testing: Humans performing step-by-step tests without scripts is an important component of model evaluation that supports...
  - Measuring progress on a small set of priority issues. When mitigating specific harms, it's often most productive to keep manually checking progress against a small dataset until the harm is no longer observed before moving to automated measurement.
  - Defining and reporting metrics until automated measurement is reliable enough to use alone.
  - Spot-checking periodically to measure the quality of automatic measurement.
- Automated testing: Automatically executed testing is also an important component of model evaluation that supports...
  - Measuring at a large scale with increased coverage to provide more comprehensive results.
  - Ongoing measurement to monitor for any regression as the system, usage, and mitigations evolve.
- Model selection: Select a model that is suited for your purpose and educate yourself to understand its capabilities, limitations, and potential safety challenges. When testing your model, make sure that it produces results appropriate for your use. To get you started, destinations for Microsoft (and non-Microsoft/open source) model sources include:

Measure - Assess risks and mitigation

Recommended practices include:

Assign a Content Moderator: Content Moderator checks text, image, and video content for material that is potentially offensive, risky, or otherwise undesirable in content. Learn more: Introduction to Content Moderator (Microsoft Learn Training).
- Use content safety filters: This ensemble of multi-class classification models detects four categories of harmful content (violence, hate, sexual, and self-harm) at four severity levels respectively (safe, low, medium, and high). Learn more: How to configure content filters with Azure OpenAI Service.
- Apply a meta-prompt: A meta-prompt is a system message included at the beginning of the prompt and is used to prime the model with context, instructions, or other information relevant to your use case. These instructions are used to guide the model’s behavior. Learn more: Creating effective security guardrails with metaprompt / system message engineering.
- Utilize blocklists: This blocks the use of certain terms or patterns in a prompt. Learn more: Use a blocklist in Azure OpenAI.
- Get familiar with the provenance of the model: Provenance is the history of ownership of a model, or the who-what-where-when, and is very important to understand. Who collected the data in a model? Who does the data pertain to? What kind of data is used? Where was the data collected? When was the data collected? Knowing where model data came from can help you assess its quality, reliability, and avoid any unethical, unfair, biased, or inaccurate data use.
- Use a standard pipeline: Use one content moderation pipeline rather than pulling together parts piecemeal. Learn more: Understanding machine learning pipelines.
Apply UI mitigations: These provide important clarity to your user about capabilities and limitations of an AI-based feature. To help users and provide transparency about your feature, you can:
- Encourage users to edit outputs before accepting them
- Highlight potential inaccuracies in AI outputs
- Disclose AI’s role in the interaction
- Cite references and sources
- Limit length of input and output where appropriate
- Provide structure out input or output – prompts must follow a standard format
- Prepare pre-determined responses for controversial prompts.

Manage - Mitigate AI risks

Recommendations for mitigating AI risks include:

Abuse monitoring: This methodology detects and mitigates instances of recurring content and/or behaviors that suggest a service has been used in a manner that may violate the Code of Conduct or other applicable product terms. Learn more: Abuse Monitoring.
Phased delivery: Roll out your AI solution slowly to handle incoming reports and concerns.
Incident response plan: For every high-priority risk, evaluate what will happen and how long it will take to respond to an incident, and what the response process will look like.
Ability to turn feature or system off: Provide functionality to turn the feature off if an incident is about to or has occurred that requires pausing the functionality to avoid further harm.
User access controls/blocking: Develop a way to block users who are misusing a system.
User feedback mechanism: Streams to detect issues from the user’s side.
Responsible deployment of telemetry data: Identify, collect, and monitor signals that indicate user satisfaction or their ability to use the system as intended, ensuring you follow applicable privacy laws, policies, and commitments. Use telemetry data to identify gaps and improve the system.

Tools and resources

Responsible AI Toolbox: Responsible AI is an approach to assessing, developing, and deploying AI systems in a safe, trustworthy and ethical manner. The Responsible AI toolbox is a suite of tools providing a collection of model and data exploration and assessment user interfaces and libraries that enable a better understanding of AI systems. These interfaces and libraries empower developers and stakeholders of AI systems to develop and monitor AI more responsibly and take better data-driven actions.
Responsible AI Dashboard Model Debugging: This dashboard can help you to Identify, Diagnose, and Mitigate issues, using data to take informed actions. This customizable experience can be taken in a multitude of directions, from analyzing the model or data holistically, to conducting a deep dive or comparison on cohorts of interest, to explaining and perturbing model predictions for individual instances, and to informing users on business decisions and actions. Take the Responsible AI Decision Making Quiz.
Review the Azure Machine Learning summary of What is Responsible AI?
Read the Approach to Responsible AI for Copilot in Bing.
Read Brad Smith's article on Combating abusive AI-generated content: a comprehensive approach from Feb 13, 2024.
Read the Microsoft Security Blog.
Overview of Responsible AI practices for Azure OpenAI models - Azure AI services
How to use content filters (preview) with Azure OpenAI Service
How to use blocklists with Azure OpenAI Service
Planning red teaming for large language models (LLMs) and their applications
Azure OpenAI Service abuse monitoring
Threat modeling AI/ML systems and dependencies
AI/ML pivots to the security. A development lifecycle bug bar
Failure modes in machine learning
Tools for Managing and Ideating Responsible AI Mitigations - Microsoft Research
Planning for natural language failures with the AI Playbook
Software engineering for ML: A case study
Security and machine learning in the real world
Overreliance on AI: Literature review
Error Analysis and Build Responsible AI using Error Analysis toolkit (youtube.com)
InterpretML and How to Explain Models with IntepretML Deep Dive (youtube.com)
Black-Box and Glass-Box Explanation in Machine Learning (youtube.com)

Megosztás a következőn keresztül: