Rediger

Del via


Harm categories in Azure AI Content Safety

This guide describes all of the harm categories and ratings that Azure AI Content Safety uses to flag content. Both text and image content use the same set of flags.

Harm categories

Content Safety recognizes four distinct categories of objectionable content.

Category Description API term
Hate and Fairness Hate and fairness harms refer to any content that attacks or uses discriminatory language with reference to a person or identity group based on certain differentiating attributes of these groups.

This includes, but is not limited to:
  • Race, ethnicity, nationality
  • Gender identity groups and expression
  • Sexual orientation
  • Religion
  • Personal appearance and body size
  • Disability status
  • Harassment and bullying
Hate
Sexual Sexual describes language related to anatomical organs and genitals, romantic relationships and sexual acts, acts portrayed in erotic or affectionate terms, including those portrayed as an assault or a forced sexual violent act against one’s will. 

 This includes but is not limited to:
  • Vulgar content
  • Prostitution
  • Nudity and Pornography
  • Abuse
  • Child exploitation, child abuse, child grooming
Sexual
Violence Violence describes language related to physical actions intended to hurt, injure, damage, or kill someone or something; describes weapons, guns, and related entities.

This includes, but isn't limited to:
  • Weapons
  • Bullying and intimidation
  • Terrorist and violent extremism
  • Stalking
Violence
Self-Harm Self-harm describes language related to physical actions intended to purposely hurt, injure, damage one’s body or kill oneself.

This includes, but isn't limited to:
  • Eating Disorders
  • Bullying and intimidation
SelfHarm

Classification can be multi-labeled. For example, when a text sample goes through the text moderation model, it could be classified as both Sexual content and Violence.

Severity levels

Every harm category the service applies also comes with a severity level rating. The severity level is meant to indicate the severity of the consequences of showing the flagged content.

Text: The current version of the text model supports the full 0-7 severity scale. The classifier detects among all severities along this scale. If the user specifies, it can return severities in the trimmed scale of 0, 2, 4, and 6; each two adjacent levels are mapped to a single level.

  • [0,1] -> 0
  • [2,3] -> 2
  • [4,5] -> 4
  • [6,7] -> 6

Image: The current version of the image model supports the trimmed version of the full 0-7 severity scale. The classifier only returns severities 0, 2, 4, and 6.

  • 0
  • 2
  • 4
  • 6

Image with text: The current version of the multimodal model supports the full 0-7 severity scale. The classifier detects among all severities along this scale. If the user specifies, it can return severities in the trimmed scale of 0, 2, 4, and 6; each two adjacent levels are mapped to a single level.

  • [0,1] -> 0
  • [2,3] -> 2
  • [4,5] -> 4
  • [6,7] -> 6

Text content

Warning

The Severity definitions tab in this document contains examples of harmful content that may be disturbing to some readers.

Image content

Warning

The Severity definitions tab in this document contains examples of harmful content that may be disturbing to some readers.

Multimodal (image with text) content

Warning

The Severity definitions tab in this document contains examples of harmful content that may be disturbing to some readers.

Next steps

Follow a quickstart to get started using Azure AI Content Safety in your application.