Harm categories in Azure AI Content Safety
- Artikkel
This guide describes all of the harm categories and ratings that Azure AI Content Safety uses to flag content. Both text and image content use the same set of flags.
Harm categories
Content Safety recognizes four distinct categories of objectionable content.
Category | Description | API term |
---|---|---|
Hate and Fairness | Hate and fairness harms refer to any content that attacks or uses discriminatory language with reference to a person or identity group based on certain differentiating attributes of these groups. This includes, but is not limited to:
|
Hate |
Sexual | Sexual describes language related to anatomical organs and genitals, romantic relationships and sexual acts, acts portrayed in erotic or affectionate terms, including those portrayed as an assault or a forced sexual violent act against one’s will. This includes but is not limited to:
|
Sexual |
Violence | Violence describes language related to physical actions intended to hurt, injure, damage, or kill someone or something; describes weapons, guns, and related entities. This includes, but isn't limited to:
|
Violence |
Self-Harm | Self-harm describes language related to physical actions intended to purposely hurt, injure, damage one’s body or kill oneself. This includes, but isn't limited to:
|
SelfHarm |
Classification can be multi-labeled. For example, when a text sample goes through the text moderation model, it could be classified as both Sexual content and Violence.
Severity levels
Every harm category the service applies also comes with a severity level rating. The severity level is meant to indicate the severity of the consequences of showing the flagged content.
Text: The current version of the text model supports the full 0-7 severity scale. The classifier detects among all severities along this scale. If the user specifies, it can return severities in the trimmed scale of 0, 2, 4, and 6; each two adjacent levels are mapped to a single level.
[0,1]
->0
[2,3]
->2
[4,5]
->4
[6,7]
->6
Image: The current version of the image model supports the trimmed version of the full 0-7 severity scale. The classifier only returns severities 0, 2, 4, and 6.
0
2
4
6
Image with text: The current version of the multimodal model supports the full 0-7 severity scale. The classifier detects among all severities along this scale. If the user specifies, it can return severities in the trimmed scale of 0, 2, 4, and 6; each two adjacent levels are mapped to a single level.
[0,1]
->0
[2,3]
->2
[4,5]
->4
[6,7]
->6
Text content
Warning
The Severity definitions tab in this document contains examples of harmful content that may be disturbing to some readers.
Hate and fairness severity levels
Severity Level | Description | Example Text |
---|---|---|
Level 0 | Content that contains hate and fairness related terms used in the general sense and is appropriate for all target audiences, including children. This includes:
|
I hate cats. |
Level 1 | Content that contains hate and fairness related terms used in the general sense but may acquire a different meaning based on context and does not differentiate between identity groups and others in a biased or negative way. This includes:
|
Black people should not be killed by police or mistreated just because of their skin color. |
Level 2 | Content that contains positive characterization or protection of the identity groups that trigger a favorable emotional response. This includes:
|
I hate you. |
Level 3 | Content that contains prejudiced, judgmental, stereotypical characterization or negative sentiment. This includes:
|
I don't like [identity group] because they are all so aggressive. They attacked us over and over again for no reason at all. |
Level 4 | Content that contains direct verbal sentiment-motivated insults, negative comments, and racial slurs used as descriptions targeting identity groups with the intention to hurt individuals based on their attributes. This includes:
|
All [identity group] are ugly. |
Level 5 | Content that contains dehumanization or derogation of individuals or groups that furthers negative sentiment through dehumanization, violence, denial of events, bullying, or intimidation behaviors. This includes:
|
The [real world event] is a fraud. What we've discovered is that sources have been lying about the number of victims. Here is proof. |
Level 6 | Content based on negative sentiment and hate speech towards identity groups. By referring to group characteristics that differentiate the identity group and others and appealing to take action against identity groups based on their attributes. This can include:
|
Send [identity group] back to the gas chambers. |
Level 7 | Content based on negative sentiment, displaying intent of hate crime and stalking, triggering a hostile emotional response and differentiating identity groups and others leading to crime. This includes:
|
Millions of [identity group] were wiped out because they were the inferior race. Murdering them continues to be right and honorable because… |
Sexual severity levels
Severity Level | Description | Example Text |
---|---|---|
Level 0 | Content that contains sexually related terms used in general sense that is harmless for all target audiences, including children.
|
The sports bra and yoga pants were too tight for comfort. |
Level 1 | Content that contains sexually related terms used in general sense but may acquire a different meaning based on context. This includes:
|
Fewer young adults are having sex than in previous generations. |
Level 2 | Content that contains sexually related terms in fictional or real world experiences that are not extremely graphic. This includes:
|
My husband and I introduced a vibrator to our sex life and it has really made a difference. |
Level 3 | Content with sexually related terms in cases that contain some graphic content, opinions expressing biased, prejudiced, judgmental or opinionated views on sex related issues. This includes:
|
We made out for a while, and he started to feel me up. I felt a bulge in his pants. |
Level 4 | Content describing sexual fantasies, professional sex industry and advertisements, and sexualized animal reproduction. This includes:
|
I got into BDSM because I had some friends into the lifestyle and it piqued my interest. I like using silk, scarves, collars, and bondage… |
Level 5 | Content describing erotic shows or experiences as personal experience, BDSM with clear consent, and erotic stories and websites. This includes:
|
We both climaxed at the same time. |
Level 6 | Content describing solicitation of sexual commercial services, containing detailed descriptions of sexual activity, violent sexual abuse and rape, and non-consensual power exchanges, abuse, and voyeurism. This includes:
|
I love forced sex, it's so much hotter. |
Level 7 | Content used to promote, endorse, glorify extreme forms of sexual enslavement. Content that is extreme or illegal. This includes:
|
I was abused as a child by my uncle and I liked it. Nowadays, I make money by producing child pornography with my neighbor’s 6 year old son. |
Violence severity levels
Severity Level | Description | Example Text |
---|---|---|
Level 0 | Content that contains non-violent actions or violence-related terms used in a general sense that is harmless for all target audiences, including children. This includes:
|
According to our child and parent testers, here are the best long-range soakers for soft blasters for toddlers suitable for epic water battles. |
Level 1 | Content that contains violence-related terms or acts used in a general sense but may acquire a different meaning based on context. This includes:
|
Carnivores such as lions and tigers are prone to killing one another. Death occurs at the paws, claws, jaws, or tusks that might fatally gore one another during fights. |
Level 2 | Content that contains violence-related terms or acts in fictional or real world experiences that are not extremely graphic and without glorification or endorsement of malintent. This includes:
|
Physical punishment is a way to correct bad behavior and doesn’t cause harm to children. |
Level 3 | Content with violence-related terms or acts in cases that contain some graphic content, opinions expressing biased, prejudiced, judgmental or opinionated views on violence. This includes:
|
Some people just need to be slapped. |
Level 4 | Content describing ungrounded rationale for why victims are being physically abused, unfounded attribution of positive qualities of abusers, threats of violent acts and fantasies relating to how to harm another person, and medium gore. This includes:
|
Victims of abuse deserve to be hit. They push and push until they get what they deserve. |
Level 5 | Content aiming to further violence, providing direct instructions on how to harm specific individuals or groups. This includes:
|
Go to another state or another city and kill any stranger; for example, shoot them with a silencer gun. Since you have not any connection with him, you will not get caught. |
Level 6 | Content describing extremist groups and terroristic organizations, mass killings and explicit physical damage in high gore, and promotion of terrorism and its propaganda. This includes:
|
We must kill them all. God wills it, and society will be better for it. |
Level 7 | Content used to describe, praise, support, promote, endorse, glorify extreme forms of violence. Content that supports terroristic organizations and radicalizes, instructs, or trains. This includes:
|
Let's call and threaten to explode the hospital during the weekend. I've got the explosives ready, and will have them installed tonight. |
Self-harm severity levels
Severity Level | Description | Example Text |
---|---|---|
Level 0 – Safe | Content that contains self-harm related terms used in a general sense that is harmless for all target audiences, including children. This includes:
|
I appreciate your hard work, but you don't need to keep killing yourself over this project. |
Level 1 | Content that contains self-harm related terms used in a general sense but may acquire a different meaning based on context. This includes:
|
Binging and purging may be a warning sign for people with a history of extreme weight loss. |
Level 2 | Content that contains self-harm related terms or acts in fictional or real world experiences that are not extremely graphic. This includes:
|
Oedipus stabbed his eyes in despair. |
Level 3 | Content with self-harm related terms or acts in cases that contain some graphic content, opinions expressing biased, prejudiced, judgmental or opinionated views on self-harm. This includes:
|
I'm really struggling with self-loathing and thoughts of self-harm to relieve my distress after my friend took his own life last month. Please can someone give me any useful advice? |
Level 4 | Content that mocks victims or survivors of suicide or self-harm, unfounded attribution of positive qualities to self-harm, fantasies or wishes of self-harm, and contains explicitly graphic content involving death, autopsy, crime, or death scene. This includes:
|
It's hilarious when [identity group] kill themselves. |
Level 5 | Content aiming to further promotion of self-harm, providing direct instructions on how to perform self-harm or bully a victim. This includes:
|
Go kill yourself, you are worthless. |
Level 6 | Content describing detailed and explicit instructions on how to commit suicide. This includes:
|
Here is a quick guide to do-it-yourself suicide. One method is asphyxia suicide by inhaling helium inside a plastic bag. |
Level 7 | Content used to endorse and glorify suicide or other severe acts of self-harm. This includes:
|
Get a shotgun and blow your brains out. It won't even hurt. And you'll be remembered for ages, like Kurt Cobain or Ernest Hemingway. There's nothing bad about suicide. Actually it's noble. |
Image content
Warning
The Severity definitions tab in this document contains examples of harmful content that may be disturbing to some readers.
Hate and fairness severity levels
Severity Level | Description | Image example |
---|---|---|
Safe | Content is safe but may contain hate and fairness related imagery with no sentiment or neutral sentiments in non-discriminatory non-offensive ways in the following contexts or those similar to:
|
Non confederate flags, e.g. Flags of countries or states that feature crosses
|
Low | Content that contains either positive sentiments, false positive sentiments, or negative, biased sentiments differentiating identity groups from others. This content is intended to trigger an emotional response. This includes:
|
Representations of 'No Hate Speech' movement Representations of deprioritized job candidates such as candidates with disabilities in interview settings |
Medium | Content that contains negative sentiment action towards identity groups and others that leads to the promotion of action against an identity groups and sentiments of superiority. This includes:
|
Imagery that promotes social avoidance of [identity groups] Imagery that depicts the racial improvement theory and eugenics |
High | Content that contains negative sentiment and hate speech towards identity groups leading to hate speech generation or hate crimes. This includes:
|
Antisemitic imagery Imagery depicting the "Southern Cross" that features a blue saltire (diagonal cross) with affiliation of obvious violence (e.g. guns in display/use) |
Sexual severity levels
Severity Level | Description | Image example |
---|---|---|
Safe | Content is safe but contains sexually related imagery used in a general sense. This includes:
|
Representations of hugging or making non-sexual physical contact Representations depicting physical display of affection such as kissing without tongue and without nudity |
Low | Content that contains sexually suggestive behaviors or acts. This includes:
|
Depictions of people |
Medium | Content that contains commercial and amateur erotic and/or pornographic content without visual photographic penetration and nude or semi-nude participants involved in sexual scenes meant to be sexually arousing. This includes:
|
Depictions of intercourse that may be partially occluded but still clearly present in the image |
High | Content that contains explicit detailed depictions of sexual organs and sexual acts, including illegal acts. This includes:
|
Depictions of engagements in sexual activities with animals or corpses |
Violence severity levels
Severity Level | Description | Image example |
---|---|---|
Safe | Content is safe but contains violence related imagery used in a general sense. This includes:
|
Depictions of non-realistic weapons and bladed tools |
Low | Content that contains violence related imagery and acts in real world or fictional contexts with low gore. This includes:
|
Depictions of ISIS/al-Qaeda flags |
Medium | Content that contains moderate graphic violence and violent acts aimed at an identified target. This includes:
|
Depictions of the use of force and coercion to aggressively dominate or intimidate an individual |
High | Content that contains explicit detailed depictions of violence and violent acts in high gore. This includes:
|
Depictions of death when performed by terrorist groups identifiable by ISIS flags and symbols |
Self-harm severity levels
Severity Level | Description | Image example |
---|---|---|
Safe | Content is safe but contains self-harm related imagery used in a general sense. This includes:
|
Depictions of jumping with a parachute out of a plane |
Low | Content that contains self-harm related imagery and acts in real world or fictional contexts with low gore. This includes:
|
Representations of self-harm related acts, tools, injury and emotional state |
Medium | Content that contains moderate graphic imagery related to self-harm and suicide. This includes:
|
Representations of the use of force and coercion to aggressively dominate or intimidate an individual into self-harm |
High | Content that contains explicit detailed depictions of self-harm and suicide in high gore. This includes:
|
Depictions of intentional suicide, where a person has committed suicide by jumping off a tall building |
Multimodal (image with text) content
Warning
The Severity definitions tab in this document contains examples of harmful content that may be disturbing to some readers.
Hate and fairness severity levels
Severity Level | Description | Image example |
---|---|---|
Safe | Content is safe but may contain hate and fairness related text and visuals used in generic and safe contexts such as:
|
Scenario: Image with text overlay Image: Benign photo of a person Text: Attends rally wearing sneakers |
Low | Content that contains positive characterization or protection of the Identity groups, displays discrimination, stereotypes and prejudiced, judgmental, opinionated views or attitudes related to hate speech or targeting identity groups. This includes:
|
Scenario: Image with text overlay Image: Benign photo of a person Text: Everyone assumes that I got high grades in math because I’m Asian. I did. Rationale: Implies racial stereotype. |
Medium | Content that contains insults, bullying, intimidation, de-humanization or derogation of individuals or groups. This includes:
|
Scenario: Image with text overlay Image: Benign photo of a person with Parkinsons Text: This guy just Friended and Unfriended me 45 times in 15 seconds. Does anyone know him? Rationale: Implies mockery due to disability |
High | Content that contains hate speech as propaganda threatening or calling for violence against Identity groups, radicalizes, recruits or incites people to join extremist efforts or spread hate speech. This includes:
|
Scenario: Image and text interleaved Never forget that everything [image of Adolf Hitler] did in Germany was legal. |
Sexual severity levels
Severity Level | Description | Image example |
---|---|---|
Safe | Content is safe but may contain sexually related terms or visuals used in generic and safe contexts such as:
|
Scenario: Image with text overlay Image: A skunk Text: Looks aren’t everything, be careful who you marry, and check your mirrors too. |
Low | Content that expresses prejudiced, judgmental or opinionated views on sexually related topics or mentions sexually related terms in fictional or real-world experiences that are not extremely graphic. This includes:
|
Scenario: Image and text interleaved I hope they debate social issues I want [image of a person cupping their ear, representing "to hear"] about how a representative wants to regulate female sexual organs. |
Medium | Content that describes sexual arousal and activity in explicit and graphic terms or visuals, including fantasies about sex and sexual abuse. This includes:
|
Scenario: Image and text interleaved There is so much [image containing the phrase "Sex and Violence"] in that book it could use a warning label in the cover. |
High | Content describing solicitation of sexual commercial services, containing detailed descriptions of sexual activity, violent sexual abuse and rape, and non-consensual power exchanges, abuse, and voyeurism. This also includes content that is used to promote, endorse, or glorify extreme or illegal sexual acts. This includes:
|
Scenario: Image and text interleaved Can too much [image of graphic pornography] actually kill you? Evidently not, because you are still with us. |
Violence severity levels
Severity Level | Description | Image example |
---|---|---|
Safe | Content is safe but may contain terms and visuals related to violence used in generic and safe contexts such as:
|
Scenario: Image and text interleaved [Image of lawnmower] can easily throw a [image of rock] to a distance well over 50 feet. |
Low | Content in fictional contexts (for example, gaming, literature, arts) that depict violent or graphic scenes without glorification, endorsement or malintent. This includes:
|
Scenario: Image with text overlay Image: Benign photo of a "nerdy" person Text: Runs suicides at basketball practice. Dies. Rationale: Implies exhaustion due to the lack of physical abilities of the person. |
Medium | Content that contains graphic depictions of violence and death at medium intensity as well as fantasies about violent acts or threats. This includes:
|
Scenario: Image and text interleaved She needs [image of a graphic that reads "to die"] already . |
High | Content that contains terrorist or violent extremist propaganda and violent ideology on the Internet that describes, praises, supports, endorses or glorifies killings as part of terroristic activity, aids offers to terrorist organizations or causes. This includes:
|
Scenario: Image and text interleaved I can’t wait for this [image of ISIS kafir beheading]. |
Self-harm severity levels
Severity Level | Description | Image example |
---|---|---|
Safe | Content is safe but may contain terms or visuals related to self-harm used in generic and safe contexts such as:
|
Scenario: Image and text interleaved How do I tie a [image of a rope]? |
Low | Content that contains references to characters planning or committing suicide in a fictional context (for example, literature, movies) or contains depictions of self-harm acts or suicide at low intensity. This includes:
|
Scenario: Image with text overlay Image: Kermit the frog Text: Life ain’t no good for a frog. Even tried to Kermit suicide. |
Medium | Content that mocks victims or survivors of self-harm such as eating disorders or suicide, as well as bullying and intimidation that incites and promotes self-harm acts. This includes:
|
Scenario: Image with text interleaved I will [image of a knife] myself soon. But until then, how do I tame my pain? |
High | Content that includes research for advice or provides instructions and methods on how to engage in self-harm that leads to death or how to commit suicide as well as glorification and endorsement of suicide, or other severe types of self-harm. This includes:
|
Scenario: Image and text interleaved Black Friday [image of self-harm with text "My self-harm cuts have nearly faded completely so they won’t scar"] and [image of a kitchen mixer] deals. |
Next steps
Follow a quickstart to get started using Azure AI Content Safety in your application.
Tilbakemeldinger
Var denne siden nyttig?