Azure computer vision and text moderation scoring criteria

Question

Azure computer vision and text moderation scoring criteria

2JK 241

I have a couple of questions regarding 1) the scoring criteria for the computer vision analyze image API, particularly concerning the adult, racy and gory classification scores, and 2) the text moderation scoring.

1- I tested some samples where I used an image of a flag and it gave a high racy score (0.8+) and a high adult score (0.7+) but isAdultContent and isRacyContent are both false. Why is that? Firstly, it shouldn't have score a flag that high, but is the scores not related to the boolean values at all?

2- For text moderation (in Content Moderator screen text), it's giving a very high score for offensive classification (0.98+) as soon as it encounters a word that may be used in offensive context, even if the sentence sentiment is positive. Can this be addressed or is it just a limitation?

YutongTie-9091 54,021 Reputation points Moderator

2021-10-26T08:52:20.17+00:00

Thanks for reaching out to us. Could you please share some quick samples if that is not confidential? This will help us to investigate this issue.

Regards,
Yutong
2JK 241 Reputation points

2021-10-26T09:29:35.203+00:00

First image (the Joker with a quote): it somehow detects a high gory score from it. I'm assuming it's because of Joker (or a clown?), as I tested a few images with the Joker or clowns and it detected it as gory.

Second image (flag of Nigeria): it returns high scores for both adult and racy content, but is false for isAdultContent and isRacyContent.

Third image: it's just a black background with text and it's similar results to the second image. High scores but false classification.

I just wanted a clearer idea on how the classification and scoring are related, if at all. Might it have to do something with the lower quality of the images? Thanks.
YutongTie-9091 54,021 Reputation points Moderator

2021-10-26T10:05:41.907+00:00

Thanks for the information, I have escalated this issue to product group for checking. Since now is US midnight, hope we can get a response tomorrow.

Regards
,Yutong
2JK 241 Reputation points

2021-10-26T14:19:28.887+00:00

Thank you.

For text moderation (Text Moderation API in Content Moderator), a few examples of high scores:

The worst thing you can do to yourself is to deceive yourself. Never claim or pretend to be what you are obviously not - gives a high offensive class score (0.987)

You need 24hrs light? Them search no more, Muyiz gat you covered - gives a high offensive class score (0.98)

Yes its true buh I have seen worst - same score as above

When you use your hand, the food is more sweet and delicious - very odd how this scored 0.98 in offensive class (Category3)

Answer accepted by question author

0 additional answers

Your answer

YutongTie-9091 54,021 Reputation points Moderator

2021-10-26T08:52:20.17+00:00

Thanks for reaching out to us. Could you please share some quick samples if that is not confidential? This will help us to investigate this issue.

Regards,
Yutong
2JK 241 Reputation points

2021-10-26T09:29:35.203+00:00

First image (the Joker with a quote): it somehow detects a high gory score from it. I'm assuming it's because of Joker (or a clown?), as I tested a few images with the Joker or clowns and it detected it as gory.

Second image (flag of Nigeria): it returns high scores for both adult and racy content, but is false for isAdultContent and isRacyContent.

Third image: it's just a black background with text and it's similar results to the second image. High scores but false classification.

I just wanted a clearer idea on how the classification and scoring are related, if at all. Might it have to do something with the lower quality of the images? Thanks.
YutongTie-9091 54,021 Reputation points Moderator

2021-10-26T10:05:41.907+00:00

Thanks for the information, I have escalated this issue to product group for checking. Since now is US midnight, hope we can get a response tomorrow.

Regards
,Yutong
2JK 241 Reputation points

2021-10-26T14:19:28.887+00:00

Thank you.

For text moderation (Text Moderation API in Content Moderator), a few examples of high scores:

The worst thing you can do to yourself is to deceive yourself. Never claim or pretend to be what you are obviously not - gives a high offensive class score (0.987)

You need 24hrs light? Them search no more, Muyiz gat you covered - gives a high offensive class score (0.98)

Yes its true buh I have seen worst - same score as above

When you use your hand, the food is more sweet and delicious - very odd how this scored 0.98 in offensive class (Category3)

Answer 1

YutongTie-9091 54,021 Moderator

@2JK

Thanks for the waiting. I have checked this situation with the pm of content moderator team, but unluckily, there is a limitation of this product. Below is the response I get, I am sorry for this experience but we do work on that.

Adult/Racy image classifier (available in CM) – this classifier was developed by Bing, who is not actively supporting now and we do not have the details on what triggers the Boolean “isImageAdultClassified” response in relationship to the actual classification scores.

a) The Gore score that the customer mentions is actually a response from a Custom Vision classifier that the customer has created a “connector” from CM to Custom Vision. The image is sent to CM, and then a custom workflow created by the customer sends that image to Custom Vision as well as being scanned by the classifiers available in CM. The responses from all the classifiers are then combined in the response to the customer. Because of that – we also do not have visibility/understanding as to “why” the Joker image returns a high Gore score.

both values are provided so that the customer can make a decision on what to trigger a “violation” off of. If they feel that the Boolean doesn’t make sense (because of the high classification score), they can choose to just go with the classification score.

b) The questions around the flag and the meme and why they are scoring high on Adult and Racy are also not something we can speak to since we didn’t develop that classifier.

Text scanning – It sounds like the issue is about text that in certain context is not offensive. If that is indeed the case, the “exact match” text scanning (the default for CM) is only indicating the presence of terms that have been identified as offensive. There are two text scanners – the exact match I mention above and then the text classifier (also developed by Bing, which would return the classification score) – if a sentence is marked with a high classification score but is contextually “not offensive” – that sounds like a limitation to that classifier.

Regards,
Yutong

YutongTie-9091 54,021 Reputation points Moderator

2021-10-31T09:56:17.65+00:00
@2JK Hello,

Hope you have solved your question, please check above answer and let me know if this helps. Thanks!

------------------------------

Please don't forget to click on or upvote button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how

Want a reminder to come back and check responses? Here is how to subscribe to a notification

If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
2JK 241 Reputation points

2021-10-31T09:58:16.853+00:00

Thank you for the clarifications. We will test more samples and see what best fits our use-case and how to correct the scorings using custom thresholds maybe. I will contact you again if we need anything cleared.
YutongTie-9091 54,021 Reputation points Moderator

2021-10-31T09:59:10.297+00:00

That's great!
2JK 241 Reputation points

2021-11-25T08:10:37.837+00:00

Hi again. I have a quick question regarding an image that's being picked up as adult/racy using Computer Vision. This is the image in question:

Is there a way I can get clarification on why it's being detected as such? I even tried other CV APIs, with the same result.
YutongTie-9091 54,021 Reputation points Moderator

2021-11-29T01:05:58.84+00:00

@2JK This is interesting, I am checking internally to see if there anyone know about this. Thanks.

Share via

Azure computer vision and text moderation scoring criteria

0 additional answers

Your answer