Quite disappointed on text filtering capabilities of 'Content Moderator'

Sylvain Donnet 1 Reputation point
2021-02-24T10:20:45.867+00:00

Hi,

(Sorry for my English, I am French and not very fluent in English).

I am currently testing Content Moderator via the API for SMS texts. Quite simple to implement and test.
I am using it in the "West Europe" area. And in English and in French.

I tested rough and sexual expressions in English and French. Here are my results :

  • in English : these expressions/words are partially detected and isolated ("Terms"=...), but are well categorized,
  • in French : NO expression detected/isolated, no category (OK, it is said in the documentation, but without any "Terms" found, it is quite not usefull).

Does anybody have any idea to improve the capability to filter such texts ?

BR

Sylvain

Not Monitored
Not Monitored
Tag not monitored by Microsoft.
35,934 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Ramr-msft 17,611 Reputation points
    2021-02-25T07:11:06.947+00:00

    @Sylvain Donnet Thanks for the question. Text moderation detects potential profanity in more than 100 languages, flag text that may be deemed inappropriate depending on context (in public preview) and match text against your custom lists. Content Moderator also helps check for personally identifiable information (PII).

    CM with their custom developed APIs for custom detection: you should be able combine since CM workflow provides APIs.
    Content Moderator Review tool: https://contentmoderator.cognitive.microsoft.com/

    There are only limited connectors available so would suggest programmatically accessing other services via SDK/API would be ideal. This way you’ve access to all cognitive services and extend their functionalities.

    0 comments No comments

  2. noc 1 Reputation point
    2021-02-25T08:38:50.31+00:00

    Hi,

    Thanks for your reply.

    I perform several CURL on my endpoint :

    • in English :

    curl -X POST "https://XXXXXXXXXXX.cognitiveservices.azure.com//contentmoderator/moderate/v1.0/ProcessText/Screen?autocorrect=true&PII=True&classify=true&language=eng

    with --data-ascii "I sc**** you and I f**** you. You are an idiot and an ass*****."

    F word, and ass word have been detected, and Category3 is about 98%

    So > OK

    • in French,
      curl -X POST "https://XXXXXXXXXXX.cognitiveservices.azure.com//contentmoderator/moderate/v1.0/ProcessText/Screen?autocorrect=true&PII=True&classify=true&language=fra

    same data, with french translations, for F word, ass* word, and so on :
    On 4 rough words, none has been detected (in "Terms" results), and, as language=fra (French), I cannot have the categorization.

    0 comments No comments