Share via

We are experiencing two issues with Azure Cognitive Services – Text Analytics (TextAnalyticsClient):

Ketan Kumar Varude 20 Reputation points
2026-02-25T10:21:53.53+00:00

1. Language Detection Issue

Observation: Some words are not being correctly recognized in their source language. For example:

Input TextExpected LanguageDetected LanguageIssueToppensv (Swedish)da (Danish)Incorrect language detection--------------------------------Toppensv (Swedish)da (Danish)Incorrect language detectionThe word “Toppen” should be detected as Swedish, but the API returns Danish, likely because it exists in both languages.

This misclassification affects downstream translation, sentiment analysis, and other language-specific processing.

Impact: Incorrect language detection can result in wrong translations and affect language-dependent analytics.


2. PII (Personally Identifiable Information) Entity Detection Issue

Observation: PII entities are either misclassified or not detected properly. For example:

Input TextExpected PII TypeDetected PII TypeIssueMy SSN is 123-45-6789US Social Security NumberPhone NumberMisclassification--------------------------------My SSN is 123-45-6789US Social Security NumberPhone NumberMisclassificationJohn Doe lives in StockholmPerson NameLocationIncorrect category assignmentThe model is incorrectly assigning categories, which could result in compliance risks.

Impact: Misclassification or missed PII entities can cause data handling errors and compliance issues.

1. Language Detection Issue

Observation:
Some words are not being correctly recognized in their source language. For example:

Input TextExpected LanguageDetected LanguageIssueToppensv (Swedish)da (Danish)Incorrect language detectionThe word “Toppen” should be detected as Swedish, but the API returns Danish, likely because it exists in both languages.

This misclassification affects downstream translation, sentiment analysis, and other language-specific processing.

Impact:
Incorrect language detection can result in wrong translations and affect language-dependent analytics.


2. PII (Personally Identifiable Information) Entity Detection Issue

Observation:
PII entities are either misclassified or not detected properly. For example:

Input TextExpected PII TypeDetected PII TypeIssueMy SSN is 123-45-6789US Social Security NumberPhone NumberMisclassificationJohn Doe lives in StockholmPerson NameLocationIncorrect category assignmentThe model is incorrectly assigning categories, which could result in compliance risks.

Impact:
Misclassification or missed PII entities can cause data handling errors and compliance issues.

Azure OpenAI Service
Azure OpenAI Service

An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.

0 comments No comments
{count} votes

Answer accepted by question author
  1. Q&A Assist
    2026-02-25T10:22:37.7333333+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    Language Detection Issue

    The Azure Cognitive Services Language Detection may misclassify words that exist in multiple languages, such as "Toppen" being detected as Danish instead of Swedish. This is a known limitation where the service tends to return the language with the largest representation in the input text, which can lead to incorrect language detection, especially for short samples or ambiguous words. To improve accuracy, consider providing longer inputs or using the countryHint parameter if you have a specific region in mind.

    PII Entity Detection Issue

    For PII detection, misclassification can occur if the model does not recognize the context or if the input text does not contain recognizable patterns. For example, a Social Security Number being detected as a phone number indicates a potential issue with the model's ability to accurately categorize entities. To mitigate this, ensure that the input text is clear and contains recognizable PII patterns. Additionally, specifying the piiCategories parameter can help in accurately identifying the types of PII you expect to be detected.

    Both issues can significantly impact downstream processes such as translation and compliance, so it is crucial to test and validate the outputs from the API thoroughly.


    References:

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Ketan Kumar Varude 20 Reputation points
    2026-02-25T10:23:18.1733333+00:00

    Currently we are working on English and Swedish only.

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.