Azure Cognitive Services Translator: any criteria to determine whether translation gave reliable results or not?

Alexey Gurevski 0 Reputation points
2023-05-29T16:38:07.9633333+00:00

We'd like to show some indication on the UI to draw user's attention if translation was unsuccessful. By 'unsuccessful' I mean not internal server errors or similar but the case when the results are not that good or reliable. It can be caused by a bad input text for instance.

The thing is that the API always returns something even if we send dummy data like 'ksalnfdljknrwwonfjlksnfk nasnfoaewrnnfsjklnfs 294#ffklsdfl' into it. In those cases the API normally returns a copy of the input. And we would like to get some score (e.g. a number from 0 to 1) or anything like that.

I was thinking about checking if output == input, but, unfortunately, sometimes word's translations are identical in different languages, e.g 'hamster' in English and German. Then I was thinking about additional preliminary input language detection to indicate if the input data are nonsense. The good thing about Detection API is that it returns a score. But it's not clear what threshold to use for the Score value.

So, hypothetically, IF input's score is < 0.3 (for instance) AND input == output THEN show an error message.

What do you think about such an approach? What is your solution for this problem? What constants could I use for thresholds? Perhaps there are other criterias or Azure API has other parameters which we are not aware about but could be helpful.

Here is a piece of distilled code in C# to send a request to API:

using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;

namespace Relesys.Core.Translation
{
    public class TranslationService
    {
        private static readonly string _endpoint = "https://api.cognitive.microsofttranslator.com";
        private static readonly string _subscriptionKey = "******************************";

        private HttpClient _httpClient;

        public TranslationService()
        {
            _httpClient = new HttpClient();
            _httpClient.DefaultRequestHeaders.Clear();
        }

        public async Task<string> SendTranslateRequestAsync(string originalText)
        {
            List<string> routeAttributes = new List<string>
            {
                "api-version=3.0", // always set api version
                "textType=html", // The translate service should expect text from an html text input (as it might contain html)
                "to=de",
                "from=en"
            };

            using (HttpRequestMessage request = new HttpRequestMessage())
            {
                // Body content to send
                object[] body = new object[] { new { Text = originalText } };
                string requestBody = JsonConvert.SerializeObject(body);

                request.Method = HttpMethod.Post;
                request.RequestUri = new Uri($"{_endpoint}/translate?{string.Join("&", routeAttributes)}");
                request.Content = new StringContent(requestBody, Encoding.UTF8, "application/json");
                request.Headers.Add("Ocp-Apim-Subscription-Key", _subscriptionKey);

                HttpResponseMessage response = await _httpClient.SendAsync(request).ConfigureAwait(false);

                if (response.IsSuccessStatusCode)
                {
                    string jsonResult = await response.Content.ReadAsStringAsync();

                    return jsonResult;
                }

                string failedResult = await response.Content.ReadAsStringAsync();

                throw new Exception(failedResult);
            }
        }
    }
}

Community Center | Not monitored
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.