Azure Cognitive Services Translator: any criteria to determine whether translation gave reliable results or not?
We'd like to show some indication on the UI to draw user's attention if translation was unsuccessful. By 'unsuccessful' I mean not internal server errors or similar but the case when the results are not that good or reliable. It can be caused by a bad input text for instance.
The thing is that the API always returns something even if we send dummy data like 'ksalnfdljknrwwonfjlksnfk nasnfoaewrnnfsjklnfs 294#ffklsdfl' into it. In those cases the API normally returns a copy of the input. And we would like to get some score (e.g. a number from 0 to 1) or anything like that.
I was thinking about checking if output == input, but, unfortunately, sometimes word's translations are identical in different languages, e.g 'hamster' in English and German. Then I was thinking about additional preliminary input language detection to indicate if the input data are nonsense. The good thing about Detection API is that it returns a score. But it's not clear what threshold to use for the Score value.
So, hypothetically, IF input's score is < 0.3 (for instance) AND input == output THEN show an error message.
What do you think about such an approach? What is your solution for this problem? What constants could I use for thresholds? Perhaps there are other criterias or Azure API has other parameters which we are not aware about but could be helpful.
Here is a piece of distilled code in C# to send a request to API:
using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
namespace Relesys.Core.Translation
{
public class TranslationService
{
private static readonly string _endpoint = "https://api.cognitive.microsofttranslator.com";
private static readonly string _subscriptionKey = "******************************";
private HttpClient _httpClient;
public TranslationService()
{
_httpClient = new HttpClient();
_httpClient.DefaultRequestHeaders.Clear();
}
public async Task<string> SendTranslateRequestAsync(string originalText)
{
List<string> routeAttributes = new List<string>
{
"api-version=3.0", // always set api version
"textType=html", // The translate service should expect text from an html text input (as it might contain html)
"to=de",
"from=en"
};
using (HttpRequestMessage request = new HttpRequestMessage())
{
// Body content to send
object[] body = new object[] { new { Text = originalText } };
string requestBody = JsonConvert.SerializeObject(body);
request.Method = HttpMethod.Post;
request.RequestUri = new Uri($"{_endpoint}/translate?{string.Join("&", routeAttributes)}");
request.Content = new StringContent(requestBody, Encoding.UTF8, "application/json");
request.Headers.Add("Ocp-Apim-Subscription-Key", _subscriptionKey);
HttpResponseMessage response = await _httpClient.SendAsync(request).ConfigureAwait(false);
if (response.IsSuccessStatusCode)
{
string jsonResult = await response.Content.ReadAsStringAsync();
return jsonResult;
}
string failedResult = await response.Content.ReadAsStringAsync();
throw new Exception(failedResult);
}
}
}
}