question

TThorLeach-0941 avatar image
0 Votes"
TThorLeach-0941 asked YutongTie-MSFT answered

English to Chinese Document Translation Character Encoding Problem

I'm trying to use the Microsoft Document Translation API to translate from English into multiple languages. The current workflow involves uploading a txt file to a source container in my Azure Storage Account, requesting a translation of that document/blob into 4 languages (Spanish, Portuguese, French, and Chinese), downloading the translated documents/blobs, and then saving those documents to the server.

The Spanish, Portuguese, French documents all seem to work (although I am having to do a conversion from Windows-1252 to UTF-8 to get rid of some character encoding issues with a few of the Spanish and Protuguese characters (mb_convert_encoding($response, "Windows-1252", "UTF-8");). Unfortunately, I have been unable to find a way to convert the Chinese document into actual Chinese characters. If I write the document/blob directly to the disk, then I get a return that looks like this: ï¼ˆå ™è¿°è€…ï¼‰å©´å„¿çš„å¤§è„‘ç”Ÿé•¿. If I try to run either a utf8_decode or mb_convert_encoding function, I get something that looks like this: ????????????. If I run a utf8_encode function, the result looks like this: (堙述者)婴儿的å.

Can anyone tell me what I'm doing wrong here? I'm able to use the Microsoft Text Translator to translate English to Chinese without a problem, but I just can't seem to get the Document Translation API to work...


azure-translator
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@TThorLeach-0941 Hello, could you please share your document if that is not confidential to us? Thanks.

0 Votes 0 ·

1 Answer

YutongTie-MSFT avatar image
0 Votes"
YutongTie-MSFT answered

@TThorLeach-0941

Sorry we can not reproduce this issue without your sample document, I would highly recommend you to raise a support ticket, connect with a support engineer to investigate it deeper. Please let us know if you do not have support plan, we can help you to enable a free support ticket. Thanks.

Regards,
Yutong

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.