Microsoft has a unique opportunity to dominate the machine translation market because Google made stupid mistakes. I'm happy to commit my time to help Microsoft achieve this goal. However, Microsoft must improve the product quickly. I've found many bugs in Microsoft's document translation API, especially in the PDF translation. They include: Translate bullet points (The big black dots) into "%a" Forget to translate some text and concatenate words together. For example, in a title with text "U N P A R A L L E L E D A C C E S S", Microsoft translators simply removes the white space and translates it into "U N P A R A L L E L E D A C C E S S". Google translate doesn't have this problem. This seems to be a simple PDF parsing bug. The generated PDF file is huge. The English version PDF is 1.6M. Google translate produced PDF is 8M but Microsoft translate produced PDF is 24M. I don't think the above issues are translator problems. But obviously your PDF reading and writing codes are junk and require immediate fix. I'm happy to provide a test document so you can see the problems. I can provide you the Google translated result too so you can compare.

Lots of bugs in PDF translation

Bigfatball 1

Microsoft has a unique opportunity to dominate the machine translation market because Google made stupid mistakes.

I'm happy to commit my time to help Microsoft achieve this goal. However, Microsoft must improve the product quickly. I've found many bugs in Microsoft's document translation API, especially in the PDF translation. They include:

Translate bullet points (The big black dots) into "%a"
Forget to translate some text and concatenate words together. For example, in a title with text "U N P A R A L L E L E D A C C E S S", Microsoft translators simply removes the white space and translates it into "U N P A R A L L E L E D A C C E S S". Google translate doesn't have this problem. This seems to be a simple PDF parsing bug.
The generated PDF file is huge. The English version PDF is 1.6M. Google translate produced PDF is 8M but Microsoft translate produced PDF is 24M.

I don't think the above issues are translator problems. But obviously your PDF reading and writing codes are junk and require immediate fix.

I'm happy to provide a test document so you can see the problems. I can provide you the Google translated result too so you can compare.

romungi-MSFT 43,696 Reputation points Microsoft Employee

2021-12-14T09:22:28.917+00:00

@Bigfatball Thanks for the feedback. We can certainly pass this feedback to the product team to check if improvements can be made. Is it possible to attach this document on this post or if the document cannot be made public, you can email us the same. We can provide you the details of the same through a private message. Thanks!!
Bigfatball 1 Reputation point

2021-12-14T17:56:00.237+00:00

Happy to provide the original document, the Azure translated document and the Google translated document. Because of the large size of the files and the sensitivity of the files, please provide me the details how to send the files to you. Thanks!

Share via

Lots of bugs in PDF translation