What is Document Translation?
Document Translation is a cloud-based feature of the Azure Translator service and is part of the Azure Cognitive Service family of REST APIs. The Document Translation API can be used to translate multiple and complex documents across all supported languages and dialects, while preserving original document structure and data format.
Key features
Feature | Description |
---|---|
Translate large files | Translate whole documents asynchronously. |
Translate numerous files | Translate multiple files across all supported languages and dialects while preserving document structure and data format. |
Preserve source file presentation | Translate files while preserving the original layout and format. |
Apply custom translation | Translate documents using general and custom translation models. |
Apply custom glossaries | Translate documents using custom glossaries. |
Automatically detect document language | Let the Document Translation service determine the language of the document. |
Translate documents with content in multiple languages | Use the autodetect feature to translate documents with content in multiple languages into your target language. |
Note
When translating documents with content in multiple languages, the feature is intended for complete sentences in a single language. If sentences are composed of more than one language, the content may not all translate into the target language. For more information on input requirements, see Document Transaltion request limits
Development options
You can add Document Translation to your applications using the REST API or a client-library SDK:
The REST API. is a language agnostic interface that enables you to create HTTP requests and authorization headers to translate documents.
The client-library SDKs are language-specific classes, objects, methods, and code that you can quickly use by adding a reference in your project. Currently Document Translation has programming language support for C#/.NET and Python.
Get started
In our quickstart, you learn how to rapidly get started using Document Translation. To begin, you need an active Azure account. If you don't have one, you can create a free account.
Supported document formats
Document Translation supports the following document file types:
File type | File extension | Description |
---|---|---|
Adobe PDF | pdf |
Portable document file format. Document Translation uses optical character recognition (OCR) technology to extract and translate text in scanned PDF document while retaining the original layout. |
Comma-Separated Values | csv |
A comma-delimited raw-data file used by spreadsheet programs. |
HTML | html , htm |
Hyper Text Markup Language. |
Localization Interchange File Format | xlf | A parallel document format, export of Translation Memory systems. The languages used are defined inside the file. |
Markdown | markdown , mdown , mkdn , md , mkd , mdwn , mdtxt , mdtext , rmd |
A lightweight markup language for creating formatted text. |
MHTML | mthml , mht |
A web page archive format used to combine HTML code and its companion resources. |
Microsoft Excel | xls , xlsx |
A spreadsheet file for data analysis and documentation. |
Microsoft Outlook | msg |
An email message created or saved within Microsoft Outlook. |
Microsoft PowerPoint | ppt , pptx |
A presentation file used to display content in a slideshow format. |
Microsoft Word | doc , docx |
A text document file. |
OpenDocument Text | odt |
An open-source text document file. |
OpenDocument Presentation | odp |
An open-source presentation file. |
OpenDocument Spreadsheet | ods |
An open-source spreadsheet file. |
Rich Text Format | rtf |
A text document containing formatting. |
Tab Separated Values/TAB | tsv /tab |
A tab-delimited raw-data file used by spreadsheet programs. |
Text | txt |
An unformatted text document. |
Request limits
For detailed information regarding Azure Translator Service request limits, see Document Translation request limits.
Legacy file types
Source file types are preserved during the document translation with the following exceptions:
Source file extension | Translated file extension |
---|---|
.doc, .odt, .rtf, | .docx |
.xls, .ods | .xlsx |
.ppt, .odp | .pptx |
Supported glossary formats
Document Translation supports the following glossary file types:
File type | File extension | Description |
---|---|---|
Comma-Separated Values | csv |
A comma-delimited raw-data file used by spreadsheet programs. |
Localization Interchange File Format | xlf , xliff |
A parallel document format, export of Translation Memory systems The languages used are defined inside the file. |
Tab-Separated Values/TAB | tsv , tab |
A tab-delimited raw-data file used by spreadsheet programs. |
Data residency
Document Translation data residency depends on the Azure region where your Translator resource was created:
- Translator resources created in any region in Europe are processed at data center in West Europe and North Europe.
- Translator resources created in any region in Asia or Australia are processed at data center in Southeast Asia and Australia East.
- Translator resource created in all other regions including Global, North America and South America are processed at data center in East US and West US 2.
Text Translation data residency
✔️ Feature: Translator Text ✔️ Region where resource created: Any
Service endpoint | Request processing data center |
---|---|
Global (recommended):api.cognitive.microsofttranslator.com |
Closest available data center. |
Americas:api-nam.cognitive.microsofttranslator.com |
East US • South Central US • West Central US • West US 2 |
Europe:api-eur.cognitive.microsofttranslator.com |
North Europe • West Europe |
Asia Pacific:api-apc.cognitive.microsofttranslator.com |
Korea South • Japan East • Southeast Asia • Australia East |
Document Translation data residency
✔️ Feature: Document Translation
✔️ Service endpoint: Custom: <name-of-your-resource.cognitiveservices.azure.com/translator/text/batch/v1.0
Resource region | Request processing data center |
---|---|
Global and any region in the Americas | East US • West US 2 |
Any region in Europe | North Europe • West Europe |
Any region in Asia Pacific | Southeast Asia • Australia East |
Document Translation data residency
Next steps
Feedback
Submit and view feedback for