Share via

Issues with language markup in Word and Save to PDF

Charles Belov SFMTA 10 Reputation points
2026-05-13T23:24:08.87+00:00

If I set the language for a document, Word overwrites any existing selection markup for other languages. Ideally, it would detect that there are selections marked up for other languages and ask me whether I want to overwrite them or retain them.

Once I set the languages, some of the languages make it over to the PDF when I save this PDF and some do not.

Steps:

  1. In Word, open an HTML document in which the body tag has a lang="en" attribute and several span tags which have a lang="{language code for some other language}" attribute, for example, lang="es" or lang="zh-hant".
  2. Save the document as a .docx file
  3. Save as PDF

The resulting PDF retains the language attributes of the various spans - provided Acrobat supports that language - but the English content is not marked as being in English, a WCAG violation of 3.1.1 language of page.

Bug: Word incorrectly fails to set the document language to English, ignoring the body tag's lang attribute.

  1. Since the best practice is to remediate the source document rather than the PDF, in order to keep remediation's in case the document is updated in the future, I go back to the .docx file and I go to the review tab and set the proofing language for the document as being English (US).

The document autosaves.

  1. Save as PDF

The resulting PDF has the entire PDF marked is being in English. The proofing language for the various selections in other languages has been lost.

Bug?: Word wipes out the existing markup for language of parts, a WCAG violation of 3.1.2 language of parts.

While I suppose one could argue that the behavior was technically correct, it would have been better for Word to confirm that I want to wipe out the contained language of parts or to set the document language to English but leave the previously specified language of parts as is.

The only fix is to do one of the following:

  1. Go back to the .docx file and I go to the review tab and set the proofing language for the document as being English (US).
  2. Find all the content that is in other languages, and reset the proofing language for each of them, a duplication of effort.
  3. Save as PDF

or

  1. Remediate the document language in Acrobat which is not a best practice because it would not carry over to any updates of the document.

There is a second issue which I do not know whether is a Word issue, an issue with the Save as PDF routine or an Acrobat issue. I will report it here in case it is a Word issue but I will also report it to Adobe:

Some proofing languages that can be set in Word do not make it over to Adobe Acrobat as part of the Save as PDF process while others do.

For example, if I set the proofing language in Word to Spanish (US) or to Filipino, those become null in Acrobat. However, if I set the proofing language in Word to Spanish (Mexico), that successfully carries over to Acrobat.

Microsoft 365 and Office | Word | For business | Windows

3 answers

Sort by: Most helpful
  1. WordWizzard 1,045 Reputation points
    2026-05-14T09:06:57.4533333+00:00

    I began with this HTML:

    <!DOCTYPE html>

    <html lang="en">

    <head>

    <meta charset="UTF-8">
    
    <title>Language Tagging Example</title>
    

    </head>

    <body>

    <h1>Welcome to Our Website</h1>
    
    <p>This paragraph is in English, which is the default language of this document.</p>
    
    <!-- Paragraph tagged for a different language (Spanish) -->
    
    <p lang="es">Este párrafo está escrito en español.</p>
    
    <!-- Inline span tagged for a different language (French) -->
    
    <p>The French phrase for "that's life" is <span lang="fr">c'est la vie</span>.</p>
    
    <!-- New inline span tagged for a different language (Traditional Chinese) -->
    
    <p>The traditional greeting used in Taiwan and Hong Kong is <span lang="zh-hant">你好</span>.</p>
    

    </body>

    </html>

    Next, I opened the HTML file using Word 2021 (desktop, not web). The proofing language was correctly set to English, and the Spanish and French text were set to Spanish and French, respectively. However, since I did not have a Traditional Chinese language installed, the Chinese text was tagged as English. Finally, I save the Word doc as PDF. When I opened the PDF using Acrobat XI, except for the Chinese text, the other text was tagged correctly.

    some of the languages make it over to the PDF when I save this PDF and some do not.

    This is likely caused by not having the proofing language installed in Word. If the language is not installed, Word will tag the text as English (default). Note: Not all authoring languages have proofing tools.

    The 365 web version does not have all the bells and whistles as the desktop version, and it very well may behave differently.

    Was this answer helpful?

    2 people found this answer helpful.
    0 comments No comments

  2. Vivian-HT 16,770 Reputation points Microsoft External Staff Moderator
    2026-05-14T01:23:19.2233333+00:00

    Dear @Charles Belov SFMTA,

    Thank you for the detailed report and for clearly documenting the steps and outcomes you’re seeing.

    What you’re encountering is a limitation in how Microsoft Word currently handles language metadata during conversion to tagged PDFs. When Word opens an HTML file, it does not reliably treat the <body lang="…"> attribute as the document’s default language. As a result, when the file is saved as PDF, Word may omit the document-level language tag (/Lang) even though language-of-parts (for spans marked with other languages) is preserved. This leads to a WCAG 2.x 3.1.1 (Language of Page) failure.

    When you later set the document proofing language (for example, English (US)) at the document level, Word applies that language to all text and overwrites existing language settings on individual selections. This removes previously defined language-of-parts, resulting in a WCAG 3.1.2 (Language of Parts) failure.

    Certain Word proofing languages (e.g., Spanish (US), Filipino) do not map cleanly to the language codes used in tagged PDFs and supported by Acrobat. In this case, this behavior appears due to how Word maps internal proofing languages to PDF language standards and Acrobat compatibility and may contribute to the inconsistencies you are observing.

    From an accessibility and standards perspective, this behavior is not ideal. However, at present it appears to be a product limitation in Word’s PDF export pipeline.

    Moreover, Microsoft hasn’t publicly acknowledged this specific issue as a bug in documentation. There is no setting in Word to automatically inherit the HTML <body lang> as the document language or preserve existing language-of-parts when changing the document’s default language.

    In this case, please be reassured that if there are any updates from Microsoft about this issue, I will reach out to you as soon as possible.

    Given the impact on accessibility standards (such as WCAG compliance) and the need for confirmation at a deeper level, we would recommend reaching out to Microsoft Support for further investigation. Here's the Microsoft Official Support site: https://support.microsoft.com/en-us/contactus

    If your environment is managed by an organization, please coordinate with your IT administrator, who can help raise a support ticket with Microsoft on your behalf. This will ensure that the issue can be reviewed by the appropriate engineering team, and they can confirm whether this behavior is expected or if any workaround or fix is available. For reference: Get support | Microsoft Docs   

    Note: Please understand that as a forum moderator, my role is to provide general guidance and troubleshooting support. I don’t have access to internal systems or backend tools, and I appreciate your understanding of these limits.

    Thank you for your patience and understanding. I'm looking forward for your reply.


    If the answer is helpful, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".   

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    Was this answer helpful?

    1 person found this answer helpful.
    0 comments No comments

  3. AI answer

    2026-05-13T23:24:34.9633333+00:00

    The described behavior cannot be fully addressed or changed using the available settings and tools documented in the provided context.

    From the context, only the following related behaviors are documented:

    1. Word and Office language settings
      • Display and proofing (authoring) languages are controlled via File → Options → Language.
      • A document can have:
        • A default “Office authoring language and proofing” (document-level language), and
        • Additional proofing languages applied to selections.
      • Proofing languages can be added and set as preferred; language packs may include proofing tools.
      Steps (from context) to configure language in Word:
      1. Open any Office application and go to File → Options → Language.
      2. Under Office authoring languages and proofing, ensure the desired language is listed.
      3. Select the language and choose Set as Preferred.
      4. If needed, use Add a language and then install proofing tools if available.
      This governs how Word treats the document’s main authoring language and proofing, but the context does not document any prompt or safeguard to preserve existing language-of-parts markup when changing the document language, nor any way to change that behavior.
    2. Language markup in exported PDFs
      • The context notes that Word’s PDF export supports language tagging for text in different languages using PDF/UA tags. For text in a different language, Word emits a <Span> tag with Lang=language code in the PDF/UA structure (for supported versions):
        • Text in different language (Version ≥ 16.0.16922.20000): <Span> Lang=language code
      • This confirms that Word can carry language-of-parts information into PDF/UA structure when the proofing language is set on specific text runs and when the language is supported.
      • The context does not document:
        • How the document-level language is mapped into the PDF’s language metadata.
        • Any known issue where setting the document language causes loss of language-of-parts markup.
        • Any list of languages that are or are not carried through to Acrobat (e.g., Spanish (US) vs Spanish (Mexico) vs Filipino).
    3. Known limitations and issues with saving to other formats
      • There is documentation of limitations when saving Word documents as Web pages (HTML), where certain formatting and features are changed or removed. This shows that format conversions can drop or alter some document semantics, but it does not specifically address language attributes or PDF export.
      • There is also documentation of specific issues with equations not being saved when exporting to PDF/XPS in older versions, and a crash issue when saving to PDF in Word 2016, but nothing about language metadata.

    Given the context, the only supported, documented actions are:

    • Use File → Options → Language and the Review → Language → Set Proofing Language commands to set:
      • A default authoring/proofing language for the document.
      • Specific proofing languages for selections that are in other languages.
    • Ensure that the Office version is new enough (≥ 16.0.16922.20000) to emit PDF/UA language tags for text in different languages.

    However, the context does not provide:

    • Any official statement that Word should preserve existing language-of-parts markup when changing the document language, or that it should prompt before overwriting it.
    • Any configuration or workaround to prevent the loss of language-of-parts markup when resetting the document language.
    • Any documentation about which specific languages are preserved or dropped when exporting to PDF, or why some Word proofing languages (e.g., Spanish (US), Filipino) might not appear correctly in Acrobat while others (e.g., Spanish (Mexico)) do.

    Therefore, based strictly on the provided context, the behavior described (loss of language-of-parts when setting document language; inconsistent transfer of some languages to Acrobat) cannot be confirmed as a documented bug, nor is there a documented fix or setting to change it. The only supported approach in the context is to:

    • Set the document’s default authoring/proofing language via File → Options → Language.
    • Explicitly set proofing languages on each non-default-language selection via Review → Language → Set Proofing Language so that Word can emit <Span Lang=...> tags for those parts when exporting to PDF.

    No additional remediation steps or configuration options are documented in the context.


    References:

    AI-generated content may be incorrect. Read our transparency notes for more information.

    Was this answer helpful?

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.