Microsoft Word Fails to Correctly Render Nested MathML <msup> elements via HTML AltChunk Import

SL 20 Reputation points
2025-12-02T18:23:57.8633333+00:00

Hello,

I am generating DOCX files programmatically (using Apache POI, which constructs the standard Open XML structure) and am utilizing the w:altChunk element to import HTML content containing MathML.

The core document structure is confirmed to be valid, and simple HTML and MathML elements are imported correctly. However, I am encountering a rendering failure in Microsoft Word when the MathML contains a nested superscript using the standard Presentation MathML tag <msup>.

The issue is that the nested superscript is not visually stacked correctly but is instead rendered inline or incorrectly offset.

Details & Replication

1. The MathML Code (HTML Snippet)

The following is the exact MathML snippet contained within the HTML file referenced by the AltChunk:

HTML


<math xmlns="http://www.w3.org/1998/Math/MathML" display="inline">
    <msup>
        <mi>X</mi>
        <msup>
            <mi>1</mi>
            <mi>2</mi>
        </msup>
    </msup>
</math>

Expected Result (Standard MathML Rendering) - should be same as it is in i_0.html.

Screenshot 2025-12-02 at 10.13.30 AM

Actual Result in Word:

flatten "X12".

Attached "result.txt" file. Please rename it to "result.docx", unzip the docx file to see the i_0.html. Open i_0.html in browser and open result.docx in Word, compare the rendering of mathML.

Thanks.

result.txt

Microsoft 365 and Office | Word | Other | Other
{count} votes

Answer accepted by question author
  1. Kai-H 5,620 Reputation points Microsoft External Staff Moderator
    2025-12-03T08:51:06.2366667+00:00

    Hi, SL

    Welcome to Microsoft Q&A forum.

    Thanks for your question. Here are the explanations and workarounds for your situation:

    What’s really happening

    • Word’s HTML importer does not implement W3C Presentation MathML. When you feed HTML that contains <math>…</math> via w:altChunk, Word’s HTML pipeline treats the MathML as unknown markup and flattens/normalizes inline runs. That’s why nested <msup> collapses into something like X12 rather than a stacked superscript.
    • The AltChunk contract is: if the application can process the content type, it will import and convert it to WordprocessingML; otherwise, it will ignore/flatten the unknown parts and continue. MathML falls into the “unknown to the HTML import” bucket.
    • Properties like w:altChunkPr/w:matchSrc only control whether source formatting is preserved during import; they don’t add MathML support.

    Implication: Rendering is correct in the browser (which understands MathML) but not in Word, because Word’s HTML>DOCX pipeline doesn’t understand MathML superscript stacking semantics.

    Recommended approaches (that do work in Word)

    1. Convert W3C MathML > OMML and insert as Word math

    Word’s native math model is OMML (<m:oMath>), not W3C MathML. If you transform your MathML into OMML and add it to document.xml as built-up equations, stacked superscripts render correctly.

    • Use the mml2omml XSLT (commonly referenced for Word) to transform input MathML to OMML, then insert the resulting <m:oMath> into your paragraph runs (not via AltChunk).
    • Programmatically with Open XML SDK / Apache POI: construct <w:r><m:oMath>…</m:oMath></w:r> in the target paragraph.
    • Word exposes the math object model (OMaths) in the interop API - indicating OMML is the supported path for equations in DOCX.

    AltChunk remains fine for generic HTML, but for math you’ll want to bypass AltChunk and write OMML directly into the main document part.

    2) Generate equations using UnicodeMath or LaTeX in Word (then serialize)

    If your pipeline can originate the math as UnicodeMath or LaTeX strings, Word can build them up into professional equations and persist them as OMML. This guarantees correct stacking of superscripts/subscripts.

    • Example (UnicodeMath): X^(1^2) typed in a Word math zone builds into proper stacked superscripts and is saved as <m:oMath> in the DOCX. [Linear for...eX in Word]
    • You can control fonts and symbols (e.g., Cambria Math) and rely on Math AutoCorrect for symbol coverage.

    3) As a fallback, rasterize math to images

    If transforming to OMML is not feasible, render the MathML server-side (MathJax/other) to SVG/PNG at appropriate DPI and insert as images. This preserves appearance but loses semantic editing in Word.

    Why AltChunk+HTML fails for <msup> nesting

    • The AltChunk spec notes applications may ignore content types they cannot process; Word’s HTML importer lacks the MathML layout semantics for superscripts/subscripts and will flatten unknown trees into inline text.
    • Tweaks like w:matchSrc or CSS do not bridge that gap; they only affect styling when the HTML feature itself is supported.

    Implementation outline (MathML > OMML > DOCX)

    Below is a conceptual outline you can adapt to Apache POI (or use Open XML SDK in .NET); the key is transforming MathML to OMML and writing it into the main document part.

    <!-- Resulting OMML snippet to insert into a run -->
    <w:p>
      <w:r>
        <m:oMath paraPr="...">
          <m:sSup> <!-- superscript -->
            <m:e>   <!-- base -->
              <m:r><m:t>X</m:t></m:r>
            </m:e>
            <m:sup> <!-- the superscript -->
              <m:sSup>
                <m:e><m:r><m:t>1</m:t></m:r></m:e>
                <m:sup><m:r><m:t>2</m:t></m:r></m:sup>
              </m:sSup>
            </m:sup>
          </m:sSup>
        </m:oMath>
      </w:r>
    </
    

    Steps:

    • For each <math> block in your HTML source, extract the MathML fragment.
    • Run the fragment through mml2omml.xsl to produce OMML.
    • In POI, build the corresponding <m:oMath> tree inside a run (<w:r>) in the target paragraph; avoid AltChunk for math content.
    • Ensure the doc defaults or styles reference Cambria Math for math font consistency.

    Hope this helps. Feel free to get back if you need further assistance.


    If the answer is helpful, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment."    

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread. 

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.