3.1 Encapsulating HTML into RTF
Having the following source HTML content:
-
<HTML><head> <style> <!-- /* Style Definitions */ p.MsoNormal, li.MsoNormal {font-family:Arial;} --> </style> <!-- This is a HTML comment. There is a horizontal tab (%x09) character before the comment, and some new lines inside the comment. --> </head> <body> <p class="MsoNormal">Note the line break inside a P tag. <b>This is bold text</b> </p> <p class="MsoNormal"> This is a normal text with a character references: < ¨<br> characters that have special meaning in RTF: {}\<br> </p> <ol> <li class="MsoNormal">This is a list item </ol> </body> </HTML>
An encapsulating RTF writer can (by conforming to this algorithm) produce the following RTF:
-
{\rtf1\ANSI\ansicpg1251\fromhtml1 \deff0 {\fonttbl {\f0\fmodern Courier New;}{\f1\fswiss Arial;}{\f2\fswiss\fcharset0 Arial;}} {\colortbl\red0\green0\blue0;\red0\green0\blue255;} {\*\htmltag64} \uc1\pard\plain\deftab360 \f0\fs24 {\*\htmltag <HTML><head>\par <style>\par <!--\par /* Style Definitions */\par p.MsoNormal, li.MsoNormal \{font-family:Arial;\}\par -->\par </style>\par \tab <!-- This is a HTML comment.\par There is a horizontal tab (%x09) character before the comment, \par and some new lines inside the comment. -->\par </head>\par <body>\par <p\par class="MsoNormal">} {\htmlrtf \f1 \htmlrtf0 Note the line break inside a P tag. {\*\htmltag <b>}{\htmlrtf \b \htmlrtf0 This is a bold text{\*\htmltag </b>}} \htmlrtf\par\htmlrtf0} \htmlrtf \par \htmlrtf0 {\*\htmltag </p>\par <p class="MsoNormal">\par} {\htmlrtf \f1 \htmlrtf0 This is a normal text with a character references: {\*\htmltag }\htmlrtf \'a0\htmlrtf0 {\*\htmltag <}\htmlrtf <\htmlrtf0 {\*\htmltag ¨}\htmlrtf {\f2\'a8}\htmlrtf0{\*\htmltag <br>\par}\htmlrtf\line\htmlrtf0 characters which have special meaning in RTF: \{\}\\{\*\htmltag <br>\par}\htmlrtf\line\htmlrtf0\htmlrtf\par\htmlrtf0} {\*\htmltag </p>\par <ol>\par <li class="MsoNormal">}{\htmlrtf {{\*\pn\pnlvlbody\pndec\pnstart1\pnindent360{\pntxta.}}\li360\fi-360{\pntext 1.\tab} \f1 \htmlrtf0 This is a list item}\htmlrtf\par\htmlrtf0} {\*\htmltag \par </ol>\par </body>\par </HTML>\par }}
A de-encapsulating RTF reader can recover the original HTML document from the RTF example in this section by conforming to this algorithm.