Arabic and Mathematical Enclosures

This post describes the Arabic subtending marks and discusses how their editing and display could be significantly improved by using a mathematical layout engine instead of using a complex-script shaping engine. Unicode has a set of six Arabic subtending or enclosure characters located at U+0600..U+0604 (؀, ؁, ؂, ؃, ؄, respectively) and the End-of-Ayah mark U+06DD (۝). Proper display of these characters when followed by a sequence of Arabic-Indic digits U+0660..U+0669 (٠,١, ٢, ٣, ٤, ٥, ٦, ٧, ٨,٩) is to extend them under the digit sequence, or for ۝ to enclose the digit sequence. The following table displays the character codes along with a numeric example.  

The layouts here are made with the Arabic shaping engine used by Uniscribe and DWrite. The digits are drawn with a reduced font size and the subtending mark is overlaid on top. Due to limitations in the shaping engine, a maximum of four digits can be used with the subtending marks except for ؂ and a maximum of three with ۝. ؂ is displayed as an enhanced underline and is drawn by the ؂ glyph followed by glyph pairs consisting of a digit with an underline. Hence the number can have arbitrary length.

Another difficulty occurs with selection. Except for ؂ , it’s tricky to figure out where to place the caret for an insertion point or the selection color rectangles for a nondegenerate selection.

Now consider how enclosed or decorated expressions are displayed using a math display engine like LineServices. Such constructs include accents, rectangles, and brackets. The expression can contain arbitrary mathematical text, let alone Arabic-Indic digits, and can be arbitrarily large. The math display engine measures the ascent, descent, and width of the expression, chooses a glyph assembly of the correct size, and then positions that glyph assembly as need be for the construct in question. Examples are given in the following table along with the linear format input used to generate them. 

Selection works as desired. You can select any part of the expressions, subject to the rules of mathematical selection (see also).

We see that treating the Arabic subtending and End-of-Ayah mark constructs as mathematical objects could significantly improve their editing and rendering.