This glossary contains definitions for terms used in the Uniscribe documentation.
The "A" width is underhang (positive; also known as "padding") or overhang (negative) to the left of the on-screen equivalent of ink that represents the glyph or run. The "B" width is the black width, the width from the leftmost ink to the rightmost ink. The "C" width is overhang to the right of the ink.
The following illustration shows an italic lowercase F with overhang to both its left and right. That is, the "A" and "C" widths here are both negative. See underhang for an illustration of positive "A" and "C" widths.
When two or more glyphs are displayed as a unit, usually only the leftmost glyph contributes to the "A" width of the run, and only the rightmost glyph contributes to the "C" width of the run. However, this is not a strict rule. For example, if the first glyph in a run is a narrow letter and the second glyph is a wide diacritical mark, and they are handled as separate glyphs, the diacritical mark might actually extend beyond the letter.
The advance width of a glyph is the movement in the direction of writing from the starting point for rendering that glyph to the starting point for rendering the next glyph.
The bidirectional stack is a 5-bit integer that keeps track of nesting levels between left-to-right and right-to-left text. It always starts at zero for left-to-right. Thus all even-numbered values represent left-to-right text and all odd-numbered values represent right-to-left text. The bidirectional stack is represented in the uBidiLevel member of a SCRIPT_STATE structure.
Bidirectional text contains both left-to-right and right-to left portions, but the term is also sometimes loosely applied to pure right-to-left text. All right-to-left text requires the use of the bidirectional stack, because the default embedding level of zero implies left-to-right text.
An application can justify text to fit a line by adjusting the cell width for certain glyphs. For unjustified text, the cell width for a glyph is the same as its advance width.
A cluster is the smallest linguistic unit that can be shaped. In languages such as Arabic and many of the Indic languages, the glyphs used to represent each character (Unicode code point) depend strongly on the surrounding code points, which constitute the cluster. In these languages, applications can translate code points into appropriate glyphs only by looking at the cluster. In some scripts, such as Devanagari, the order of glyphs within a cluster can differ from the order of the corresponding Unicode code points. For more information, see Windows Glyph Processing on the Microsoft typography site.
A complex script is a script with any of the following properties:
- Allows bidirectional rendering.
- Has contextual shaping.
- Has combining characters.
- Has specialized word-breaking and justification rules.
- Filters out illegal character combinations.
- Is not supported in the core Windows fonts and therefore might require font fallback.
In some complex scripts, the order of the glyphs might be quite different from the order of the underlying Unicode characters they represent. See About Complex Scripts for more detail.
In the context of typography, it is sometimes desirable to handle the Latin script used in writing English as a complex script. Examples include the Stylistic Alternates feature described in the documentation of OPENTYPE_FEATURE_RECORD, or ligatures, such as "ﬁ", where a single glyph represents two or more consecutive characters.
Font fallback is automated selection of a font other than the font selected by the user in an application. In Uniscribe, font fallback is applied by the ScriptStringAnalyse function when all or part of the text is in a script that the user-selected font does not support.
A glyph is a single unit of display in a font. For OpenType, this unit is defined by an outline. For other types of fonts, it can be defined by a bitmap, a set of graphic commands, and the like. A glyph does not necessarily correspond to a single character. For example, the "fi" ligature ("ﬁ") represents the two characters "f" and "i". The Vietnamese lowercase "o" with circumflex and tilde ("ỗ") is typically composed from multiple glyphs.
An item has a single script and direction. The ScriptItemize or ScriptItemizeOpenType function can analyze a paragraph into items. An item is not necessarily a run. It can contain characters of multiple styles. Item and run information must be combined to determine ranges.
LRM indicates the LEFT-TO-RIGHT MARK (Unicode code point U+200E). This mark specifies that characters following it in logical order should be rendered left-to-right.
LTR indicates left-to-right.
RLM indicates the RIGHT-TO-LEFT MARK (Unicode code point U+200F). This mark indicates that characters following it in logical order should be rendered right-to-left.
RTL indicates right-to-left.
A run is a passage of text for Uniscribe to render. It should have a single style, that is, font, size, and color, but can be drawn from a variety of scripts. A run can contain both left-to-right and right-to-left content.
NADS indicates NATIONAL DIGIT SHAPES (Unicode code point U+206E. The term specifies that European digits (U+0030 through U+0039) should be rendered as national digits. See Digit Shapes for further discussion of national digits.
NODS indicates NOMINAL DIGIT SHAPES (Unicode code point U+206F). The term specifies that European digits (U+0030 through U+0039) should be rendered normally, not as national digits.
The overhang is the part of the ink of a glyph that extends beyond the advance width of the glyph. Most glyphs (such as "H") have no overhang, as there is a little white space on either side to separate them from adjacent glyphs. An example of a glyph with overhang is the italic "f" used in this topic to illustrate ABC width. Both the top and bottom of the italic "f" overhang the adjacent glyphs. Overhang corresponds to a negative "A" or "C" width.
A script is a system of written language, for example, Latin script, Arabic script, Chinese script. A single script can apply to one or many human languages. The script has no particular relation to a font. For example, the Latin script can be rendered equally well by the Times New Roman or the Arial font.
The underhang is a width of white space to the left or right of the solid portion of a glyph. Underhang corresponds to a positive "A" or "C" width, as described for ABC width. Underhang is sometimes known as "padding". The following illustration shows the underhang for the lowercase letter n.