Unicode – Nemeth Character Mappings
In addition to handling 2D arrangements such as fractions, root, subscripts and superscripts, math layout programs need to be able to display the myriad math symbols discussed in Unicode Technical Report #25 Unicode Support for Mathematics. To interoperate with Nemeth braille, such programs need to map between Unicode characters and Nemeth braille sequences. Since Unicode and Nemeth braille were developed independently of one another, it’s not surprising that each can represent symbols that the other doesn’t have. For example, Nemeth doesn’t have Unicode’s reversed tilde ∽ and Unicode doesn’t have Nemeth’s extended tilde (like ∼∼ with no intervening space). Fortunately, all common math symbols have well-defined mappings. Nemeth braille has rules that guide reasonable choices for many Unicode math symbols not mentioned explicitly in the Nemeth specification, for example, see that specification’s §139 on negation and §147 on comparison signs compounded vertically. The present post discusses some of the Nemeth methodology and gives representative mappings for some symbols. If you’re interested in the full table, email me and I’ll send you the Word document containing the current mapping collection. Because of the size of this undertaking, only Nemeth math braille is considered. It would be worthwhile for someone to undertake a similar effort for Unified English Braille (UEB) math braille. Discussion on how Nemeth represents 2D layouts such as fractions is given in Nemeth Braille—the first math linear format.
Math zones
The focus here is on math zones, which are text ranges that have math typography, rather than normal typography. Natural language contractions are not used in math zones, so hopefully we can get general globalized math-symbol mappings. When math zones are embedded in UEB, a math-zone start delimiter would be ⠸⠩ and the math-zone end delimiter would be ⠸⠱ in accord with Using the Nemeth Code within UEB contexts. Math zones are key to working with technical documents since math-zone typography and conventions differ from those for normal text. So, a user needs to know when a math zone starts and ends.
Braille symbol construction techniques
The Nemeth specification describes several symbol construction techniques. Some very productive ones are illustrated in the following table, which also includes the section number in the Nemeth specification. The structure codes such as the termination indicator ⠻ are displayed in red
Mapping origins
The mappings given in the table below resulted from scouring the Nemeth specification, which is a pdf file of scanned images. As such it offers no search or link capabilities, and a paper version is more useful than the electronic version. It’s the first document I’ve printed in years, other than occasional tickets, boarding passes, and legal authorizations. There’s also a nicely formatted Nemeth specification in French complete with a navigation pane with links to all the rules. You can search the text including finding braille sequences, since the sequences are encoded in Nemeth Ascii braille. This is valuable in learning about sequences in general and whether a potentially new sequence is already defined. The combination ⠠⠱ doesn’t appear as math in the French specification except as part of the extended tilde ⠈⠠⠱, so that seems like a good candidate for encoding the missing reversed tilde ∽ (∼ is given by ⠈⠱). The French version’s content has differences from the original English version, so I checked both in creating the table entries. It would be nice if someone would enter the 1972 Nemeth specification into Word so that a more accessible pdf could be created in English. A partial version with MathSpeak functionality is given in The Nemeth Braille Code for Mathematics and Science. An ASCII braille version (brf) can be downloaded from here.
Challenging mappings
Unicode 9.0 has a total of 2310 characters that have the math property (see Math property in DerivedCoreProperties.txt). Of these, many of the more advanced symbols don’t have unambiguous Nemeth representations. In particular, Nemeth doesn’t distinguish between slanted and vertical bar overlays, e.g., 219A ↚ and 21F7 ⇷ are both given by ⠳⠈⠫⠪⠒⠒⠻ . Nemeth doesn’t have white arrows like ⇦, and white/black arrow heads like ⭠. Unicode has symbols like the bowtie ⧑ that don’t have apparent Nemeth representations. One possibility for ⋈ is as a shape with a suggestive name like “bt” as in ⠫⠃⠞, but one still needs to encode whether the sides are black or white since Unicode encodes all four possibilities.
Conversely, Nemeth has quite a few symbols not in Unicode. Often these can be constructed with a combination of Unicode symbols or with math layout objects in applications like Microsoft Word. We can submit proposals to add characters given in the Nemeth specification but not yet in Unicode provided the characters occur in journals or books. Nemeth’s extended tilde is one such character to research. Since Nemeth braille is based on a productive syntax, many symbol combinations can be created. Eventually I hope to collect all Unicode math characters that have reasonable Nemeth braille sequences and add their mapping data to the information associated with Unicode Technical Report #25, Unicode Support for Mathematics.
Sample mappings
The table below lists some representative Unicode characters used in mathematical text along with their Unicode names and the corresponding Nemeth math braille sequences. The table doesn’t include any mappings of the Unicode math alphanumerics, since they are defined in the post Nemeth Braille Alphanumerics and Unicode Math Alphanumerics. Relational operators (Nemeth calls them “signs and symbols of comparison”) need to be surrounded by spaces. The spaces are not included in the table since the relational operator property is defined in the MathClass data file and software can insert spaces programmatically. It’d be easy to add the math behavior column in MathClass.txt for quick reference. The full mapping table isn’t included since I don’t know how to convince the MSDN blogging facility to use a braille font that displays nondot place holders and braille is hard to read without them.