Why is System.Text.Rune named like this?

Question

Why is System.Text.Rune named like this?

PerceptiveFilament 21

Intro

.NET uses UTF-16 for string encoding. This means:

char type, which is atomic unit of string, is 16-bit size data type.
that surrogate pairs are used for Supplementary code points, U+10000..U+10FFFF, while for Basic Multilingual Plane, U+0000..U+FFFF one char is sufficient to express any Unicode scalar value.

.NET Rune comes in with ability to bypass surrogate pair problems, e.g. their unintended split. Thus by one variable all scalar values can be expressed.

var a = new Rune('a');
var grinnigFace = new Rune ( '\uD83D', '\uDE00' );
grinnnigFace.ToString();
"😀"

Terminology

There exist many terms related to string and char types: symbol, character, pictogram, emoji, grapheme, script, letter, mark, punctuation, accent, diacritics, emoticon, … .NET chose to use textual element as synonym for what is grapheme +/ cluster in Unicode.

Unicode® Technical Standard #51

Note that all emoji sequences are single grapheme clusters:

It is also not 100 % correct to use "grapheme".

Grapheme

In linguistics, a grapheme is the smallest functional unit of a writing system.

While let say this 🧑🏿‍🎄 emoji is composed as sequence of:

1F9D1, 🧑, \Ud83e\Uddd1
1F3FF, 🏿, \Ud83c\Udfff
200D, ‍, \U200d
1F384, 🎄, \Ud83c\Udf84

hardly can be any unit considered functional unit of writing system.

Document The Unicode® Standard: A Technical Introduction is more specific.

For example, in historic Spanish language sorting, "ll"; counts as a single text element. However, when Spanish words are typed, "ll" is two separate text elements: "l" and "l".

Text elements are encoded as sequences of one or more characters. Certain of these sequences are called combining character sequences, made up of a base letter and one or more combining marks, which are rendered around the base letter (above it, below it, etc.). For example, a sequence of "a" followed by a combining circumflex "^" would be rendered as "â".

There can be seen some intersection between Unicode and .NET on text/textual element. Nonetheless, rather opaque terminology reign is apparent.

The Rune

https://www.vocabulary.com/dictionary/rune

A rune is a letter used in early Germanic writing. A linguist might be interested in runes because they're evidence of ancient languages, while a mystic might use runes, believed by some to have magical properties, in fortune-telling.

https://www.thefreedictionary.com/rune

1a: Any of the characters in several alphabets used by ancient Germanic peoples from the 3rd to the 13th century.

1b: A similar character in another alphabet, sometimes believed to have magic powers.

2: A poem or incantation of mysterious significance, especially a magic charm.

Conclusion

After deeper look, "rune" does not resemble any of terminology used by Unicode and seems to be tightly coupled with Germanic tribes' writing system. As far as I can see, Rune is nothing more then Unicode Scalar Value. It is not important whether UCP is expressed by 1:1 numeric relation, surrogate pairs or by series of doggies and cats. It could be said that .NET is about to contribute to terminology goulash:

Why Rune is not UnicodeScalarValue or something more technical accurate?
I failed on finding any reference on why this name was chosen.
Is there some specific reason?
Or Rune is just fine as grapheme, glyph, symbol and others would be?

Viorel 122.6K Reputation points

2024-10-02T15:59:42.79+00:00

I think that it is not unusual to borrow and adapt the terms, without standardization or technical accuracy, for example: “string”, “bug”, “method”, etc.
PerceptiveFilament 21 Reputation points

2024-10-02T16:51:36.13+00:00

Yes, of couse. Generally it is quite common. On other hand in .NET environment is rather usual to be technically accurate. For instance Swift uses Unicode.Scalar which is very snug.

Rune is very specific to German tribes's literacy while Unicode scalars represent very broad set of writable and non-writable elements.

But motivation for this choice from Rob Pike and Ken Thompson is unknown. Maybe they found it sympathetic regards its magical connotations. For instance, https://thegrimoirevault.com/runic-magick/runic-magick-basics-spells-and-symbols/#:~:text=This%20article%20will%20guide%20you%20through%20the.

Beside string, char and other folks Rune appears flaunting.
PerceptiveFilament 21 Reputation points

2024-10-02T16:53:01.09+00:00

test test

1 answer

Your answer

Viorel 122.6K Reputation points

2024-10-02T15:59:42.79+00:00

I think that it is not unusual to borrow and adapt the terms, without standardization or technical accuracy, for example: “string”, “bug”, “method”, etc.
PerceptiveFilament 21 Reputation points

2024-10-02T16:51:36.13+00:00

Yes, of couse. Generally it is quite common. On other hand in .NET environment is rather usual to be technically accurate. For instance Swift uses Unicode.Scalar which is very snug.

Rune is very specific to German tribes's literacy while Unicode scalars represent very broad set of writable and non-writable elements.

But motivation for this choice from Rob Pike and Ken Thompson is unknown. Maybe they found it sympathetic regards its magical connotations. For instance, https://thegrimoirevault.com/runic-magick/runic-magick-basics-spells-and-symbols/#:~:text=This%20article%20will%20guide%20you%20through%20the.

Beside string, char and other folks Rune appears flaunting.
PerceptiveFilament 21 Reputation points

2024-10-02T16:53:01.09+00:00

test test

Answer 1

So, I found it at https://learn.microsoft.com/en-us/dotnet/fundamentals/runtime-libraries/system-text-rune?source=recommendations#rune-in-net-vs-other-languages

The term "rune" is not defined in the Unicode Standard. The term dates back to the creation of UTF-8. Rob Pike and Ken Thompson were looking for a term to describe what would eventually become known as a code point. They settled on the term "rune", and Rob Pike's later influence over the Go programming language helped popularize the term.

Share via

Why is System.Text.Rune named like this?

Intro

Terminology

The Rune

Conclusion

1 answer

Your answer