IdnMapping.GetAscii Method

Definition

Encodes a string of domain name labels that include Unicode characters outside the US-ASCII character range to a string of displayable Unicode characters in the US-ASCII character range (U+0020 to U+007E). The string is formatted according to the IDNA standard.

Overloads

GetAscii(String)

Encodes a string of domain name labels that consist of Unicode characters to a string of displayable Unicode characters in the US-ASCII character range. The string is formatted according to the IDNA standard.

GetAscii(String, Int32)

Encodes a substring of domain name labels that include Unicode characters outside the US-ASCII character range. The substring is converted to a string of displayable Unicode characters in the US-ASCII character range and is formatted according to the IDNA standard.

GetAscii(String, Int32, Int32)

Encodes the specified number of characters in a substring of domain name labels that include Unicode characters outside the US-ASCII character range. The substring is converted to a string of displayable Unicode characters in the US-ASCII character range and is formatted according to the IDNA standard.

GetAscii(String)

Source:
IdnMapping.cs
Source:
IdnMapping.cs
Source:
IdnMapping.cs

Encodes a string of domain name labels that consist of Unicode characters to a string of displayable Unicode characters in the US-ASCII character range. The string is formatted according to the IDNA standard.

C#
public string GetAscii(string unicode);

Parameters

unicode
String

The string to convert, which consists of one or more domain name labels delimited with label separators.

Returns

The equivalent of the string specified by the unicode parameter, consisting of displayable Unicode characters in the US-ASCII character range (U+0020 to U+007E) and formatted according to the IDNA standard.

Exceptions

unicode is null.

unicode is invalid based on the AllowUnassigned and UseStd3AsciiRules properties, and the IDNA standard.

Examples

The following example uses the GetAscii(String) method to convert an array of internationalized domain names to Punycode, which is an encoded equivalent that consists of characters in the US-ASCII character range. The GetUnicode(String) method then converts the Punycode domain name back into the original domain name, but replaces the original label separators with the standard label separator.

C#
using System;
using System.Globalization;

public class Example
{
   public static void Main()
   {
      string[] names = { "bücher.com", "мойдомен.рф", "παράδειγμα.δοκιμή",
                         "mycharity\u3002org",
                         "prose\u0000ware.com", "proseware..com", "a.org",
                         "my_company.com" };
      IdnMapping idn = new IdnMapping();

      foreach (var name in names) {
         try {
            string punyCode = idn.GetAscii(name);
            string name2 = idn.GetUnicode(punyCode);
            Console.WriteLine("{0} --> {1} --> {2}", name, punyCode, name2);
            Console.WriteLine("Original: {0}", ShowCodePoints(name));
            Console.WriteLine("Restored: {0}", ShowCodePoints(name2));
         }
         catch (ArgumentException) {
            Console.WriteLine("{0} is not a valid domain name.", name);
         }
         Console.WriteLine();
      }
   }

   private static string ShowCodePoints(string str1)
   {
      string output = "";
      foreach (var ch in str1)
         output += $"U+{(ushort)ch:X4} ";

      return output;
   }
}
// The example displays the following output:
//    bücher.com --> xn--bcher-kva.com --> bücher.com
//    Original: U+0062 U+00FC U+0063 U+0068 U+0065 U+0072 U+002E U+0063 U+006F U+006D
//    Restored: U+0062 U+00FC U+0063 U+0068 U+0065 U+0072 U+002E U+0063 U+006F U+006D
//
//    мойдомен.рф --> xn--d1acklchcc.xn--p1ai --> мойдомен.рф
//    Original: U+043C U+043E U+0439 U+0434 U+043E U+043C U+0435 U+043D U+002E U+0440 U+0444
//    Restored: U+043C U+043E U+0439 U+0434 U+043E U+043C U+0435 U+043D U+002E U+0440 U+0444
//
//    παράδειγμα.δοκιμή --> xn--hxajbheg2az3al.xn--jxalpdlp --> παράδειγμα.δοκιμή
//    Original: U+03C0 U+03B1 U+03C1 U+03AC U+03B4 U+03B5 U+03B9 U+03B3 U+03BC U+03B1 U+002E U+03B4 U+03BF U+03BA U+03B9 U+03BC U+03AE
//    Restored: U+03C0 U+03B1 U+03C1 U+03AC U+03B4 U+03B5 U+03B9 U+03B3 U+03BC U+03B1 U+002E U+03B4 U+03BF U+03BA U+03B9 U+03BC U+03AE
//
//    mycharity。org --> mycharity.org --> mycharity.org
//    Original: U+006D U+0079 U+0063 U+0068 U+0061 U+0072 U+0069 U+0074 U+0079 U+3002 U+006F U+0072 U+0067
//    Restored: U+006D U+0079 U+0063 U+0068 U+0061 U+0072 U+0069 U+0074 U+0079 U+002E U+006F U+0072 U+0067
//
//    prose ware.com is not a valid domain name.
//
//    proseware..com is not a valid domain name.
//
//    a.org --> a.org --> a.org
//    Original: U+0061 U+002E U+006F U+0072 U+0067
//    Restored: U+0061 U+002E U+006F U+0072 U+0067
//
//    my_company.com --> my_company.com --> my_company.com
//    Original: U+006D U+0079 U+005F U+0063 U+006F U+006D U+0070 U+0061 U+006E U+0079 U+002E U+0063 U+006F U+006D
//    Restored: U+006D U+0079 U+005F U+0063 U+006F U+006D U+0070 U+0061 U+006E U+0079 U+002E U+0063 U+006F U+006D

Remarks

The unicode parameter specifies a string of one or more labels that consist of valid Unicode characters. The labels are separated by label separators. The unicode parameter cannot begin with a label separator, but it can include and optionally end with a separator. The label separators are FULL STOP (period, U+002E), IDEOGRAPHIC FULL STOP (U+3002), FULLWIDTH FULL STOP (U+FF0E), and HALFWIDTH IDEOGRAPHIC FULL STOP (U+FF61). For example, the domain name "www.adatum.com" consists of the labels, "www", "adatum", and "com", separated by periods.

A label cannot contain any of the following characters:

The GetAscii method converts all label separators to FULL STOP (period, U+002E).

If unicode contains no characters outside the US-ASCII character range and no characters within the US-ASCII character range are prohibited, the method returns unicode unchanged.

Notes to Callers

In the .NET Framework 4.5, the IdnMapping class supports different versions of the IDNA standard, depending on the operating system in use:

See Unicode Technical Standard #46: IDNA Compatibility Processing for the differences in the way these standards handle particular sets of characters.

Applies to

.NET 10 and other versions
Product Versions
.NET Core 1.0, Core 1.1, Core 2.0, Core 2.1, Core 2.2, Core 3.0, Core 3.1, 5, 6, 7, 8, 9, 10
.NET Framework 2.0, 3.0, 3.5, 4.0, 4.5, 4.5.1, 4.5.2, 4.6, 4.6.1, 4.6.2, 4.7, 4.7.1, 4.7.2, 4.8, 4.8.1
.NET Standard 2.0, 2.1
UWP 10.0

GetAscii(String, Int32)

Source:
IdnMapping.cs
Source:
IdnMapping.cs
Source:
IdnMapping.cs

Encodes a substring of domain name labels that include Unicode characters outside the US-ASCII character range. The substring is converted to a string of displayable Unicode characters in the US-ASCII character range and is formatted according to the IDNA standard.

C#
public string GetAscii(string unicode, int index);

Parameters

unicode
String

The string to convert, which consists of one or more domain name labels delimited with label separators.

index
Int32

A zero-based offset into unicode that specifies the start of the substring to convert. The conversion operation continues to the end of the unicode string.

Returns

The equivalent of the substring specified by the unicode and index parameters, consisting of displayable Unicode characters in the US-ASCII character range (U+0020 to U+007E) and formatted according to the IDNA standard.

Exceptions

unicode is null.

index is less than zero.

-or-

index is greater than the length of unicode.

unicode is invalid based on the AllowUnassigned and UseStd3AsciiRules properties, and the IDNA standard.

Remarks

The unicode and index parameters define a substring with one or more labels that consist of valid Unicode characters. The labels are separated by label separators. The first character of the substring cannot begin with a label separator, but it can include and optionally end with a separator. The label separators are FULL STOP (period, U+002E), IDEOGRAPHIC FULL STOP (U+3002), FULLWIDTH FULL STOP (U+FF0E), and HALFWIDTH IDEOGRAPHIC FULL STOP (U+FF61). For example, the domain name "www.adatum.com" consists of the labels, "www", "adatum", and "com", separated by periods.

A label cannot contain any of the following characters:

The GetAscii method converts all label separators to FULL STOP (period, U+002E).

If unicode contains no characters outside the US-ASCII character range and no characters within the US-ASCII character range are prohibited, the method returns unicode unchanged.

Notes to Callers

In the .NET Framework 4.5, the IdnMapping class supports different versions of the IDNA standard, depending on the operating system in use:

See Unicode Technical Standard #46: IDNA Compatibility Processing for the differences in the way these standards handle particular sets of characters.

Applies to

.NET 10 and other versions
Product Versions
.NET Core 1.0, Core 1.1, Core 2.0, Core 2.1, Core 2.2, Core 3.0, Core 3.1, 5, 6, 7, 8, 9, 10
.NET Framework 2.0, 3.0, 3.5, 4.0, 4.5, 4.5.1, 4.5.2, 4.6, 4.6.1, 4.6.2, 4.7, 4.7.1, 4.7.2, 4.8, 4.8.1
.NET Standard 2.0, 2.1
UWP 10.0

GetAscii(String, Int32, Int32)

Source:
IdnMapping.cs
Source:
IdnMapping.cs
Source:
IdnMapping.cs

Encodes the specified number of characters in a substring of domain name labels that include Unicode characters outside the US-ASCII character range. The substring is converted to a string of displayable Unicode characters in the US-ASCII character range and is formatted according to the IDNA standard.

C#
public string GetAscii(string unicode, int index, int count);

Parameters

unicode
String

The string to convert, which consists of one or more domain name labels delimited with label separators.

index
Int32

A zero-based offset into unicode that specifies the start of the substring.

count
Int32

The number of characters to convert in the substring that starts at the position specified by index in the unicode string.

Returns

The equivalent of the substring specified by the unicode, index, and count parameters, consisting of displayable Unicode characters in the US-ASCII character range (U+0020 to U+007E) and formatted according to the IDNA standard.

Exceptions

unicode is null.

index or count is less than zero.

-or-

index is greater than the length of unicode.

-or-

index is greater than the length of unicode minus count.

unicode is invalid based on the AllowUnassigned and UseStd3AsciiRules properties, and the IDNA standard.

Examples

The following example uses the GetAscii(String, Int32, Int32) method to convert an internationalized domain name to a domain name that complies with the IDNA standard. The GetUnicode(String, Int32, Int32) method then converts the standardized domain name back into the original domain name, but replaces the original label separators with the standard label separator.

C#
// This example demonstrates the GetAscii and GetUnicode methods.
// For sake of illustration, this example uses the most complex
// form of those methods, not the most convenient.

using System;
using System.Globalization;

class Sample
{
    public static void Main()
    {
/*
   Define a domain name consisting of the labels: GREEK SMALL LETTER
   PI (U+03C0); IDEOGRAPHIC FULL STOP (U+3002); GREEK SMALL LETTER
   THETA (U+03B8); FULLWIDTH FULL STOP (U+FF0E); and "com".
*/
    string name = "\u03C0\u3002\u03B8\uFF0Ecom";
    string international;
    string nonInternational;

    string msg1 = "the original non-internationalized \ndomain name:";
    string msg2 = "Allow unassigned characters?:     {0}";
    string msg3 = "Use non-internationalized rules?: {0}";
    string msg4 = "Convert the non-internationalized domain name to international format...";
    string msg5 = "Display the encoded domain name:\n\"{0}\"";
    string msg6 = "the encoded domain name:";
    string msg7 = "Convert the internationalized domain name to non-international format...";
    string msg8 = "the reconstituted non-internationalized \ndomain name:";
    string msg9 = "Visually compare the code points of the reconstituted string to the " +
                  "original.\n" +
                  "Note that the reconstituted string contains standard label " +
                  "separators (U+002e).";
// ----------------------------------------------------------------------------
    CodePoints(name, msg1);
// ----------------------------------------------------------------------------

    IdnMapping idn = new IdnMapping();

    Console.WriteLine(msg2, idn.AllowUnassigned);
    Console.WriteLine(msg3, idn.UseStd3AsciiRules);
    Console.WriteLine();
// ----------------------------------------------------------------------------
    Console.WriteLine(msg4);
    international = idn.GetAscii(name, 0, name.Length);
    Console.WriteLine(msg5, international);
    Console.WriteLine();
    CodePoints(international, msg6);
// ----------------------------------------------------------------------------
    Console.WriteLine(msg7);
    nonInternational = idn.GetUnicode(international, 0, international.Length);
    CodePoints(nonInternational, msg8);
    Console.WriteLine(msg9);
    }
// ----------------------------------------------------------------------------
    static void CodePoints(string value, string title)
    {
    Console.WriteLine("Display the Unicode code points of {0}", title);
    foreach (char c in value)
        {
        Console.Write("{0:x4} ", Convert.ToInt32(c));
        }
        Console.WriteLine();
        Console.WriteLine();
    }
}
/*
This code example produces the following results:

Display the Unicode code points of the original non-internationalized
domain name:
03c0 3002 03b8 ff0e 0063 006f 006d

Allow unassigned characters?:     False
Use non-internationalized rules?: False

Convert the non-internationalized domain name to international format...
Display the encoded domain name:
"xn--1xa.xn--txa.com"

Display the Unicode code points of the encoded domain name:
0078 006e 002d 002d 0031 0078 0061 002e 0078 006e 002d 002d 0074 0078 0061 002e 0063 006f
006d

Convert the internationalized domain name to non-international format...
Display the Unicode code points of the reconstituted non-internationalized
domain name:
03c0 002e 03b8 002e 0063 006f 006d

Visually compare the code points of the reconstituted string to the original.
Note that the reconstituted string contains standard label separators (U+002e).

*/

Remarks

The Unicode, index, and count parameters define a substring with one or more labels that consist of valid Unicode characters. The labels are separated by label separators. The first character of the substring cannot begin with a label separator, but it can include and optionally end with a separator. The label separators are FULL STOP (period, U+002E), IDEOGRAPHIC FULL STOP (U+3002), FULLWIDTH FULL STOP (U+FF0E), and HALFWIDTH IDEOGRAPHIC FULL STOP (U+FF61). For example, the domain name "www.adatum.com" consists of the labels, "www", "adatum", and "com", separated by periods.

A label cannot contain any of the following characters:

The GetAscii method converts all label separators to FULL STOP (period, U+002E). If the substring contains no characters outside the US-ASCII character range, and no characters within the US-ASCII character range are prohibited, the method returns the substring unchanged.

Notes to Callers

In the .NET Framework 4.5, the IdnMapping class supports different versions of the IDNA standard, depending on the operating system in use:

See Unicode Technical Standard #46: IDNA Compatibility Processing for the differences in the way these standards handle particular sets of characters.

Applies to

.NET 10 and other versions
Product Versions
.NET Core 1.0, Core 1.1, Core 2.0, Core 2.1, Core 2.2, Core 3.0, Core 3.1, 5, 6, 7, 8, 9, 10
.NET Framework 2.0, 3.0, 3.5, 4.0, 4.5, 4.5.1, 4.5.2, 4.6, 4.6.1, 4.6.2, 4.7, 4.7.1, 4.7.2, 4.8, 4.8.1
.NET Standard 2.0, 2.1
UWP 10.0