Редагувати

Поділитися через


IdnMapping.GetAscii Method

Definition

Encodes a string of domain name labels that include Unicode characters outside the US-ASCII character range to a string of displayable Unicode characters in the US-ASCII character range (U+0020 to U+007E). The string is formatted according to the IDNA standard.

Overloads

GetAscii(String)

Encodes a string of domain name labels that consist of Unicode characters to a string of displayable Unicode characters in the US-ASCII character range. The string is formatted according to the IDNA standard.

GetAscii(String, Int32)

Encodes a substring of domain name labels that include Unicode characters outside the US-ASCII character range. The substring is converted to a string of displayable Unicode characters in the US-ASCII character range and is formatted according to the IDNA standard.

GetAscii(String, Int32, Int32)

Encodes the specified number of characters in a substring of domain name labels that include Unicode characters outside the US-ASCII character range. The substring is converted to a string of displayable Unicode characters in the US-ASCII character range and is formatted according to the IDNA standard.

GetAscii(String)

Source:
IdnMapping.cs
Source:
IdnMapping.cs
Source:
IdnMapping.cs

Encodes a string of domain name labels that consist of Unicode characters to a string of displayable Unicode characters in the US-ASCII character range. The string is formatted according to the IDNA standard.

public:
 System::String ^ GetAscii(System::String ^ unicode);
public string GetAscii (string unicode);
member this.GetAscii : string -> string
Public Function GetAscii (unicode As String) As String

Parameters

unicode
String

The string to convert, which consists of one or more domain name labels delimited with label separators.

Returns

The equivalent of the string specified by the unicode parameter, consisting of displayable Unicode characters in the US-ASCII character range (U+0020 to U+007E) and formatted according to the IDNA standard.

Exceptions

unicode is null.

unicode is invalid based on the AllowUnassigned and UseStd3AsciiRules properties, and the IDNA standard.

Examples

The following example uses the GetAscii(String) method to convert an array of internationalized domain names to Punycode, which is an encoded equivalent that consists of characters in the US-ASCII character range. The GetUnicode(String) method then converts the Punycode domain name back into the original domain name, but replaces the original label separators with the standard label separator.

using System;
using System.Globalization;

public class Example
{
   public static void Main()
   {
      string[] names = { "bücher.com", "мойдомен.рф", "παράδειγμα.δοκιμή",
                         "mycharity\u3002org",
                         "prose\u0000ware.com", "proseware..com", "a.org",
                         "my_company.com" };
      IdnMapping idn = new IdnMapping();

      foreach (var name in names) {
         try {
            string punyCode = idn.GetAscii(name);
            string name2 = idn.GetUnicode(punyCode);
            Console.WriteLine("{0} --> {1} --> {2}", name, punyCode, name2);
            Console.WriteLine("Original: {0}", ShowCodePoints(name));
            Console.WriteLine("Restored: {0}", ShowCodePoints(name2));
         }
         catch (ArgumentException) {
            Console.WriteLine("{0} is not a valid domain name.", name);
         }
         Console.WriteLine();
      }
   }

   private static string ShowCodePoints(string str1)
   {
      string output = "";
      foreach (var ch in str1)
         output += $"U+{(ushort)ch:X4} ";

      return output;
   }
}
// The example displays the following output:
//    bücher.com --> xn--bcher-kva.com --> bücher.com
//    Original: U+0062 U+00FC U+0063 U+0068 U+0065 U+0072 U+002E U+0063 U+006F U+006D
//    Restored: U+0062 U+00FC U+0063 U+0068 U+0065 U+0072 U+002E U+0063 U+006F U+006D
//
//    мойдомен.рф --> xn--d1acklchcc.xn--p1ai --> мойдомен.рф
//    Original: U+043C U+043E U+0439 U+0434 U+043E U+043C U+0435 U+043D U+002E U+0440 U+0444
//    Restored: U+043C U+043E U+0439 U+0434 U+043E U+043C U+0435 U+043D U+002E U+0440 U+0444
//
//    παράδειγμα.δοκιμή --> xn--hxajbheg2az3al.xn--jxalpdlp --> παράδειγμα.δοκιμή
//    Original: U+03C0 U+03B1 U+03C1 U+03AC U+03B4 U+03B5 U+03B9 U+03B3 U+03BC U+03B1 U+002E U+03B4 U+03BF U+03BA U+03B9 U+03BC U+03AE
//    Restored: U+03C0 U+03B1 U+03C1 U+03AC U+03B4 U+03B5 U+03B9 U+03B3 U+03BC U+03B1 U+002E U+03B4 U+03BF U+03BA U+03B9 U+03BC U+03AE
//
//    mycharity。org --> mycharity.org --> mycharity.org
//    Original: U+006D U+0079 U+0063 U+0068 U+0061 U+0072 U+0069 U+0074 U+0079 U+3002 U+006F U+0072 U+0067
//    Restored: U+006D U+0079 U+0063 U+0068 U+0061 U+0072 U+0069 U+0074 U+0079 U+002E U+006F U+0072 U+0067
//
//    prose ware.com is not a valid domain name.
//
//    proseware..com is not a valid domain name.
//
//    a.org --> a.org --> a.org
//    Original: U+0061 U+002E U+006F U+0072 U+0067
//    Restored: U+0061 U+002E U+006F U+0072 U+0067
//
//    my_company.com --> my_company.com --> my_company.com
//    Original: U+006D U+0079 U+005F U+0063 U+006F U+006D U+0070 U+0061 U+006E U+0079 U+002E U+0063 U+006F U+006D
//    Restored: U+006D U+0079 U+005F U+0063 U+006F U+006D U+0070 U+0061 U+006E U+0079 U+002E U+0063 U+006F U+006D
Imports System.Globalization

Module Example
   Public Sub Main()
      Dim names() As String = { "bücher.com", "мойдомен.рф", "παράδειγμα.δοκιμή",
                                "mycharity" + ChrW(&h3002) + "org",
                                "prose" + ChrW(0) + "ware.com", "proseware..com", "a.org", 
                                "my_company.com" }
      Dim idn As New IdnMapping()
      
      For Each name In names
         Try
            Dim punyCode As String = idn.GetAscii(name)
            Dim name2 As String = idn.GetUnicode(punyCode)
            Console.WriteLine("{0} --> {1} --> {2}", name, punyCode, name2) 
            Console.WriteLine("Original: {0}", ShowCodePoints(name))
            Console.WriteLine("Restored: {0}", ShowCodePoints(name2))
         Catch e As ArgumentException 
            Console.WriteLine("{0} is not a valid domain name.", name)
         End Try
         Console.WriteLine()
      Next   
   End Sub
   
   Private Function ShowCodePoints(str1 As String) As String
      Dim output As String = ""
      For Each ch In str1
         output += String.Format("U+{0} ", Convert.ToUInt16(ch).ToString("X4"))
      Next
      Return output
   End Function
End Module
' The example displays the following output:
'    bücher.com --> xn--bcher-kva.com --> bücher.com
'    Original: U+0062 U+00FC U+0063 U+0068 U+0065 U+0072 U+002E U+0063 U+006F U+006D
'    Restored: U+0062 U+00FC U+0063 U+0068 U+0065 U+0072 U+002E U+0063 U+006F U+006D
'    
'    мойдомен.рф --> xn--d1acklchcc.xn--p1ai --> мойдомен.рф
'    Original: U+043C U+043E U+0439 U+0434 U+043E U+043C U+0435 U+043D U+002E U+0440 U+0444
'    Restored: U+043C U+043E U+0439 U+0434 U+043E U+043C U+0435 U+043D U+002E U+0440 U+0444
'    
'    παράδειγμα.δοκιμή --> xn--hxajbheg2az3al.xn--jxalpdlp --> παράδειγμα.δοκιμή
'    Original: U+03C0 U+03B1 U+03C1 U+03AC U+03B4 U+03B5 U+03B9 U+03B3 U+03BC U+03B1 U+002E U+03B4 U+03BF U+03BA U+03B9 U+03BC U+03AE
'    Restored: U+03C0 U+03B1 U+03C1 U+03AC U+03B4 U+03B5 U+03B9 U+03B3 U+03BC U+03B1 U+002E U+03B4 U+03BF U+03BA U+03B9 U+03BC U+03AE
'    
'    mycharity。org --> mycharity.org --> mycharity.org
'    Original: U+006D U+0079 U+0063 U+0068 U+0061 U+0072 U+0069 U+0074 U+0079 U+3002 U+006F U+0072 U+0067
'    Restored: U+006D U+0079 U+0063 U+0068 U+0061 U+0072 U+0069 U+0074 U+0079 U+002E U+006F U+0072 U+0067
'    
'    prose ware.com is not a valid domain name.
'    
'    proseware..com is not a valid domain name.
'    
'    a.org --> a.org --> a.org
'    Original: U+0061 U+002E U+006F U+0072 U+0067
'    Restored: U+0061 U+002E U+006F U+0072 U+0067
'    
'    my_company.com --> my_company.com --> my_company.com
'    Original: U+006D U+0079 U+005F U+0063 U+006F U+006D U+0070 U+0061 U+006E U+0079 U+002E U+0063 U+006F U+006D
'    Restored: U+006D U+0079 U+005F U+0063 U+006F U+006D U+0070 U+0061 U+006E U+0079 U+002E U+0063 U+006F U+006D

Remarks

The unicode parameter specifies a string of one or more labels that consist of valid Unicode characters. The labels are separated by label separators. The unicode parameter cannot begin with a label separator, but it can include and optionally end with a separator. The label separators are FULL STOP (period, U+002E), IDEOGRAPHIC FULL STOP (U+3002), FULLWIDTH FULL STOP (U+FF0E), and HALFWIDTH IDEOGRAPHIC FULL STOP (U+FF61). For example, the domain name "www.adatum.com" consists of the labels, "www", "adatum", and "com", separated by periods.

A label cannot contain any of the following characters:

The GetAscii method converts all label separators to FULL STOP (period, U+002E).

If unicode contains no characters outside the US-ASCII character range and no characters within the US-ASCII character range are prohibited, the method returns unicode unchanged.

Notes to Callers

In the .NET Framework 4.5, the IdnMapping class supports different versions of the IDNA standard, depending on the operating system in use:

See Unicode Technical Standard #46: IDNA Compatibility Processing for the differences in the way these standards handle particular sets of characters.

Applies to

GetAscii(String, Int32)

Source:
IdnMapping.cs
Source:
IdnMapping.cs
Source:
IdnMapping.cs

Encodes a substring of domain name labels that include Unicode characters outside the US-ASCII character range. The substring is converted to a string of displayable Unicode characters in the US-ASCII character range and is formatted according to the IDNA standard.

public:
 System::String ^ GetAscii(System::String ^ unicode, int index);
public string GetAscii (string unicode, int index);
member this.GetAscii : string * int -> string
Public Function GetAscii (unicode As String, index As Integer) As String

Parameters

unicode
String

The string to convert, which consists of one or more domain name labels delimited with label separators.

index
Int32

A zero-based offset into unicode that specifies the start of the substring to convert. The conversion operation continues to the end of the unicode string.

Returns

The equivalent of the substring specified by the unicode and index parameters, consisting of displayable Unicode characters in the US-ASCII character range (U+0020 to U+007E) and formatted according to the IDNA standard.

Exceptions

unicode is null.

index is less than zero.

-or-

index is greater than the length of unicode.

unicode is invalid based on the AllowUnassigned and UseStd3AsciiRules properties, and the IDNA standard.

Remarks

The unicode and index parameters define a substring with one or more labels that consist of valid Unicode characters. The labels are separated by label separators. The first character of the substring cannot begin with a label separator, but it can include and optionally end with a separator. The label separators are FULL STOP (period, U+002E), IDEOGRAPHIC FULL STOP (U+3002), FULLWIDTH FULL STOP (U+FF0E), and HALFWIDTH IDEOGRAPHIC FULL STOP (U+FF61). For example, the domain name "www.adatum.com" consists of the labels, "www", "adatum", and "com", separated by periods.

A label cannot contain any of the following characters:

The GetAscii method converts all label separators to FULL STOP (period, U+002E).

If unicode contains no characters outside the US-ASCII character range and no characters within the US-ASCII character range are prohibited, the method returns unicode unchanged.

Notes to Callers

In the .NET Framework 4.5, the IdnMapping class supports different versions of the IDNA standard, depending on the operating system in use:

See Unicode Technical Standard #46: IDNA Compatibility Processing for the differences in the way these standards handle particular sets of characters.

Applies to

GetAscii(String, Int32, Int32)

Source:
IdnMapping.cs
Source:
IdnMapping.cs
Source:
IdnMapping.cs

Encodes the specified number of characters in a substring of domain name labels that include Unicode characters outside the US-ASCII character range. The substring is converted to a string of displayable Unicode characters in the US-ASCII character range and is formatted according to the IDNA standard.

public:
 System::String ^ GetAscii(System::String ^ unicode, int index, int count);
public string GetAscii (string unicode, int index, int count);
member this.GetAscii : string * int * int -> string
Public Function GetAscii (unicode As String, index As Integer, count As Integer) As String

Parameters

unicode
String

The string to convert, which consists of one or more domain name labels delimited with label separators.

index
Int32

A zero-based offset into unicode that specifies the start of the substring.

count
Int32

The number of characters to convert in the substring that starts at the position specified by index in the unicode string.

Returns

The equivalent of the substring specified by the unicode, index, and count parameters, consisting of displayable Unicode characters in the US-ASCII character range (U+0020 to U+007E) and formatted according to the IDNA standard.

Exceptions

unicode is null.

index or count is less than zero.

-or-

index is greater than the length of unicode.

-or-

index is greater than the length of unicode minus count.

unicode is invalid based on the AllowUnassigned and UseStd3AsciiRules properties, and the IDNA standard.

Examples

The following example uses the GetAscii(String, Int32, Int32) method to convert an internationalized domain name to a domain name that complies with the IDNA standard. The GetUnicode(String, Int32, Int32) method then converts the standardized domain name back into the original domain name, but replaces the original label separators with the standard label separator.

// This example demonstrates the GetAscii and GetUnicode methods.
// For sake of illustration, this example uses the most complex
// form of those methods, not the most convenient.

using System;
using System.Globalization;

class Sample
{
    public static void Main()
    {
/*
   Define a domain name consisting of the labels: GREEK SMALL LETTER
   PI (U+03C0); IDEOGRAPHIC FULL STOP (U+3002); GREEK SMALL LETTER
   THETA (U+03B8); FULLWIDTH FULL STOP (U+FF0E); and "com".
*/
    string name = "\u03C0\u3002\u03B8\uFF0Ecom";
    string international;
    string nonInternational;

    string msg1 = "the original non-internationalized \ndomain name:";
    string msg2 = "Allow unassigned characters?:     {0}";
    string msg3 = "Use non-internationalized rules?: {0}";
    string msg4 = "Convert the non-internationalized domain name to international format...";
    string msg5 = "Display the encoded domain name:\n\"{0}\"";
    string msg6 = "the encoded domain name:";
    string msg7 = "Convert the internationalized domain name to non-international format...";
    string msg8 = "the reconstituted non-internationalized \ndomain name:";
    string msg9 = "Visually compare the code points of the reconstituted string to the " +
                  "original.\n" +
                  "Note that the reconstituted string contains standard label " +
                  "separators (U+002e).";
// ----------------------------------------------------------------------------
    CodePoints(name, msg1);
// ----------------------------------------------------------------------------

    IdnMapping idn = new IdnMapping();

    Console.WriteLine(msg2, idn.AllowUnassigned);
    Console.WriteLine(msg3, idn.UseStd3AsciiRules);
    Console.WriteLine();
// ----------------------------------------------------------------------------
    Console.WriteLine(msg4);
    international = idn.GetAscii(name, 0, name.Length);
    Console.WriteLine(msg5, international);
    Console.WriteLine();
    CodePoints(international, msg6);
// ----------------------------------------------------------------------------
    Console.WriteLine(msg7);
    nonInternational = idn.GetUnicode(international, 0, international.Length);
    CodePoints(nonInternational, msg8);
    Console.WriteLine(msg9);
    }
// ----------------------------------------------------------------------------
    static void CodePoints(string value, string title)
    {
    Console.WriteLine("Display the Unicode code points of {0}", title);
    foreach (char c in value)
        {
        Console.Write("{0:x4} ", Convert.ToInt32(c));
        }
        Console.WriteLine();
        Console.WriteLine();
    }
}
/*
This code example produces the following results:

Display the Unicode code points of the original non-internationalized
domain name:
03c0 3002 03b8 ff0e 0063 006f 006d

Allow unassigned characters?:     False
Use non-internationalized rules?: False

Convert the non-internationalized domain name to international format...
Display the encoded domain name:
"xn--1xa.xn--txa.com"

Display the Unicode code points of the encoded domain name:
0078 006e 002d 002d 0031 0078 0061 002e 0078 006e 002d 002d 0074 0078 0061 002e 0063 006f
006d

Convert the internationalized domain name to non-international format...
Display the Unicode code points of the reconstituted non-internationalized
domain name:
03c0 002e 03b8 002e 0063 006f 006d

Visually compare the code points of the reconstituted string to the original.
Note that the reconstituted string contains standard label separators (U+002e).

*/
' This example demonstrates the GetAscii and GetUnicode methods.
' For sake of illustration, this example uses the most complex
' form of those methods, not the most convenient.

Imports System.Globalization

Class Sample
    Public Shared Sub Main()

'   Define a domain name consisting of the labels: GREEK SMALL LETTER
'   PI (U+03C0); IDEOGRAPHIC FULL STOP (U+3002); GREEK SMALL LETTER
'   THETA (U+03B8); FULLWIDTH FULL STOP (U+FF0E); and "com".

        Dim name As String = "π。θ.com"
        Dim international As String
        Dim nonInternational As String

        Dim msg1 As String = "the original non-internationalized " & vbCrLf & "domain name:"
        Dim msg2 As String = "Allow unassigned characters?:     {0}"
        Dim msg3 As String = "Use non-internationalized rules?: {0}"
        Dim msg4 As String = "Convert the non-internationalized domain name to international format..."
        Dim msg5 As String = "Display the encoded domain name:" & vbCrLf & """{0}"""
        Dim msg6 As String = "the encoded domain name:"
        Dim msg7 As String = "Convert the internationalized domain name to non-international format..."
        Dim msg8 As String = "the reconstituted non-internationalized " & vbCrLf & "domain name:"
        Dim msg9 As String = "Visually compare the code points of the reconstituted string to the " & _
                             "original." & vbCrLf & _
                             "Note that the reconstituted string contains standard label " & _
                             "separators (U+002e)."
        ' ----------------------------------------------------------------------------
        CodePoints(name, msg1)
        ' ----------------------------------------------------------------------------
        Dim idn As New IdnMapping()

        Console.WriteLine(msg2, idn.AllowUnassigned)
        Console.WriteLine(msg3, idn.UseStd3AsciiRules)
        Console.WriteLine()
        ' ----------------------------------------------------------------------------
        Console.WriteLine(msg4)
        international = idn.GetAscii(name, 0, name.Length)
        Console.WriteLine(msg5, international)
        Console.WriteLine()
        CodePoints(international, msg6)
        ' ----------------------------------------------------------------------------
        Console.WriteLine(msg7)
        nonInternational = idn.GetUnicode(international, 0, international.Length)
        CodePoints(nonInternational, msg8)
        Console.WriteLine(msg9)
    End Sub

    ' ----------------------------------------------------------------------------
    Shared Sub CodePoints(ByVal value As String, ByVal title As String)
        Console.WriteLine("Display the Unicode code points of {0}", title)
        Dim c As Char
        For Each c In  value
            Console.Write("{0:x4} ", Convert.ToInt32(c))
        Next c
        Console.WriteLine()
        Console.WriteLine()

    End Sub
End Class
'
'This code example produces the following results:
'
'Display the Unicode code points of the original non-internationalized
'domain name:
'03c0 3002 03b8 ff0e 0063 006f 006d
'
'Allow unassigned characters?:     False
'Use non-internationalized rules?: False
'
'Convert the non-internationalized domain name to international format...
'Display the encoded domain name:
'"xn--1xa.xn--txa.com"
'
'Display the Unicode code points of the encoded domain name:
'0078 006e 002d 002d 0031 0078 0061 002e 0078 006e 002d 002d 0074 0078 0061 002e 0063 006f
'006d
'
'Convert the internationalized domain name to non-international format...
'Display the Unicode code points of the reconstituted non-internationalized
'domain name:
'03c0 002e 03b8 002e 0063 006f 006d
'
'Visually compare the code points of the reconstituted string to the original.
'Note that the reconstituted string contains standard label separators (U+002e).
'

Remarks

The Unicode, index, and count parameters define a substring with one or more labels that consist of valid Unicode characters. The labels are separated by label separators. The first character of the substring cannot begin with a label separator, but it can include and optionally end with a separator. The label separators are FULL STOP (period, U+002E), IDEOGRAPHIC FULL STOP (U+3002), FULLWIDTH FULL STOP (U+FF0E), and HALFWIDTH IDEOGRAPHIC FULL STOP (U+FF61). For example, the domain name "www.adatum.com" consists of the labels, "www", "adatum", and "com", separated by periods.

A label cannot contain any of the following characters:

The GetAscii method converts all label separators to FULL STOP (period, U+002E). If the substring contains no characters outside the US-ASCII character range, and no characters within the US-ASCII character range are prohibited, the method returns the substring unchanged.

Notes to Callers

In the .NET Framework 4.5, the IdnMapping class supports different versions of the IDNA standard, depending on the operating system in use:

See Unicode Technical Standard #46: IDNA Compatibility Processing for the differences in the way these standards handle particular sets of characters.

Applies to