C# does not encode Uri properly/Web client does not load page when uri has a unicode character

AlienDeal 21 Reputation points
2022-10-29T17:36:42.107+00:00

Hi,

I were trying to get data from the url https://hanziyuan.net/#字 . This in percent encoding is

https://hanziyuan.net/#%E5%AD%97
.

No matter what I do, the data that loads is from the default page
https://hanziyuan.net/#%E8%BD%A6
https://hanziyuan.net/#车

The code I used is given below. It seems the encoded part is not getting passed on to the server
by the C# client.

// Online C# Editor for free
// Write, Edit and Run your C# code using C# Online Compiler

using System;

public class HelloWorld
{
public static void Main(string[] args)
{

        System.Net.WebClient wc = new System.Net.WebClient();  

          

        
        byte[] raw = wc.DownloadData(new System.Uri("https://hanziyuan.net/#%E5%AD%97"));  


          
        string webData = System.Text.Encoding.UTF8.GetString(raw);     
          

    Console.WriteLine (webData);  
}  

}

   The data that loads is from the default page: https://hanziyuan.net/#%E8%BD%A6  
   https://hanziyuan.net/#车  
     
   While **the expected data on that code **is from:   
   https://hanziyuan.net/#字  
     
   https://hanziyuan.net/#%E5%AD%97  
     
   I have tried with the string "https://hanziyuan.net/#字" as well. Nothing seems to work!  
   "
C#
C#
An object-oriented and type-safe programming language that has its roots in the C family of languages and includes support for component-oriented programming.
10,648 questions
.NET Runtime
.NET Runtime
.NET: Microsoft Technologies based on the .NET software framework.Runtime: An environment required to run apps that aren't compiled to machine language.
1,141 questions
{count} votes

Accepted answer
  1. P a u l 10,496 Reputation points
    2022-10-29T17:56:58.697+00:00

    Both https://hanziyuan.net/#字 and https://hanziyuan.net/ refer to the same page, except the former has a document fragment (note the # before ).

    This means that the HTML you're downloading with your DownloadData call is identical for both, and the page routing is done on the client side (using hash routing, see link: https://blog.bitsrc.io/using-hashed-vs-nonhashed-url-paths-in-single-page-apps-a66234cefc96)


1 additional answer

Sort by: Most helpful
  1. Bruce (SqlWork.com) 61,731 Reputation points
    2022-10-29T18:47:52.087+00:00

    the # in a url is a bookmark. it is not supposed to be sent to the server, after the response, its used to position the page. in the case of javascript routing, its used to navigate to an internal page. in the sample url, javascript uses the bookmark to makes an ajax call to fetch custom display content.

    if you read the javascript you can probably reverse engineer how the page works, and do it in c#.

    0 comments No comments