Output Encoding

Hi Anil Chintala here....

I am a Developer on CISG team working out of the Hyderabad campus in India. I am responsible for building security software for the information security group within Microsoft IT. I have a bachelors degree in mechanical engineering and I have worked in various roles from development to managing a dev team for a startup in India, delivering technical solutions and managing customer relations where I gained more knowledge on cryptography, general security awareness, techniques and secure coding skills. Before joining Microsoft, I worked as a consultant for the ACE Team in Redmond and was involved in designing and building business critical applications supporting their information security program.  In December 2007 I left V-Empower and United States, to take up a full-time (or "FTE" in MSFT speak) position in ACE Engineering team in India. I am currently working as a developer on the AntiXSS team building the next generation of the AntiXSS library. I also just started my personal blog ( can't believe I've waited this long) where I intend to post frequently on technical content, provide interesting links and provide my opinion on security, software engineering, process, tools and technologies. Apart from working for this amazing team, I enjoy watching movies, playing video games and recently started playing tennis to keep myself fit.

Today as a gentle introduction I'll try and show you how to prevent XSS vulnerabilities to happen in your ASP.NET applications.

Cross Site Scripting (XSS) vulnerabilities occur in your ASP.NET applications when a malicious script or un-validated user input is executed while viewing dynamically generated pages.

In general XSS vulnerabilities can be prevented by following countermeasures:

  1. Validate Input - Constrain all special characters and user input to acceptable range,type and length of input characters.
  2. Encode Output - Encode the output displaying to browser which includes any user input.

I'll consider "Validating Input" as a subject for another day ( for more information on Input Validation, see How To: Use Regular Expressions to Constrain Input in ASP.NET) and limit the scope of this post to output encoding techniques to prevent XSS in ASP.NET application. As I mentioned above, one solution to prevent XSS vulnerabilities is to encode values before they are rendered to users.

Microsoft Patterns and Practices Guide demonstrates How To: Prevent Cross-Site Scripting in ASP.NET, where the following valid recommendations are made with excellent code examples:

  • Use the HttpUtility.HtmlEncode method to encode output if it contains input from the user or from other sources such as databases.
  • Similarly, use HttpUtility.UrlEncode to encode output URLs if they are constructed from input.

Although HttpUtility.HtmlEncode/HttpUtility.UrlEncode methods prevent XSS vulnerabilities when characters like "<", ">" and "&" are used, but they can be vulnerable when user input contains characters outside of this limited set of characters. Below is an example which shows how a code can be vulnerable even after using HttpUtility.HtmlEncode method.

In Secure Example 1 - VulnerablePage.aspx

    1: <input type=text value=<%= HttpUtility.HtmlEncode(Request.QueryString["name"]) %> ></input>

In Secure Example 2 - Vulnerable Page.aspx

    1: <head runat="server">
    2: <title>Untitled Page</title>
    3:  
    4: <script>
    5: function fnEvil()
    6:   {
    7:      var id = '<%= Server.HtmlEncode(name)%>';
    8:   }
    9: </script>
   10:  
   11: </head>

In Secure Example 2 - VulnerablePage.aspx.cs

    1: protected string name = string.Empty;
    2:  
    3: protected void Page_Load(object sender, EventArgs e)
    4: {
    5:    name = Request.QueryString["name"];
    6: }
    7:  
    8: Response.Write(HttpUtility.HtmlEncode(Request.Form["name"]));

Now consider a user input like " '; alert(XSS);// " which results the following insecure code.

    1: <script type="text/javascript"> 
    2: function fnEvil() 
    3: { 
    4:     var name=''; alert(XSS);//'; 
    5: } 
    6: </script>

Reason for this is, System.Web.HttpUtility follows a principle of exclusion only escaping the known dangerous characters (such as <, >, and & ) where as AntiXSS library follows a principle of inclusion and allows only a small set of safe characters to escape and encodes everything else. Following is the safe characters list:

a-z (lower case)

A-Z (upper case)

0-9 (Numeric values)

, (Comma)

. (Period)

_ (Underscore)

- (dash)

(Space)— Excluded for URLEncode

Below is the sample code using encoding functions from AntiXSS library.

Secure Example 1:

    1: <input type=text value=<%= AntiXss.HtmlAttributeEncode(Request.QueryString["name"]) %> ></input>

Secure Example 2 - Vulnerable Page.aspx

    1: <head runat="server"> 
    2: <title>Untitled Page</title> 
    3:  
    4: <script> 
    5: function fnEvil() 
    6:   { 
    7:      var id = <%= AntiXss.JavaScriptEncode(name)%>; 
    8:   } 
    9: </script> 
   10:  
   11: </head>

In the above scenario user input is used in JavaScript context and AntiXss provides JavaScriptEncode method which uses \xSINGLE_BYTE_HEX and \uDOUBLE_BYTE_HEX notation to encode unsafe characters and also wraps the output in single quotes to make it a string.

Now considering the same input " '; alert(XSS);// " AntiXSS generates the following safe output.

    1: <script type="text/javascript"> 
    2: function fnEvil() 
    3: { 
    4:     var name='\x3b alert\x28XSS\x29\x3b\x2f\x2f'; 
    5: } 
    6: </script> 

Above code sample demonstrates that the white-list approach of AntiXSS library basically provides superior protection by encoding everything except a small set of safe characters when compared against the classic HtmlEncode and UrlEncode utilities which encode only known bad items. I like AntiXSS because it looks for "good things" and not "bad things". :)

Thanks and more later...

Comments

  • Anonymous
    August 28, 2008
    As promised, I am back sooner than you expected! and I know you are one of the two people who visit my
  • Anonymous
    September 08, 2008
    Anil Chintala here... I told you in my previous blog about AntiXSS Output Encoding methodology and why
  • Anonymous
    September 09, 2008
    Why does your example use HTML entity encoding inside a JavaScript block? This should use Javascript escaping. The spec defines a few special characters that have specific a encodings. All characters not known to be safe should use xHH or uHHHH format.
  • Anonymous
    September 10, 2008
    Thank You Jeff for pointing out the wrong method used in the sample code. I have corrected it now.Appreciate your input.
  • Anonymous
    November 06, 2008
    The comment has been removed