Output Encoding

Hi Anil Chintala here....

I am a Developer on CISG team working out of the Hyderabad campus in India. I am responsible for building security software for the information security group within Microsoft IT. I have a bachelors degree in mechanical engineering and I have worked in various roles from development to managing a dev team for a startup in India, delivering technical solutions and managing customer relations where I gained more knowledge on cryptography, general security awareness, techniques and secure coding skills. Before joining Microsoft, I worked as a consultant for the ACE Team in Redmond and was involved in designing and building business critical applications supporting their information security program. In December 2007 I left V-Empower and United States, to take up a full-time (or "FTE" in MSFT speak) position in ACE Engineering team in India. I am currently working as a developer on the AntiXSS team building the next generation of the AntiXSS library. I also just started my personal blog ( can't believe I've waited this long) where I intend to post frequently on technical content, provide interesting links and provide my opinion on security, software engineering, process, tools and technologies. Apart from working for this amazing team, I enjoy watching movies, playing video games and recently started playing tennis to keep myself fit.

Today as a gentle introduction I'll try and show you how to prevent XSS vulnerabilities to happen in your ASP.NET applications.

Cross Site Scripting (XSS) vulnerabilities occur in your ASP.NET applications when a malicious script or un-validated user input is executed while viewing dynamically generated pages.

In general XSS vulnerabilities can be prevented by following countermeasures:

Validate Input - Constrain all special characters and user input to acceptable range,type and length of input characters.
Encode Output - Encode the output displaying to browser which includes any user input.

I'll consider "Validating Input" as a subject for another day ( for more information on Input Validation, see How To: Use Regular Expressions to Constrain Input in ASP.NET) and limit the scope of this post to output encoding techniques to prevent XSS in ASP.NET application. As I mentioned above, one solution to prevent XSS vulnerabilities is to encode values before they are rendered to users.

Microsoft Patterns and Practices Guide demonstrates How To: Prevent Cross-Site Scripting in ASP.NET, where the following valid recommendations are made with excellent code examples:

Use the HttpUtility.HtmlEncode method to encode output if it contains input from the user or from other sources such as databases.
Similarly, use HttpUtility.UrlEncode to encode output URLs if they are constructed from input.

Although HttpUtility.HtmlEncode/HttpUtility.UrlEncode methods prevent XSS vulnerabilities when characters like "<", ">" and "&" are used, but they can be vulnerable when user input contains characters outside of this limited set of characters. Below is an example which shows how a code can be vulnerable even after using HttpUtility.HtmlEncode method.

In Secure Example 1 - VulnerablePage.aspx

    1: <input type=text value=<%= HttpUtility.HtmlEncode(Request.QueryString["name"]) %> ></input>

In Secure Example 2 - Vulnerable Page.aspx

    1: <head runat="server">

    2: <title>Untitled Page</title>

3:

    4: <script>

    5: function fnEvil()

    6:   {

    7:      var id = '<%= Server.HtmlEncode(name)%>';

    8:   }

    9: </script>

10:

   11: </head>

In Secure Example 2 - VulnerablePage.aspx.cs

    1: protected string name = string.Empty;

2:

    3: protected void Page_Load(object sender, EventArgs e)

    4: {

    5:    name = Request.QueryString["name"];

    6: }

7:

    8: Response.Write(HttpUtility.HtmlEncode(Request.Form["name"]));

Now consider a user input like " '; alert(XSS);// " which results the following insecure code.

    1: <script type="text/javascript">

    2: function fnEvil()

    3: {

    4:     var name=''; alert(XSS);//';

    5: }

    6: </script>

Reason for this is, System.Web.HttpUtility follows a principle of exclusion only escaping the known dangerous characters (such as <, >, and & ) where as AntiXSS library follows a principle of inclusion and allows only a small set of safe characters to escape and encodes everything else. Following is the safe characters list:

a-z (lower case)

A-Z (upper case)

0-9 (Numeric values)

, (Comma)

. (Period)

_ (Underscore)

- (dash)

(Space)— Excluded for URLEncode

Below is the sample code using encoding functions from AntiXSS library.

Secure Example 1:

    1: <input type=text value=<%= AntiXss.HtmlAttributeEncode(Request.QueryString["name"]) %> ></input>

Secure Example 2 - Vulnerable Page.aspx

    1: <head runat="server">

    2: <title>Untitled Page</title>

3:

    4: <script>

    5: function fnEvil()

    6:   {

    7:      var id = <%= AntiXss.JavaScriptEncode(name)%>;

    8:   }

    9: </script>

10:

   11: </head>

In the above scenario user input is used in JavaScript context and AntiXss provides JavaScriptEncode method which uses \xSINGLE_BYTE_HEX and \uDOUBLE_BYTE_HEX notation to encode unsafe characters and also wraps the output in single quotes to make it a string.

Now considering the same input " '; alert(XSS);// " AntiXSS generates the following safe output.

    1: <script type="text/javascript">

    2: function fnEvil()

    3: {

    4:     var name='\x3b alert\x28XSS\x29\x3b\x2f\x2f';

    5: }

    6: </script>

Above code sample demonstrates that the white-list approach of AntiXSS library basically provides superior protection by encoding everything except a small set of safe characters when compared against the classic HtmlEncode and UrlEncode utilities which encode only known bad items. I like AntiXSS because it looks for "good things" and not "bad things". :)

Thanks and more later...

Comments

Anonymous
August 28, 2008
As promised, I am back sooner than you expected! and I know you are one of the two people who visit my
Anonymous
September 08, 2008
Anil Chintala here... I told you in my previous blog about AntiXSS Output Encoding methodology and why
Anonymous
September 09, 2008
Why does your example use HTML entity encoding inside a JavaScript block? This should use Javascript escaping. The spec defines a few special characters that have specific a encodings. All characters not known to be safe should use xHH or uHHHH format.
Anonymous
September 10, 2008
Thank You Jeff for pointing out the wrong method used in the sample code. I have corrected it now.Appreciate your input.
Anonymous
November 06, 2008
The comment has been removed

Last updated on 2008-08-28

Output Encoding

Comments

Additional resources