2022-01-21

Cross-domain Ajax: Implementation and Considerations

Dino Esposito | February 17, 2011

No Ajax features would be possible without the XmlHttpRequest (XHR) object that Web browsers started supporting about a decade ago. By using a script created instance of the XHR object, you can bypass the classic browser-led machinery that connects you to a requested URL. The XHR object uses the low-level browser’s API to ultimately open a socket and send the HTTP packet out. Beyond that, everything else is under your developer responsibility.

In particular, the biggest difference between making a browser-led and a XHR-led request is in what happens once the response has been downloaded. The browser just processes the response in order to display it. Your script, instead, can do potentially any use of the downloaded response—from building hopefully innocuous mashups to preparing cross-site scripting attacks. Because XHR uploads cookies, a user authenticated on a site (say, dino.com) might end up on another site (say, badguy.com) and leave there his authentication cookie. At this point, from the badguy.com site someone could make a XHR request to dino.com and behave as if it were the original user.

For these reasons, browsers have always implemented a security model around XHR that prevents cross-domain calls. In other words, the browser’s implementation of the XHR object compares the URL being opened by XHR with the current URL. If the two URLs do not have the same exact host and protocol, the call is denied. This is a policy that browser chose to implement for security reasons and it goes under the name of “Same Origin Policy” (SOP).

Nobody complained about SOP until Ajax became popular and widely used. SOP just prevents you from easily creating mashups and, more in general, to request data to a site that lives on a different host or that uses a different protocol. This represents a serious limitation for developers of modern Ajax applications. Workarounds have been in the works for years but we’re still looking for an official standard solution to the issue. The W3C have a working draft for something called Cross-Origin Resource Sharing (CORS), which defines a common ground for browsers and Web servers to interoperate and enable applications to perform secure cross-site data transfers. Some browsers currently support CORS to some extent and through different APIs. That is probably the way to go in the near future.

For the rest of the article, I’ll examine the various options we have today and the browser requirements of each of them.

Tools to Leverage

Just because SOP applied to XHR is a browser’s native feature, what can we leverage in order to bypass it? If cross-domain XHR calls are prohibited by browsers what else can we do? Generally speaking, there are four possible approaches:

Using a server-side proxy
Using Silverlight or Flash applets and their workarounds to bypass SOP
Leveraging cross-domain enabled HTML tags such as <script> and <iframe>
Using ad hoc browser extensions specifically created to enable cross-domain XHR calls (i.e, CORS-based features)

These are the various options you might want to consider first as a software architect. These are the options that would work without requiring each single user to tweak security settings on her browser.

Before going any further, though, let me clarify a point. Cross-domain access may be subject to security and firewall policies other than the Same Origin Policy. If such policies are standing, bypassing SOP doesn’t still give you access to the remote resources of your leisure.

Server-side Proxy

This is probably the most common approach. It doesn't limit in any way the power of client-side scripts; at the same time, it doesn’t incur in any of the limitations imposed by SOP. The trick is simple: the client side script uses XHR to call a proxy located in the same host which served the current page. Next, the proxy will make a server-to-server call to whatever site you like. Any results gotten will then be forwarded to the client as the result of the XHR operation. The programming interface exposed by the proxy works as a façade for the real target of your calls.

Quite effective and absolutely safe, this solution always requires that bit of extra work which is just what has stimulated people to look for other, more direct, solutions. Most of the time, in fact, you end up working out a specific proxy for each external Web site you intend to call. Generic proxies that accept the target URL as a parameter exist but are, at the end of the day, a bit cumbersome to use and not always as flexible as you would like.

The time you likely need to invest in the creation of a server-side proxy can be better amortized if you are developing an application with multiple front-ends. In this case, you often just need a made-to-measure façade for the real remote data source providers. The façade will normalize downloaded data to the expected UI format. In this way, you may perceive the façade as a required part of your architecture rather than as an extra, if not unnecessary, cost.

Silverlight and Flash Cross-domain Machinery

Flash has been the first Web API to concretely address the issue of cross-domain calls (and not just that given that Flash is today still a valid option for socket-based Web programming.) Already years ago, Flash set up a custom, handshake-like protocol for Web browsers to place cross-domain requests to selected sites.

The security model of XHR is inadequate for modern applications not so much because it is too restrictive, but because it is blindly restrictive. When it comes to XHR, browsers just don’t allow cross-domain calls (or allow them on a specific client machine if the user takes responsibility for that). With the success of Ajax, there’s been a growing number of scenarios in which a fully legitimate cross-domain call is safe and, more importantly, necessary for the application. If the target server is aware of exposing information that others may request, why not accepting calls just from selected callers? This is the winner that Flash architects scored a while back.

Unlike Web browsers, Flash has its own networking machinery and can employ its own rules as far as SOP is concerned. When faced with a cross-domain call, the Flash runtime requires that a policy file be downloaded from the target domain before any access is allowed to any resources on that domain. The policy file is an XML file named crossdomain.xml, whose contents comply with a specific XML schema. For example, the following file will enable the Web server that hosts it to receive calls from Flash clients hosted in pages from the specified domain—dino.com.

<cross-domain-policy>
   <allow-access-from domain="dino.com" />
</cross-domain-policy>

It should be noted that Flash enforces the cross-domain policy—not the target site.

Starting with version 2, Silverlight supports the same policy file as Flash, plus another format a bit more expressive. Note that both in Silverlight and Flash, the security policy system affects cross-domain operations regardless of the network API being used, be it XHR, sockets or other helper classes such as the WebClient class in Silverlight.

The additional Silverlight policy file has a different format than the Flash policy file and, of course, a different name. The file is named ClientAccessPolicy.xml and must be located in the root of the site. Silverlight first looks for this file via HTTP. If the file is missing, inaccessible, or has an invalid content then the Silverlight runtime looks for the Flash file and denies the call if it can’t get any valid policy information. Here’s a sample content for a ClientAccessPolicy.xml file.

<access-policy>
  <cross-domain-access>
    <policy>
      <allow-from http-methods="*">       
        <domain uri="*"/>
      </allow-from>      
      <grant-to>      
        <resource path="/public" include-subpaths="true"/>
      </grant-to>      
    </policy>
  </cross-domain-access>
</access-policy>

As you can see, you can specify domains from which you accept calls (and feasible HTTP methods) through the allow-from element. In addition, the grant-to element allows specifying which section of the site the allowed caller can see. In the code snippets all callers are only granted access to anything under the public path. This means that Silverlight would deny client calls directed at any other path outside the public directory.

Legacy Cross-domain Elements

If you’re not a server-side type of Web developer, and would rather implement a quick and dirty hack that possibly work across browsers, then you can arrange some ad hoc script code that leverages a couple of legacy cross-domain HTML elements. Both the <script> and <iframe> tags can be configured to download resources from just any site regardless of any origin policy that may be set.

Downloading content, however, is not the same as using it! In this respect, <script> and <iframe> are quite different and <script> is the tag you would focus on while looking for effective cross-domain tricks. An IFRAME can successfully download content from just about anywhere but browsers apply restrictive policies as far as scripting that content is concerned. Cross-frame scripting is not allowed if content comes from different domains. So you’re back at square one: how can you actually consume the downloaded content? The IFRAME trick proves helpful only when you need to upload data in a fire-and-forget manner to a cross-domain site.

With the <script> tag, instead, the downloaded content is restricted to JavaScript but it can be freely consumed from within the caller page. With a bit of help from the remote server, you can download usable data from a different domain in the form of a JavaScript string and process it on the client. This requires using the JSON with Padding (JSONP) protocol. A JSONP solution is effective and cross-browser, but can be used only with agreeable sites and in conformance with the rules they set.

First off, a JSONP-enabled Web site is a Web site exposing a public endpoint that agrees to return a JSON string padded with a call to a caller-defined JavaScript function. For example, suppose that dino.com exposes an endpoint like below:

https://www.dino.com/public/getcustomer/123

When invoked, the endpoint returns a JSON string. The following code snippet assumes the endpoint is handled through an ASP.NET MVC application. In general, it could be anything that can return an application/json content type.

public JsonResult GetCustomer(Int32 id)
{
    // Get some data in some way
    :
    var customer = new Customer {Id = id, ...};

    return Json(customer);
}  // returns a string like {'Id'='...', 'CompanyName'='...', ...}

If you try to call the above URL via XHR, you likely get an access denied error. If you use the same URL within a <script> tag you successfully download the response of the method; except that you can’t do much to further process it.

<script type="text/javascript" 
        src="https://www.dino.com/public/getcustomer/123" />

A JSONP-enabled endpoint would rather wrap the JSON output string in a call to a JavaScript function that is defined locally within the context of the caller server. The JSONP output would look like below:

myHandler("{'Id'='...', 'CompanyName'='...', ...}");

Because all browsers evaluate any content downloaded via a <script> immediately, you do the trick of invoking some cross-domain code and process the output locally. Here’s a fragment of a sample HTML page that works this way:

<html>
<head>
    <script type="text/javascript" 
            src="https://www.dino.com/public/getcustomer/123" />
    <script type="text/javascript">
        function myHandler(jsonData) {
           alert(jsonData.Id + "  " + json.CompanyName);
        }
    </script>
</head>
<body>
    :
</body>
</html>

The only point to be clarified is how you can let the server know about the name of the local JavaScript function to be used to wrap the JSON return data. That depends on the API exposed by the JSONP-enabled server. A popular JSONP-enabled Web site such as Flickr requires that you add a jsoncallback query string parameter to the URL. If you control the site yourself, then the API is up to you! Most of the time it will be the name of a query string parameter. Here’s how to rewrite the ASP.NET MVC method to support JSONP. In this case, the server API expects a query string parameter named callback to indicate the name of the local function.

public JsonpResult GetCustomer(Int32 id)
{
    // Get some data in some way
    :
    var customer = new Customer {Id = id, ...};

    return this.Jsonp(customer);
}

public class JsonpResult : JsonResult
{
    private const String JsonpCallbackName = "callback";
    public override void ExecuteResult(ControllerContext context)
    {
       if (context == null)
          throw new ArgumentNullException("context");
       var response = context.HttpContext.Response;
       response.ContentType = !String.IsNullOrEmpty(ContentType) 
                                       ? ContentType : "application/json";
       if (ContentEncoding != null)
           response.ContentEncoding = ContentEncoding;
       if (Data == null) 
           return;

       var request = context.HttpContext.Request;
       var serializer = new JavaScriptSerializer();
       var buffer = request[JsonpCallbackName] != null 
                      ? String.Format("{0}({1})", 
                            request[JsonpCallbackName], serializer.Serialize(Data)) 
                      : serializer.Serialize(Data);
       response.Write(buffer);
    }
}

public static class JsonpExtensions
{
    public static JsonpResult Jsonp(this Controller controller, 
             Object data, JsonRequestBehavior behavior)
    {
        return new JsonpResult
        {
            Data = data,
            JsonRequestBehavior = behavior
        };
    }
}

As a result, try linking the following URL to the src attribute of a <script> tag:

https://www.dino.com/public/getcustomer/123?callback=foo

The response the browser gets is

foo({ ... })

As long as you have a foo JavaScript function in the local page, everything works and your code will actually process data downloaded from a cross-domain site.

If you use a <script> tag, however, the remote call would be placed as the page loads up. If you intend to download the response on demand, then you can create the <script> element programmatically and add it to the DOM. Alternatively, you can resort to the $.getScript method of the jQuery library which does just the same for you.

function buttonClicked() {
   var url = "https://www.dino.com/public/getcustomer/123?callback=foo";
   $.getScript(url, function() {});
}
function foo(jsonData) {
   // Process response here
}

For a deeper overview of JSONP, refer to Wikipedia and its links: https://en.wikipedia.org/wiki/JSON.

CORS-based Features

Recently the W3C released a working draft of Cross-Origin Resource Sharing, a mechanism intended to become the standard way for enabling client-side requests directed to external domains. You can find the latest draft at https://www.w3.org/TR/cors/. The idea develops around the same concept we find implemented in Flash and Silverlight: the browser would forward a cross-site request only if the target site explicitly declares to accept it. What’s different between Flash/Silverlight and future browsers is the handshake protocol. For browsers, no XML policy file has to be deployed on the accepting server. Instead, CORS entails that browsers issue a cross-domain request by adding a special request header such as below:

Origin: https://samples.msdn.microsoft.com

A Web site that intends to support cross-domain calls should check this header and decide whether, given the origin server, it intends to reply. If so, it just sends out any response plus a bunch of HTTP response headers such as

Access-Control-Allow-Origin: *

Lacking the response header, the browser may refuse to accept any response. CORS is expected to be an extension to the XHR specification from W3C.

Having said that, CORS is not yet a recommended standard but browsers are already providing some support for it. Support, however, varies on a per browser (and version) basis. For example, Internet Explorer 8 does have its own object that supports a CORS semantic and it is distinct from XHR. The object is XDomainRequest. Here’s how to use it:

var xdr = new XDomainRequest();
xdr.onload = processData; 
xdr.open("get", url);
xdr.send();

The programming interface is similar to XHR except for a bunch of extra events that signal most important states of the object. For example, the onload event indicates when data has been successfully downloaded. You can retrieve the response using the responseText property, as below:

function processData() {
    alert(xdr.responseText);
}

It should be noted that the XDomainRequest object in IE8 ignores cookies.

Other browsers, specifically Firefox 3.5, Safari 4, Chrome 2 and newer versions support CORS within the XHR object. The code is the same you would use with classic XHR programming except and it would work cross-domain for endpoints that return the Access-Control-Allow-Origin header set to * or the calling host.

It Depends…

As usual when you have to choose from multiple options, you won’t move one step further without first reminding the audience that “It depends”. In this case, it mostly depends on the priorities you have. Rest assured that you can’t call a cross-domain server that is not aware of you and agrees to some extent, and in some way, to accept your calls.

Effective cross-browser support indicates JSONP or a server proxy as the ideal solution. Using a server proxy adds an extra of indirection (and complexity) but would let you implement the Façade pattern—a way to streamline and normalize a hard API. Silverlight and Flash are probably not worth the cost unless there are other benefits in using them in the site. Browser-specific solutions are clearly specific to a particular flavor of a browser: I don’t recommend going that way unless you are sure about always using a particular version. CORS features are probably the future but for what I can say today they are just a thing of the future. And we code mostly for the present. My quick answer would be: either JSONP or a proxy.

About the Author

Dino is the author of "Programming ASP.NET MVC" for Microsoft Press and also coauthored the bestseller "Microsoft .NET: Architecting Applications for the Enterprise" (Microsoft Press 2008). A long time author and experienced consultant and trainer, Dino lives in Italy (when not traveling) and plays tennis (when not injured).

Share via