Cross-Document Messaging and RPC

Øyvind Sean Kinsey | June 29, 2010

 

Even though we, for security reasons, most often do not wish pages from different domains to be able to communicate, sometimes we do. And then, we discover that there is no 'proper' way to do so - the current standards and the current technologies are built to disallow it. So we turn to workarounds, we use dynamic script tags included from external domains, we use JSONP, or we try our best using  postMessage or the  IFrame URL technique (FIM). These solutions often become quite complex and fragile, and transporting the pure string messages between the domains might very well end up making up most of the code.

Typical cross-domain scenarios:

  • A document in domain foo.com with an iframe pointing to bar.com, where either document tries to access elements, properties or code from the other
  • A document in domain foo.com trying to use the XMLHttpRequest object to load URLs from bar.com

In this article I will go through the methods available in the different browsers with example code for each one. I will then show you how you can utilize a normalized API for things like RPC, and finally I will introduce to you a ready-made framework for all of this.

One important thing to remember when it comes to working around the Same Origin Policy (SOP), is that it was set in place for a reason, and so any workarounds need to take this into account. In addition, the solution we choose needs to work reliably, and across all the targeted browsers.

So before we begin, let’s define some qualities that should be asserted in any implementation.

  • Only data should be able to pass the boundary (this means strings as objects would incur a security risk)
  • Only the intended recipient should be able to read the messages
  • The recipient must be able to assert who the sender of the message is in order to avoid spoofing
  • The messages must be delivered reliably
  • Messages of arbitrary size should be supported
  • No context must leak (one window cannot be able to reference  any objects owned by another)
  • Should work equally in all targeted browsers

Failing to meet any of these means that the solution will either be susceptible to attacks or that it might not work reliably for all users/situations.

 

Available methods

Note: The example code given here is the minimum needed to show how the different techniques work. Additional code could be added that would provide the missing qualities, but this will not be covered here.

postMessage

Browser Support: Available in IE8, Firefox 3, Chrome 2, Safari 4 and Opera 9

This is a feature that is defined in the draft for the upcoming HTML5 standard, but which has already been implemented in all major browsers. The standard defines a method, window.postMessage and a corresponding message event, which allows one document to post messages, and another to register to receive messages.

Example

foo.com 
    <iframe src="https://bar.com/" id="barFrame">
    </iframe>

    var win = document.getElementById("barFrame").contentWindow;
    win.postMessage("hola!", "https://bar.com"); // make sure only https://bar.com can receive this  message

bar.com 
    window.addEventListener("message", function(message) {
        if (message.origin == "https://foo.com") { // make sure we only receive messages from those we trust
            alert(message.data);
        }
    });

Caveats:

  • Is only supported by relatively new browsers

Fragment Identifier Messaging

Browser Support: Available in all browsers

This is probably the most known technique, and it works due to a small loophole in the SOP. In order for one document to be able to navigate another, write access to document.location is allowed, and what this means is that we can write information by passing it in the url. Since we normally don't want to reload the document we are talking to, we use the fact that if we set the location property to the same url, with only the part after the " # " changed, no reload will occur. And since the document being written to has full read access to its location property it can easily retrieve it.

Example

foo.com 

    <iframe src="https://bar.com/" id="barFrame">
    </iframe>

    var win = document.getElementById("barFrame").contentWindow;
    win.location = "https://bar.com/#hola!";

bar.com 

    var prevMsg = location.hash;
    window.setInterval(function(){
        if (location.hash !== prevMsg) {
            prevMsg = location.hash;
            alert(prevMsg);
        }
    }, 100);

Caveats

  • Difficulties signaling when a message could be expected. Using timers are not advised in order to conserve resources
  • No way of identifying the sender
  • Messages might be written faster than the recipient can read them
  • Different browsers have different max sizes for the URLs. For example, IE6 supports URLs with a max length of 4095 characters.

'window.name'

Browser Support: Available in all browsers where the windows name property is persisted when navigating across domains

This method works due to another loophole that has now been shut in newer browsers. The basic way it works is that if you set the  ' name ' property of a window in your own domain, and then redirect the window to that of another, then the loaded document will be able to read its name property and so receive the message. And again, since we don't want the recipient’s document reloaded, we use a helperframe for this.

Example

foo.com 

    <iframe src="https://bar.com/" name="barFrame" id="barFrame">
    </iframe>
    <iframe src="sender.html" id="helper">
    </iframe>

    var helper = document.getElementById("helper").contentWindow;
    helper.sendMessage("hola!", "https://bar.com/receiver.html"); //sendMessage is defined in sender.html

foo.com/sender.html 

    window.sendMessage = function(message, url){
        window.name = message;
        location.href = url;
    };
bar.com 

    function onMessage(message) {
        alert(message);
    }

bar.com/receiver.html 

    parent.frames["barFrame"].onMessage(window.name); // pass the message along
    window.history.back(); // move back to the sender.html document

Caveats

  • Requires additional helper documents, and additional frames
  • Requires additional requests to the server (unless aggressive caching is used)

'NIX'

Browser Support: Available in IE6 and IE7

This is definitely one of the more obscure techniques and it was brought to my attention when I rummaged through the source code of the Apache Shindig project. It works due to an iframes opener property being writable for the parent window, while being readable by the iframe itself, in addition to it being able to store not only primitives, but any kind of object. This can then be used to exchange functions for passing messages back and forth.

Example

foo.com 

    <iframe src="https://bar.com/" id="barFrame">
    </iframe>

    var postMessage;
    function onMessage(msg) {
        alert(msg);
    }
     
    var win = document.getElementById("barFrame").contentWindow;
    win.opener = {
        setPostMessage: function(fn) {
            postMessage = fn;
    },
    postMessage: function(msg) {
        window.setTimeout(function(){ //defer the call so that it is run in the 'correct' 'context'
            onMessage(msg);
        },0);
    }
    };
    window.setTimeout(function(){ //wait until the child document has loaded and postMessage has been set
        postMessage("hola!");
    }, 100);

bar.com 

    function onMessage(msg) {
        alert(msg);
    }
    window.opener.setPostMessage(onMessage);
    window.opener.postMessage("right back at ya");

Caveats

  • Leaks the windows contexts.  One window can easily get access to 'privileged' data and not just what is passed.
  • Numeral problems related to security, hijacking, man-in-the middle etc.
  • No way of identifying the sender.

In addition to the above mentioned methods you also find techniques like JSONP to retrieve data, and cross-domain POSTs to send data, but common for these are that you require server components to maintain state, and so they aren't truly solutions for Cross-Document Communication, where both documents are able to maintain a context and a state.

Utilizing a normalized API

With a little work (actually, a lot), all of the above techniques can be normalized into a full-duplex transport with a postMessage function for sending, and an onMessage function for receiving messages, and in this case a message is always a string primitive.

/**
* @param {String} msg The message to transport
* @param {String} recipient The domain of the intended recipient  
*/
function postMessage(msg, recipient) {
    // implementation
}
 
/**
* @param {String} msg The message
* @param {String} origin The originating windows domain  
*/
function onMessage(msg, origin) {
    // process message 
}

So far so good, but of what use is really a string based transport? Wouldn't it be better if we could call methods, supply arguments, and consume return values instead? As it turns out, this is actually not hard at all as we have two other resources available, the JSON-RPC protocol, which defines how to format a string with the information needed to invoke a method and to return its return value, and the JSON serializer/deserializer (either natively or through Douglas Crockfords JSON2 library). With these tools (a string transport, a protocol for Remote Procedure Calls using string messages, and a JSON serializer), a cross-domain RPC implementation is only moments away.

Let’s start with the procedure call:

According to the JSON-RPC specification a call should have the following format

'{"jsonrpc":"2.0", "method": "methodName", "params:" ["a", 2, false]}'

This is a format that we easily can achieve by serializing an standard javascript object using JSON.stringify. And once we have transported the string across the boundary then we can recreate the javascript object using JSON.parse.

Example

Note: In this example the assumption is that we have a working transport between one document and another.

foo.com 

var procedureCall = {
    jsonrpc: "2.0",
    method: "alertMessage",
    params: ["my message"]
};
//serialize the message
var jsonRpcString = JSON.stringify(procedureCall);
//send the message
postMessage(jsonRpcString, "https://bar.com"); // https://bar.com is the intended recipient

bar.com 

function alertMessage(msg) {
    alert(msg);
}
function onMessage(msg, origin) {
    //deserialize the message
    var procedureCall = JSON.parse(msg);
    //execute the call
    switch (procedureCall.method) {
        case "alertMessage":
            // call the requested method with the provided arguments
            alertMessage.apply(null, procedureCall.params);
            break;
        case ....
    }
}

That wasn't too complicated now was it? Now, if we want to also be able to call methods and receive return values we can mark the message with an id parameter so that we can link return values to their handlers.

Example

foo.html 

var calls = {}, callId = 0;
 
function onMessage(msg, origin) {
    //deserialize the message
    var procedureCall = JSON.parse(msg);
    if (procedureCall.method) { // this is a method call
        switch(procedureCall.method) {
            ...
        }
    }else{ // this is a result
        // retrieve the callback function
        var fn = calls[procedureCall.id];
        // execute it with the result 
        fn(procedureCall.result);
        // remove the callback
        delete calls[procedureCall.id];
    }
}
 
function rpcAdd(a, b, fn) {
    var id = ++callId; //get a new id
    calls[id] = fn; //store the function that should receive the return value
    postMessage(JSON.stringify({
        jsonrpc: "2.0",
        method: "add",
        params: [a, b],
        id: id
    }), "https://bar.com");
}
 
// this looks pretty much like any other asynchronous call right?
rpcAdd(3, 5, function(result) {
    alert("the result is: " + result);
});

bar.com 

function add(a, b) {
    return a + b;
}
 
function onMessage(msg, origin) {
    //deserialize the message
    var procedureCall = JSON.parse(msg);
    //execute the call
    switch (procedureCall.method) {
        case "add":
            // call the requested method with the provided arguments
            var result = add.apply(null, procedureCall.params);
            //return the value
            postMessage(JSON.stringify({
                jsonrpc: "2.0",
                id: procedureCall.id,
                result: result
            }), "https://foo.com");
            break;
        case ....
    }
}

Again, it's not hard when you have the right tools for the job :)

So in summary, what is needed for cross-document RPC?

Now, in order for all of this to be used reliably, the following is needed

  • Code must be written for each of the available techniques in order to satisfy the demands stated in the introduction, this means adding reliability, queuing, sender and recipient verification etc. AND making them full-duplex (going both ways)..
  • The different transports needs to be abstracted so that they have a common interface, and wrapped with logic to select the appropriate one
  • The transports need to be properly initialized before usage, either manually or automatically through the query string
  • Logic for generating RPC stubs, for handling RPC calls and notifications, and for handling all the edge cases for RPC needs to be added

Trust me; this is not done in an hour or two!

The solution

Luckily, there's no need for all of you to go through all of the steps above as there is already a framework that will do all of this for you. easyXDM is a JavaScript library that enables you as a developer to easily work around the limitation set in place by the Same Origin Policy, and it does so by doing all of the above, and by doing this with quality JavaScript code. It provides two levels of abstractions; a Socket class that is used for string based messaging, and an RPC class that provides Remote Procedure Calls.

Using the easyXDM.Socket class

You only need to pass a single argument to easyXDM for it to set up a transport, and that is the URL to the remote end. The rest, including passing the needed parameters to the other end is handled by easyXDM.

Example

foo.com 

//To enable the use of the NameTransport, some additional arguments needs to be passed, but this will not be covered here.

var socket = new  easyXDM.Socket({
    // this is the URL to the other end, the provider. This is only needed for the main document, the consumer.
    remote: "https://foo.com/index.html",
    // onMessage is the function that will be called with incoming messages.
    onMessage: function(msg, origin) {
        alert("received message:" + msg + " from " + origin);
    },
    // onReady is called once the transport is ready to use.
    onReady: function(){
        // here you can put code that should run once the transport is up and running
    }
});
 
// each Socket has a method 'postMethod' that can be used to send messages
// messages sent prior to the onReady event being fired will be buffered and executed once ready
socket.postMessage("hola!");

bar.com/index.html 

// on the provider end, easyXDM automatically sets up the transport based on information passed in the URL.
var socket = new easyXDM.Socket({
    onMessage: function(msg, origin) {
        alert("received message:" + msg + " from " + origin);
    }
});
socket.postMessage("hola!");

Try the demo

This socket supports all browsers - IE6+, IE8, Firefox 3, Chrome 2, Safari 4 and Opera 9 with transit speeds of less than 15ms, and the rest depending on the underlying transport. It will also satisfy all the demands that we stated earlier, and some more.

Using the easyXDM.Rpc class

Using the RPC class for Remote Procedure Calls isn't much harder either

Example

foo.com 

// the first object passed here is pretty much identical to that passed to the Socket constructor
var rpc= new easyXDM.Rpc({
    remote: "https://foo.com/index.html"
},
// this is where we define the methods that we want to expose, and the remote methods that we want to have stubs generated for
{
    // here we define the methods that we want to expose
    local: {
        barFoo: function(a, b, c, fn){
            // here we can implement the exposed method.
            // if its a synchronous method then we can use
            // 'return value;' to return the value
            // if its asynchronous (e.g. doing ajax) then we use
            // fn(value);
            return a + b.toString() + c.toString();
        }
    },
    // these define the stubs that easyXDM should create
    remote: {
        fooBar: {}
    }
});
 
// send a JSON-RPC 2.0 Notification (no callback functions results in a notification)
rpc.fooBar();

bar.com/index.html 

// again, here we do not need to supply any information regarding the transport, this is handled automatically
var rpc= new easyXDM.Rpc({}, {
    local: {
        fooBar: function(){
            // this was called using a JSON-RPC Notification, lets run a regular function call
            rpc.barFoo("a", 1, false, function(result) {
                alert("the result of rpc.barFoo was " + result);
            });
        }
    },
    remote: {
        barFoo: {}
    }
});

Try the demo

As you can see, the instance of the RPC class will be augmented with stubs for all of the methods, making the use no different from any other asynchronous method!

Use cases

easyXDM is really just a framework that facilitates Cross-Document Messaging and RPC, and so you can think of it as a building block - all by itself it does little, but as a foundation, it can do wonders.

  • Auto-resizing iframes
  • Facebook Connect-like APIs (vkontakte.com with 75 million have based their API on easyXDM)
  • Bookmarklets that augments the current page
  • Cross-Domain AJAX
  • Mashups
  • Bridging multiple web applications

Important note about the iframe

As you can see above no markup is needed, and this is due to easyXDM creating the iframes as necessary. This means that you cannot use easyXDM to connect existing iframes, something which is due to easyXDM needing full control in order to set up the transport.

By default the frame will be hidden from view as this is how most API's will work, but easyXDM also supports visible frames.

Example

var rpc= new easyXDM.Rpc({
        remote: "https://foo.com/index.html",
        onReady: function(){
            ....
        },
        container: document.getElementById("container"),
        props: {
            style: {
                border: "1px solid red",
                width: "100px",
                height: "200px"
            }
        }
    }, {
    local: {
        ...
    },
    remote: {
        ...
    }
});

As you can see, we pass a container to control where in the DOM the iframe should be placed, and we pass a props object that contains all the properties that should be applied to the iframe. The props object is deep-copied onto the iframe and so we can use 'style: {...}' to set the style.

Conclusion

Adding Cross-Document Communication to your applications might sound complicated, but it need not be - it's all about using the right tool for the job. easyXDM is one such tool, it's flexible, it's reliant and it's very easy to use as it does all of the heavy lifting for you. And did I mention that it is completely framework agnostic and cross-browser compatible?! easyXDM is licensed under the MIT License and for more information, check out the easyXDM web site as well as its API Documentation.

 

About the Author

Sean is the CTO of BRIK, a small development company in Norway. He specializes in Front-End engineering, but is known to stick his hands into everything from redundant hardware to software loadbalancers.

Sean's experience ranges from teaching low-level ATM, Wi-Fi and IP-over-HF as an Officer in the Norwegian Military to building advanced web based applications for the fitness industry. What he enjoys the most though, is to push the boundaries of what others believe is possible, and then preferably by using everyday household items like HTML, javascript and a dash of CSS! 

When not working, Sean enjoys running, biking and good beer.

Find Sean on: