May 2012

Volume 27 Number 05

Cutting Edge - Understanding the Power of WebSockets

By Dino Esposito | May 2012

Dino EspositoThe current World Wide Web isn’t designed to be a real-time medium. Web applications give the impression of continuous feel through classic polling solutions implemented via AJAX or perhaps via long-polling requests when effectively implemented by ad hoc libraries such as SignalR and Comet. For the needs of most appli­cations, polling is a good solution, even though it might suffer from client-to-server and server-to-client latency. In this article I’ll explore a new alternative called WebSockets.

The growing integration between Web and mobile applications with social media is lowering the threshold of tolerable delay in client/server interaction. When you update your Facebook status, you want that information to be immediately available to your friends. Similarly, when someone likes one of your posts, you want to be notified instantly. Today, all of these features are real, and this is just one of the reasons for the worldwide adoption of Facebook and for the explosion of the social network phenomenon. So in the end, there’s a significant demand from developers of solutions and tools for implementing real-time communication over the Web.

Achieving zero-lag connectivity between Web clients and servers requires going beyond the HTTP protocol. This is just what the WebSocket Protocol provides. Currently an Internet Engineering Task Force standard exists for the WebSocket Protocol; you can read about it at bit.ly/va6qSS. A standard API for implementing the protocol is being formalized by the World Wide Web Consortium (W3C) for browsers to support it (see bit.ly/h1IsjB). The specification is in “Candidate Recommendation” status.

The WebSocket Protocol

The new WebSocket Protocol aims to overcome a structural limitation of the HTTP protocol that makes it inefficient for Web applications hosted in browsers to stay connected to the server over a persistent connection. The WebSocket Protocol enables bidirectional communication between Web applications and Web servers over a single TCP socket. Put another way, the protocol makes it possible for a Web application hosted in a browser to stay connected with a Web endpoint all the time while incurring minimal costs such as pressure on the server, memory and resource consumption. The net effect is that data and notifications can come and go between browsers and Web servers with no delay and no need to arrange for additional requests. As emphatic as it may sound, the WebSocket Protocol opens up a whole new world of possibilities to developers and makes polling-based tricks and frameworks a thing of the past. Well, not exactly.

Using WebSockets Today

Browser support for the WebSocket Protocol will improve quickly, but only the latest versions of browsers will support WebSockets, of course. Users who don’t usually upgrade their browsers regularly (or aren’t allowed to upgrade by strict corporate policies) will be left behind.

This means developers can’t just abandon code based on AJAX polling or long-polling solutions. In this regard, it’s relevant to note that SignalR—the upcoming Microsoft framework for zero-lag messaging between browsers and Web servers—does a fantastic job of abstracting a persistent connection, automatically switching to WebSockets where supported and using long polling in any other cases. I covered SignalR in recent columns, and once again I invite you to try it out as soon as possible if you haven’t done so yet. SignalR has everything to be a winning library and a tool for every developer and any Web application.

Who Supports WebSockets Today?

Figure 1 provides a brief summary of the support for WebSockets that most popular browsers provide at present.

Figure 1 Browser Support for WebSockets

Browser WebSockets Support
Internet Explorer WebSockets will be supported in Internet Explorer 10. Metro applications written using JavaScript and HTML5 will support WebSockets as well.
Firefox WebSockets are supported starting with version 6 of the browser released in mid-2011. Some very early support was offered in version 4 and then dropped in version 5.
Chrome WebSockets are supported starting with version 14, which was released in September 2011.
Opera Support for WebSockets has been removed in version 11.
Safari Supports an earlier version of the WebSocket Protocol.

With the exception of Firefox, you can programmatically check for WebSockets support by looking at the window.WebSocket object. For Firefox, you should currently check the MozWebSocket object. It should be noted that most HTML5-related capabilities can be checked in browsers by means of a specialized library such as Modernizr (modernizr.com). In particular, here’s the JavaScript code you need to write if you linked the Modernizr library to your page:

if (Modernizr.websockets)
{
  ...
}

Modernizr is probably an excellent choice today if you want to start using a WebSocket implementation because it provides you with polyfills—code that kicks in automatically if a given feature isn’t supported on the current browser.

In the end, WebSockets are an extremely compelling feature with non-uniform support today across vendors. Microsoft, however, broadly supports WebSockets through the upcoming Internet Explorer 10 and also IIS, ASP.NET, Windows Communication Foundation (WCF) and Windows Runtime (WinRT). Note, though, that no official standard API exists yet, so the early support is a great sign of interest. The best you can do today is use WebSockets through some abstraction layer. Modernizr is a possible option if you want to stay close to the metal and write your own code that opens and closes WebSockets. SignalR is a better option if you’re looking for a framework that transparently connects a browser and a Web endpoint in a persistent manner, with no frills and no need to know many underlying details.

A Look at the WebSocket Protocol

The WebSocket Protocol for bidirectional communication requires that both the client and server application are aware of the protocol details. This means you need a WebSocket-compliant Web page that calls into a WebSocket-compliant endpoint.

A WebSocket interaction begins with a handshake in which the two parties (browser and server) mutually confirm their intention to communicate over a persistent connection. Next, a bunch of message packets are sent over TCP in both directions. Figure 2 outlines how the WebSocket Protocol works.

The WebSocket Protocol Schema
Figure 2 The WebSocket Protocol Schema

Note that in addition to what’s in Figure 2, when the connection is closed, both endpoints exchange a close frame to cleanly close the connection. The initial handshake consists of a plain HTTP request that the client sends to the Web server. The request is an HTTP GET configured as an upgrade request:

GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Origin: https://example.com

In HTTP, a client request with the Upgrade header indicates the intention of the client to request that the server switch to another protocol. With the WebSocket Protocol, the upgrade request to the server contains a unique key that the server will return mangled as the proof that it has accepted the upgrade request. This is a practical demonstration to show the server understands the WebSocket Protocol. Here’s a sample response to a handshake request:

HTTP/1.1 101 WebSocket Protocol Handshake
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

A successful status code is always 101, and any other status code will be interpreted as a refusal to upgrade to the WebSocket Protocol. The server concatenates the received key with a fixed GUID string and calculates a hash out of the resulting string. The hash value is then encoded to Base64 and returned to the client via the Sec-WebSocket-Accept header.

The client can also send other headers such as Sec-WebSocket-­Protocol in order to indicate which subprotocols it’s happy to employ. A subprotocol is an application-level protocol built on top of the basic WebSocket Protocol. If the server understands some of the suggested subprotocols, it will pick one up and send its name back to the client via the same header.

Following the handshake, client and server can freely send messages over the WebSocket Protocol. The payload begins with an opcode that indicates the operation being performed. One of these opcodes—specifically 0x8—indicates a request to close the session. Note that WebSocket messages go asynchronously so a send request won’t necessarily receive an immediate response, as in HTTP. With the WebSocket Protocol, you’re better off thinking in terms of general messages going from client to server or vice versa, and forgetting about the classic request/response pattern of HTTP.

The typical URL for a WebSocket endpoint takes the following form:

var myWebSocket =
    new WebSocket("ws://www.websocket.org");

You use the wss protocol prefix if you want to go on a secure socket connection (secure connections will generally be more successful when intermediaries are present). Finally, the WebSocket Protocol acknowledges and addresses the problem of cross-origin communication. A WebSocket client generally—but not always—permits sending requests to endpoints located on any domain. But it’s the WebSocket server that will decide whether to accept or refuse the handshake request.

A Look at the WebSocket API

As mentioned, the W3C is currently standardizing an API for the WebSocket Protocol, and browsers are catching up with the various drafts as they become available. You should be aware any code that works today might not work across all browsers and, more importantly, is not even guaranteed to work on the same browser when a new release hits the market. In any case, once you have some working WebSocket code you’re pretty much done, as any changes that might be required in the future will likely be just minor changes.

If you want to experiment with the WebSocket Protocol you can visit websocket.org with a browser that supports the protocol. For example, you can use a preview of Internet Explorer 10 or a recent version of Google Chrome. Figure 3 shows the handshake as it’s being tracked by Fiddler.

Real Handshaking Between Browser and Server
Figure 3 Real Handshaking Between Browser and Server

Not surprisingly, the current version of Fiddler (version 2.3.x) will only capture HTTP traffic. However, a new version of Fiddler that deals with WebSocket traffic is currently in beta.

The WebSocket API is fairly simple. On the browser side, you need to create an instance of the WebSocket browser class. This class exposes a bunch of interesting events for which you want to have proper handlers:

var wsUri = " ws://echo.websocket.org/";
websocket = new WebSocket(wsUri);
websocket.onopen = function(evt) { onOpen(evt) };
websocket.onmessage = function(evt) { onMessage(evt) };
websocket.onclose = function(evt) { onClose(evt) };
websocket.onerror = function(evt) { onError(evt) };

The onopen event is fired when the connection is established. The onmessage event fires whenever the client receives a message from the server. The onclose is triggered when the connection has been closed. Finally, onerror is fired whenever an error occurs.

To send a message to the server, all you need to do is place a call to the method send, as shown here:

var message = "Cutting Edge test: " +
  new Date().toString();
websocket.send(message);

Figure 4 shows a sample page that’s an adaptation of the echo example found on the websocket.org Web site. In this example the server just echoes the received message back to the client.

The WebSocket Protocol in Action
Figure 4 The WebSocket Protocol in Action

If you’re interested in WebSocket programming for Internet Explorer 10, see bit.ly/GNYWFh.

The Server Side of WebSockets

In this article I mostly focused on the client side of the WebSocket Protocol. It should be clear that in order to use a WebSocket client, you need a proper WebSocket-compliant server that understands the requests and can reply appropriately. Frameworks for building a WebSocket server have started to appear. For example, you can try out Socket.IO for Java and Node.js (socket.io). If you’re looking for some Microsoft .NET Framework stuff, have a look at “Web Socket Server” from The Code Project at bit.ly/lc0rjt. In addition, Microsoft server support for WebSockets is available in IIS, ASP.NET and WCF. You can refer to the Channel 9 video, “Building Real-Time Web Apps with WebSockets Using IIS, ASP.NET and WCF,” for more details (bit.ly/rnYaw5).

Sliced Bread, Hot Water and WebSockets

As many have said, WebSockets are the most useful invention since sliced bread and hot water. After you’ve made sense of WebSockets you just wonder how the software world could’ve thrived without them. WebSockets are helpful in a number of applications, though not for just any applications. Any application where instant messaging is key is a potential scenario where you can seriously consider building a WebSocket server and a bunch of clients—Web, mobile or even desktop. Gaming and live-feed applications are other industry fields that will benefit immensely from the WebSocket Protocol. Yes, WebSockets are definitely the best after hot water!


Dino Esposito  is the author of “Programming ASP.NET 4” (Microsoft Press, 2011) and “Programming ASP.NET MVC 3” (Microsoft Press, 2010), and coauthor of “Microsoft .NET: Architecting Applications for the Enterprise” (Microsoft Press, 2008). Based in Italy, Esposito is a frequent speaker at industry events worldwide. Follow him on Twitter at twitter.com/despos.

Thanks to the following technical experts for reviewing this article:  Levi Broderick and Brian Raymor