Adventures In Comet and Multipart Mime

December 21, 2006

The Dojo Bayeux client implements a bunch of different "transports" and tries to pick the right one based on what the browser can support, the cross-domain requirements, and so forth. When we started down this path, most of the reason for doing this was to implement both the forever-frame and long-polling styles of Comet as well as providing a platform to experiment with alternate transport types (e.g., Flash sockets). One of the most promising of these experiments took advantage of the multipart mime support that's been tucked away in the Mozilla codebase for quite a while. What follows is one of those stories that makes people assume that I'm crazy to do what I do for a living. They might be right.

Multipart is attractive because it provides a way of avoiding TCP set-up and tear-down for each and every event across the channel. While it's not significant overhead (comparatively), being able to also reduce the number of HTTP header blocks sent can also help out when it comes to wringing latency out of the system. The code indicates that multipart is supported on Safari and Mozilla, but while events are indeed delivered at the right times on Safari, you can't get at the payload until the connection closes completely. Not useful.

Things were looking better on Firefox and it was the preferred transport type in the Dojo client, but I think that's going to have to change. Sadly, it seems we can't actually tell when a multipart connection has failed. In "normal" XHR requests, the 200 HTTP status code plus a "finished" readystate indicates that the contents of the request can be read and control handed back. In the multipart case, each successful block fires of a load handler and resets the readystate. That means that the combination of readystate and and status can't be used to differentiate between block success and connection success. Making matters worse, server-side connection failure doesn't fire any kind of readystatechange handler, and even if it did, it doesn't appear to be possible to determine if the connection is closed from any of the public properties on the object.

So, OK, what about falling back to a timer that restarts the connection every N seconds for good measure? This might work in cases like failover where a lag of 10 to 30 seconds might be acceptable but not for normal operation. Should events be flow regularly, it might never be necessary to hit this "backstop". Not great, but I gave it a try, only to discover that Firefox won't give you responseText of an XHR request if the connection is marked as multipart but the response isn't a 200 and wrapped in a multipart block. Since we're trying to use HTTP status codes correctly and keep the server internals from needing to fork significantly for each pluggable transport, it's something of a step backward to need this kind of hand-holding.

I'd still like to support the multipart transport type, but until at least one of the implementations becomes rational for use from the XHR object, I think I'm going to just be commenting this transport out in cometd.js. Like XHR itself, it's one to mark down for resurrection a year or two from now. At least we still have good enough options in the mean time.