Infrequently Noted

Alex Russell on browsers, standards, and the process of progress.

Cross Domain Comet

Last night we got Cometd working across domains, without plugins.

"Cross domain Ajax" (see JSONP) hasn't taken off the way that I'd hoped, perhaps in part because of a lack of necessity. It might just be easier to set up a permissive proxy to do requests to services hosted on other domains. Moore's Law discounts distributed coordination in favor of localized transformation. But whatever the reason, there hasn't been huge uptake, even with beautiful abstractions like Dojo's JSON-RPC facilities (see the Yahoo services example). Intensely useful, but not everyone is on board just yet. Hopefully Joseph Smarr's talk on the topic at OSCON will help turn some heads.

But there's one scenario where it's a total no-brainer: cross-domain Comet. Normally when you set up a Comet server, it either has to sit in front of the application server or have requests passed off to it from the main web server. In the case where the app server dishes off the requests, it's potentially still "involved" in the process if it acts like a proxy. Unless your front-end web server is also built with event-based-IO in it's core, this can really really hurt. Luckily, there's hope on the horizon. Sun even announced that they're baking Comet support directly into their app server. But until the web tier collectively upgrades, we need some other way to simplify the configuration of apps+Comet servers. Until now, help has come in the form of iframes plus document.domain setting. This lets you put the Comet server on a sub-domain and still pass event payloads back-and-forth between parent document and the iframe. This is how the tried-and-true mod_pubsub client operates, but it's brittle. Browsers get picky when you try to set document.domain and things don't line up exactly the right way. Add Safari's non-deterministic iframe behaviors to the mix and it's a recipe for sleepless nights and begging forgiveness from the ops folks when you need just one more DNS change and server restart.

We can do better.

The solution to this that Juggernaut employs is to use the built-in cross-domain capabilities of the Flash player and it's proprietary crossdomain.xml scheme. This isn't a bad solution, but having Flash turned off isn't unheard of. The performance-across-the-flash-boundary problems are mostly licked these days (thanks dojo.flash!), but the complexity with getting it all set up correctly puts it in the position of "Plan B" if something better comes along. And we just might have something better in the form of <script src="...">.

By building on top of the Dojo ScriptSrcIO infrastructure it was trivial to make a JSONP transport for Cometd. We just adapt the long-poll style of Comet but make the initial handshake messages JSONP-clued. That in place, the rest works just like "normal" long-polling, except that the client can connect from any domain. The configuration problem just dropped from "make sure everything cooperates at the systems and network level" to "make sure that the client and server speak the same protocol"...and I like solutions that put the problem in software and not external configuration or human effort. The checked-in server-side code shows how to handle it.

So what's the downside? The first downside is that the latency characteristics for cross-domain Comet using this technique are the same as long-polling. Not bad, mind you, but not as optimal as iframe, multipart-mime, or Flash-based solutions when you have large numbers of messages being thrown at the client. Additionally, it's more difficult to know when a connection "times out" or in some other way fails, but I think these are tractable. Having a Plan A and a Plan B for cross-domain Comet and a server that can speak both styles via pluggable transport wrappers is exciting to me. By lowering the configuration barrier, I think we'll start to see a significant uptick in Comet adoption.

Next time: Why we need a standard protocol for this stuff, and what we're doing about that in Cometd.

PS: as usual, big thanks go out to my employer SitePen who is funding this research and making it available under liberal Open Source licenses.

Cometd: The Long Tail of Bad Puns

Thanks to a premature Ajaxian mention there's some real wind in the sails of Cometd, a primordial little Comet server and protocol project that I'm lucky enough to be working on. The goal of the project is to produce a content-level protocol for publish/subscribe event notification down to browsers. We need a name for the protocol (as distinct from the code), so if you have ideas, we're all ears.

Along with the protocol, we're producing multiple server implementations and at least one JavaScript client library.

Unlike some predecessors, however, a couple of neat features should help to make Cometd clients and servers resilient in the face of browser and network stupidity. The first of these features is "transport negotiation". At the core of it, there's a realization that browsers very rarely all do the same thing the same way...especially when you're out pushing on the lightly tested bits like we are. We get around this by having the client and server advertise what they want to speak over, and if they can come to some agreement, the conversation starts. Since the protocol is so simple, this generally means that servers can implement almost every type of transport with exactly the same code. The Twisted Python server does exactly this. This is very interesting because it allows us to implement most of the known Comet techniques with very little overhead. In the last 2 days I've implemented 3 different styles (forever-frame, multi-part-mime, and long-polling) which cover most of the modern browsers with at least one transport method. Each method exhibits different bugs, browser compatibility headaches, and performance characteristics, but being able to swap out one technique cheaply for another is going to let us experiment that much faster with how best to implement Comet systems.

If there's a secret sauce to Cometd it's that both of the initial implementations are being based on async network responder frameworks. On the Perl side, Perlbal is doing the heavy lifting while the Python variant relies on Twisted. Obviously there's a lot more to these things than picking a good library, but it sure helps. There's even been some discussion of how Java, PHP, and Erlang implementations could be made to scale using similar techniques.

Hopefully we'll get a new name for the protocol bits of Cometd soon (and a draft of the spec shortly thereafter) and once the Python server does topic and client pruning I'll try to get some demos hooked up. After all, it's the sexy collaboration that sells this stuff.

Update: seems I forgot to mention that in addition to the mailing list, the Cometd project also has one of them newfangled "blog" thingers.

Upgrade

It's been too long since I've written a non-trivial amount of Python and for the last couple of days I've been spending some time to look around and see what got better and what still sucks in Python land. The good news is that a lot of things have gotten better, including the language itself. Sadly, Python is still missing a clean way to represent enclosed scope, which seems totally out of kilter for a language that is otherwise so dynamic. Sure, you return a named throw-away function from another function in order to generate a closure, but you can't create an anonymous variant of the same thing. Also, Python's lambda function is still crippled, to my great chagrin. JavaScript's function and Ruby's blocks still have Python by the balls on this one.

Also, I'm not sold on decorators. The burgeoning catastrophe that is Java 5 annotations should be an object lesson, but I'm afraid good taste hasn't prevailed in either language yet. But whatever quibbles I might have with various syntactic forms in modern Python, it's still the best thing going. Idiomatic Python remains readable, audit-able, and fast.

Here's the short list of the toolchain I'm assembling for an upcoming project:

While the greenlet project looks like an interesting alternative to Twisted, it doesn't look "baked" enough to handle most of the situations I'll be relying on Twisted for. In much the same way that library depth is the only reason I keep inflicting Java on myself, protocol implementation breadth and depth are the reasons that Twisted probably has very long legs, despite the kludgyness of requiring factory classes every time you want to sneeze.

I'm tremendously excited about lxml. It implements the ElementTree API but extends it with all kinds of XPath and XSLT goodies that make doing XML in Python much less painful than the cluster-fuck that is 4suite and PyXML. Our long XML processing nightmare may finally be over, no surprise, thanks to libxml/libxslt.

Is there anything that I've totally missed in the last 2+ years of Python development that I shouldn't be going without?

Doability

Somewhere there must be written an engineering mantra that states something along the lines of "knowing that a thing can be done liberates the mind to do it".

Is there a name for this "law" of software engineering?

Phobos builds available

I've mentioned Phobos before, but until now you couldn't get code. Seems that builds are now available for the adventurous to play with.

Older Posts

Newer Posts