Fast DOM Queries in Today's Browsers

March 18, 2006

What follows is a janky hack. If you do not have the stomach for things that are useful in the real world, please stop reading here. "But it's not standards compliant!" comments will receive no sympathy. Validatorians, you've been warned.

If you're still reading, you're probably aware of the crappy primitives that the W3C has bestowed us with for scripting arbitrary collections of nodes. Things like Microsoft's HTC and Mozilla's XBL allow for browser-specific markup-upgrade paths but these aren't really feasible in the "real world" since they require lots of code branching and different semantics for attaching a behavior. Tools only succeed where they lower our costs. This is why Dojo and Behavior (and even netWindows, back in The Day) work so hard to provide a portable basis for applying behaviors.

Given that the W3C has f'd us in the ear and that the browser manufacturers can't get it together enough to come up with one non-standard way to apply scripts to collections of nodes, we're back to iteration. Updating the collection to which a behavior should be applied when a DOM is updated presents a particular challenge. IE doesn't throw mutation events for DOM changes nor does it provide client-side XSLT. Both hands our tied behind our back.

So we need something else. document.getElementsByName() would work quite well if the browsers paid attention to name attributes for any element. Alas, they don't. Which brings us back to the one fast query browsers will support on any element: document.getElementById(). With it we can build a fast, efficient query function for every browser but Safari:

function elementsById(id){
var nodes = [];
var tmpNode = document.getElementById(id);
while(tmpNode){
nodes.push(tmpNode);
tmpNode.id = "";
tmpNode = document.getElementById(id);
}
for(var x=0; x<nodes .length; x++){
nodes[x].id = id;
}
return nodes;
}

A permutation of this that caches the results and does not set the IDs back to the original value allows re-runs of the function to determine if new elements have been added to the group and/or if a node should be removed:

var groupCache = {};
function elementsById(id){
if(!groupCache[id]){
groupCache[id] = [];
}
var nodes = groupCache[id];
for(var x=0; x<nodes .length; x++){
if(nodes[x].id != ""){
nodes.splice(x, 1);
x--;
}
}
var tmpNode = document.getElementById(id);
while(tmpNode){
nodes.push(tmpNode);
tmpNode.id = "";
tmpNode = document.getElementById(id);
}
return nodes;
}

Obviously, getting a list of added/removed nodes from this function might be preferable to receiving the full list, but I'll leave a better API for this as an exercise. Safari is the only browser which appears to not support this method of constructing queries, but we can fall back to iteration. It's certainly not going to be any slower than current methods. The hack is also made somewhat less useful by the W3C's bone-headed decision to limit elements to a single ID.

The jury is still out as to whether or not this is going to prove useful, but I can already imagine it being an optional optimization for Dojo users looking to make their apps go like hell on pages with complex DOMs.

If only it weren't necessary.

Update: After reading much WebCore source code, I'm not aware of a way to make elementsById() work on the current Safari. There is good news, however. On the latest Konqueror release and nightly Safari builds this hack works flawlessly. In short, this will very soon the the most widely available, fastest DOM query method. More news regarding elementsById() to follow shortly.