Infrequently Noted

Alex Russell on browsers, standards, and the process of progress.

How IE Mangles The Design Of JavaScript Libraries

A lot of hyperbole gets thrown around about how painful IE 6 and 7 make the world of JS development, and so I thought I'd do a bit of cataloging to help those using Dojo understand why it's built the way it is and indeed, why all JS widget libraries suffer similar design warts. I know the good folks up in Redmond are working hard at delivering something better, but the fact of the matter remains that until they outline when we're going to get it (and the version after) and how it's going to be distributed, IE 8 only serves to make the current situation look as shabby as it really is. Here are but 5 examples of how IE makes your toolkit of choice less elegant than it probably should be.

  1. Array's Can't Be Usefully Subclassed (test case)

    At first blush, this seems wrong. You can use the Array base-class as the prototypal delegate for any user-defined class you wish. Methods are correctly delegated to and hash-style indexes work fine. Almost everything works right...except when you try to use the built-in array manipulation methods like push, pop, and shift. They dutifully change the internal contents of the subclass instance's indexed attributes, but they don't manipulate the length property. This means that while you can use for(var x in list){ ... style iteration, you can't do anything *aside from key iteration* to know how many items are in the array. Obviously, one could try to wrap the intrinsic functions and detect how they manipulate the length property, but then you've ruined their [DontEnum] status and now they end up in the iterable surface area of instances. Ugg.

    Arrays without a working length property are nearly useless, and JScript mangles the design of toolkits as a result.

    So how do we get dojo.NodeList to be a "real" array with extra methods then?

    As you might expect, it's a giant hack. When you use the "new" keyword with the dojo.NodeList function, you expect that the system will create a new instance and do it's normal "stamp with a constructor" business. Instead, we resort to creating (and populating) a regular Array instance and "NodeList-ifying" it by copying named attributes from the class prototype into the instance as member properties. The "constructor" function then explicitly return a new object, bypassing the "new" keyword's create/stamp machinery, at which point the return of the new operator becomes our explicit return and not the object which it would have otherwise implicitly returned.

    In Dojo 0.9 we had used an even more aggressively hackish workaround for IE which involved creating a sub-document and mapping its interpreter's intrinsic Array class into the parent document at a different name. Both are slow for different reasons but we eventually switched to the create-and-mix-in style because some popup blockers were interfering with the old method.

    Lest you think that Dojo is a dirty, dirty toolkit for doing this kind of thing, consider the janktastic "it's not really an array" thing that JQuery resorts to instead. By giving up all "[]" index operations, JQuery manually maintains it's internal length property by re-implementing all of push, pop, etc. functions. This has the benefit of allowing prototypal delegation to work of pre-existing instances when new features are added to the base prototype, but at the expense of no longer being able to think about a dense list of things as an array. Dojo's approach is painful, but so are all the alternatives today.

    I think it's safe to say that both toolkits would subclass Array directly to save code, were it a reasonable thing to do.

  2. Where Art Thou Getters/Setters?

    As JavaScript toolkits get pushed out of their current workhorse tasks of plastering over JavaScript and DOM implementation gaffes by positive browser progress and Moore's Law, they increasingly take on application construction tasks (e.g., dojo.data). As the toolkits have approached these tasks, we've collectively started to hit some very serious usability limitations due, in large part, to JScript's lack of progress.

    Toolkits like Dojo have widget systems because HTML just hasn't kept up. That means that these toolkits have a responsibility to keep as many of the positive aspects of the "native" web as they can. From CSS customization to accessibility all the way through implementing declarative creation and DOM-ish JavaScript APIs, the better a job a toolkit can do in making the abstraction feel more solid, the better the toolkit is. Widgets are essentially a way to "subclass HTML elements".

    In many places, DOM allows you to affect the behavior of the visible document using "setter"-style syntax. For example:

    
    document.getElementById("foo").style.border = "5px dotted black";
    

    Custom widget classes can have the same behavior on every browser except IE.

    This means, of course, that JavaScript toolkits can't really implement the behavior, backing JavaScript programmers up against a wall when they design their tools. Instead of providing the natural property-oriented behavior, it forces class authors to write getSomeProperty/setSomeProperty method pairs on their classes should they want to do anything when values are gotten or set. The resulting code feels a lot more like Java than JavaScript, which is usually a sign that something is horribly wrong in a browser.

    As bad as this problem is for visual widgets, it's worse for data binding systems. API's like dojo.data would be designed in fundamentally different ways if getters and setters were available everywhere. Instead of the rigamarole of making users fetch an item reference and then fetch attribute values using the opaque item handle and the property name, we'd just set up getters and setters on the properties themselves and defer the implementation of fetching those values down to the data store. Further, assigning a linkage between a dojo.data query or store and a widget which represents it could be as simple as assigning a property to the widget object.

    So are workarounds to this possible? I think they are, and I'm testing some of them out for use in Dojo 1.1 right now. I'll post more about them should they pan out. Every avenue which looks potentially workable right now involves gigantic hacks which also deeply constrain API designs. Fundamentally, this problem can't be solved without good language-level support.

    It's perhaps folly to assume that this will be addressed in IE 8, but given the enormous back-pressure of nearly every JavaScript toolkit author demanding this feature and the embarrassment of every other browser beating them to the punch, I have some hope that we could see getters and setters for JScript in the near future. It won't matter much, though, unless the JScript team ships their new engine to all IE versions when they release IE 8. Not bloody likely.

  3. Performance

    Kudos are in order to the JScript team for fixing their long-b0rken GC heuristic and pushing it out to everyone...but it's the tip of the iceberg.

    Performance is one of those areas where differences in implementations can tightly circumscribe what's possible despite exacting spec conformance. On this front, JScript's raw VM-level execution time leaves a lot to be desired, but the true travesties really show up when you hit the DOM for computed style information or try to do anything reasonably complicated that involves string operations.

    Most non-trivial blocks of JS code today rely on innerHTML to bootstrap some new chunk of DOM in response to user action due in large part to the cross-browser speed and size advantages of innerHTML vs. raw DOM methods for equivalent DOM structures. This reality pushes IE's string performance woes to the fore as more and more client-side systems push far enough to hit the new "wall".

    Similarly, getting computed box model calculations out of IE is not for the faint of optimization foo. When we profiled Dojo's widgets to plan our attack for 0.9 and 1.0, we noted very quickly that getting box-model data out of the browser for any element is hugely costly on every browser, but on IE the cost was not just big... it was enormous. Our best guess right now is that the properties on the currentStyle property are re-calculated when they're requested from script and not cached in the bound object when layout happens. The resulting performance penalty requires that developers nearly never manage layout in code, severely constraining the layouts which are attempted by toolkits like Dojo.

    Across the board, from DOM performance to raw JScript execution speed, IE is a dog, and the odds are good that whatever toolkit you're using spends a lot of time working around that reality.

  4. Doctype Switching

    Doctype switching to toggle box-model behavior is perhaps the single most limiting implementation error in IE. Saner browsers allow you to use a CSS property to affect the layout model in use in a particular section of a document. This makes tons of sense in a templated world where most of the markup your system generates starts and ends in the "news hole". Today, that covers nearly everyone. A quick line count in any HTML document shows that doctypes are a scarce resource whose scarcity is made problematic when it's semantics are overloading. I've been on product teams where the idea of changing the doctype would require months to recover from. That kind of cost related to what should be simply a markup dialect change (not a formatting policy change) implies strongly that the doctype is a terribly brittle way to control several independent concerns.

    Instead of giving devs fine-grained layout system control, IE makes it all-or-nothing. The global flag approach backs toolkit developers into doing script-based layout calculations or "just throw it in another div" solutions where we'd really rather not. Both are slow and both may be required since it's completely impractical to dictate to users which doctype they'll be using. While any app may be able to be disciplined enough to not care, toolkit developers must work everywhere. Hilarity ensues.

    I fear this is going to get even worse with IE8 as the IE team looks to implement some of HTML 5 and hopefully many of CSS 2.1's clarifications. The sooner they abandon the global switch, the better...but I'll wager it's pain they just don't feel. Building a browser is a very different pursuit from building portable apps to run inside it.

  5. HTC's Can't Be Inlined (Even With Hacks)

    Modern browsers have built-in widget systems. On IE, it's HTCs + Viewlink and on Firefox it's XBL. Even a cursory reading through the docks for both is enough to illuminate the gigantic overlap. Alas, no one is yelling at them to standardize and the result is a terrible mess in which both sub-optimal formats limp along with nearly zero Open Web usage.

    So why do I single out IE for whipping here when XBL is just as lame and similarly b0rken with regards to single-file embedding? Well, on Mozilla, you have a lot more "outs". I strongly suspect that you can use "data:" urls to generate and evaluate component definitions for FF, which would enable compiling down from a single (more sane) format in the running page environment. IE prevents any such useful code-loading approaches, meaning you have to generate files on disk in their b0rken-ass format in order to be able to use them. Given how far we've gotten with non-builtin widget systems, it's pretty clear that toolkit authors aren't going to contort themselves into requiring a build step that splats files all over disk just so that we can give IE and Mozilla different views of the same component description. Instead, we all limp along on our own class hierarchies without any of the benefits of element subclassing, getter/setter handling, and inline (scoped) method description that these browser-provided systems would allow. It's kind of pitiful, really.

    IE8 may include strong "data:" URL support given that it's needed to pass Acid2, but I'm not holding my breath. I strongly suspect that HTC's are a dark, unloved corner of the Trident codebase that none of the current engineering team are really fired up about fixing (which they could have done by just allowing Data Islands to contain HTC definitions...but I digress). The takeaway here is that we probably shouldn't even need JS toolkits to build widget systems, but it's too late now. We're an abstraction short and a decade late and now the Flex frankenstein is beating HTML at it's own game.

In a vacuum of feature data or builds to work with, it's prudent to assume that IE 8's DOM and JavaScript implementations will continue to warp attempts at building useful, idiomatic JavaScript libraries to ease the problems that HTML + CSS aren't effectively solving. So next time you wonder why your toolkit of choice is built the way it is or why it's even necessary, just remember that in many cases they are protecting you from a decade or more of bad decision making.

From that perspective, they're worth every penny.