Smaller is better.
I gathered the numbers and stand behind them, so let me quickly outline where they come from, why they're fair, and why they matter to your app.
I took the average of three separate runs of the TaskSpeed benchmark in comparing the latest versions of both Dojo and jQuery. The numbers were collected on isolated VM's on a system doing little else. You may not be able to reproduce the exact numbers, but across a similar set of runs, the relative timings should be representative.
So why is TaskSpeed a fair measuring stick? First, it does representative tasks and the runtime harness is calibrated to ensure statistically significant results. Secondly, the versions of the code for each library are written by the library authors themselves. The Dojo team contributed the Dojo versions of the baseline tasks and the jQuery team contributed theirs. If any library wants to take issue with the tests or the results, they only need to send Pete a patch. Lastly, the tests run in relative isolation in iframes. This isn't bulletproof -- GC interactions can do strange things and I've argued for longer runs -- but it's pretty good as these things go. I took averages of multiple runs in part to hedge against these problems.
The comparison to jQuery is fair on the basis of syntax and market share. If you compare the syntax used for Dojo's tests with the jQuery versions, you'll see that they're similarly terse and provide analogous conveniences for DOM manipulation, but the Dojo versions lose the brevity race in a few places. That's the price of speed, and TaskSpeed makes those design decisions clear. As for market share, I'll let John do the talking. It would be foolish of me to suggest that we should be comparing Dojo to some other library without simultaneously suggesting that his market share numbers are wrong; and I doubt they are.
Given all of that, do the TaskSpeed numbers actually matter for application performance? I argue that they do for two reasons. First, TaskSpeed is explicitly designed to capture common-case web development tasks. You might argue that the weightings should be different (a discussion I'd like to see happen more openly), but it's much harder to argue that the tests do things that real applications don't. Because the toolkit teams contributed the test implementations, they provide a view to how developers should approach a task using a particular library. It's also reasonable to suspect that they demonstrate the fastest way in each library to accomplish each task. It's a benchmark, after all. This dynamic makes plain the tradeoffs between speed and convenience in API design, leaving you to make informed decisions based on the costs and benefits of convenience. The APIs, after all, are the mast your application will be lashed to.
I encourage you to go run the numbers for yourself, investigate each library's contributed tests to get a sense for the syntax that each encourages, and get involved in making the benchmarks and the libraries better. That's the approach that the Dojo team has taken, and one that continues to pay off for Dojo's users in the form of deliberately designed APIs and tremendous performance.
I've been invited by Chris Messina and some kindly folks at MSFT to participate in a panel at this year's SxSW regarding the value and/or necessity of view-source, and so with apologies to my fellow panelists, I want to get the conversation started early.
First, my position: ceteris paribus, view-source was necessary (but not sufficient) to make HTML the dominant application platform of our times. I also hold that it is under attack — not least of all from within — and that losing view-source poses a significant danger to the overall health of the web.
That's a lot to hang on the shoulders of a relatively innocent design decision, and I don't mean to imply that any system that has a view-source like feature will become dominant. But I do argue that it helps, particularly when coupled with complementary features like reliable parsing, semantic-ish markup, and plain-text content. Perhaps it's moving the goal line a bit, but when I talk about the importance of view-source, I'm more often than not discussing these properties together.
To understand the importance of view-source, consider how people learn. Some evidence exists that even trained software engineers chose to work with copy-and-pasted example code. Participants in the linked study even expressed guilt over the copy-paste-tweak method of learning, but guilt didn't change the dynamic: a blank slate and abstract documentation doesn't facilitate learning nearly as well as poking at an example and feeling out the edges by doing. View-source provides a powerful catalyst to creating a culture of shared learning and learning-by-doing, which in turn helps formulate a mental model of the relationship between input and output faster. Web developers get started by taking some code, pasting it into a file, saving, loading it in a browser and hitting
ctrl-r. Web developers switch between editor and browser between even the most minor changes. This is a stark contrast with technologies that impose a compilation step where the process of seeing what was done requires an intermediate step. In other words, immediacy of output helps build an understanding of how the system will behave, and
ctrl-r becomes a seductive and productive way for developers to accelerate their learning in the copy-paste-tweak loop. The only required equipment is a text editor and a web browser, tools that are free and work together instantly. That is to say, there's no waiting between when you save the file to disk and when you can view the results. It's just a
With that hyper-productive workflow as the background, view-source helps turn the entire web into a giant learning lab, and one that's remarkably resilient to error and experimentation. See an interesting technique or layout? No one can tell you "no" to figuring out how it was done. Copy some of it, paste it into your document, and you'll get something out the other side. Browsers recovering from errors gracefully create a welcome learning environment, free of the inadequacy that a compile failure tends to evoke. You can see what went wrong as often as not. The evolutionary advantages of reliable parsing have helped to ensure that strict XML content comprises roughly none of the web, a decade after it was recognized as "better" by world+dog. Even the most sophisticated (or broken) content is inspectable at the layout level and tools like Firebug and the Web Inspector accelerate the copy-paste-tweak cycle by inspecting dynamic content and allowing live changes without reloads, even on pages you don't "own". The predicate to these incredibly powerful tools is the textual, interpreted nature of HTML. There's much more to say about this, but lets instead turn to the platform's relative weaknesses as a way of understanding how view-source is easily omitted from competing technologies.
The first, and most obvious, downside to the open-by-default nature of the web is that it encourages multiple renderers. Combined with the ambiguities of reliable parsing and semantics that leave room for interpretation, it's no wonder that web developers struggle through incompatibilities. In a world where individual users each need to be convinced to upgrade to the newest version of even a single renderer, differences only in version can wreak havoc in the development process. Things that work in one place may not look exactly the same in another. This is both a strength and a weakness for the platform, but at the level of sophisticated applications, it's squarely a liability. Next, ambiguities in interpretation and semantics mean that the project of creating tooling for the platform is significantly more complex. If only one viewer is prevalent (for whatever reason), then tools only need to consume and generate code that understands the constraints, quirks, and performance of a single runtime. Alternate forms of this simplification include only allowing code (not markup) so as to eliminate parsing ambiguity. The code-not-markup approach yields a potentially more flexible platform and one that can begin to execute content more quickly (as Flash does). These advantages, taken together, can create an incredibly productive environment for experts in the tools that generate content: no output ambiguity, better performance, and tools that can deliver true WYSIWYG authoring. These tools can sidestep the
ctrl-r cycle entirely.
Similarly Flex, Laszlo, GWT's UI Binder, and Silverlight have discovered the value in markup as a simple declarative way for developers to understand the hierarchical relationships between components, but they correspond to completely unambiguous definitions of components they rely on compiled code — not reliably parsed markup — for final delivery of the UI. These tight contracts turn into an evolutionary straightjacket. Great if you're shipping compiled code down the wire that can meet the contract, but death if those tags and attributes are designed to live for long periods of time or across multiple implementations. You might be able to bolt view-source into the output, but it'll always be optional and ad-hoc, features that work against it being pervasive. Put another way, the markup versions of these systems are leaky abstractions on the precise, code-centric system that under-girds both the authoring and runtime environments. This code-centric bias is incredibly powerful for toolmakers and "real" developers, but it cuts out others entirely; namely those who won't "learn to program" or who want to build tools that inject content into the delivery format.
Whatever the strengths of code-based UI systems, they throw web crawlers for a loop. Today, most search engines deal best with text-based formats, and those search engines help make content more valuable in aggregate than it is on its own. Perhaps it's inevitable that crawlers and search engines will need to execute code in order to understand the value of content, but I remain unconvinced. As a thought experiment, consider a web constructed entirely of Flash content. Given that Flash bytecode lacks a standard, semantic way to denote a relationship between bits of Flash content, what parts of the web wouldn't have been built? What bits of your work would you do differently? What would the process be? There's an alternate path forward that suggests that we can upgrade the coarse semantics of the web to deal with ever-more-sophisticated content requirements. Or put another way, use the features of today's toolkits and code generators as a TODO list for markup driven features. But the jury is still out on the viability that approach; the same dynamic that makes multiple renderers possible ensures that getting them to move in a coordinated way is much harder than the unilateral feature roadmap that plugin vendors enjoy. HTML 5 and CSS 3 work is restarting those efforts, but only time will tell if we can put down the code and pick markup back up as a means to express ourselves.
I've glossed over a lot of details here, and I haven't discussed implications for the server side of a non-text format as our lingua-franca, nor have I dug into the evolution metaphor. Many of the arguments are likewise conditional on economic assumptions. There's lots of discussion yet to have, so if you've got links to concrete research in either direction or have an experience that bears on the debate post in the comments! Hopefully my fellow panelists will respond in blog form and I'll update this post when they do.