Infrequently Noted

Alex Russell on browsers, standards, and the process of progress.

View-Source Is Good? Discuss.

I've been invited by Chris Messina and some kindly folks at MSFT to participate in a panel at this year's SxSW regarding the value and/or necessity of view-source, and so with apologies to my fellow panelists, I want to get the conversation started early.

First, my position: ceteris paribus, view-source was necessary (but not sufficient) to make HTML the dominant application platform of our times. I also hold that it is under attack — not least of all from within — and that losing view-source poses a significant danger to the overall health of the web.

That's a lot to hang on the shoulders of a relatively innocent design decision, and I don't mean to imply that any system that has a view-source like feature will become dominant. But I do argue that it helps, particularly when coupled with complementary features like reliable parsing, semantic-ish markup, and plain-text content. Perhaps it's moving the goal line a bit, but when I talk about the importance of view-source, I'm more often than not discussing these properties together.

To understand the importance of view-source, consider how people learn. Some evidence exists that even trained software engineers chose to work with copy-and-pasted example code. Participants in the linked study even expressed guilt over the copy-paste-tweak method of learning, but guilt didn't change the dynamic: a blank slate and abstract documentation doesn't facilitate learning nearly as well as poking at an example and feeling out the edges by doing. View-source provides a powerful catalyst to creating a culture of shared learning and learning-by-doing, which in turn helps formulate a mental model of the relationship between input and output faster. Web developers get started by taking some code, pasting it into a file, saving, loading it in a browser and hitting ctrl-r. Web developers switch between editor and browser between even the most minor changes. This is a stark contrast with technologies that impose a compilation step where the process of seeing what was done requires an intermediate step. In other words, immediacy of output helps build an understanding of how the system will behave, and ctrl-r becomes a seductive and productive way for developers to accelerate their learning in the copy-paste-tweak loop. The only required equipment is a text editor and a web browser, tools that are free and work together instantly. That is to say, there's no waiting between when you save the file to disk and when you can view the results. It's just a ctrl-r away.

With that hyper-productive workflow as the background, view-source helps turn the entire web into a giant learning lab, and one that's remarkably resilient to error and experimentation. See an interesting technique or layout? No one can tell you "no" to figuring out how it was done. Copy some of it, paste it into your document, and you'll get something out the other side. Browsers recovering from errors gracefully create a welcome learning environment, free of the inadequacy that a compile failure tends to evoke. You can see what went wrong as often as not. The evolutionary advantages of reliable parsing have helped to ensure that strict XML content comprises roughly none of the web, a decade after it was recognized as "better" by world+dog. Even the most sophisticated (or broken) content is inspectable at the layout level and tools like Firebug and the Web Inspector accelerate the copy-paste-tweak cycle by inspecting dynamic content and allowing live changes without reloads, even on pages you don't "own". The predicate to these incredibly powerful tools is the textual, interpreted nature of HTML. There's much more to say about this, but lets instead turn to the platform's relative weaknesses as a way of understanding how view-source is easily omitted from competing technologies.

The first, and most obvious, downside to the open-by-default nature of the web is that it encourages multiple renderers. Combined with the ambiguities of reliable parsing and semantics that leave room for interpretation, it's no wonder that web developers struggle through incompatibilities. In a world where individual users each need to be convinced to upgrade to the newest version of even a single renderer, differences only in version can wreak havoc in the development process. Things that work in one place may not look exactly the same in another. This is both a strength and a weakness for the platform, but at the level of sophisticated applications, it's squarely a liability. Next, ambiguities in interpretation and semantics mean that the project of creating tooling for the platform is significantly more complex. If only one viewer is prevalent (for whatever reason), then tools only need to consume and generate code that understands the constraints, quirks, and performance of a single runtime. Alternate forms of this simplification include only allowing code (not markup) so as to eliminate parsing ambiguity. The code-not-markup approach yields a potentially more flexible platform and one that can begin to execute content more quickly (as Flash does). These advantages, taken together, can create an incredibly productive environment for experts in the tools that generate content: no output ambiguity, better performance, and tools that can deliver true WYSIWYG authoring. These tools can sidestep the ctrl-r cycle entirely.

But wait, I hear you shout, It's possible to do code-only, toolable, full fidelity development in JavaScript! Tools like GWT and Cappuccino generate code that generates UI, ensuring that only those who can write code or have tools that can will participate; removing the potential value of view-source for those apps. But lets be honest: view source is nearly never locally beneficial. I can hardly count the number of times I've seen the "how do I hide my code?" question from a web n00b who (rightly or wrongly) imagines there's value in it. For GWT the fact that the output is an HTML DOM that's styled with CSS is as much annoyance as benefit. The big upside is that browsers are the dominant platform and you don't have to convince users to install some new runtime.

Similarly Flex, Laszlo, GWT's UI Binder, and Silverlight have discovered the value in markup as a simple declarative way for developers to understand the hierarchical relationships between components, but they correspond to completely unambiguous definitions of components they rely on compiled code — not reliably parsed markup — for final delivery of the UI. These tight contracts turn into an evolutionary straightjacket. Great if you're shipping compiled code down the wire that can meet the contract, but death if those tags and attributes are designed to live for long periods of time or across multiple implementations. You might be able to bolt view-source into the output, but it'll always be optional and ad-hoc, features that work against it being pervasive. Put another way, the markup versions of these systems are leaky abstractions on the precise, code-centric system that under-girds both the authoring and runtime environments. This code-centric bias is incredibly powerful for toolmakers and "real" developers, but it cuts out others entirely; namely those who won't "learn to program" or who want to build tools that inject content into the delivery format.

Whatever the strengths of code-based UI systems, they throw web crawlers for a loop. Today, most search engines deal best with text-based formats, and those search engines help make content more valuable in aggregate than it is on its own. Perhaps it's inevitable that crawlers and search engines will need to execute code in order to understand the value of content, but I remain unconvinced. As a thought experiment, consider a web constructed entirely of Flash content. Given that Flash bytecode lacks a standard, semantic way to denote a relationship between bits of Flash content, what parts of the web wouldn't have been built? What bits of your work would you do differently? What would the process be? There's an alternate path forward that suggests that we can upgrade the coarse semantics of the web to deal with ever-more-sophisticated content requirements. Or put another way, use the features of today's toolkits and code generators as a TODO list for markup driven features. But the jury is still out on the viability that approach; the same dynamic that makes multiple renderers possible ensures that getting them to move in a coordinated way is much harder than the unilateral feature roadmap that plugin vendors enjoy. HTML 5 and CSS 3 work is restarting those efforts, but only time will tell if we can put down the code and pick markup back up as a means to express ourselves.

I've glossed over a lot of details here, and I haven't discussed implications for the server side of a non-text format as our lingua-franca, nor have I dug into the evolution metaphor. Many of the arguments are likewise conditional on economic assumptions. There's lots of discussion yet to have, so if you've got links to concrete research in either direction or have an experience that bears on the debate post in the comments! Hopefully my fellow panelists will respond in blog form and I'll update this post when they do.