The Tyranny of Validation

July 22, 2005

Like a lot of people who read this blog, I've been writing HTML for a long time. Heck, I remember when "A List Apart" was an actual, honest-to-goodness mailing list. As a certifiable old fart, I remember the "Bad Old Days" only too clearly. <font> tags, tables for formatting, the whole bit. Boy am I glad we've moved past that. Now if only we could move past the mental block of validating markup.

Today, web developers (myself included) are held to a higher standard of accessibility, usability, and visual appeal than ever before. With the recent resurgence of interest in my little corner of the web development world, we're also providing better interactivity, responsiveness, and overall experience.

As the evolution from tinkerers to professionals has progressed, an amazing amount of pressure has been brought to bear, first on browser makers, and then on their fellow developers, by a very vocal segment of the webdev community. Like the internet itself, the net effect has (hopefully) been to reduce costs. No one will argue more strongly than me that this is a Good Thing for everyone. I've been fighting browsers for almost a decade and I'm truly grateful for the effects they've had.

But I've got a beef with one of the agents of this change. The Validation game. As a former true believer, I think it's time that the air was cleared about some of the non-standard things I've done (including the extra attribute syntax in Dojo). You see, I came to web development from programming (not the other way around), and one of the first lessons you learn when writing network programs is "be liberal in what you accept, strict in what you produce". The implication here, of course, is that there's a standard that you can be liberal or strict about. In the case of internet software, there almost always is (whether or not you're ignorant of said standard is another question). This little bit of wisdom has enabled programmers for decades to get along despite products shipping too early, implementations based on draft versions of standards which then diverge, and the general tendency of programmers to be human.

What does this have to do with XHTML? Fire up a browser and point it at google.com. Did that work? If so, you can thank some very fast-and-loose code at almost every point in the chain between you and Google's data center. The TCP stack in your computer's kernel probably re-assembled some fragmented packets for you. The DNS resolver in your OS probably consulted a stale cache, but just kept on working anyway, or fired off an asynchronous request to update it after it knowingly gave you a stale answer. The browser itself likely continued to process the response to your request despite all kinds of behaviors that RFC 2616 frowns very sternly on, and when it came time to process the content...well...I think you can take it from there. Or, rather, the W3C validtor can.

Which makes the W3C validator one of the most widely used academic programs ever to crawl out of a lab. You see, the validator is just an implementation of a client that has been explicitly tuned to be as bitchy as possible. This is the direct opposite of what the client you actually use every day (the browser) is tuned to do. If your browser were as bitchy as the W3C validator, you wouldn't just get frustrated with it, you'd throw it out. Generalizing that point, we can say:

Leniency in the face of incompatibility is a feature of useful software, not a bug.

History sides with the continued existence of bugs in software. Well engineered software doesn't bitch, it just does what you wanted it to do in the first place.

There are very few programs that can provide value by being bitchy, and even those that are explicitly designed to be testy (GPG and SSL spring to mind) often must support multiple incompatible and non-validating modes of operation (S/MIME anyone?). When was the last time you browsed to a site and saw an SSL cert revocation warning based on a CRL or OCSP check? My money is on never. Why? Because you don't have CRL or OCSP checking turned on. Because it's slow. And yet somehow, the sky hasn't fallen and you probably continue to bank online. And if you don't, the odds are pretty high it's because you're worried about spyware or phishing. Not an invalid cert.

Once you're trying to get something done on more than one computer, leniency helps make software useful. Useful software helps me (the user) do things I couldn't before; without complaint. Software that wins markets does that better, faster, and in the face of more incompatibilities than any of the competitors. The market for software will always, always favor the lenient.

The corollary to this is:

By encouraging the use of standards, you in no way decrease the amount of work that will be required of implementors of useful software.

This means that all the arguments about how producing valid XHTML is somehow better for mobile devices aren't just ignorant of Moore's Law, they are wasting our time. Tim Bray didn't get it and now we're all paying for his poor instincts as a software engineer.

Now, before you think that I just said that standards are worthless, let me defend myself a bit. I think they are tremendously useful. In open systems, somewhere between novelty and ubiquity, hopefully a standard gets written. This process is good, inevitable, and gives everyone who's been furiously building stuff a chance to catch their breath and acknowledge where things should be improved. It also help iron out interoperability concerns, which is something that users care very much about. This process is good because it reduces everyone's costs. That's why standards, when done well, are good.

But back to the validation game. I call it a game because that's precisely what it has become for many of us. I'm just as guilty as anyone. Once I've done a lot of actual, hard work figuring out how a page should be structured, worked to make it look right on as many devices as I can lay my hands on, and then fretted about usability (which includes accessibility, brain-damanged WCAG terminology aside), I can then go play the validation game. You start playing the game by scp-ing your files up to a server and them running them through the validation service. Tweak, re-upload, re-check. Does it pass? w00t! I r0xor!

Or not. Either way, no one cares.

What people do care about is whether or not your stuff works for them. And this is fundamentally why web standards have been important. They have allowed WaSP and others to bludgeon and shame browser vendors into submission such that the same amount of effort expended by the same web developer today makes a thing work better for more people than it did before. That behavior is dependent on the clients, not the spec or any adherence to it. Clients define behavior, markup is only a series of suggestions about what that behavior might, possibly, be some day.

For a long time now, I've been adding extra attributes to (X)HTML elements that throw all kinds of warnings in the validator, and I've been conflicted about it, enough to bake in ways around the problem in both nWidgets and Dojo . The hacks rely on the same kinds of (completely validating (X)HTML) value and structure conventions that are now being used to carry microformats. But no one uses those validating structure conventions in nW or in Dojo. They just go ahead and do the low-effort thing by blithely adding non-validating attributes to their tags. When people ask about validation for them, my response has been "well, you can just build a custom DTD". Which is kinda like a cab driver saying "sure, we can go there, just get out and push".

In a recent article, the W3C Validation Team discuss the prospect and end with:

Custom DTDs can be a very useful tool to enrich the existing markup languages or create entirely new ones. One always has to keep in mind that they are tantamount to creating a new language, and that proprietary languages are best kept in closed environments where they can be taught to a limited set of agents and tools, and NOT to make the web a modern version of the Tower of Babel by unleashing them in the wilderness.

It's about here that my bullshit detector started doing the full-on "woop! woop!" thing. Could it really be that they are arguing against the only way out of being both useful and valid? And are they really arguing that we should only be using the full power of the available standards in intranets?

And then I remembered. It doesn't matter. Most of the discussed approaches aren't available. If the clients that people use will accept the extra elements without complaint, and if they don't degrade the experience for anyone and if the dominant browsers don't support custom DTDs or namespaces sanely anyway, then the only people who ever have to wring their hands over this are academics. And their tools. I for one am opting out of this particular full-employment-for-academics plan. And I should have done it a lot sooner.

So until validation actually starts to matter, I'm gonna proudly display my invalidation badge, redirect all questions about validation to this page and get back to work. Although you've gotta admit, beating the validation boss at the end of the InterWebDev game sure was fun while it lasted.

corrections/clarifications

corrected innacurate statement about clients of microformats (thanks Tantek)