Infrequently Noted

Alex Russell on browsers, standards, and the process of progress.

Chandler 1.0!

Somehow I missed this last week, but OSAF's Chandler super-PIM just went 1.0. It's been a long time in coming, and the result isn't what I had at all expected. Instead of being an "email client++", Chandler 1.0 is a calendar and task management tool that happens to be super-savvy about talking to your existing IMAP folders and lets you share and coordinate via CalDAV. This is fundamentally different from Things in that it also has enough of the "guts" of a real PIM to allow scheduling and coordination on tasks to be an integral part of the experience. Fundamentally it's "us" oriented and not "me" oriented and I'm excited to see what organizations use it for and what kinds of organizations discover its value.

The Chandler Hub also strikes me as a gem hidden in plain sight. Not only is it a great way to share parts of your schedule with others, it's an amazingly complete Dojo-based, Open Source UI for getting it all done, too. You can run your own Cosmo server (the code that runs Chandler Hub) inside of your department or organization but more than that, you can get the source. If you know Java (or employ someone that does), the Cosmo server is perhaps the easiest-to-hack-on option for an organization needing a flexible, lightweight task and team management option. Given that every organization I've ever worked for has struggled with exactly this type of coordination, the availability of source code here is probably going to beget some amazing integrations with bug trackers and the other project and task management systems already rely on. In some ways, despite being almost completely different in scope, Chandler Hub and Plaxo's kick-butt online features are both brining a level of visibility to different types of activities that cry out for better and deeper integrations with the tools that get used every day to "do the work" or track it in other ways. A few lightweight bridges to MS Project and/or Trac/Redmine would make Cosmo jet fuel for team visibility. I can't wait.

The Chandler team also told me last week that they're hard at work on a re-architecture of their python-based desktop client in order improve the performance and startup time and to make the whole system more hackable. Given that the desktop and web clients can speak to the same Cosmo server back-end (which can federate data out to lots of other places to boot), this seems like a promising path forward as the team completes a transition to a more traditional OSS distributed-development approach. Truth be told, I probably won't give up Thinks for Chandler desktop until performance does improve, but I'm sure gonna be tying my calendars together with Jennifer's via Chandler Hub ASAP.

Congrats again to the Chandler (and Cosmo and Hub) team(s)!

CSS Variables Are The Future

or: "Reports of the Harm Caused By CSS Variables Are Greatly Exaggerated"

To say that CSS is abominable isn't controversial. The implementations are leading the spec in some places, and we're getting real progress there. Firefox's rounded corners and WebKit's drop-shadows, declarative animations, background tiling, and CSS variables are all hugely important and liberating. But where the spec is in-front of the important implementations...well, I've ranted before on the topic. CSS sucks, and the editor of the spec has now written at length of his intent to keep it that way (via Simon). His arguments are flim-flam, but just saying so isn't enough to convince any one. Making the case requires answering long-hand and showing our work.

By The Numbers

Lets look first at the numbers presented in the sidebar here. Remember that the survey numbers come from documents on the W3C website. The article would have us believe that this sample set bears some relationship to the rest of the web such that we can extrapolate a case against CSS variables out of them. The relationship is tenuous enough that this disclaimer is included:

The authors who write on this Web site site are probably more careful about the structure and re-usability (and download speed) of their documents than the average Web author and there are indications that the average size of style sheets on the Web is two or three times higher. That is still small for computer code, though. Although it doesn't fit in one window, like the average W3C style sheet, it doesn't take more than scrolling once or twice either.

There's much wrong with this leap of logic: if the real web is at least 2x more complicated, how can we dismiss the clamors of real web developers for more powerful tools? Further more, what's to say that what is or isn't being encoded in CSS is due to to complexity? There are lots of things which we'd like to put in CSS but don't because CSS just can't do many of the things we should expect of it. Real-world CSS is likely to get longer, not shorter, as CSS evolves toward its manifest destiny and allows us to declare property bindings, animations, and all manner of complex layouts for which we currently turn to table elements and layout systems like the Dojo BorderContainer and ExpandoPane widgets. This isn't a foot-note, it's an out-and-out refutation of the proffered case.

We can also do much better than the chosen sample set. A quick wget of the front pages of the top 20 Alexa sites (to stop short of the porn) reveals a world which the article's sample set bears no resemblance to. Remember, these are only the front pages, as well. Internal pages can be significantly more complex as they trend toward applications and away from relatively static views of data. Here's what I ran to get data to work with:

media:css_stats alex$ ls
./		../		out/		sites.txt
media:css_stats alex$ cat sites.txt
http://yahoo.com/
http://google.com/
http://youtube.com/
http://live.com/
http://msn.com/
http://myspace.com/
http://facebook.com/
http://blogger.com/
http://orkut.com/
http://rapidshare.com/
http://microsoft.com/
http://google.co.in/
http://ebay.com/
http://hi5.com/
http://aol.com/
http://google.co.uk/
http://photobucket.com/
http://amazon.com/
http://imdb.com/
http://imageshack.us/
media:css_stats alex$ wget --user-agent="..." -P out -l1 -p -H -i sites.txt
--15:05:15--  http://yahoo.com/
           => `out/yahoo.com/index.html'
Resolving yahoo.com... 68.180.206.184, 206.190.60.37
Connecting to yahoo.com|68.180.206.184|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://www.yahoo.com/ [following]
...

Of these 20 pages, there are only 28 referenced external style sheets (per wget), likely to speed rendering time. Most of the pages that do specify external style sheets also host them on "static" servers to increase download parallelism and resource usage on front-end boxes. We can already say something about the use of CSS is large, commercially successful websites:

Site authors are going to enormous lengths to ensure that pages load as fast. Therefore, techniques which reduce redundancy and therefore result in smaller style sheets will have significant value to content authors.

Recall that one of the primary drivers behind CSS inheritance, macros, and variables is to reduce the size of style sheets. Just looking at the behavior of real-world page authors at least makes a strong case for techniques to create terse style sheets if not for these specific solutions.

These external style sheets have a median line count of 280, seemingly validating the essay's assertion that the "real world" is 2x to 3x more complicated. But something tells me that's also misleading. Indeed:

media:css_stats alex$ cd out
media:out alex$ find . | grep "\.css" > css_files.out
media:out alex$ cat `cat css_files.out` | wc -l
    7866
media:out alex$ CSS_FILES=`cat css_files.out`
media:out alex$ for f in $CSS_FILES; do wc -l $f; done
      92 ./i.media-imdb.com/images/SF389ed063874f275ab16e1fef86e6462a/css2/consumerhome.css
     419 ./i.media-imdb.com/images/SFf399dbc3948ab005b0ed6e733d294c7c/css2/consumersite.css
      59 ./imageshack.us/css/styles.css
     138 ./imageshack.us/img/style-def.css
      32 ./imageshack.us/img/tooltips.css
     122 ./include.ebaystatic.com/v4css/en_US/e573/GlobalNavVjoOpt23_Ebay_e5736892865_en_US.css
     128 ./include.ebaystatic.com/v4css/en_US/e577/CCHP_HomepageV4_SLDR_e5777009417_en_US.css
     310 ./rapidshare.com/img2/styles.css
      86 ./static.ak.fbcdn.net/rsrc.php/102900/css/typeaheadpro.css
      58 ./static.ak.fbcdn.net/rsrc.php/104778/css/dialogpro.css
     332 ./static.ak.fbcdn.net/rsrc.php/108104/css/ubersearch.css
     150 ./static.ak.fbcdn.net/rsrc.php/98481/css/welcome.css
      10 ./static.ak.fbcdn.net/rsrc.php/99258/css/webkit.css
    2081 ./static.ak.fbcdn.net/rsrc.php/pkg/77/113750/css/common.css.pkg.php
       0 ./static.hi5.com/friend/styles/global_1215662072.css
       0 ./static.hi5.com/friend/styles/headernav_1214604328.css
       0 ./static.hi5.com/friend/styles/homepage_anon_1203011808.css
       0 ./static.photobucket.com/include/css/pkgs/homepage_v14.2.4.css
       0 ./static.photobucket.com/include/css/pkgs/photobucket_v14.2.4.css
       0 ./stc.msn.com/br/hp/en-us/css/54/bluso.css
     377 ./x.myspacecdn.com/modules/common/static/css/global017.css
     346 ./x.myspacecdn.com/modules/common/static/css/header/siteheader004.css
     192 ./x.myspacecdn.com/modules/common/static/css/master.css
     853 ./x.myspacecdn.com/modules/splash/static/css/splashv2007.css
     386 ./z-ecx.images-amazon.com/images/G/01/nav2/gamma/n2CoreLibs/n2CoreLibs-n2v1-43122._V252271770_.css
    1428 ./z-ecx.images-amazon.com/images/G/01/nav2/gamma/navbarCSS/navbarCSS-navbar-53059._V252284495_.css
     256 ./z-ecx.images-amazon.com/images/G/01/nav2/gamma/topproductideasCSS/topproductideasCSS-topproductideasCSS-50166._V13960170_.css
      11 ./z-ecx.images-amazon.com/images/G/01/s9-campaigns/popover._V255266624_.css

But what about those 0-line files? ls says there's more there:

 8.8K  ./static.hi5.com/friend/styles/global_1215662072.css
 4.4K  ./static.hi5.com/friend/styles/headernav_1214604328.css
 2.3K  ./static.hi5.com/friend/styles/homepage_anon_1203011808.css
 8.8K  ./static.photobucket.com/include/css/pkgs/homepage_v14.2.4.css
  34K  ./static.photobucket.com/include/css/pkgs/photobucket_v14.2.4.css
 1.6K  ./stc.msn.com/br/hp/en-us/css/54/bluso.css

Indeed, they're all 1-line long and missing a trailing newline char due to the whitespace removal that's been applied to them. The shortest of the files, when expanded for readability, is longer than 100 lines. Clearly counting lines doesn't actually tell us much about the complexity of production-quality CSS – at least not without some normalization. Since most production-level CSS is embedded in the served document then, we should probably have a look at it too and figure out ways to normalize the whole shooting match to determine some sort of "style complexity factor" since it's very much the case that the impact of CSS is not often isolated to individual elements. Indeed, some of the hardest to maintain issues with CSS come from the overall difficulty of knowing what's affecting which element and those rules can come from anywhere. So getting an accurate view of the amount of style that a developer or designer needs to keep in their head at once (the central argument of the original piece) is usually some factor of the total number and applicability of the rules in the page to the elements currently being styled. Therefore, to get a sense of the complexity of the style being applied to a page, it would be far better to know the number of normalized lines of CSS on the page plus a count of the total number of rules applied to the document.

I put together a small python script to do just this. Quickly summarized, it sweeps through documents looking for external stylesheet links, @import statements, or <style> tags, parsing the contents of each into a normalized, pretty-printed form. Here's the summary output:

Totals:

files examined: 17
total # of CSS rules: 9462
normalized lines of CSS: 39999
average # of CSS lines per file: 2352 (mean) 1428 (median)
average # of CSS rules per file: 556 (mean) 308 (median)
average # of CSS style sheets per file: 4 (mean) 3 (median)

(full output here)

This output comes after removing items from the list which don't have valid index pages (e.g. microsoft.com got wise to my faked user agent) and which included some manually fetched external CSS files that wget initially missed. Also, it's worth noting that the Google home-page pulls the averages down significantly since they only include 111 lines of normalized CSS per and the home pages for google.com, google.co.in, and google.co.uk are seemingly identical from this perspective and all occur in the top 20.

Never the less, the results are still astounding when compared with the results from the original article. CSS authors who maintain the world's most popular landing pages are contending with thousands of lines of CSS per page and hundreds of rules. This is orders of magnitude more complexity than the initially presented numbers, hopefully dispelling any notion that we could rely on those numbers as the basis for asserting anything but that those in the employ of the W3C and the volunteers which join them are not typical content authors and do not attempt the feats of CSS which are meerly work-a-day constructions for commercially successful websites. Granted, the home-pages of the most successful websites on the internet are also not "normal" (by definition), but they are significantly more representative of where the web is heading and the standards that content authors likely aspire to.

The article's numbers may indeed present a compelling case for not adding CSS variables for the sake of the W3C's content authors, but no such case holds in the "real world" of deployed content.

Upon Further Inspection...

One argument made in the article is that the ability to understand what you see in a style sheet is bolstered by a lack of indirection. This argument is simplistic insofar as CSS is already rife with indirections. From cascades to the order of precedence in style application to the !important rules to media queries, CSS currently provides many facilities for content authors to create difficult situations in determining where a particular rule is coming from or what the impact of a rule will be. Inspectability and what-you-see-is-what-you-get were properties of an earlier, simpler web which the article harkens back to. But the WYSIWYG principle has already been lost as HTML and CSS have failed to keep pace with the tasks being demanded of them. When a pile of non-semantic div or table elements are employed to create the canonical 3-column layout for which the CSS is a mind-bending combination of art and science, no novice can be expected to follow along at home. I whole-heartedly agree that the ability to "View Source" on the web and have that mean something is a powerful evolutionary advantage to the Open Web, but it is one which is being under-cut most forcefully by the lack of evolution in HTML and CSS, not by the addition of features to them. While HTML and CSS lack semantics for simple construction of common visual and structural idioms, we should continue to expect the contorted, complex sets of rules and markup. Visual and interaction designers aren't demanding less of the user experience simply because CSS isn't up to the task. Instead, they're turning to JavaScript toolkits like Dojo which can and do deliver the goods. Hardly a better position for the platform to compete from.

On this point the essay also contains a rhetorical bait-and-switch which I find distasteful: it dismisses variables because they don't inherently do anything to reduce the lengths of pages (true!) and then argues against macros and inheritance because they create levels of indirection which can be confusing. Inheritance and macro definitions can play a key role in drastically reducing the length of style sheets. In this way, they promote understanding through exactly the same "memory effect" mechanism that is cited as a liability when discussing variables.

Variables, on the other hand, provide an effective and over-due mechanism for consolidating the definition of shared values across style sheets which may be defined in distributed places (say, via a CMS's default template which is later customized by users). For the very-lengthy, real-world styles which occur frequently on the public internet, this ability to cleanly separate the definition of common values into a single style sheet would prove a huge boon to the development and maintenance of sites for which large teams must cooperate on the generation of what ends up being a single page. Style sheets are already long, and the proponents of variables assume this to be true. That variables do not shorten style sheets is not a valid argument against the considerable good that they can do in ensuring that style sheets are maintainable.

The essay dismisses the idea that variable names are (and should be) self-documenting. The argument that a comment would somehow "be better" ignores the reality of todays large style sheets. There isn't a way for rules to effect similar visual appearance on different properties without repetition today, leading to tremendous maintenance headaches.

Modern style sheets are already well beyond the complexity levels which allow us to fit them all in a single screen, and re-using values today requires the exact same looking-in-multiple-places-burden that the essay deems unacceptable in a future with macros. The results for this extra effort today, however, aren't consistent and maintainable rules which can be changed with relatively few updates. Instead we're left with a mish-mash of hard-coded values which are copied here and there, often across multiple style sheets. Adding multiple classes to a single node is also no panacea as it quickly devolves into tens or hundreds of small rules to define what are essentially parameterized constants for a single set of layout, color, or typography decision combinations. These decisions cannot be conveyed through a simple selector but must instead by applied in the right combinations throughout a document directly to elements which required them. Ugg. Even if the original article were to make a cogent case against the need for variables, that can't be extended to a case against inheritance or macro capabilities. Composition is far too difficult and authors are already awash in complexity. Denying them effective, optional tools to deal with this complexity is simply to deny the truth of the web which has evolved.

Cobbling It Together

Perhaps the most disingenuous argument fielded is that the addition of variables in CSS places an onerous burden on the developers of user agents. User agent developers are best suited to know the difficulties and pitfalls in implementing CSS variables and at least one team has decided that not only is it workable, they have authored the spec now under discussion and have implemented several different syntaxes for the feature in parallel in order to figure out what will work best. Were there hue-and-cry from other implementers, I'd be much more sympathetic to this point. However, given the general lack of objection amongst implementers, the long-standing ambiguities in the CSS specifications, the inscrutable choices of box models, and the weirdisms of the CSS in general it seems that we're very far down the path in terms of the complexity required of any implementor. It is probably the case today that authoring a new HTML and CSS rendering engine that will consume the real web isn't a realistic prospect save but for the most well-heeled and motivated of teams today. Adding or not adding CSS variables and/or macros doesn't change that reality.

The arguments against variables and macros/inheritance get weakest when they are taken as a whole. Variables are likely just the first step to a CSS that allows both simple parameterization (variables) and composition (inheritance, macros, etc.). One without the other is weak sauce, and the essay tacitly acknowledges as much by arguing against them in turn (but not in together). A CSS dialect which includes inheritance will allow the specialization of "parent" rules without relying on extra nodes or multiple classes added directly to nodes. CSS with variables will allow for much simpler maintenance and "templating" of complex visual identities. Taken together, these techniques allow for sophisticated CSS authors to stop repeating themselves and get a handle on the thousands of lines of code which they're managing to construct pages today.

Arguing that they are too hard or too confusing simply ignores the deeply painful experience of today's content authoring process. The time has long passed when we can delay progress or claim it "harmful" without proof. Once again, it's time to let the implementations lead and time for the standards bodies to stand aside and cheer them on.

Transition

Two days ago I dusted off the rarely-used voting procedure for Dojo Foundation projects in order to kick off a transition that I'm very excited about: as of this afternoon, the committers of the Dojo project have elected Peter Higgins the new Project Lead for the toolkit project.

I've had the pleasure of working with Peter both in the Dojo project and at SitePen and his energy and enthusiasm for making Dojo better and helping designers and developers work better together is infectious.

For anyone not familiar with Peter's work, he's been instrumental in the creation of the amazing Dojo Campus website (along with the Uxebu chaps) as well as being primary author of many DojoX components. His work in shepherding new contributors through the contributor to committer process is nearly legendary inside the project, and Peter has been a one-man outreach and support machine via the Dojo blog and his endless patience on the forums and IRC (#dojo on irc.freenode.net). I couldn't be happier about where Dojo is headed under his direction.

There have already been some recurring questions about this transition amongst the committers and on today's Open Web podcast, so let me quickly recap them here:

Q: Will you still be involved in Dojo?

Absolutely! I'm excited that Pete is taking on the figurehead and "vision thing" duties which is a role that he's naturally suited to. Part of this transition is about me wanting more time to focus on experimental and edgy stuff that can make a huge difference in how we work with the web and I have no doubt that Peter is the right guy to help us grow the truly open Dojo community even further. He absolutely gets the importance of a truly open community, the need to be conservative about where IP comes from and meet our promises of backwards-compatibility, and how Dojo can make big changes in the lives of application developers and designers. I'm grateful to be have the opportunity to continue working with him on Dojo and will continue to do so in whatever capacity Peter deems appropriate.

Q: Does this change your role at the Foundation?

Nope. I'm still serving as President of the Dojo Foundation. This transition will allow me to also focus more time on ensuring that the Foundation is running well, that a new Board is elected soon, and that the Foundation's other projects succeed on their own terms. The Foundation has always been about more than just giving Dojo a home, and we're now looking to expand the umbrella of the Foundation to help nurture other JavaScript and Open Web projects more than ever.

Q: Will you still be doing talks on Dojo?

Yep, I'll still be out there advocating the Dojo case, but you can expect to see Peter doing more of that over time as well. If you're planning a conference and are looking for a cogent person to talk about Dojo, Peter is now your go-to guy. I've enjoyed having the opportunity to think and talk about where the Open Web is headed, so I'll also be doing a lot more of that. There are lots of meta-issues that this transition will let me work harder on, so expect more from me there.

My hat's off to the Dojo community and Peter in particular. The work that has gone into 1.2 and will land in 1.3 and beyond under his direction really is changing the way we view what the web can and should be used for.

A Little Perspective

I live in a world dominated by the unfeeling, unaccountable whims of browser vendors and so it is with a 50 lb bag of rock salt that I consume the optimistic projections of Firefox's triumphal ascendence to market dominance.

As it turns out, my skepticism is absolutely justified. Now, this isn't to say that I'm not an optimist – I am, and given what I do for a living, I need to be hopeful about the future. I would have succumbed to serious depression long ago if I weren't deeply hopeful at some level. Clearly, though, big dreams are only rewarded with the depleted melancholy of a future forever deferred. We Ajax hackers color in the margins because we know all-too-well that Firefox hasn't won. Yeah, it's great for the 20% that have picked it up, but we can't count on them. People, it seems, don't easily change their habits regarding which button on the desktop to click to get to "the google".

Lets put the last year's market numbers for browsers in some perspective. From July '07 to July '08, IE 6 market share declined from ~45% to ~26%, or a drop of 19%. Given that we know that the Win2K market share numbers are now well below 5% on the public interwebs, this gives us serious reason to hope: if the remaining 25% could just be convinced to get with the century and run Windows Update (12 times or so), we'll all be able to target the best in late-'90's browser technology. In the same year-ish timeframe, IE 7 went from ~33% to ~47%, a gain of 14%. That leaves 5% of the churn on the table for non-MSFT browsers, most of which was picked up by Firefox. Since Win2K was still below 5% a year ago, who are these users who pick Firefox and not IE 7?

Part of the story about the uptake of IE 7 is the story of Vista. Vista was hovering near 5% in June of last year and a year later had taken a nearly 15% share. Now, it's hard to say anything useful about potential correlations, but if there's a 10% gain in IE 7-by-default installs of an OS, and if some large percentage of that number came at the expense of non-upgraded XP boxen (likely old computers), it wouldn't be hard to spin a yarn about how Firefox is gaining market share primarily at the expense of IE 6. To get a concrete answer to this will likely mean paying $100 for access to "pro" version of the Net Applications numbers, but even without it we can say something very concrete about the power of OS bundling of browsers:

Competitor displacement of bundled browsers on the monopoly market-share OS has a demonstrated year-on-year market-share improvement rate of 4%/year.

That 4% a year is pretty consistent since 2004 as well. Ouch. By way of comparison remember that IE 7 is replacing IE 6 at a rate triple that. Obviously that's not apples-to-apples since every browser's "internal" version replacement rates are much higher than their competitor-displacement-rates, but it's clear that most users aren't making choices about browsers. Auto-upgrades are largely doing their thing and users are making choices about OSes and (mostly) living with whatever shows up on the desktop.

This isn't to say that Firefox can't do better and that it's not having an effect. It appears that the adoption rate for Firefox is going to have improved this year versus last, perhaps significantly. Safari is also making inroads into IE's market-share, both as a result of iTunes bundling and Mac market-share gains in the face of Vista. It's also unlikely that IE 7 and IE 8 would be happening but for the competition that Firefox has brought. But the take-away here is as powerful as it is dispiriting: we may be able to abandon IE 6 in another year or two, but no matter who works to displace it, IE 7 is going to be with us a long, long time. Worse, the sustained rate of competitor-displacement in the browser market is now much, much lower than it was in the previous era of browser competition. In one sense, competition is working in that every browser vendor is creating new versions. But the bigger picture remains: Flash can get to "ubiquitous" across the entire web with new capabilities in roughly 18 months and the Open Web faces a best case replacement time-frame of 5 years.

Reducing that differential from 42 months to zero is now the defining challenge of the Open Web. HTML is back in the hunt. Time to see how fast we can teach it remember the new tricks we're so eager to teach.

The Price of Anonymity: Our Principles?

I'm blessed with many friends in the Bay Area and incredibly grateful to count Caryl Shaw among them. It was pretty horrifying, then, to see the Digg "commentary" on an article which she wrote for PC Gamer. Luckily, much of the worst of the lot are being modded down as time goes on, but seriously, who really thinks that blatantly sexist comments are passable in 2008? That those kinds of comments occur on high-volume sites like Digg or Slashdot, sadly, doesn't surprise me.

It's really hard to know where to start in pondering the deep-seated misogyny that leads anyone to think that comments along those lines are OK, particularly in a public forum. That's perhaps part of the issue: while public, Digg (and Slashdot, etc.) comments are anonymous enough to give voice to the kinds of behavior that any society must excise if it hopes to achieve anything near its potential. We have shared principles that govern our society because we agree (together) that they're best for everyone and not just some smaller set of people. Anonymity suppresses the social enforcement functions that usually keep this kind of stuff from dominating the discussion by removing the sense of public shame that should be felt when saying vile things about others. Typing away at a keyboard allows one to feel alone but act in public in a way that creates an all-to-common dynamic online.

That got me thinking about OSCON and the talks that get proposed on the topic of gender balance nearly every year (I serve on the program committee). I usually find myself conflicted about such proposals, in part because I think the Open Source world has – in the main – been incredibly dishonest with itself to date regarding gender disparities. Jennifer and I seem to discuss it as it comes up every year, always ending up at the frustrating conclusion that this is the outcome the community allows. Surely this kind of objectionable behavior wouldn't show up so frequently if we were closer to gender balance in the OSS world. But the larger tech world seems to be addressing the topic badly if at all and OSS is no exception. Organizations like LinuxChix, SFWOW, and the Anita Borg Institute seem to me as much as defense mechanism against pervasive misogyny than a viable path forward. Segregation can't be our answer. Luckily there was a great talk this year by Emma Jane Hogbin (good notes here) which got to a lot of the meat of the issue (also, see Pia Waugh's talk summary). I find the discussion about the offhand comments which are tolerated by OSS communities to be particularly spot on: many of these communities have very strict rules about how they build and discuss code but are completely tone-deaf to how they alienate 50% of the world. Under the surface of both gaming and OSS is much the same dynamic at play when it comes to the treatment of women and, well, anyone else who's not a young white male from somewhere in the midwest. I've certainly seen my share of deplorable IRC conversations in rooms ostensibly dedicated to Open Source projects. Small or highly-focused communities might not put up with the crap that passes for discussion on Digg, but as communities grow without a strong set of norms in place and enforced, it seems inevitable that the semi-anonymous nature of the medium begets a hostile environment.

This is about the point where folks jump in to note that anonymity on the internet is a great tool for freedom; a way for the oppressed to express themselves and organize to further causes which are actually worth rallying to. But this argument breaks down quickly here: degenerate behavior in support channels or on discussions about popular links serves no principle, rises to no higher cause than prurient interest, and builds no "community" other than those who tolerate the objectification and denigration of half (or more) of the world's population. Frankly, that's not a community I want any part of.

So what, then, is the lesson for Open Source? Having just spent the week at OSCON, I've been slapped in the face once again by the complete lack of gender balance in Open Source contribution and computer engineering disciplines in general. It's kinda painful to walk around the expo hall and just imagine that for every 5 guys there are 4 women who were insulted, condescended to, or in some other way diverted from the path that would have landed them at OSCON. Simplistic arguments about graduation and enrollment rates are the dismissible results of completely antiquated cultural biases (via a new large-scale UW-Madison study). The UW study makes the case plainly: when we stop expecting differences and behave as though they are abnormal, they go away. Yes, yes, there are evolutionary differences in the physiology of men and women, but nothing that in any way explains anything like the complete dearth of female participation in Open Source. So we are left with just ourselves to blame.

In the Dojo project forums, mailing lists, and IRC channel, there is a strict policy forbidding offensive and lewd behavior. With that basic rule in place and enforced by long-time members of the community, the hostile environment so common elsewhere hasn't formed. That leads to a further puzzle: the Open Source world finds itself debating the moral and practical consequences of obtuse licensing aspects on a daily basis. What makes norms of community behavior around race, gender, and other forms of bias so different and loaded that Open Source community leaders then can't or won't speak to them? If we're developing this software with society at large, for society at large, why is absence of half of society from the process not the largest topic of discussion in the OSS world? It's certainly much more disturbing to me personally than any of the dickering over licenses that consumes so much time and attention.

The gaming world will need to clean up its own act, but the Open Source community doesn't need to wait for that to happen before acting. Unlike for-profit endeavors, Open Source projects have total leeway to act because it's the right thing to do and for no other reason. Open Source communities set standards – codes of conduct, if you will – regarding how code is developed, tested, licensed, and distributed. Open Source project leaders are in the business of setting standards for how well-organized communities act when it comes to code. So why are so many projects stopping there? The Ubuntu community Code of Conduct talks about respect but doesn't mention gender at all and while the OSI Code of Conduct talks about civility, it doesn't describe the norms which the community is held to aside from a reference to their Terms of Service which bury these expectations in 5 pages of legalese. At Dojo we haven't laid out our code of conduct in a document to date, but this latest incident has convinced me now that it's time to do so. Finding ways to modify our expectations around OSS participation by the "missing half" is now something I'm convinced is critical to the future of Open Source and computer science in general.

In that spirit, here's a first draft of a Code of Conduct for all Dojo Foundation projects which I'll send for discussion to the main Foundation list today for comment and hopefully adoption. Your thoughts on how it can be improved are much appreciated. It may not change the entire world of Open Source software development, computer science, or for that matter gaming, but we've got to start somewhere. We haven't let the Dojo community be complicit in the kind of misogyny-fueled belligerence that passes for commentary on Digg so perhaps by codifying those standards we can help create a clean, brightly-lit space where everyone can work, not just young white guys with too much time not enough perspective.

Update: Emma Jane Hogbin notes that others are starting to run with this too. The Dojo Foundation response to the proposed Code of Conduct has been very positive while there seems to be a lot of skepticism so far on the FLOSS Foundations mailing list regarding the need for a pan-Foundation statement of conduct principles. It'll be interesting to see where it goes from here.

Update 2: as I was listening to my podcasts this evening, I ran across a fascinating On The Media piece from this week that's pretty much required listening on this topic. Amazing and introspective stuff.

Update 3: What would Digg be like with Yog Rules?