Service Workers and PWAs: It’s About Reliable Performance, Not “Offline”

A lot of smart folks keep asking me why AppCache isn’t a good enough solution for “offline” and why it was necessary to invent Service Workers. It’s a great question!

Motivated by the regrettably uneven browser support landscape for Service Workers, there’s a real incentive to “just make something work offline” on iOS or old-IE. This phrasing obscures the primary experience difference between native apps and web content: native apps always “boot” when you tap on them. The legacy web, however, can take as long as the TCP timeout (2 minutes in many devices) to end in failure. That’s a looooong time to be looking at a white screen.

But doesn’t AppCache let you cache documents you might want offline? Sort of. Kinda. Ish. Turns out this only works in trivial cases. The issue is that the AppCache design only allows you to do “routing” using the FALLBACK section, and to trigger the FALLBACK URL (which can boot up, inspect it’s environment, and do something custom) the request needs to have actually failed. To handle the general case — a site with an unbounded set of URLs — that means users are waiting on a flaky, slow network for minutes (if they’re patient) before the the router page starts, which might then forward them on to content or a custom offline experience.

“But wait!”, you say, “doesn’t AppCache also allow you to put documents in the cache directly, bypassing all of that?”. Yes, it does, but because of AppCache’s atomic update system and practical limits to storage space, this means that the set of URLs in this list needs to be low and change infrequently. You see this play out, for example, in the AppCache manifests that major Google services like Google Docs generate. These services wind up with major complexity related to creating manifests on a per-user basis and managing the entries in them. Those entries are carefully curated from amongst the entire set of possible URLs to ensure high performance for the entries users are likely to visit offline, but this still leaves everything else at the mercy of network flakiness.

It’s hard to stress enough how much better reliable performance is. Think of it this way: the first load of a native app sucks. It’s gated by an app store and a huge download, but once you get to a point where the app is installed, that up-front cost is amortized across all app starts and none of those starts have a variable delay. Each application start is as fast as the last, no variance. What native apps deliver is reliable performance.

Until Service Workers arrived, it simply wasn’t possible to achieve this and that’s why I designed Chrome’s Add-to-Homescreen prompting behavior to require a Service Worker that responds quickly for the URL listed in the Manifest’s start_url. The heuristic is designed to only promote experiences that are “app-like”; recognizing that a key differentiator between the “old web” and native apps is reliable performance.

Remy Sharp blogged about the experience and expectation differences that we’ve worked hard to bake into Progressive Web Apps. Reliable performance is the most important of these and the reason Progressive Web Apps require Service Workers. It isn’t that “offline” support doesn’t matter — it does — but apps that work well offline are a subset of things that are apps, experiences you can trust to start any time, anywhere, on any connection.

I’ll be speaking more about this at Google I/O in a few weeks, and I’m hugely excited about the ways that the web is going to get better this year; starting with experiences that always work.

EWS Melbourne

What follows is roughly the text of a talk I gave last week at the Extensible Web Summit in Melbourne.


The Point of Extensibility

Mark, apparently, “volunteered” me to give a lightning talk last night over a dinner that I wasn’t at, so apologies in advance if I run short or long.

Something that comes up frequently in our work on the TAG is the relationship between extensibility as a principle and how it relates to specific features we want in the web platform. To get anywhere in these debates, I think it’s worth zooming out a bit.

The web has a strange origin story: we didn’t build our way up from assembler and C, and the notion of moving words around memory is so far thing from the level of abstraction that HTML, JavaScript, and CSS provide that you can barely see them from there. Even VBScript, perhaps the most relevant contemporary, had a story for how that layering worked. The web, for decades, managed to do without.

Extensibility, then, has been an effective tool in the modern era for trying to understand the hidden linkages between the strange fauna and flora of this alien world. We can almost always get somewhere by asking dumb questions like “so, how does this thing relate to that other thing over there?”. We nearly always turn up some missing primitive that we can catalog and reuse on our shared exploration.

But I’d submit that Extensibility is a tool in the same sense as Standards: it’s possible to drive yourself mad with them if you lose track of the goal. Despite our setting today, this isn’t academic. We’re doing it all for a reason, and that reason needs to be a goal we share.

I can’t set goals for you, but I can tell you what mine are and ask you to join me in them. My specific goal, then, is to improve the rate of progress on the web for the benefit of users and developers, in that order.

Put another way, I want to ensure the web is the platform developers invest in first and most, and that they do so because it’s the easiest, best way to deliver great experiences.

With that in mind, it’s easier to be kind to proposals that come to the TAG for high-level features, particularly if they’re forging new ground or opening up new fundamental capabilities that the web didn’t have but that will enrich user’s experiences by improving developer’s options.

Extensibility and Standards are incredibly useful tools but they make crummy religions.

There will be new features that aren’t extensible by default. That’s OK. Not ideal, but OK and, sometimes, unavoidable. What is avoidable is leaving the web’s features sealed shut, never asking the questions about how things relate; never connecting the dots.

That would be a damned shame. I’m grateful the TAG has joined some of us insurrectionists in asking these questions, but we’ve got a long way to go. And my hope as we go there together is that we don’t mistake the means for the goals.

Thanks.

Doing Science On The Web

Cross-posted at Medium

This post is about vendor prefixes, why they didn’t work, and why it’s toxic not to be able to launch experimental features. Also, what to do about it. The arguments require nuance and long-term thinking. Brace yourself.

Vendor prefixes are a very sore topic, and one where I’ve disagreed with the overwhelming consensus. In the heat of the ‘11–12 debate (“prefixpocalypse”) I tried to outline a hierarchy of the web platform needs:

  1. Meeting developer & user experience needs with new features
  2. Eventual interoperability for successful features
  3. Minimizing harm to the ecosystem from experiments-gone-wrong

The debate and subsequent (conflicting) prohibitions & advice centered on the third point: minimizing pollution.

Recall that in 2012, Google, Apple, Blackberry, and a host of other vendors were all shipping browsers based on a single CSS engine (WebKit) without changing the -webkit-* prefixes to be vendor-specific; e.g. -cr-*, -apple-*, or -bb-*. As a result many experimental features experienced premature compatibility.Developers could get the benefits of broad feature support without a corresponding standard. This backed non-WebKit-based browsers into a terrible choice: “camp” on the other vendor’s prefixed behavior to render content correctly or suffer a loss of user and developer loyalty.

This illustrates what happens when experiments inadvertently become critical infrastructure. It has happened before. Over, and over, and over again.

Prefixes were supposed to allow experimentation while discouraging misuse, but in practice they don’t. Prefixes “look” ugly and the thought was that ugliness  —  combined with an aversion to proprietary gunk by web developers —  would cause sites to cease using them once standards are in place and browsers implement. But that’s not what happens.

Useful features that live a long time in the “experimental” phase tend to get “burned in”, particularly if the browsers supporting them are widely used. Breaking existing content is the third rail for browsers; all of their product instincts and incentives keep them from doing it, even if the breakage comes from retracting proprietary features. This means that many prefixed properties continue to work long after standard versions are added. Likewise, sites and pages that work with prefixes are all-too-easy for web developers to write and abandon. It’s unsettling to remove a prefix when you might break a user with an old browser. Maintenance of both sites and browsers rarely subtracts, but the theory of prefixes hinges on subtraction.

Everyone who uses prefixes, both browser engineers and web developers, start down the path thinking they’ll stop at some point. But for predictable reasons, that isn’t what happens. Good intentions are not an effective prophylactic. Not for web developers or browser makers (to say nothing of amorous teens).

This situation is the natural consequence of platform and developer time-scales that are out of sync. Browsers move more slowly than sites (at the micro scale), but sites must contend with huge browser diversity and are therefore much more conservative about removing “working” code than browser engineers expected.

Now What?

Years after Prefixpocalypse everyone who works on a browser understands that prefixes haven’t succeeded in minimizing harm, yet vendors proudly announce new prefixed features and developers blithely (ab)use them. Clearly, a need for new features trumps interoperability and pollution concerns. This is natural and, perhaps even healthy. A static web, one which doesn’t do more to make lives better is one that doesn’t deserve to thrive and grow. In technology as in life there is no stasis, only various speeds of growth or decay.

Browsers could stop prefix ecosystem pollution from happening by simply vowing not to add features. This neatly analyses the problem (some experiments don’t work out, and some get out of hand) and proposes a solution (no experimentation), but as H.L. Mencken famously wrote:

…there is always a well-known solution to every human problem — neat, plausible, and wrong.

We have already run a natural experiment in this area. At the low point after the first browser war, Microsoft (temporarily) shrink from the challenge of building the web into a platform. Meanwhile IE 6’s momentum assured its place as the boat-anchor-browser. Between 2002 and 2006, the web (roughly) didn’t add any new features. Was that better? Not hardly. I’m glad to be done with 9-table-cell image hacks to accomplish rounded corners. Not all change is progress, but without change there is no progress.

Or, put better by W3C Memes:

image alt text

“One does not simply ship no new features for a year and remain competitive”

We do need new features, and we’d like good versions of them — fewer document.alls, WebSQLs and AppCaches, thanks.

We know from experience developing software of all kinds that more iteration yields better results. Experimentation, chances to learn, and opportunities to try alternatives are what separate good ideas from great products. Members of the Google Gears team report they considered building something like Service Workers. Instead they built an AppCache style system which didn’t work in all the ways AppCache didn’t work (which they couldn’t have known at the time). It shouldn’t have taken 6+ years to course-correct. We need to be able to experiment and iterate. Now that we understand the problems with prefixes, we need another mechanism.

Experiments That Stay Experiments

Prefixpocalypse happened because experiments escaped the lab. Wide-scale use of experimental properties isn’t healthy. Because prefixed properties were available to any site (not matter how large), it was straightforward for the killer combination of broad browser support and major site usage to ensure that compatibility would work against ever ending the experiment. The key to doing better, then, is to limit the size of the experimental population.

The way prefixes were run was like making a new drug available over the counter as soon as a promising early trial was conducted, skipping animal, human, and large-scale clinical trials. Of course that would be ludicrous; “first do no harm” requires starting with a small population, showing efficacy, gathering data about side-effects, and iterating.

In the web platform, the missing ingredient has been the ability to limit the experimental population. Experiments can run for fixed duration without fear of breaking the web if we can be sure that they never imperiled the whole web in the first place. Short duration and small, committed test populations allow for more iteration which should, in the end, lead to better features. The web developer feedback needs to be the most important voice in the standards process, and we’ll never get there until there’s more ability for web developers to participate in feature evolution.

Experimental outcomes are ammo for the standards development process; in the best-case they can provide good evidence that a feature is both needed and well-designed.

Putting evidence at the core of web feature and standards development is a 180° change from the current M.O., but one we sorely need.

So how do we get there?

Some mechanisms I’ve thought through and rejected (with reasons):

  • “Just have users flip things in about:flags”
    This has several persistent downsides: first, it doesn’t limit the size of the experimental population. If every site encourages users to flip a particular flag, odds are enough users will do so to set usage above a red-line threshold.

  • “Enable it by default on your Beta/Dev channel browser”
    Like the flag-flipping mechanism, it puts a burden on users which is exactly backward. Experiments will get better feedback when developers can work with features without the friction of asking users to switch browser.

The Chrome Team has been thinking about this problem for the past several years, including conversations with other vendors, and those ideas have congealed into a few interlocking mechanisms that haven’t been rejected:

  1. Developer registration & usage keys.
    A large part of the reason it’s difficult to change developer behavior about use of experimental features is that it’s hard to find them! Who would you call to talk about use of some prefixed CSS thing on facebook.com? I don’t know either. Having an open communication channel is critical to learning how features are working (or not) in the real world. To that end, new experimental features will be tied to specific origins using keys vended by a developer program; sites supply the keys to the browser through header/meta tags, enabling the features dynamically. Registration for the program will probably require giving a (valid) email address and agreeing to answer survey questions about experimental features. Because of auto-self-destruct (see below), there’s less worry that these experiments will be abused to provide proprietary features to “preferred” origins. Public dashboards of running experiments and users will ensure transparency to this effect.

  2. Global usage caps.
    The Blink project generally uses a ~0.03% usage threshold to decide if it’s plausible to remove a feature. Experimenters might use our Use Counter infrastructure and RAPPOR to monitor use. Any feature that breaches this threshold can automatically close the experiment to new users and, if any individual user goes above ~0.01% (global) use, a config update can be pushed to throttle use on that site.

  3. Feature auto-self-destruct.
    Experimental features should be backed by a process that’s trying to learn. To enable this, we’re going to ensure that each version of an experimental feature auto-self-destructs, tentatively set at 12–18 weeks per experiment. New iterations which are designed to test some theory can be launched once an experiment has finished (but must have some API or semantic difference, preferably breaking). Sites that want to opt into the next experiment and were part of a previous group will be asked survey questions in the key-update process (which is probably going to be a requirement for access to future experimental versions). Experiments can overlap to provide continuity for end-users who are willing to move to the next-best-guess and provide feedback.

We’re also going to work to ensure that the surfaced APIs are done in a responsible way, including feature-detection where possible. These properties add up to a solution that gives us confidence that we can create Ctrl-Z for web features without damaging users or sites.

In discussions with our friends in the community and at other browser vendors we’ve thought through alternative ways to throttle or shrink the experimental population: randomness in API names, limiting APIs to postMessage style calling, or shortening experiment lifetimes. As Chrome is going first, we’ll be iterating on the experimental framework to try to strike the right balance that allows enough use to learn from but not so much that we inadvertently commit to an API. We’ll also be sharing what we learn.

My hope is that other browsers implement similar programs and, as a corollary, cease use of prefixes. If they do, I can imagine many future areas for collaboration on developing and running these experiments. That said, it’s desirable to for different browsers to be trying different designs; we learn more through diversity than premature monoculture.

Moving faster and building better features don’t have to be in tension; we can do better. It’s time to try.

Thanks to Owen Campbell-Moore, Joe Medley, Jeff Yasskin, Adrian Bateman, Jake Archibald, Ian Clelland, Michael Stillwell, Addy Osmani, and Chris Wilson, and Paul Irish for their invaluable feedback on drafts of this post.

A Funny Thing Happened On The Way To The Future…

There’s a post on the fetch() API by Ludovico Fischer doing the rounds. As a co-instigator for adding the API to the platform, it’s always a curious thing to read commentary about an API you designed, but this one more than most. It brings together the epic slog that was the Promises design (which we also waded into in order to get Service Workers done and which will improve with await/async) with the in-process improvements that will come from Streams and it mixes it with a dollop of FUD, misunderstanding, and derision.

This sort of article is emblematic of a common misunderstanding. It expresses (at the end) a latent belief that there is some better alternative available for making progress on the web platform than to work feature-by-feature, compromise-by-compromise towards a better world. That because the first version of fetch() didn’t have everything we want, it won’t ever. That there was either some other way to get fetch() shipped or that there was a way to get cancellation through TC39 in ’13. Or that subclassing is somehow illegitimate and “non-standard” (even though the subclass would clearly be part of the Fetch Standard).

These sorts of undirected, context-free critiques rely on snapshots of the current situation (particularly deficiencies thereof) to argue implicitly that someone must be to blame for the current situation not yet being as good as the future we imagine. To get there, one must apply skepticism about all future progress; “who knows when that’ll be done!” or “yeah, fine, you shipped it but it isn’t available in Browser X!!!”.

They’re hard to refute because they’re both true and wrong. It’s the prepper/end-of-the-world mentality applied to technological progress. Sure, the world could come to a grinding halt, society could disintegrate, and the things we’re currently working on could never materialize. But, realistically, is that likely? The worst-case-scenario peddlers don’t need to bother with that question. It’s cheap and easy to “teach the controversy”. The appearance of drama is its own reward.

Perhaps most infuriatingly, these sorts of cheap snapshots laced with FUD do real harm to the process of progress. They make it harder for the better things to actually appear because “controversy” can frequently become a self fulfilling prophesy; browser makers get cold feet for stupid reasons which can create negative feedback loops of indecision and foot-gazing. It won’t prevent progress forever, but it sure can slow it down.

I’m disappointed in SitePoint for publishing the last few paragraphs in an otherwise brilliant article, but the good news is that it (probably) won’t slow Cancellation or Streams down. They are composable additions to fetch() and Promises. We didn’t block the initial versions on them because they are straightforward to add later and getting the first versions done required cuts. Both APIs were designed with extensions in mind, and the controversies are small. Being tactical is how we make progress happen, even if it isn’t all at once. Those of us engaged in this struggle for progress are going to keep at it, one feature (and compromise) at a time.