Why Browser Security UI Isn't Specified
Update: my colleague Emily Stark has just written an essential-reading post on this topic. We probably need to write something about all of the non-prompt controls that Chrome has put in place behind the patterns discussed in this post, but Emily's post really drives home the case for flexibility with practical examples.
From time to time, working groups (particularly at the W3C) are asked to adopt requirements on browser-presented UI (pixels drawn outside or over-top the page content) in the normative text of a specification. For example, someone might advocate for text like along the lines of "before proceeding to Step 5, conforming User Agents MUST request user consent". Less frequently, advocates will push for the inclusion of specific text to be presented, or in the most extreme cases, specific UI layouts.
The most intense of these debates occur in the context of security-sensitive moments: warning users of potential risks, requesting permission to access sensitive features, etc. Thankfully, these efforts usually fail in the face of bristling browser feedback.
What's going on here? Don't browsers care about providing consistent UI? Don't they want the best possible security?
The answer to both questions is an emphatic yes. Counter-intuitively, it is because browsers care so much about security, privacy, and consistency (in that order) that we err on the side of rejecting constraints on security UI. In the modern era, that is paired with API design patterns that maximize flexibility about these moments.
Bit Shifter Priorities #
Web Standards are voluntary. The force that most powerfully compels their adoption is competition, rather than regulation. In practice, this means that vendors of integrated browsers have final say over what their browsers do. Those who ship the bits, rather than those who write the spec text, are ultimately responsible for the properties of the subsequent experience. This is the basis for functioning browser competition. Those who shift the bits call the shots.
This is an inherent property of modern browsers. Vendors participate in standards processes not because they need anyone else to tell them what to do, and not because they are somehow subject to the dictates of standards bodies, but rather to learn from developers and find agreement with competitors in a problem space where compatibility returns outsized gains.
Web Standards can be understood as liberally licensed, meticulously ventilated documentation about the detailed workings of a system that implementers agree to potentially be wrong about together.
Normative restrictions in a well-built spec should only cover aspects of a design where benefits of consistency and compatibility reliably outweigh the costs to getting some detail wrong. That is, if an API is found to be wanting in some way, the specified bits of the design are limited to aspects vendors can take time to address through an iterative, deliberative process.
Of course, there are classes of errors that vendors are not willing to be wrong about for the time it takes to forge new consensus. Chief among them are security vulnerabilities. Pervasive, abusive behaviour is not far behind. There are dozens of instances over the decades of browsers unilaterally breaking spec'd behaviours in the interest of guarding users. UI prescriptions in specifications that intersect with these concerns will be blithely ignored by vendors looking to protect users and their own reputations. This flows directly from the frequently-cited priority of constituencies.
Normative text about properties that are neither developer exposed nor are likely to be upheld in the breech create long-term liabilities. Over time, predictable conformance breakage — either from incident response or iterative exploration toward better solutions — has the effect of putting asterisks over otherwise-implementable specifications. This creates credibility problems for specs, their editors, working groups that back them, reviewers who do not flag unenforceable text, and SDOs that ratify toothless language. In a world of voluntary standards, reputation matters, and bad spec language can create a lasting negative impression.
Hitchen's Razor #
One of the strongest reasons to avoid making
MUST (or even
SHOULD)-strength statements about what browsers present is the vast depth of our collective ignorance.
Security UX is a relatively young field, and the set of organisations funding research into it is relatively small. We know from practice that understandability of systems is a major challenge and that users are poor at judging risk. This often causes those concerned about specific risks to demand prompts or user confirmation, but evidence also teaches that we risk "dialog blindness" through repeated re-confirmations.
The most successful examples of guiding users to safety in recent years involve a great deal of trial-and-error and follow-up research. Our best tools, collectively, are studies of variations and watching behaviour at scale. There are no models that I'm aware of that consistently maximize user understanding of choice over time while reducing decision burdens.
Consequently, as API designers, we owe it to our UXR peers to maximize their ability to iterate on potential presentations in service of those dual goals. Failing to acknowledge our own ignorance is a surefire way to paint ourselves into a bad corner.
If the case for a specific UI treatment on the basis of reckons is suspect, the case for consistency between browsers is doubly so. There is some evidence that a not-inconsiderable subset of desktop users may encounter multiple browsers in a week, but it is not a majority. Consistency between them should not then be a more important goal than internal consistency within a single browser's UI surfaces and/or consistency with underlying OS affordances (which also differ). Unlike the APIs that might invoke them, these UIs are not developer defined or exposed, removing the usual argument for consistency in a web standard.
Frustratingly, intense demands are often paired with generalized statements of the form "I think..." and "users don't know X...", without associated research. These reckons can take up a great deal of time when they should fail-fast against Hitchen's Razor, particularly given the browser community's history of iteration on these surfaces. Should future proposals for normative browser security surfaces come with peer-reviewed, wide-scale UXR attached then, perhaps, it may be worth revisiting blanket push-back.
Until then, and given our vast lack of understanding, it seems prudent to design defensively while expanding each spec's
Privacy and Security Considerations sections to exhaustively enumerate risks that UX researchers will be asked to navigate. Designing for maximum iteration potential in this area has been, thus far, the best way to be least-cumulatively-wrong.
Integrated browsers are those that ship an engine implementation along with browser UI. "WebView browsers" are an example of non-integrated browsers; they are able to define browser UI, can't change the behaviour of the entire API surface. They are hostage in some measure to the system-provided WebView for consequential choices that affect both users and developers. Browser vendors are hopping mad at Apple about iOS because — unlike every other commodity OS, including ChromeOS — it restricts integrated browsers to a single (first) party, forcing all other vendors to implicitly market Apple's (sub-standard) idea of what a browser can be through WebView-browsers with their brands...or forfeit the right to reach iOS users entirely. This undermines the competition that is central to progress and should be viewed by all users (and particularly web developers) as an intolerable distortion. If there is anything good to be said for browser engine diversity, it is that it spurs competition, sparking improvement across the board. Undercutting competition is, therefore, striking directly at the heart of the reason to want multiple engines in the first place. ↩︎
End-to-end vendor responsibility in integrated browsers — and the inability to make better decisions about consequential choices within WebView-based browsers — also explains why the Hotel Cupertino clause is a slow-moving disaster for progress, depressing the rate of innovation on the web overall. ↩︎
For a live example, consider the HTML5 History API. It's a hot mess. But it's an overwhelmingly compatible mess. As we dig into potential solutions, none of them involve breaking changes to the existing APIs and, if breaking changes were proposed, would only be done so on the basis of close study (e.g. via usage statistics), potentially spanning multiple quarters. The plan to deprecate of Web Components V0 APIs, for example, is in its second year of execution, with a lifeline for deprecation extending until at least next year. We'd expect breaking changes to the HTML5 History API to require a multiple on that timeline given relative use. Therefore, proposed fixes are largely constrained to compatible extensions and new, parallel API surfaces that can be incrementally adopted by developers over a span of years. By the same token, we have been willing to subvert the behavior of this same API in breaking ways to prevent abuse of users on relatively tight timelines, and without consensus. Security and abuse issues are a different class of concern and reliably trump compatibility. ↩︎
Non-normative notes and external "explainer" documents can help supplement the core text and provide examples of ideas for implementers to keep in mind. If a spec must contain some text that tries to talk about browser UI, this is the way to do it. ↩︎