Infrequently Noted

Apple Is Not Defending Browser Engine Choice

June 23, 2022
Updated: June 27, 2022

This is part six of the nine-part series "Browser Choice Must Matter"

Gentle reader, I made a terrible mistake. Yes, that's right: I read the comments on a MacRumors article. At my age, one knows better. And yet.

As penance for this error, and for being short with Miguel, I must deconstruct the ways Apple has undermined browser engine diversity. Contrary to claims of Apple partisans, iOS engine restrictions are not preventing a "takeover" by Chromium — at least that's not the primary effect. Apple uses its power over browsers to strip-mine and sabotage the web, hurting all engine projects and draining the web of future potential.

As we will see, both the present and future of browser engine choice are squarely within Cupertino's control.

Apple's Long-Standing Policies Are Anti-Diversity
"WebKit couldn't compete if it had to."
Browsers Are Big Business
WebKit Is No Charity
Choices, Choices
- Recent Developments
How Apple Gutted Mozilla's Chances
Back Of The Napkin
The Best Kind Of Correct
What Now?

Apple's Long-Standing Policies Are Anti-Diversity #

A refresher on Apple's iOS browser policies:

From iOS 2.0 in '08 to iOS 14 in late '20, Apple would not allow any browser but Safari to be the default.
For 14 years and counting, Apple has prevented competing browsers from bringing their own engines, forcing vendors to build skins over Apple's WebKit binary, which has historically been slower, less secure, and lacking in features.
Apple will not even allow competing browsers to provide different runtime flags to WebKit. Instead, Fruit Co. publishes a paltry set of options that carry an unmistakable odour of first-party app requirements.
Apple continues to self-preference through exclusive API access for Safari; e.g., the ability to install PWAs to the home screen, implement media codecs, and much else.

Defenders of Apple's monopoly offer hard-to-test claims, but many boil down to the idea that Apple's product is inferior by necessity. This line is frankly insulting to the good people that work on WebKit. They're excellent engineers; some of the best, pound for pound, but there aren't enough of them. And that's a choice.

"WebKit couldn't compete if it had to." #

Nobody frames it precisely this way; instead they'll say, if WebKit weren't mandated, Chromium would take over, or Google would dominate the web if not for the WebKit restriction. That potential future requires mechanisms of action — something to cause Safari users to switch. What are those mechanisms? And why are some commenters so sure the end is nigh for WebKit?

Recall the status quo: websites can already ask iOS users to download alternative browsers. Thanks to (belated) questioning by Congress, they can even be set as the user's default, ensuring a higher probability to generate search traffic and derive associated revenue. None of that hinges on browser engine choice; it's just marketing. At the level of commerce, Apple's capitulation on default browser choice is a big deal, but it falls short of true differentiation.

So, websites can already put up banners asking users to get different browsers, If WebKit is doomed, its failure must lie in other stars; e.g., that Safari's WebKit is inferior to Gecko and Blink.

But the quality and completeness of WebKit is entirely within Apple's control.

Past swings away from OS default browsers happened because alternatives offered new features, better performance, improved security, and good site compatibility. These are properties intrinsic to the engine, not just the badge on the bonnet. Marketing and distribution have a role, but in recent browser battles, better engines have powered market shifts.

To truly differentiate and win, competitors must be able to bring their own engines. The leads of OS incumbents are not insurmountable because browsers are commodities with relatively low switching costs. Better products tend to win, if allowed, and Apple knows it.

Destkop <abbr>OS</abbr>es have long created a vibrant market for browser choice, enabling competitors not tied to OS defaults to flourish over the years. — Destkop OSes have long created a vibrant market for browser choice, enabling competitors not tied to OS defaults to flourish over the years.

Apple's prohibition on iOS browser engine competition has drained the potential of browser choice to deliver improvements. Without the ability to differentiate on features, security, performance, privacy, and compatibility, what's to sell? A slightly different UI? That's meaningful, but identically feeble web features cap the potential of every iOS browser. Nobody can pull ahead, and no product can offer future-looking capabilities that might make the web a more attractive platform.

This is working as intended:

<a href='https://developer.apple.com/app-store/review/guidelines/#2.5.6'>Apple's policies</a> explicitly prevent meaningful competition between browsers on iOS. In 2022, you can have any default you like, as long as it's as buggy as Safari. — Apple's policies explicitly prevent meaningful competition between browsers on iOS. In 2022, you can have any default you like, as long as it's as buggy as Safari.

On OSes with proper browser competition, sites can recommend browsers with engines that cost less to support or unlock crucial capabilities. Major sites asking users to switch can be incredibly effective in aggregate. Standards support is sometimes offered as a solution, but it's best to think of it as a trailing indicator.^[1] Critical capabilities often arrive in just one engine to start with, and developers that need these features may have incentive to prompt users to switch.

Developers are reluctanct to do this, however; turning away users isn't a winning growth strategy, and prompting visitors to switch is passé.

Still, in extremis, missing features and the parade of showstopping bugs render some services impossible to deliver. In these cases, suggesting an alternative beats losing users entirely.

But what if there's no better alternative? This is the situation that Apple has engineered on iOS. Cui bono? — who benefits?

All iOS browsers present as Safari to developers. There's no point in recommending a better browser because none is available. The combined mass of all iOS browsing pegged to the trailing edge means that folks must support WebKit or decamp for Apple's App Store, where it hands out capabilities like candy, but at a shocking price.

iOS's mandated inadequacy has convinced some that when engine choice is possible, users will stampede of away from Safari. This would, in turn, cause developers to skimp on testing for Apple's engine, making it inevitable that browsers based on WebKit and other minority engines could not compete. Or so the theory goes.

But is it predestined?

Perhaps some users will switch, but browser market changes take a great deal of time, and Apple enjoys numerous defences.

To the extent that Apple wants to win developers and avoid losing users, it has plenty of time.

It took over five years for Chrome to earn majority share on Windows with a superior product, and there's no reason to think iOS browser share will move faster. Then there's the countervailing evidence from macOS, where Safari manages to do just fine.

Regulatory mandates about engine choice will also take more than a year to come into force, giving Apple plenty of time to respond and improve the competitiveness of its engine. And that's the lower bound.

Apple's pattern of malaicious compliance will likely postpone true choice even futher. As Apple fights tooth-and-nail to prevent alternative browser engines, it will try to create ambiguity about vendor's ability to ship their best products worldwide, potentially delaying high-cost investment in ports with uncertain market reach.

Cupertino may also try to create arduous processes that force vendors to individually challenge the lack of each API, one geography at a time. In the best case, time will still be lost to this sort of brinksmanship. This is time that Apple can use to improve WebKit and Safari to be properly competitive.

Why would developers recommend alternatives if Safari adds features, improves security, prioritises performance, and fumigates for showstopping bugs? Remember: developers don't want to prompt users to switch; they only do it under duress. The features and quality of Safari are squarely in Apple's control.

So, given that Apple has plenty of time to catch up, is it a rational business decision to invest enough to compete?

Browsers Are Big Business #

Browsers are both big business and industrial-scale engineering projects. Hundreds of folks are needed to implement and maintain a competitive browser with specialisations in nearly every area of computing. World-class experts in graphics, networking, cryptography, databases, language design, VM implementation, security, usability (particularly usable security), power management, compilers, fonts, high-performance layout, codecs, real-time media, audio and video pipelines, and per-OS specialisation are required. And then you need infrastructure; lots of it.

How much does all of this cost? A reasonable floor comes from Mozilla's annual reports. The latest consolidated financials (PDF) are from 2020 and show that, without marketing expenses, Mozilla spends between $380 and $430 million US per year on software development. Salaries are the largest category of these costs (~$180-210 million), and Mozilla economises by hiring remote employees paid local market rates, without large bonuses or stock-based compensation.

From this data, we can assume a baseline cost to build and maintain a competitive, cross-platform browser at $450 million per year.

Browser vendors fund their industrial-scale software engineering projects through integrations. Search engines pay browser makers for default placement within their products. They, in turn, make a lot of money because browsers send them transactional and commercial intent searches as part of the query stream.

Advertisers bid huge sums to place ads against keywords in these categories. This market, in turn, funds all the R&D and operational costs of search engines, including "traffic acquisition costs" like browser search default deals.^[2]

How much money are we talking about? Mozilla's $450 million in annual revenue comes from approximately 8% of the desktop market and negligible mobile share. Browsers are big, big business.

WebKit Is No Charity #

Despite being largely open source, browsers and their engines are not loss leaders.

Safari, in particular, is wildly profitable. The New York Times reported in late 2020 that Google now pays Apple between $8-12 billion per year to remain Safari's default search engine, up from $1 billion in 2014. Other estimates put the current payments in the $15 billion range. What does this almighty torrent of cash buy Google? Searches, preferably of the commercial intent sort.

Mobile accounts for two-thirds of web traffic (or thereabouts), making outsized iOS adoption among wealthy users particularly salient to publishers and advertisers. Google's payments to Apple are largely driven by the iPhone rather than its niche desktop products where effective browser competition has reduced the influence of Apple's defaults.

Against considerable competition, Safari was used by 52% of visitors to <abbr>US</abbr> Government websites from macOS devices from <time datetime='2022-03-06'>March 6<sup>th</sup></time> to <time datetime='2022-04-04'>April 4<sup>th</sup>, 2022</time> — Against considerable competition, Safari was used by 52% of visitors to US Government websites from macOS devices from March 6^th to April 4^th, 2022

The influence of a dozen years of suppressed browser choice is evident on iOS, where Safari is used 90% of the time. Apple's policies caused Mozilla to delay producing an iOS browser for seven years, and its de minimus iOS share (versus 3.6% on macOS) is a predictable result.

iOS represents 75% of all visits to <abbr>US</abbr> Government websites from Apple <abbr>OS</abbr>es — iOS represents 75% of all visits to US Government websites from Apple OSes

Even with Apple's somewhat higher salaries per engineer, the skeleton staffing of WebKit, combined with the easier task of supporting fewer platforms, suggests that Apple is unlikely to spend considerably more than Mozilla does on browser development. In 2014, Apple would have enjoyed a profit margin of 50% if it had spent half a billion on browser engineering. Today, that margin would be 94-97%, depending on which figure you believe for Google's payments.

In absolute terms, that's more profit than Apple makes selling Macs.

Compare Cupertino's 3-6% search revenue reinvestment in the web with Mozilla's near 100% commitment, then recall that Mozilla has consistently delivered a superior engine to more platforms. I don't know what's more embarrassing: that some folks argue with a straight face that Apple is trying hard to build a good browser, or that it is consistently overmatched in performance, security, and compatibility by a plucky non-profit foundation that makes just ~5% of Apple's web revenue.

Choices, Choices #

Steve Jobs launched Safari for Windows in the same WWDC keynote that unveiled the iPhone.

Commenters often fixate on the iPhone's original web-based pitch, but don't give Apple stick for reducing engine diversity by abandoning Windows three versions later.

Today, Apple doesn't compete outside its home turf, and when it has agency, it prevents others from doing so. These are not the actions of a firm that is consciously attempting to promote engine diversity. If Apple is an ally in that cause, it is only by accident.

Theories that postulate a takeover by Chromium dismiss Apple's power over a situation it created and recommits to annually through its budgeting process.

This is not a question of resources. Recall that Apple spends $85 billion per year on stock buybacks^[3], $15 billion on dividends, enjoys free cash flow larger than the annual budgets of 47 nations, and retain tens of billions of dollars of cash on hand.^[4] And that's to say nothing of Apple's $100+ billion in non-business-related long-term investments.

Even if Safari were a loss leader, Apple would be able to avoid producing a slower, stifled, less secure, famously buggy engine without breaking the bank.

Apple needs fewer staff to deliver equivalent features because Safari supports fewer OSes. The necessary investments are also R&D expenses that receive heavy tax advantages. Apple enjoys enviable discounts to produce a credible browser, but refuses to do so.

Unlike Microsoft's late and underpowered efforts with IE 7-11, Safari enjoys tolerable web compatibility, more than 90% share on a popular OS, and an unheard-of war chest with which to finance a defence. The postulated apocalypse seems far away and entirely within Apple's power to forestall.

Recent Developments #

One way to understand the voluntary nature of Safari's poor competitiveness is to put Cupertino's recent burst of effort in context.

When regulators and legislators began asking questions in 2019, a response was required. Following Congress' query about default browser choice, Apple quietly allowed it through iOS 14 (however ham-fistedly) the following year. This underscores Apple's gatekeeper status and the tiny scale of investment required to enable large changes.

In the past six months, the Safari team has gone on a veritable hiring spree. This month's WWDC announcements showcased returns on that investment. By spending more in response to regulatory pressure, Apple has eviscerated notions that it could not have delivered a safer, more capable, and competitive browser many years earlier.

Safari's incremental headcount allocation has been large compared to the previous size of the Safari team, but in terms of Apple's P&L, it's loose change. Predictably, hiring talent to catch up has come at no appreciable loss to profitability.

The competitive potential of any browser hinges on headcount, and Apple is not limited in its ability to hire engineering talent. Recent efforts demonstrate that Apple has been able to build a better browser all along and, year after year, chose not to.

How Apple Gutted Mozilla's Chances #

For over a dozen years, setting any browser other than Safari as the iOS default was impossible. This spotted Safari a massive market share head-start. Meanwhile, restrictions on engine choice continue to hamstring competitors, removing arguments for why users should switch. But don't take my word for it; here's the recent "UK CMA Final Report on Mobile Ecosystems" summarising submissions by Mozilla and others (pages 154-155):

5.48 The WebKit restriction also means that browser vendors that want to use Blink or Gecko on other operating systems have to build their browser on two different browser engines. Several browser vendors submitted that needing to code their browser for both WebKit and the browser engine they use on Android results in higher costs and features being deployed more slowly.

5.49 Two browser vendors submitted that they do not offer a mobile browser for iOS due to the lack of differentiation and the extra costs, while Mozilla told us that the WebKit restriction delayed its entrance into iOS by around seven years

That's seven years of marketing, feature iteration, and brand loyalty that Mozilla sacrificed on the principle that if they could not bring their core differentiator, there was no point.

It would have been better if Mozilla had made a ruckus, rather than hoping the world would notice its stoic virtue, but thankfully the T-rex has roused from its slumber.

Given the hard times the Mozilla Foundation has found itself in, it seems worth trying to quantify the costs.

To start, Mozilla must fund a separate team to re-develop features atop a less-capable runtime. Every feature that interacts with web content must be rebuilt in an ad-hoc way using inferior tools. Everything from form autofill to password management to content blocking requires extra resources to build for iOS. Not only does this tax development of the iOS product, it makes coordinated feature launches more costly across all ports.

Most substantially, iOS policies against default browser choice — combined with "in-app-browser" and search entry point shenanigans — have delayed and devalued browser choice.

Until late 2020, users needed to explicitly tap the Firefox icon on the home screen to get back to their browser. Naïvely tapping links would, instead, load content in Safari. This split experience causes a sort of pervasive forgetfulness, making the web less useful.

Continuous partial amnesia about browser-managed information is bad for users, but it hurts browser makers too. On OSes with functional competition, convincing a user to download a new browser has a chance of converting nearly all of their browsing to that product. iOS (along with Android and Facebook's mobile apps) undermine this by constantly splitting browsing, ignoring the user's default. When users don't end up in their browser, searches occur through it less often, affecting revenue. Web developers also experience this as a reduction in visible share of browsing from competing products, reducing incentives to support alternative engines.

A foregetful web also hurts publishers. Ad bid rates are suppressed, and users struggle to access pay-walled content when browsing is split. The conspicuious lack of re-engagement features like Push Notifications are the rotten cherry on top, forcing sites to push users to the App Store where Apple doesn't randomly log users out, or deprive publishers of key features.

Users, browser makers, web developers, and web businesses all lose. The hat-trick of value destruction.

Back Of The Napkin #

The pantomime of browser choice on iOS has created an anaemic, amnesiac web. Tapping links is more slogging than surfing when autofill fails, passwords are lost, and login state is forgotten. Browsers become less valuable as the web stops being a reliable way to complete tasks.

Can we quantify these losses?

Estimating lost business from user frustration and ad rate depression is challenging. But we can extrapolate what a dozen years of choice might have meant for Mozilla from what we know about how Apple monetises the web.

For the purposes of argument, let's assume Mozilla would be paid for web traffic at the same rate as Apple; $8-15 billion per year for ~75% share of traffic from Apple OSes.

If the traffic numbers to US government websites are reasonable proxies for the iOS/macOS traffic mix (big "if"s), then equal share for Firefox on iOS to macOS would be worth $215-400 million per year.^[5] Put differently; there's reason to think that Mozilla would not have suffered layoffs if Apple were an ally of engine choice.

Apple's policies have made the web a less compelling ecosystem, its anti-competitive behaviours drive up competitor's costs, and it simultaneously starves them of revenue by undermining browser choice.

If Apple is a friend of engine diversity, who needs enemies?

The Best Kind Of Correct #

There is a narrow, fetid sense in which Apple's influence is nominally pro-diversity. Having anchored a significant fraction of web traffic at the trailing edge, businesses that do not decamp for the App Store may feel obliged to support WebKit.

This is a malignant form of diversity, not unlike other lagging engines through the years that harmed users and web-based businesses by externalizing costs. But on OSes with true browser choice, alternatives were meaningful.

Consider the loathed memory of IE 6, a browser that overstayed its welcome by nearly a decade. For as bad as it was, folks could recommend alternatives. Plugins also allowed us to transparently upgrade the platform.

Before the rise of open-source engines, the end of one browser lineage may have been a deep loss to ecosystem diversity, but in the past 15 years, the primary way new engines emerge has been through forking and remixing.

But the fact of an engine being different does not make that difference valuable, and WebKit's differences are incremental. Sure, Blink now has a faster layout engine, better security, more features, and fewer bugs, but like WebKit, it is also derived from KHTML. Both engines are forks and owe many present-day traits to their ancestors.

The history of browsers includes many forks and remixes. It's naïve to think that will end if iOS becomes hospitable to browser competition. After all, it has been competition that spurred engine improvements and forks.

Today's KHTML descendants are not the end of the story. Future forks are possible. New codebases can be built from parts. Indeed, there's already valuable cross-pollination in code between Gecko, WebKit, and Chromium. Unlike the '90s and early 2000s, diversity can arrive in valuable increments through forking and recombination.

What's necessary for leading edge diversity, however, is funding.

By simultaneously taking a massive pot of cash for browser-building off the table, returning the least it can to engine development, and preventing others from filling the gap, Apple has foundationally imperilled the web ecosystem by destroying the utility of a diverse population of browsers and engines.

Apple has agency. It is not a victim, and it is not defending engine diversity.

What Now? #

A better, brighter future for the web is possible, and thanks to belated movement by regulators, increasingly likely. The good folks over at Open Web Advocacy are leading the way, clearly explaining to anyone who will listen both what's at stake and what it will take to improve the situation.

Investigations are now underway worldwide, so if you think Apple shouldn't be afraid of a bit of competition if it will help the web thrive, consider getting involved. And if you're in the UK or do business there, consider helping the CMA help the web before July 22nd, 2022. The future isn't written yet, and we can change it for the better.

Many commenters come to debates about compatibility and standards compliance with a mistaken view of how standards are made. As a result, they perceive vendors with better standards conformance (rather than content compatibility) to occupy a sort of moral high ground. They do not. Instead, it usually represents a broken standards-setting process.

This can happen for several reasons. Sometimes standards bodies shutter, and the state of the art moves forward without them. This presents some risk for vendors that forge ahead without the cover of an SDO's protective IP umbrella, but that risk is often temporary and measured. SDOs aren't hard to come by; if new features are valuable, they can be standardised in a new venue. Alternatively, vendors can renovate the old one if others are interested in the work.

More often, working groups move at the speed of their most obstinate participants, uncomfortably prolonging technical debates already settled in the market and preventing definitive documentation of the winning design. In other cases, a vendor may play games with intellectual property claims to delay standardisation or lure competitors into a patent minefield (as Apple did with Touch Events).

At the leading edge, vendors need space to try new ideas without the need for the a priori consensus represented by a standard. However, compatibility concerns expressed by developers take on a different tinge over time.

When the specific API details and capabilities of ageing features do not converge, a continual tax is placed on folks trying to build sites using features from that set. When developers stress the need for compatibility, it is often in this respect.

Disingenuous actors sometimes try to misrepresent this interest and claim that all features must become standards before they are introduced in any engine. This interpretation runs against the long practice of internet standards development and almost always hides an ulterior motive.

The role of standards is to consolidate gains introduced at the leading edge through responsible competition. Vendors that fail to participate constructively in this process earn scorn. They bring ignominy upon their houses by failing to bring implementations in line with the rough (documented and tested) consensus or by playing the heel in SDOs to forestall progress they find inconvenient.

Vendors like Apple. ↩︎
In the financial reports of internet businesses, you will see the costs to acquire business through channels reported as "Traffic Acquisition Costs" or "TACM". Many startups report their revenue "excluding TAC" or "ex-TAC". These are all ways of saying, "we paid for lead generation", and search engines are no different. ↩︎
This is money Apple believes it cannot figure out a way to invest in its products. That's literally what share buybacks indicate. They're an admission that a company is not smart enough to invest the money in something productive. Buybacks are attractive to managers because they create artificial scarcity for shares to drive up realised employee compensation — their own included. Employees who are cheesed to realise that their projects are perennially short-staffed are encouraged not to make a stink through RSU appreciation. Everyone gets a cut, RSU-rich managers most of all. ↩︎
Different analysts use different ways of describing Apple's "cash on hand". Some analysts lump in all marketable securities, current and non-current, which consistently pushes the number north of $150 billion. Others report only the literal cash value on the books ($34 billion as of May 2020). All of this means that it can require more context to compare the numbers in Apple's consolidated financial statements (PDF) with public reporting on them.

The picture is also clouded by changes in the way Apple manages its cash horde. Over the past two years, Apple has begun to draw from this almighty pile of dollars and spend more to inflate its stock price through share buybacks and dividends. This may cast Apple as more cash-poor than it is. A better understanding of the actual situation is derived from free cash flow. Perhaps Apple will continue to draw down from its tall cash mountain to inflate its stock price via buybacks, but that's not a material change in the amount Apple can potentially spend on improving its products. ↩︎
Since this post first ran, several commenters have noted a point I considered while writing, but omitted in order to avoid heaping scorn on a victim; namely that Mozilla's management has been asleep at the switch regarding the business of its business.

Historically, when public records were available for both Opera and Mozilla, it was easy to understand how poorly Mozilla negotiated with search partners. Under successive leaders, Mozilla negotiated deals that led to payments less than as half as much per point of share. There's no reason to think MoCo's negotiating skills have improved dramatically in recent years. Apple, therefore, is likely to caputre much more revenue per search than an install of Firefox.

But even if Mozilla only made 1/3 of Apple's haul for equivalent use, the combined taxes of iOS feature re-development and loss of revenue would be material to the Mozilla Foundation's bottom line.

Obviously, to get that share, Mozilla would need to prioritise mobile, which it has not done. This is a deep own-goal and a point of continued sadness for me.

A noble house reduced to rubble is a tragedy no matter who demolishes the final wall. Management incompetence is in evidence, and Mozilla's Directors are clearly not fit for purpose.

But none of that detracts from what others have done to the Foundation and the web, and it would be wrong to claim Mozilla should have been perfect in ways its enemies and competitors were not. ↩︎

A Management Maturity Model for Performance

May 9, 2022

This is part five of the seven-part series "The Performance Inequality Gap"

Since 2015 I have been lucky to collaborate with more than a hundred teams building PWAs and consult on some of the world's largest sites. Engineers and managers on these teams universally want to deliver great experiences and have many questions about how to approach common challenges. Thankfully, much of what once needed hand-debugging by browser engineers has become automated and self-serve thanks to those collaborations.

Despite advances in browser tooling, automated evaluation, lab tools, guidance, and runtimes, teams I've worked with consistently struggle to deliver minimally acceptable performance with today's popular frameworks. This is not a technical problem per se — it's a management issue, and one that teams can conquer with the right frame of mind and support.

What is Performance?
Value Propositions
Protecting the Commons
Levels of Performance Management Maturity
Uneven Steps, Regression, & False Starts
The Role of Senior Management
- Questions for Senior Managers
- "o11y, But Make it Performance"

What is Performance? #

It may seem a silly question, but what is performance, exactly?

This is a complex topic, but to borrow from a recent post, web performance expands access to information and services by reducing latency and variance across interactions in a session, with a particular focus on the tail of the distribution (P75+). Performance isn't a binary and there are no silver bullets.

Only teams that master their systems can make intentional trade-offs. Organisations that serve their tools will tread water no matter how advanced their technology, while groups that understand and intentionally manage their systems can succeed on any stack.^[1]

Value Propositions #

The value of performance is deeply understood within a specific community and in teams that have achieved high maturity. But outside those contexts it can be challenging to communicate. One helpful lens is to view the difference between good and bad performance as a gap between expectations and reality.

For executives that value:

Revenue
Performance is a significant revenue contributor. To the extent that a product performs poorly, a fraction of revenue will be lost and the narrower the funnel becomes for all types of conversions.^[2]
Engagement
Poor performance has a well-documented relationship to reduced engagement. The space between good and bad performance is the opportunity for a competitor's product to fill the same opening. An under-appreciated aspect of this equation is variance. When a product often performs well but sometimes is slow to respond, users lose trust and use that service less. Consistent performance matters just as much as low average latency.
Design
Poor performance is a gap between the responsiveness of Figma mockups and brand reality. Products that perform well are closer to the approved design. Many brands build their reputations on fanatical attention to detail about products and the physical environments they're sold in. A laggy digital experience is out of alignment with these values.
Accessibility
Performance is the foundation of access. Users experiencing slow services may be unable to access them in the first place, mooting other work poured into improving a11y. Reduced network and device capacity correlate with other access challenges. a11y that isn't founded on consistently excellent performance can easily veer into performative, rather than effective, territory.

Performance is rarely the single determinant of product success, but it can be the margin of victory. Improving latency and reducing variance allows teams to test other product hypotheses with less noise. A senior product leader recently framed a big performance win as creating space that allows us to be fallible in other areas.

Protecting the Commons #

Like accessibility, security, UI coherence, privacy, and testability, performance is an aggregate result. Any single component of a system can regress latency or create variance, which means that like other cross-cutting product properties, performance must be managed as a commons. The approaches that work over time are horizontal, culturally-based, and require continual investment to sustain.

Teams I've consulted with are too often wrenched between celebration over launching "the big rewrite" and the morning-after realisation that the new stack is tanking business metrics.

Now saddled with the excesses of npm, webpack, React, and a thousand promises of "great performance" that were never critically evaluated, it's easy for managers to lose hope. These organisations sometimes spiral into recrimination and mistrust. Where hopes once flourished, the horrors of a Bundle Buddy readout looms. Who owns this code? Why is it there? How did it get away from the team so quickly?

Many "big rewrite" projects begin with the promise of better performance. Prototypes "seem fast", but nobody's actually benchmarking them on low-end hardware. Things go fine for a while, but when sibling teams are brought in to integrate late in the process, attention to the cumulative experience may suffer. Before anyone knows it, the whole thing is as slow as molasses, but "there's no going back"... and so the lemon launches with predictably sour results.

In the midst of these crises, thoughtful organisations begin to develop a performance management discipline. This, in turn, helps to create a culture grounded in high expectations. Healthy performance cultures bake the scientific method into their processes and approaches; they understand that modern systems are incredibly complex and that nobody knows everything — and so we learn together and investigate the unknown to develop an actionable understanding.

Products that maintain a healthy performance culture elevate management of latency, variance, and other performance attributes to OKRs because they understand how those factors affect the business.

Levels of Performance Management Maturity #

Performance management isn't widely understood to be part of what it means to operate a high-functioning team. This is a communcation challenge with upper management, but also a potential differentiator or even a strategic advantage. Teams that develop these advantages progress through a hierarchy of management practice phases. In drafting this post, I was pointed to similar work developed independently by others^[3]; that experienced consultants have observed similar trends helps give me confidence in this assessment:

Level 0: Bliss #

Hear no evil, see no evil, speak no evil. — Photo by von Vix

Level 0 teams do not know they have a problem. They may be passively collecting some data (e.g., through one of the dozens of analytics tools they've inevitably integrated over the years), but nobody looks at it. It isn't anyone's job description to do so.

Folks at this level of awareness might also simply assume that "it's the web, of course it's slow" and reach for native apps as a panacea (they aren't). The site "works" on their laptops and phones. What's the problem?

Management Attributes #

Managers in Level 0 teams are unaware that performance can be a serious product problem; they instead assume the technology they acquired on the back of big promises will be fine. This blindspot usually extends up to the C-suite. They do not have latency priorities and they uncritically accept assertions that a tool or architecture is "performant" or "blazing fast". They lack the technical depth to validate assertions, and move from one framework to another without enunciating which outcomes are good and which are unacceptable. Faith-based product management, if you will.

Level 0 PMs fail to build processes or cultivate trusted advisors to assess the performance impacts of decisions. These organisations often greenlight rewrites because we can hire easily for X, and we aren't on it yet. These are vapid narratives, but Level 0 managers don't have the situational awareness, experience, or confidence to push back appropriately.

These organisations may perform incidental data collection (from business analytics tools, e.g.) but are inconsistently reviewing performance metrics or considering them when formulating KPIs and OKRs.

Level 1: Fire Fighting #

Shit's on fire, yo. — Photo by Jay Heike

At Level 1, managers will have been made aware that the performance of the service is unacceptable.^[4]

Service quality has degraded so much that even fellow travelers in the tech privilege bubble^[4:1] have noticed. Folks with powerful laptops, new iPhones, and low-latency networks are noticing, which is a very bad sign. When an executive enquires about why something is slow, a response is required.

This is the start of a painful remediation journey that can lead to heightened performance management maturity. But first, the fire must be extinguished.

Level 1 managers will not have a strong theory about what's amiss, and an investigation will commence. This inevitably uncovers a wealth of potential metrics and data points to worry about; a few of those will be selected and tracked throughout the remediation process. But were those the right ones? Will tracking them from now on keep things from going bad? The first firefight instills gnawing uncertainty about what it even means to "be fast". On teams without good leadership or a bias towards scientific inquiry, it can be easy for Level 1 investigations to get preoccupied with one factor while ignoring others. This sort of anchoring effect can be overcome by pulling in external talent, but this is often counter-intuitive and sometimes even threatening to green teams.

Competent managers will begin to look for more general "industry standard" baseline metrics to report against their data. The industry's default metrics are moving to a better place, but Level 1 managers are unequipped to understand them deeply. Teams at Level 1 (and 2) may blindly chase metrics because they have neither a strong, shared model of their users, nor an understanding of their own systems that would allow them to focus more tightly on what matters to the eventual user experience. They aren't thinking about the marginal user yet, so even when they do make progress on directionally aligned metrics, nasty surprises can reoccur.

Low levels of performance management maturity are synonymous with low mastery of systems and an undeveloped understanding of user needs. This leaves teams unable to quickly track down culprits when good scores on select metrics fail to consistently deliver great experiences.

Management Attributes #

Level 1 teams are in transition, and managers of those teams are in the most fraught part of their journey. Some begin an unproductive blame game, accusing tech leads of incompetence, or worse. Wise PMs will perceive performance remediation work as akin to a service outage and apply the principles of observability culture, including "blameless postmortems".

It's never just one thing that's amiss on a site that prompts Level 1 awareness. Effective managers can use the collective learning process of remediation to improve a team's understanding of its systems. Discoveries will be made about the patterns and practices that lead to slowness. Sharing and celebrating these discoveries is a crucial positive attribute.

Strong Level 1 managers will begin to create dashboards and request reports about factors that have previously caused problems in the product. Level 1 teams tend not to staff or plan for continual attention to these details, and the systems often become untrustworthy.

Teams can get stuck at Level 1, treating each turn through a Development ➡️ Remediation ➡️ Celebration loop as "the last time". This is pernicious for several reasons. Upper management will celebrate the first doused fire but will begin to ask questions about the fourth and fifth blazes. Are their services just remarkably flammable? Is there something wrong with their team? Losing an organisation's confidence is a poor recipe for maximising personal or group potential.

Next, firefighting harms teams, and doubly so when management is unwilling to adopt incident response framing. Besides potential acrimony, each incident drains the team's ability to deliver solutions. Noticeably bad performance is an expression of an existing feature working below spec, and remediation is inherently in conflict with new feature development. Level 1 incidents are de facto roadmap delays.

Lastly, teams stuck in a Level 1 loop risk losing top talent. Many managers imagine this is fine because they're optimising for something else, e.g. the legibility of their stack to boot camp grads. A lack of respect for the ways that institutional knowledge accelerates development is all too common.

It's difficult for managers who do not perceive the opportunities that lie beyond firefighting to comprehend how much stress they're placing on teams through constant remediation. Fluctuating between Levels 1 and 0 ensures a team never achieves consistent velocity, and top performers hate failing to deliver.

The extent to which managers care about this — and other aspects of the commons, such as a11y and security — is a reasonable proxy for their leadership skills. Line managers can prevent regression back to Level 0 by bolstering learning and inquiry within their key personnel, including junior developers who show a flair for performance investigation.

Level 2: Global Baselines & Metrics #

Think globally, then reset. — The global baseline isn't what folks in the privilege bubble assume.

Thoughtful managers become uncomfortable as repeated Level 1 incidents cut into schedules, hurt morale, and create questions about system architecture. They sense their previous beliefs about what's "reasonable" need to be re-calibrated... but against what baseline?

It's challenging for teams climbing the maturity ladder to sift through the many available browser and tool-vendor data points to understand which ones to measure and manage. Selected metrics are what influence future investments, and identifying the right ones allows teams to avoid firefighting and prevent blindspots.

A diagram of the W3C Navigation Timing timline events — Browsers provide a *lot* of data about site performance. Harnessing it requires a deep understanding of the product and its users.

Teams looking to grow past Level 1 develop (or uncover they already had) Real User Monitoring ("RUM data") infrastructure in previous cycles. They will begin to report to management against these aggregates.

Against the need for quicker feedback and a fog of metrics, managers who achieve Level 2 maturity look for objective, industry-standard reference points that correlate with business success. Thankfully, the web performance community has been busy developing increasingly representative and trustworthy measurements. Still, Level 2 teams will not yet have learned to live with the dissatisfaction that lab measurements cannot always predict a system's field behavior. Part of mastery is accepting that the system is complex and must be investigated, rather than fully modeled. Teams at Level 2 are just beginning to learn this lesson.

Strong Level 2 managers acknowledge that they don't know what they don't know. They calibrate their progress against studies published by peers and respected firms doing work in this area. These data points reflect a global baseline that may (or may not) be appropriate for the product in question, but they're significantly better than nothing.

Management Attributes #

Managers who bring teams to Level 2 spread lessons from remediation incidents, create a sense of shared ownership over performance, and try to describe performance work in terms of business value. They work with their tech leads and business partners to adopt industry-standard metrics and set expectations based on them.

Level 2 teams buy or build services that help them turn incidental data collection into continual reporting against those standard metrics. These reports tend to focus on averages and may not be sliced to focus on specific segments (e.g., mobile vs. desktop) and geographic attributes. Level 2 (and 3) teams may begin drowning in data, with too many data points being collected and sliced. Without careful shepherding to uncover the most meaningful metrics to the business, this can engender boredom and frustration, leading to reduced focus on important RUM data sources.

Strong Level 2 managers will become unsatisfied with how global rules of thumb and metrics fail to map directly into their product's experience and may begin to look for better, more situated data that describe more of the user journeys they care about. The canniest Level 2 managers worry that their teams lack confidence that their work won't regress these metrics.

Teams that achieve Level 2 competence can regress to Level 1 under product pressure (removing space to watch and manage metrics), team turnover, or assertions that "the new architecture" is somehow "too different" to measure.

Level 3: P75+, Site-specific Baselines & Metrics #

Level 3 teams are starting to fly the plane instead of being passengers on an uncomfortable journey — Photo by Launde Morel

The unease of strong Level 2 management regarding metric appropriateness can lead to Level 3 awareness and exploration. At this stage, managers and TLs become convinced that the global numbers they're watching "aren't the full picture" — and they're right!

At Level 3, teams begin to document important user journeys within their products and track the influence of performance across the full conversion funnel. This leads to introducing metrics that aren't industry-standard, but are more sensitive and better represent business outcomes. The considerable cost to develop and validate this understanding seems like a drop in the bucket compared to flying blind, so Level 3 teams do it, in part, to eliminate the discomfort of being unable to confidently answer management questions.

Substantially enlightened managers who reach Level 3 will have become accustomed to percentile thinking. This often comes from their journey to understand the metrics they've adopted at Levels 1 and 2. The idea that the median isn't the most important number to track will cause a shift in the internal team dialogue. Questions like, "Was that the P50 number?" and "What does it look like at P75 and P90?" will become part of most metrics review meetings (which are now A Thing (™).

Percentiles and histograms become the only way to talk about RUM data in teams that reach Level 3. Most charts have three lines — P75, P90, and P95 — with the median, P50, thrown in as a vanity metric to help make things legible to other parts of the organisation that have yet to begin thinking in distributions.

Treating data as a distribution fundamentally enables comparison and experimentation because it creates a language for describing non-binary shifts. Moving traffic from one histogram bucket to another becomes a measure of success, and teams at Level 3 begin to understand their distributions are nonparametric, and they adopt more appropriate comparisons in response.

Management Attributes #

Level 3 managers and their teams are becoming scientists. For the first time, they will be able to communicate with confidence about the impact of performance work. They stop referring to "averages", understand that medians (P50) can tell a different story than the mean, and become hungry to explore the differences in system behavior at P50 and outlying parts of the distribution.

Significant effort is applied to the development and maintenance of custom metrics and tools. Products that do not report RUM data in more sliceable ways (e.g., by percentile, geography, device type, etc.) are discarded for those that better support an investigation.

Teams achieving this level of discipline about performance begin to eliminate variance from their lab data by running tests in "less noisy" environments than somewhere like a developer's laptop, a shared server, or a VM with underlying system variance. Low noise is important because these teams understand that as long as there's contamination in the environment, it is impossible to trust the results. Disaster is just around the corner when teams can't trust tests designed to keep the system from veering into a bad state.

Level 3 teams also begin to introduce a critical asset to their work: integration of RUM metrics reporting with their experimentation frameworks. This creates attribution for changes and allows teams to experiment with more confidence. Modern systems are incredibly complex, and integrating this experimentation into the team's workflow only intensifies as groups get ever-more sophisticated moving forward.

Teams can regress from Level 3 because the management structures that support consistent performance are nascent. Lingering questions about the quality of custom metrics can derail or stall progress, and some teams can get myopic regarding the value of RUM vs. lab data (advanced teams always collect both and try to cross-correlate, but this isn't yet clear to many folks who are new to Level 3). Viewing metrics with tunnel vision and an unwillingness to mark metrics to market are classic failure modes.

Level 4: Variance Control & Regression Prevention #

Level 4 teams are beginning to understand and manage the tolerances of their service. — Photo by Mastars

Strong Level 3 managers will realise that many performance events (both better and worse than average) occur along a user journey. This can be disorienting! Everything one thought they knew about how "it's going" is invalidated all over again. The P75 latency for interaction (in an evenly distributed population) isn't the continuous experience of a single user; it's every fourth tap!

Suddenly, the idea of managing averages looks naive. Medians have no explanatory power and don't even describe the average session! Driving down the median might help folks who experience slow interactions, but how can the team have any confidence about that without constant management of the tail latency?

This new understanding of the impact that variance has on user experiences is both revelatory and terrifying. The good news is that the tools that have been developed to this point can serve to improve even further.

Level 4 teams also begin to focus on how small, individually innocuous changes add up to a slow bleed that can degrade the experience over time. Teams that have achieved this sort of understanding are mature enough to forecast a treadmill of remediation in their future and recognise it as a failure mode. And failure modes are avoidable with management processes and tools, rather than heroism or blinding moments of insight.

Management Attributes #

Teams that achieve Level 4 maturity almost universally build performance ship gates. These are automated tests that watch the performance of PRs through a commit queue, and block changes that tank the performance of important user flows. This depends on the team having developed metrics that are known to correlate well with user and business success.

This implies all of the maturity of the previous levels because it requires a situated understanding of which user flows and scenarios are worth automating. These tests are expensive to run, so they must be chosen well. This also requires an investment in infrastructure and continuous monitoring. Making performance more observable, and creating a management infrastructure that avoids reactive remediation is the hallmark of a manager who has matured to Level 4.

Many teams on the journey from Level 3 to 4 will have built simpler versions of these sorts of gates (bundle size checks, e.g.). These systems may allow for small continuous increases in costs. Over time, though, these unsophisticated gates become a bad proxy for performance. Managers at Level 4 learn from these experiences and build or buy systems to watch trends over time. This monitoring ought to include data from both the lab and the field to guard against "metric drift". These more sophisticated monitoring systems also need to be taught to alert on cumulative, month-over-month and quarter-over-quarter changes.

Level 4 maturity teams also deputise tech leads and release managers to flag regressions along these lines, and reward them for raising slow-bleed regressions before they become crises. This responsibility shift, backed up by long-run investments and tools, is one of the first stable, team-level changes that can work against cultural regression. For the first time, the team is considering performance on longer time scales. This also begins to create organisational demand for latency budgeting and slowness to be attributed to product contributions.

Teams that achieve Level 4 maturity are cautious acquirers of technology. They manage on an intentional, self-actualised level and value an ability to see through the fog of tech fads. They do bake-offs and test systems before committing to them. They ask hard questions about how any proposed "silver bullets" will solve the problems that they have. They are charting a course based on better information because they are cognizant that it is both valuable and potentially available.

Level 4 teams begin to explicitly staff a "performance team", or a group of experts whose job it is to run investigations and drive infrastructure to better inform inquiry. This often happens out of an ad-hoc virtual team that forms in earlier stages but is now formalised and has long-term staffing.

Teams can quickly regress from Level 4 maturity through turnover. Losing product leaders that build to Level 4 maturity can set groups back multiple maturity levels in short order, and losing engineering leaders who have learned to value these properties can do the same. Teams are also capable of losing this level of discipline and maturity by hiring or promoting the wrong people. Level 4 maturity is cultural and cultures need to be defended and reinforced to maintain even the status quo.

Level 5: Strategic Performance #

Level 5 teams have understood the complexity of their environment and can make tradeoffs with confidence. — Photo by Colton Sturgeon

Teams that fully institutionalise performance management come to understand it as a strategic asset.

These teams build management structures and technical foundations that grow their performance lead and prevent cultural regressions. This includes internal training, external advocacy and writing^[5], and the staffing of research work to explore the frontier of improved performance opportunities.

Strategic performance is a way of working that fully embeds the idea that "faster is better", but only when it serves user needs. Level 5 maturity managers and teams will gravitate to better-performing options that may require more work to operate. They have learned that fast is not free, but it has cumulative value.

These teams also internally evangelise the cause of performance. Sibling teams may not be at the same place, so they educate about the need to treat performance as a commons. Everyone benefits when the commons is healthy, and all areas of the organisation suffer when it regresses.

Level 5 teams institute "latency budgets" for fractional feature rollouts. They have structures (such as managers or engineering leadership councils) that can approve requests for non-latency-neutral changes that may have positive business value. When business leaders demand the ability to ram slow features into the product, these leaders are empowered to say no.

Lastly, Level 5 teams are focused on the complete user journey. Teams in this space can make trades intelligently, moving around code and time within a system they have mastered to ensure the best possible outcomes in essential flows.

Management Attributes #

Level 3+ team behaviours are increasingly illegible to less-advanced engineers and organisations. At Level 5, serious training and guardrails are required to integrate new talent. Most hires will not yet share the cultural norms that a strategically performant organisation uses to deliver experiences with consistent quality.^[6]

Strategy is what you do differently from the competition, and Level 5 teams understand their way of working is a larger advantage than any single optimisation. They routinely benchmark against their competition on important flows and can understand when a competitor has taken the initiative to catch up (it rarely happens through a single commit or launch). These teams can respond at a time of their choosing because their lead will have compounded. They are fully out of firefighting mode.

Level 5 teams do not emerge without business support. They earn space to adopt these approaches because the product has been successful (thanks in part to work at previous levels). Level 5 culture can only be defended from a position of strength. Managers in this space are operating for the long term, and performance is understood to be foundational to every new feature or improvement.

Teams at Level 5 degrade more slowly than at previous levels, but it does happen. Sometimes, Level 5 teams are poor communicators about their value and their values, and when sibling teams are rebuffed, political pressure can grow to undermine leaders. More commonly, enough key people leave a Level 5 team for reasons unrelated to performance management, like when the hard-won institutional understanding of what it takes to excel is lost. Sometimes, simply failing to reward continual improvement can drive folks out. Level 5 managers need to be on guard regarding their culture and their value to the organisation as much as the system's health.

Uneven Steps, Regression, & False Starts #

It's possible for strong managers and tech leads to institute Level 1 discipline by fiat. Level 2 is perhaps possible on a top-down basis in a small or experienced team. Beyond that, though, maturity is a growth process. Progression beyond global baseline metrics requires local product and market understanding. TLs and PMs need to become curious about what is and isn't instrumented, begin collecting data, then start the directed investigations necessary to uncover what the system is really doing in the wild. From there, tools and processes need to be built to recreate those tough cases on the lab bench in a repeatable way, and care must be taken to continually re-validate those key user journeys against the evolving product reality.

Advanced performance managers build groups that operate on mutual trust to explore the unknown and then explain it out to the rest of the organisation. This means that advancement through performance maturity isn't about tools.

Managers who get to Level 4 are rare, but the number who imagine they are could fill stadiums because they adopted the technologies that high-functioning leaders encourage. But without the trust, funding to enquire and explore, and an increasingly fleshed-out understanding of users at the margins, adopting a new monitoring tool is a hollow expenditure. Nothing is more depressing than managerial cosplay.

It's also common for teams to take several steps forward under duress and regress when heroics stop working, key talent burns out, and the managerial focus moves on. These aren't fatal moments, but managers need to be on the lookout to understand if they support continual improvement. Without a plan for an upward trajectory, product owners are putting teams on a loop of remediation and inevitable burnout... and that will lead to regression.

The Role of Senior Management #

Line engineers want to do a good job. Nobody goes to work to tank the product, lose revenue, or create problems for others down the line. And engineers are trained to value performance and quality. The engineering mindset is de facto optimising. What separates Level 0 firefighting teams from those that have achieved self-actualised Level 5 execution is not engineering will; it's context, space, and support.

Senior management sending mixed signals about the value of performance is the fastest way to degrade a team's ability to execute. The second-fastest is to use blame and recrimination. Slowness has causes, but the solution isn't to remove the folks that made mistakes, but rather to build structures that support iteration so they can learn. Impatience and blame are not assets or substitutes for support to put performance consistently on par with other concerns.

Teams that reach top-level performance have management support at the highest level. Those managers assume engineers want to do a good job but have the wrong incentives and constraints, and it isn't the line engineer's job to define success — it's the job of management.

Questions for Senior Managers #

Senior managers looking to help their teams climb the performance management maturity hill can begin by asking themselves a few questions:

Do we understand how better performance would improve our business?
- Is there a shared understanding in the leadership team that slowness costs money/conversions/engagement/customer-success?
- Has that relationship been documented in our vertical or service?
- Do we know what "strategic performance" can do for the business?
What constraints have we given the team?
- Do they have a device-class or network condition target?
- Can engineers rely on those targets to negotiate with other parts of the organisation (e.g., sales, marketing, etc.)?
Have we developed a management fluency wth histograms and distributions over time?
- Do we write OKRs for performance?
- Are they phrased in terms of marginal device and network targets, as well as distributions?
What support do we give teams that want to improve performance?
- Do folks believe they can appeal directly to you if they feel the system's performance will be compromised by other decisions?
- Can folks (including PMs, designers, and SREs — not just engineers) get promoted for making the site faster?
- Can middle managers appeal to performance as a way to push back on feature requests?
- Are there systems in place for attributing slowness to changes over time?
- Can teams win kudos for consistent, incremental performance improvement?
- Can a feature be blocked because it might regress performance?
- Can teams easily acquire or build tools to track performance?
What support do we give mid-level managers who push back on shiny tech in favour of better performance?
- Have we institutionalised critial questions for adopting new technologies?
- Are aspects of the product commons (e.g., uptime, security, privacy, a11y, performance) managed in a coherent way?
- Do managers get as much headcount and funding to make steady progress as they would from proposing rewrites?
Have we planned to staff a performance infrastructure team?
- It's the job of every team to monitor and respond to performance challenges, but will there be a group that can help wrangle the data to enable everyone to do that?
- Can any group in the organisation serve as a resource for other teams that are trying to get started in their latency and variance learning journeys?

The answers to these questions help organisations calibrate how much space they have created to scientifically interrogate their systems. Computers are complex, and as every enterprise becomes a "tech company", becoming intentional about these aspects is as critical as building DevOps and Observability to avoid downtime.

It's always cheaper in the long run to build understanding than it is to fight fires, and successful management can create space to unlock their team's capacity.

"o11y, But Make it Performance" #

Mature technology organisations may already have and value a discipline to manage performance: "Site Reliability Engineering" (SRE), aka "DevOps", aka "Observability". These folks manage and operate complex systems and work to reduce failures, which looks a lot like the problems of early performance maturity teams.

These domains are linked: performance is just another aspect of system mastery, and the tools one builds to manage approaches like experimental, flagged rollouts need performance to be accounted for as a significant aspect of the success of a production spike.

Senior managers who want to build performance capacity can push on this analogy. Performance is like every other cross-cutting concern; important, otherwise un-owned, and a chance to differentiate. Managers have a critical role to forge solidarity between engineers, SREs, and other product functions to get the best out of their systems and teams.

Everyone wants to do a great job; it's the manager's role to define what that means.

It takes a village to keep my writing out of the ditch, so my deepest thanks go to Annie Sullivan, Jamund Ferguson, Andy Tuba, Barry Pollard, Bruce Lawson, Tanner Hodges, Joe Liccini, Amiya Gupta, Dan Shappir, Cheney Tsai, and Tim Kadlec for their invaluable comments and corrections on drafts of this post.

High-functioning teams can succeed with any stack, but they will choose not to. Good craftsmen don't blame their tools, nor do they carry deficient implementations.

Per Kellan Elliot-McCrea's classic "Questions for new technology", this means that high-functioning teams will not be on the shiniest stack. Teams choices that are highly correlated with hyped solutions are a warning sign, not an asset. And while "outdated" systems are unattractive, they also don't say much at all about the quality of the product or the team.

Reading this wrong is a sure tell of immature engineers and managers, whatever their title. ↩︎
An early confounding factor for teams trying to remediate performance issues is that user intent matters a great deal, and thus the value of performance will differ based on context. Users who have invested a lot of context with a service will be less likely to bounce based on bad performance than those who are "just browsing". For example, a user that has gotten to the end of a checkout flow or are using a government-mandated system may feel they have no choice. This isn't a brand or service success case (failing to create access is always a failure), but when teams experience different amounts of elasticity in demand vs. performance, it's always worth trying to understand the user's context and intent.

Users that "succeed" but have a bad time aren't assets for a brand or service, they're likely to be ambasassadors for any other way to accomplish their tasks. That's not great, long-term, for a team or for their users. ↩︎
Some prior art was brought to my attention by people who reviewed earlier drafts of this post; notably this 2021 post by the Splunk team and the following tweet by the NCC Group from 2016 (as well as a related PowerPoint presentation):

NCC Group Web Perf @NCCGroupWebperf

Where are you on the #webperf maturity model? ow.ly/miAi3020A9G #perfmatters
3 03:04 AM · Jul 7, 2016

It's comforting that we have all independently formulated roughly similar framing. People in the performance community are continually learning from each other, and if you don't take my formulation, I hope you'll consider theirs. ↩︎
Something particularly problematic about modern web development is the way it has reduced solidarity between developers, managers, and users. These folks now fundamentally experience the same sites differently, thanks to the shocking over-application of client-side JavaScript to every conceivable problem.

This creates structural illegibility of budding performance crises in new, uncomfortably exciting ways.

In the desktop era, developers and upper management would experience sites through a relatively small range of screen sizes and connection conditions. JavaScript was applied in the breach when HTML and CSS couldn't meet a need.^[7] Techniques like Progressive Enhancement ensured that the contribution of CPU performance to the distribution of experiences was relatively small. When content is predominantly HTML, CSS, and images, browsers are able to accelerate processing across many cores and benefit from the ability to incrementally present the results.

By contrast, JavaScript-delivered UI strips the browser of its ability to meaningfully reorder and slice up work so that it prioritises responsiveness and smooth animations. JavaScript is the fuck it, we'll do it live way to construct UI, and stresses the relative performance of a single core more than competing approaches. Because JavaScript is, byte for byte, the most expensive thing you can ask a browser to process, this stacks the difficulty involved in doing a good job on performance. JavaScript-driven UI is inherently working with a smaller margin for error, and that means today's de facto approach of using JavaScript for roughly everything leaves teams with much less headroom.

Add this change in default architecture to the widening gap between the high end (where all developers and managers live) and the median user. It's easy to understand how perfectly mistimed the JavaScript community's ascendence has been. Not since the promise of client-side Java has the hype cycle around technology adoption been more out of step with average usability.

Why has it gone this badly?

In part because of the privilege bubble. When content mainly was markup, performance problems were experienced more evenly. The speed of a client device isn't the limiting site speed factor in an HTML-first world. When database speed or server capacity is the biggest variable, issues affect managers and executives at the same rate they impact end users.

When the speed of a device dominates, wealth correlates heavily with performance. This is why server issues reliably get fixed, but JavaScript bloat has continued unabated for a decade. Rich users haven't borne the brunt of these architectural shifts, allowing bad choices to fly under the radar much longer which, in turn, increase the likelihood of expensive remediation incidents.

Ambush by JavaScript is a bad time, and when managers and execs only live in the privilege bubble, it's users and teams who suffer most. ↩︎ ↩︎
Managers may fear that by telling everyone about how strategic and important performance has become to them, that their competitiors will wise up and begin to out-execute on the same dimension.

This almost never happens, and the risks are low. Why? Because, as this post exhaustively details, the problems that prevent the competition from achieving high-functioning performance are not strictly technical. They cannot — and more importantly, will not — adopt tools and techniques you evangelise because it is highly unlikely that they are at a maturity level that would allow them to benefit. In many cases, adding another tool to the list for a Level 1-3 team to consider can even slow down and confound them.

Strategic performance is hard to beat because it is hard to construct at a social level. ↩︎
Some hires or transfers into Level 5 teams will not easily take to shared performance values and training.

Managers should anticipate pushback from these quarters and learn to re-assert the shared cultural norms that are critical to success.

There's precious little space in a Level 5 team for résumé-oriented development because a focus on the user has evacuated the intellectual room that hot air once filled. Thankfully, this can mostly be avoided through education, support, and clear promotion criteria that align to the organisation's evolved way of working.

Nearly everyone can be taught, and great managers will be on the lookout to find folks who need more support. ↩︎
Your narrator built JavaScript frameworks in the desktop era; it was a lonely time compared to the clogged market for JavaScript tooling today. The complexity of what we were developing for was higher than nearly every app I see today; think GIS systems, full PIM (e.g., email, calendar, contacts, etc.) apps, complex rich text editing, business apps dealing with hundreds of megabytes worth of normalised data in infinite grids, and BI visualisations.

When the current crop of JavaScript bros tells you they need increased complexity because business expectations are higher now, know that they are absolutely full of it. The mark has barely moved in most experiences. The complexity of apps is not much different, but the assumed complexity of solutions is. That experiences haven't improved for most users is a shocking indictment of the prevailing culture. ↩︎

Older Posts

Newer Posts