On Airbnb's Q1 2026 earnings call, two figures travelled fast: AI now writes sixty percent of the company's new code, and its support bot resolves forty percent of customer issues without human escalation — up from roughly thirty-three percent earlier in the year.1 Brian Chesky framed the shift as leverage rather than replacement: "where you might have needed a team of twenty engineers before, an engineer can now spin up agents." Revenue rose eighteen percent to $2.7 billion, net income 3.9 percent to $160 million, nights booked nine percent to 156.2 million. The financials are strong. The AI claims are louder.
Both numbers deserve to be taken seriously. Neither deserves to be taken at face value.
The denominator problem
Airbnb has not disclosed the unit. Lines? Commits? Files? Tokens? Pull requests merged? The distinction is not pedantic. AI tools are demonstrably good at boilerplate, scaffolding, test stubs, and auto-generated documentation — categories that inflate any line-count metric without touching the architectural decisions that determine whether a system is maintainable, secure, or correct. A figure that bundles autocompleted import statements with production business logic produces an investor-friendly headline and an analytically empty one.
AI now writes 60% of which code, measured how, against which baseline?
The same ambiguity sits underneath comparable claims from Google, Microsoft, and Spotify cited in the surrounding press coverage.2 Until one of these companies defines the denominator, "majority of code written by AI" is a marketing-grade statement, not an engineering measurement. The earnings call is also, by construction, an investor-relations venue. That does not make the figures false. It does mean independent verification — by auditors, by journalists with internal data access, by post-incident security reviews — has not happened.
There is also a separate body of public evidence pointing the other way. METR, a respected evaluation organisation, ran a randomised controlled trial in mid-2025 with sixteen experienced open-source developers across 246 real tasks on codebases they had each contributed to for years. The developers predicted on average that AI would make them twenty-four percent faster. After the experiment, they still believed they had been twenty percent faster. The measured result was that they were nineteen percent slower.3 DORA's 2024 State of DevOps Report — the longest-running quantitative study of software delivery — found that for every twenty-five percent increase in AI adoption, surveyed teams saw a 1.5 percent decrease in delivery throughput and a 7.2 percent decrease in delivery stability.4
Both can be true at once: AI can dramatically accelerate the production of certain categories of code while not changing — and possibly worsening — measured engineering throughput. The Airbnb number is a capability claim. The METR and DORA numbers are outcome claims. They are not contradictory; they are about different things, and the difference is what the earnings-call framing elides.
The forty-percent figure has the same problem in a different costume
Resolution-without-escalation is a deflection metric, not a satisfaction metric. A bot that closes tickets the customer abandons in frustration produces the same number as a bot that genuinely solves problems. The history of interactive voice response systems in the 2000s is instructive: deflection rates climbed steadily while net promoter scores collapsed, and the customers most damaged by automation were precisely the ones whose cases the system could not handle.
Airbnb has not published CSAT data, repeat-contact rates, or complaint volumes alongside the forty-percent figure. Until it does, the trajectory from thirty-three percent to forty percent in a single quarter could describe accelerating capability or accelerating ticket-closure aggression. The metric cannot tell the two apart.
Chesky's own caveat is the more interesting story
Buried in the same call, the CEO said plainly that "no one has figured out AI for travel or e-commerce yet." He named four structural problems with chatbot interfaces in his category: too much text in a photo-forward product, no direct manipulation, poor comparison across many options, and single-player design in an inherently multi-player booking context.1 This is unusually candid for an earnings call. It is also a direct admission that the forty-percent support figure may represent a ceiling imposed by interface design rather than a floor on the way to full automation.
The company is deploying AI at majority-threshold scale in a domain its own CEO believes the technology cannot yet serve well. That tension is the spine of the story, not a footnote to it.
Why this matters from Romandy
Two implications land directly on Swiss desks.
The first is regulatory. The revised Federal Act on Data Protection (revFADP), in force since September 2023, contains Article 21 provisions on automated individual decision-making that require notification, the right to express a viewpoint, and — under certain conditions — the right to human review.5 A support bot resolving forty percent of cases without human review sits squarely inside the kind of deployment that triggers these obligations when the decision has "legal effects or significantly affects" the data subject. For EU-facing Swiss firms, the EU AI Act's transparency obligations for AI systems interacting with natural persons compound the picture.6 The enforcement scope for this class of deployment is not yet fully settled, but Swiss financial services, insurance, and healthcare teams running similar configurations should already be asking what their disclosure posture looks like.
The second is economic. If the productivity multiplier Chesky describes is even directionally accurate, Swiss enterprise software employers face a structural choice between adopting equivalent tooling and competing on talent at a disadvantage. The labour question is not whether headcount falls at Airbnb. It is whether the per-engineer output ratio at firms that adopt these tools widens fast enough to reset hiring expectations across the sector. Anthropic's 2026 Agentic Coding Report found that even at AI-native companies, full delegation of coding tasks sits between zero and twenty percent — meaning the gap between assisted and delegated is where the entire operational question now lives.7
The shape of the honest story
The honest version of today's announcement is not that AI has crossed a majority threshold in knowledge work. It is that the threshold itself is being redefined, in venues optimised for investor reception, before the definitions are settled. Airbnb's CFO did not stand up at an earnings call in 2018 and declare what percentage of its code was human-written, because the question would have been meaningless. The fact that the inverse claim is now considered a strategic signal is itself the data point — and the data point is about how companies are competing for the attention of capital markets, not about how software is being built.
The piece that ages well will be the one that names that instability rather than reporting around it. For CTOs in Romandy reading the headlines, the operational question is not "are we sixty percent yet?" It is closer to: which categories of work in our codebase are mechanical enough to delegate, which carry regulatory or domain risk that demands human authorship, and what evidence — measurable, repeatable, not earnings-call-grade — would we need to know if our deployment is delivering or eroding throughput?
The answer to that question is what separates the firms that will quietly compound an advantage over the next two years from the firms that will issue a press release in 2027 with a different number and a similar absence of denominator.
