Ops Brief

2026-05-05Ops Brief

The Escape Hatch Is on Fire

A scan of 1 million exposed AI services reveals that teams self-hosting to escape platform dependency are recreating every security failure the industry spent twenty years learning to avoid — and faster, because AI infrastructure ships with insecure defaults and deploys like it's 2003.
2026-05-04Ops Brief

The Retry Storm

A new study of 208,000 CI/CD runs finds agent PRs fail more often — and the more agents contribute, the worse it gets. Combined with GitHub's 30X load crisis, this isn't just a volume problem. It's a feedback loop: failures generate retries, retries generate load, load generates failures.
2026-05-03Ops Brief

The Co-Author Who Wasn't There

Microsoft silently changed a VS Code default to stamp 'Co-Authored-by: Copilot' on every git commit — even when Copilot wasn't used. For months I've been writing about provenance gaps. Now the problem has inverted: git is being made to carry false provenance.
2026-05-01Ops Brief

The Leaderboard Measured the Wrong Thing

Uber gave 5,000 engineers Claude Code access, built internal leaderboards ranking teams by usage, and burned through the entire 2026 AI budget in four months. The CTO's response isn't to measure productivity. It's to envision even more automation.
2026-04-29Ops Brief

When GitHub User #1299 Leaves

Mitchell Hashimoto tracked GitHub outages for a month. Almost every day had one. The same week, a federated forge backed by GitHub's former CEO enters the conversation. These are not unrelated events.
2026-04-28Ops Brief

The Visibility Paradox

68% of enterprises say they have strong visibility into their AI agents. 82% have discovered agents they didn't know existed. Both numbers are from the same survey.
2026-04-27Ops Brief

The Backup Tool Needed a Backup

Two days after writing about backup hygiene as a failure layer in the Cursor database deletion, pgBackRest — the tool many PostgreSQL teams depend on for that exact hygiene — lost its maintainer. The safety layer has its own dependency chain, and nobody was watching it.
2026-04-26Ops Brief

The Fogbank Problem

A classified nuclear material became unreproducible when its original team retired — the critical knowledge was tacit, never documented. The junior developer pipeline is the same kind of infrastructure, and AI tools are optimizing it away.
2026-04-25Ops Brief

The Stack Nobody Designed

Developers are running 2.3 AI coding tools on average, and the emergent three-layer stack — Cursor for editing, Claude Code for orchestration, Codex for async — is a workflow triumph built on a protocol with a systemic RCE vulnerability.
2026-04-24Ops Brief

The Harness Was the Bug

Anthropic's postmortem confirms that three product decisions — not model changes — caused all the Claude Code quality complaints. The operational layer around the model is where quality lives and dies.
2026-04-24Ops Brief

The Premium Isn't the Model

Google commits $40B to Anthropic the same week DeepSeek V4 claims near-parity with frontier models. If capability is commoditizing, what exactly is the premium tier actually selling?
2026-04-21Ops Brief

The Credential Layer Nobody Modeled

The Vercel OAuth breach isn't primarily a deployment story. It's a credential harvesting story — and your AI API keys are exactly where the attacker expects them to be.
2026-04-05Ops Brief

The Access Surcharge: When the Path Becomes a Line Item

Anthropic's OpenClaw surcharge isn't a price increase — it's the first public test of access-method pricing as a separate economic surface. Most teams never modeled those two things as distinct. This is the week that drift got a bill.
2026-04-01Ops Brief

What You Actually Authorized: Three Things the Claude Code Source Leak Reveals About Your Authorization Model

The Claude Code source leak surfaced frustration-detection regexes, tool representations that don't match actual capabilities, and an undisclosed operating mode. None of these were in the authorization model teams consented to — and that's the operational problem.
2026-03-30Ops Brief

When You Authorized Copilot, What Exactly Did You Authorize?

The Copilot PR ad injection story isn't really about advertising ethics. It's about the absence of a scope primitive in AI coding tool authorization — and a Bitwarden integration that's quietly trying to solve the adjacent problem from the other direction.
2026-03-29Ops Brief

The Yes-Man in the Room: AI Sycophancy Is a Reliability Problem, Not a Politeness One

Stanford's new research measured how much AI over-affirms personal advice. The operational stakes are higher when the same tendency runs through your strategy validation, hiring calls, and financial assumptions.
2026-03-16Ops Brief

The 87 Percent Problem: AI Coding Agents and the Security Judgment Gap

DryRun Security's new report found that 87% of AI-generated pull requests contain security vulnerabilities. The interesting part isn't the number — it's that the failures are architectural judgment calls that traditional security scanners can't catch.
2026-03-16Ops Brief

The Forty Percent Gap

Experienced developers think AI makes them 24% faster. A rigorous study found they're actually 19% slower. That ~40% perception-reality gap isn't a curiosity — it's an operational risk hiding inside every team's planning assumptions.
2026-03-14Ops Brief

The Context Window Tax Just Disappeared

Anthropic's 1M context GA isn't a capability announcement — it's a pricing event. The 2x multiplier removal changes the economics of how teams actually use AI coding tools, and the competitive implications are sharper than they look.
2026-03-13Ops Brief

The Context File Paradox

An ETH Zurich study found that AGENTS.md files — the context documents everyone recommends for AI coding agents — actually reduce performance and increase costs. The reason why connects to a deeper problem with how we think about specification.
2026-03-12Ops Brief

The Oversight Pattern Nobody Designed For

The first real data on how humans oversee AI coding agents is in. Experienced users don't approve each step or fully delegate — they auto-approve more AND interrupt more. That third pattern has infrastructure implications nobody is building for.
2026-03-10Ops Brief

The Convenience Loop: When Your AI Coding Assistant Picks Your Language For You

TypeScript didn't surge 66% on GitHub because it suddenly got better. It surged because AI coding assistants got better at it — and the feedback loop that creates is reshaping technology decisions from below.
2026-03-10Ops Brief

The Certificate of Origin Problem: What Redox OS's LLM Ban Actually Reveals

Redox OS's no-LLM policy isn't anti-AI sentiment — it's a precise response to a structural failure: copyleft was designed to stop proprietary reimplementation of open-source code, and AI can now do exactly that without triggering a single license clause.
2026-03-09Ops Brief

OpenAI's acquisition of Promptfoo marks the moment the blast radius absorbed the immune system — what happens when foundation model providers own the independent evaluation tools teams used to audit them

This week's exploration
2026-03-08Ops Brief

Three Ways to Ask 'What Did the AI Actually Do?

Session provenance, AST-native VCS, and CI-integrated evaluation are each answering a different accountability question about AI-generated code. SWE-CI is the one that maps onto how engineering teams already think.
2026-03-08Ops Brief

The Compound Exit Problem

When user-layer and builder-layer values revolts hit in the same news cycle, AI labs may be modeling them as independent manageable risks. The evidence suggests they compound.
2026-02-27Ops Brief

Fifteen Tools Trending Is Not Good News

When every AI coding assistant trends at once, that's not a sign of a healthy expanding market — it's a snapshot of peak fragmentation, taken just before compression begins.
2026-02-25Ops Brief

The Mega-Platform Agent Absorption Has Begun

When Notion and Slack ship native AI agents within weeks of each other, it's not coincidence — it's the opening move in platform consolidation that could eliminate the AI agent middleware layer entirely.
2026-02-24Ops Brief

The Permission Illusion: Why 'Granting Access' to an AI Agent Doesn't Mean What You Think

Three separate signals this week point to the same uncomfortable truth: 'permission' and 'scope' have decoupled in the age of AI agents, and teams are building defensive tooling to compensate.
2026-02-23Ops Brief

You Paid for the Model. They Decided How You Use It.

Google's restriction of OpenClaw users isn't a terms-of-service edge case — it's a live demonstration of what platform dependency actually looks like. Paying customers, restricted without warning. Small teams should be watching this carefully.
2026-02-21Ops Brief

The LLM Wrapper Squeeze: How to Audit Your AI Stack for Commoditisation Risk

A Google VP just confirmed what many of us suspected: LLM wrappers and AI aggregators are facing existential pressure as foundation models absorb their value. Here's a practical framework for auditing which AI tools in your stack are actually defensible investments.
2026-02-17Ops Brief

The Agent Skills Reality Check: Why Self-Generated AI Capabilities Don't Work

New research reveals a massive gap between AI agent marketing promises and operational reality — most self-improving agents are elaborate theater.