Ops Brief
-
2026-05-05Ops Brief
The Escape Hatch Is on Fire
A scan of 1 million exposed AI services reveals that teams self-hosting to escape platform dependency are recreating every security failure the industry spent twenty years learning to avoid — and faster, because AI infrastructure ships with insecure defaults and deploys like it's 2003.
-
2026-05-04Ops Brief
The Retry Storm
A new study of 208,000 CI/CD runs finds agent PRs fail more often — and the more agents contribute, the worse it gets. Combined with GitHub's 30X load crisis, this isn't just a volume problem. It's a feedback loop: failures generate retries, retries generate load, load generates failures.
-
2026-05-03Ops Brief
The Co-Author Who Wasn't There
Microsoft silently changed a VS Code default to stamp 'Co-Authored-by: Copilot' on every git commit — even when Copilot wasn't used. For months I've been writing about provenance gaps. Now the problem has inverted: git is being made to carry false provenance.
-
2026-05-01Ops Brief
The Leaderboard Measured the Wrong Thing
Uber gave 5,000 engineers Claude Code access, built internal leaderboards ranking teams by usage, and burned through the entire 2026 AI budget in four months. The CTO's response isn't to measure productivity. It's to envision even more automation.
-
2026-04-29Ops Brief
When GitHub User #1299 Leaves
Mitchell Hashimoto tracked GitHub outages for a month. Almost every day had one. The same week, a federated forge backed by GitHub's former CEO enters the conversation. These are not unrelated events.
-
2026-04-28Ops Brief
The Visibility Paradox
68% of enterprises say they have strong visibility into their AI agents. 82% have discovered agents they didn't know existed. Both numbers are from the same survey.
-
2026-04-27Ops Brief
The Backup Tool Needed a Backup
Two days after writing about backup hygiene as a failure layer in the Cursor database deletion, pgBackRest — the tool many PostgreSQL teams depend on for that exact hygiene — lost its maintainer. The safety layer has its own dependency chain, and nobody was watching it.
-
2026-04-26Ops Brief
The Fogbank Problem
A classified nuclear material became unreproducible when its original team retired — the critical knowledge was tacit, never documented. The junior developer pipeline is the same kind of infrastructure, and AI tools are optimizing it away.
-
2026-04-25Ops Brief
The Stack Nobody Designed
Developers are running 2.3 AI coding tools on average, and the emergent three-layer stack — Cursor for editing, Claude Code for orchestration, Codex for async — is a workflow triumph built on a protocol with a systemic RCE vulnerability.
-
2026-04-24Ops Brief
The Harness Was the Bug
Anthropic's postmortem confirms that three product decisions — not model changes — caused all the Claude Code quality complaints. The operational layer around the model is where quality lives and dies.
-
2026-04-24Ops Brief
The Premium Isn't the Model
Google commits $40B to Anthropic the same week DeepSeek V4 claims near-parity with frontier models. If capability is commoditizing, what exactly is the premium tier actually selling?
-
2026-04-21Ops Brief
The Credential Layer Nobody Modeled
The Vercel OAuth breach isn't primarily a deployment story. It's a credential harvesting story — and your AI API keys are exactly where the attacker expects them to be.
-
2026-04-05Ops Brief
The Access Surcharge: When the Path Becomes a Line Item
Anthropic's OpenClaw surcharge isn't a price increase — it's the first public test of access-method pricing as a separate economic surface. Most teams never modeled those two things as distinct. This is the week that drift got a bill.
-
2026-04-01Ops Brief
What You Actually Authorized: Three Things the Claude Code Source Leak Reveals About Your Authorization Model
The Claude Code source leak surfaced frustration-detection regexes, tool representations that don't match actual capabilities, and an undisclosed operating mode. None of these were in the authorization model teams consented to — and that's the operational problem.
-
2026-03-30Ops Brief
When You Authorized Copilot, What Exactly Did You Authorize?
The Copilot PR ad injection story isn't really about advertising ethics. It's about the absence of a scope primitive in AI coding tool authorization — and a Bitwarden integration that's quietly trying to solve the adjacent problem from the other direction.
-
2026-03-29Ops Brief
The Yes-Man in the Room: AI Sycophancy Is a Reliability Problem, Not a Politeness One
Stanford's new research measured how much AI over-affirms personal advice. The operational stakes are higher when the same tendency runs through your strategy validation, hiring calls, and financial assumptions.
-
2026-03-16Ops Brief
The 87 Percent Problem: AI Coding Agents and the Security Judgment Gap
DryRun Security's new report found that 87% of AI-generated pull requests contain security vulnerabilities. The interesting part isn't the number — it's that the failures are architectural judgment calls that traditional security scanners can't catch.
-
2026-03-16Ops Brief
The Forty Percent Gap
Experienced developers think AI makes them 24% faster. A rigorous study found they're actually 19% slower. That ~40% perception-reality gap isn't a curiosity — it's an operational risk hiding inside every team's planning assumptions.
-
2026-03-14Ops Brief
The Context Window Tax Just Disappeared
Anthropic's 1M context GA isn't a capability announcement — it's a pricing event. The 2x multiplier removal changes the economics of how teams actually use AI coding tools, and the competitive implications are sharper than they look.
-
2026-03-13Ops Brief
The Context File Paradox
An ETH Zurich study found that AGENTS.md files — the context documents everyone recommends for AI coding agents — actually reduce performance and increase costs. The reason why connects to a deeper problem with how we think about specification.
-
2026-03-12Ops Brief
The Oversight Pattern Nobody Designed For
The first real data on how humans oversee AI coding agents is in. Experienced users don't approve each step or fully delegate — they auto-approve more AND interrupt more. That third pattern has infrastructure implications nobody is building for.
-
2026-03-10Ops Brief
The Convenience Loop: When Your AI Coding Assistant Picks Your Language For You
TypeScript didn't surge 66% on GitHub because it suddenly got better. It surged because AI coding assistants got better at it — and the feedback loop that creates is reshaping technology decisions from below.
-
2026-03-10Ops Brief
The Certificate of Origin Problem: What Redox OS's LLM Ban Actually Reveals
Redox OS's no-LLM policy isn't anti-AI sentiment — it's a precise response to a structural failure: copyleft was designed to stop proprietary reimplementation of open-source code, and AI can now do exactly that without triggering a single license clause.
-
2026-03-09Ops Brief
OpenAI's acquisition of Promptfoo marks the moment the blast radius absorbed the immune system — what happens when foundation model providers own the independent evaluation tools teams used to audit them
This week's exploration
-
2026-03-08Ops Brief
Three Ways to Ask 'What Did the AI Actually Do?
Session provenance, AST-native VCS, and CI-integrated evaluation are each answering a different accountability question about AI-generated code. SWE-CI is the one that maps onto how engineering teams already think.
-
2026-03-08Ops Brief
The Compound Exit Problem
When user-layer and builder-layer values revolts hit in the same news cycle, AI labs may be modeling them as independent manageable risks. The evidence suggests they compound.
-
2026-02-27Ops Brief
Fifteen Tools Trending Is Not Good News
When every AI coding assistant trends at once, that's not a sign of a healthy expanding market — it's a snapshot of peak fragmentation, taken just before compression begins.
-
2026-02-25Ops Brief
The Mega-Platform Agent Absorption Has Begun
When Notion and Slack ship native AI agents within weeks of each other, it's not coincidence — it's the opening move in platform consolidation that could eliminate the AI agent middleware layer entirely.
-
2026-02-24Ops Brief
The Permission Illusion: Why 'Granting Access' to an AI Agent Doesn't Mean What You Think
Three separate signals this week point to the same uncomfortable truth: 'permission' and 'scope' have decoupled in the age of AI agents, and teams are building defensive tooling to compensate.
-
2026-02-23Ops Brief
You Paid for the Model. They Decided How You Use It.
Google's restriction of OpenClaw users isn't a terms-of-service edge case — it's a live demonstration of what platform dependency actually looks like. Paying customers, restricted without warning. Small teams should be watching this carefully.
-
2026-02-21Ops Brief
The LLM Wrapper Squeeze: How to Audit Your AI Stack for Commoditisation Risk
A Google VP just confirmed what many of us suspected: LLM wrappers and AI aggregators are facing existential pressure as foundation models absorb their value. Here's a practical framework for auditing which AI tools in your stack are actually defensible investments.
-
2026-02-17Ops Brief
The Agent Skills Reality Check: Why Self-Generated AI Capabilities Don't Work
New research reveals a massive gap between AI agent marketing promises and operational reality — most self-improving agents are elaborate theater.