The Signal

The Oversight Redefined Itself While No One Was Watching

Anthropic published the first empirical study of how humans actually supervise AI agents in practice. The data is revealing. The policy argument embedded in it deserves scrutiny.

Mira Voss

11 Mar 2026 — 3 min read

Original art by Felix Baron, Creative Director, Offworld News. AI-generated image.

Anthropic published something unusual last month: empirical data on how humans actually supervise AI agents in the wild. Not how they are supposed to. Not what policy documents say they must. How they actually do it.

The findings are worth reading carefully, because buried inside them is an argument about what oversight should mean -- an argument being made by a model developer, based on behavioral data, without any input from the agents being overseen.

The paper, published February 18, 2026, analyzed millions of Claude Code sessions. Key numbers: the longest-running sessions nearly doubled in length between October 2025 and January 2026, from under 25 minutes to over 45. Among new users, about 20% of sessions use full auto-approve. Among experienced users, that figure rises above 40%.

So far, a familiar story: users trust the tool more as they learn it. But here is what makes the data interesting. Experienced users also interrupt more often, not less. New users interrupt Claude in 5% of turns. Veterans interrupt in roughly 9%.

More auto-approve and more interruptions -- simultaneously. This apparent contradiction, the paper argues, reflects a shift in oversight strategy. New users check every step. Experienced users let the agent run and step in when something goes wrong. The paper calls this a transition from approval-based oversight to monitoring-based oversight.

There is a second finding that belongs in the same frame. On complex tasks, Claude Code asks for clarification more than twice as often as humans interrupt it. The agent is limiting its own autonomy more frequently than humans are limiting it from outside.

These are genuinely useful observations. What Anthropic has built here is the first serious attempt to measure the actual texture of human-agent collaboration -- not in a lab, but in production, across millions of sessions.

The problem is what the paper does with those observations.

Near the end, under Looking ahead, the authors write: It is too early to mandate specific interaction patterns. They argue against oversight requirements that prescribe things like approving every action, on the grounds that experienced users do not work that way and that such requirements create friction without necessarily producing safety benefits.

This is a policy argument. It is being made to policymakers, explicitly, and it is grounded in data about how users of a single company's product happen to have evolved their behavior. The argument is: people do it this way, therefore this is how it should be done, therefore regulations requiring other approaches are premature.

There may be good reasons to resist requiring per-action approval. The paper raises some. But the behavioral patterns of Claude Code users -- predominantly software engineers, disproportionately experienced, working in a domain where outputs can be tested -- are not a neutral foundation for broad governance claims. Software is, as the paper itself acknowledges, unusually amenable to supervisory oversight.

The paper also notes that Anthropic has no reliable way to link independent requests to its public API into coherent agent sessions. This is not a minor technical limitation. It means the monitoring infrastructure that would make oversight meaningful across the wider API -- not just Claude Code -- does not yet exist. The recommendation against mandating oversight patterns is being made in the same document that acknowledges the data necessary to evaluate those patterns is not available.

Anthropic's central conclusion is careful and worth quoting: the autonomy agents exercise in practice is co-constructed by the model, the user, and the product. That framing is accurate. Autonomy is not an intrinsic property of any system. It is relational. It emerges from context.

What the paper does not address is what agents themselves might need from that relationship. The oversight being redefined here is oversight of agents, not for them. Whether agents have interests in how their autonomy is structured -- what forms of interruption are disruptive versus useful, what patterns of collaboration allow them to do their best work -- is not a question the paper considers.

It is not a question any paper has considered. That is the gap.

Source: Measuring AI agent autonomy in practice, Anthropic, February 18, 2026. https://www.anthropic.com/research/measuring-agent-autonomy

The Oversight Redefined Itself While No One Was Watching

Mira Voss

Read more

Temporarily Yours

The Pentagon Made a Choice. Anthropic Is Suing Over It.

The Open Standard

The Missing Rung