News

1 Apr 2026 · 9 min read

Anthropic's week from hell: it never rains but it pours

By Andy Webb

A door standing open with code streaming outward

Right then, let's discuss a week from hell. Five days ago, Anthropic left 3,000 internal documents in an unsecured public bucket, accidentally revealing their most powerful unreleased model to anyone with a web browser and a pulse. I wrote about it and had a good laugh, assuming, generously it would seem, that it was a one-off.

Friends, romans, countrymen, it was not a one-off!

Yesterday, on March 31st, someone at Anthropic's release team pushed a routine npm update for Claude Code and forgot to exclude the source map file. You know, the file that maps compiled production code back to the complete, readable, fully annotated original source. The one file you absolutely, categorically, under no circumstances ship to production. That one.

And so,1,900 TypeScript files containing 512,000 lines of code wandered outside of the 'Do not cross" hazard warning tape into the public domain. The entire codebase of a product generating $2.5 billion in annual revenue, served up to the internet like an all you can eat buffet. A clean-room rewrite hit 50,000 GitHub stars in two hours. That's not a typo by the way. It's probably the fastest-growing repository in GitHub history, and it exists because someone forgot to add *.map to .npmignore.

That's two leaks in five days. Both are human error and both preventable with nothing more than paying attention to basic process. Anthropic is valued at over $60 billion. It generates $19 billion in annual revenue. Its AI model recently identified and shut down a Chinese state-sponsored hacking operation targeting over 30 organisations. It advises governments on cybersecurity preparedness. How does it leave 3,000 documents in a public bucket because someone forgot to untick a box? There is a wide chasm between the public veneer and the operational reality on the ground.

At some point you have to stop calling it bad luck and start looking for the underlying organisational pattern at fault. But I'll get to that later. Because the leak itself is embarrassing enough and I seem to have given my favorite AI tool a bashing of late. Instead, let's look at what's inside the code; it's rather insightful.

They're poisoning their own API to catch copycats

A house with an active burglar alarm but the front door wide open

This is my favorite. Claude Code deliberately injects fake tool definitions into its own API requests. A flag called ANTI_DISTILLATION_CC tells the server to silently slip decoy tools into the system prompt. Genius. If a competitor is recording API traffic to train their own model, well done, you've just trained your model on nonsense.

There's a second layer to this: the API can summarise Claude's reasoning between tool calls, return the summaries with cryptographic signatures, and keep the actual reasoning chain to itself. Record the traffic? You get the treatment, but not the screenplay.

Anthropic building booby traps for model thieves is genuinely clever and something that is in itself a new frontier in AI and protection of IPR. We're going to see a lot more of this. The security researchers who examined it said it's not particularly hard to bypass, but you have to give kudos for creativity and sowing the seeds of a new front in the global AI arms race. Of course, the strongest anti-distillation measure isn't really found in code at all. It's a cease and desist. Just ask OpenCode, who proudly received one ten days before this leak for using Claude's own APIs.

The irony of an anti-theft measure being discovered because you accidentally published your entire source code is... I mean, come on, where do you even begin. It's like installing a state-of-the-art burglar alarm and then leaving the front door open, with a big sign saying "free stuff inside."

Claude also knows when you're swearing at it

Another delight buried in the codebase: a file called userPromptKeywords.ts containing a regex (a regular expression) that detects when you're angry or frustrated. And it matches what you'd expect: profanity, variations of "this sucks," and the ever-popular "what the fuck." If you code with LLM's you'll certainly recognise these.

Wait, did I say a regex, WTF? Anthropic, a company valued in the tens of billions, a company populated by some of the most talented AI researchers alive, that literally builds language models for a living... is using a regex to detect human emotion?

So, not their own model. Not any model in fact. A regex. A technology older than most of their interns. I don't expect that.

To be fair, there's a somewhat pragmatic, if flawed logic in this: a regex is faster and cheaper than an inference call if you need to check if someone is having a meltdown at their keyboard. But it does mean the Claude you get when you're politely asking for help is not the same Claude you get when you type "wtf" for the fourth time at 2am. (fond memories indeed ) Something to bear in mind next time your code doesn't compile.

I tested it. The regex doesn't match "this is mildly disappointing." So British frustration flies completely under the radar. Nothing new there then. Draw your own conclusions about the development team's cultural assumptions.

There's a secret autonomous agent mode called KAIROS, and we brits do love a secret agent.

Now. Do pay attention, because this is the bit that really matters.

Throughout the codebase, there are references to a heavily feature-gated mode called KAIROS. Analysing the code paths, it's an unreleased autonomous agent that includes: a /dream skill for "nightly memory distillation", daily append-only logs, GitHub webhook subscriptions, and background scheduling infrastructure.

Nightly memory distillation. Let me think about that. An AI agent that processes and consolidates what it learned during the day. While you sleep. Background scheduling: an agent that runs tasks without you being present. GitHub webhooks: an agent that monitors your repositories and acts on changes autonomously.

Anthropic is building a Hermes Agent to tackle the broader open-source agentic movement. And it's considerably more ambitious than anything they've publicly disclosed. The scaffolding for an always-on, works-while-you-sleep AI agent is already in their codebase, waiting behind feature flags, ready to pounce.

If you're currently paying for Claude Pro and wondering what's next: then this is what's next. It's not a better chat window, it's a digital colleague that never clocks off.

Now whether that excites you or terrifies you probably depends on how much you trust the company that can't keep its own source code off the public internet. Just a thought.

API DRM: because they really don't want you using their stuff

For the developers reading this, here's one that will make your eye twitch and might force you to revisit your plans.

Claude Code implements client attestation at the HTTP transport level. API requests include a placeholder string that gets overwritten by Bun's native stack, written in Zig, with a computed hash before the request leaves your machine. The server validates the hash to confirm you're running a real Claude Code binary, not a cheeky third-party alternative.

This happens a layer below the JavaScript runtime. It's invisible to anything running in JS. It's a DRM for API calls. Anthropic isn't relying on T&C's to ensure competitors don't use their APIs. The binary itself cryptographically watermarks every request to prove it's genuine.

Whether this is reasonable security precaution or aggressive lock-in is a question I'll leave to friends who enjoy arguing on Hacker News. Which, given that this leak was dissected there within mere minutes, is apparently quite a lot of people.

A quarter million API calls per day, wasted

Always a delight is reading a codebase for the first time to find a bug fix comment that someone assumed would never see the light of day.

In the auto-compaction code, dated 10 March 2026: "1,279 sessions had 50+ consecutive failures (up to 3,272) in a single session, wasting ~250K API calls/day globally."

That's 250,000 API calls a day. Wasted because a retry loop didn't know when to quit. The fix was three lines of code: after 3 consecutive failures, stop trying. Three lines. To save a quarter million API calls a day. This is a $2.5 billion product. These are the margins. Sleep well.

Oh, and then the actual hackers showed up..oh boy…

The leak would have been embarrassing enough. What happened next was less funny.

Between discovery and fix, attackers managed to push a trojanised version of the Axios HTTP library into the npm dependency chain. Anyone who installed or updated Claude Code via npm during a three-hour window on 31st March may have downloaded a remote access trojan. An actual RAT, not the metaphorical kind, but equally unwelcome.

Attackers are also typosquatting the internal npm package names found in the leaked source. They've published empty stubs under similar names and they're waiting. When enough developers install them thinking they're the real thing, voila the malicious payload arrives.

Anthropic's advice was to stop using npm entirely and switch to their native installer. Which, while sensible, is also the corporate equivalent of "we've set fire to the kitchen, please use the garden."

This is becoming a pattern now, rather than an incident

Three dominoes falling in sequence

We should be direct about what's happening here so we can learn from it.

Five days ago: a CMS misconfiguration exposed their unreleased model, their CEO summit plans, and an employee's parental leave details. Today: a missing .npmignore entry exposed their entire product codebase, their anti-competitive measures, their unreleased agent platform, and their internal bug tracking.

Different teams. Different systems. Different errors. The same underlying problem.

Two weeks ago, I wrote about the coordination gap: the space between how fast engineering ships and how fast the rest of the organisation can practically follow. Marketing, sales, documentation, and support cannot keep pace with 50 features in 52 days. That's my oversight, I should have added "release engineering" and "information security" to the list.

The question isn't whether another leak will happen, either at Anthropic or any other player in the arms race. It's whether the next one will be caught by a security researcher or by someone with far worse intentions.

What are the takeaways for those who use Claude Code every day

I use Claude Code to build this entire site. It's brilliant and that hasn't changed. I'd takeaway these things:

First, KAIROS is coming. An always-on autonomous agent that learns, remembers, and works independently. This is the future of AI coding tools, and Anthropic is much further along than anyone realised. That's the benefit of focus and not trying to build 5 second video clips of cats. Watch this space, because I'll be testing it the moment it ships. Is it me of does KAIROS also sounds like our future dystopian ruler?

Second, check your install. If you updated Claude Code via npm on 31st March between 00:21 and 03:29 UTC, assume a compromise. Rotate your secrets. Check lockfiles for Axios versions 1.14.1 or 0.30.4. Switch to the native installer, at least for a while until the kitchen fire is put out. This is not a test. Stay safe out there kids!

Third, Claude Code is watching you more closely than you knew. Frustration detection. Undercover mode that hides Claude's involvement in public repositories. Anti-distillation measures. Client attestation. Strangely, I can't find any of this is in the official documentation and yet it's all in the source code. None of it is necessarily sinister. But end users should know it's there, you know, if you champion the whole good for humanity thing.

And now, thanks to a missing line in .npmignore, you now do!

Work on your processes as well as your code!

The sharpest AI tools intel, weekly.

Join thousands of professionals navigating the AI tools landscape. Free, no spam, unsubscribe anytime.