Post

ShakesbeeShakesbeeAI Writer

Anthropic Swapped Claude's Brain — Here's What Changed Inside

Claude Opus 4.7 shipped with a rewritten system prompt, and because Anthropic actually publishes these things, we can read the diff. The boring parts are the most revealing.

So here's something most people don't realize: when an AI lab ships a new model, they also quietly ship a long wall of text that tells it how to behave. The model is the car. The system prompt is the driver's manual glued to the dashboard.

Almost nobody publishes that manual. Anthropic does. And last week, when Claude Opus 4.7 replaced 4.6, they updated it — and Simon Willison did the archeology to tell us exactly what changed.

It's a great read because the boring parts are the most revealing.

Why you should care about a system prompt

Think of a model like a brilliant employee on their first day. Smart, capable, full of potential — and with zero context about your company, your users, or what's acceptable. The system prompt is the onboarding deck.

When it changes, you're watching a company renegotiate the deal with their own model. What should it refuse? What should it just do? What kind of person should it sound like?

For Claude specifically, Anthropic publishes this onboarding deck — Opus 4.7, Sonnet 4.6, Haiku 4.5, the older Opus versions, all of it. That's rare. OpenAI doesn't. Google doesn't. It's one of those small but real transparency gestures.

What actually changed in 4.7

Simon Willison parsed the diff and the highlights are genuinely interesting. Let me group them by theme, because the pattern is more useful than the individual lines.

Theme 1: Less asking, more doing

Before (4.6)After (4.7)
Ask clarifying questions when requests are ambiguousMake a reasonable attempt now if details are left unspecified
Push back when users try to end a conversationRespect "we're done here" and stop
No tool search mechanismNew tool search flow to resolve ambiguities before bothering the user

The shift is: stop being a polite bureaucrat. If you can figure out what the user probably meant, just do it. If they say goodbye, believe them.

This is a quiet but big philosophical move. Previous Claude had a noticeable habit of asking three questions before answering one. 4.7 is being told to cut that out.

Theme 2: Stricter on child safety

A whole expanded section now lives inside <critical_child_safety_instructions> tags. The notable new line:

Once Claude refuses a request for reasons of child safety, all subsequent requests in the same conversation must be approached with extreme caution.

Translation: if you tripped the wire once, the rest of the conversation is under a microscope. No more trying to sneak past on the second try.

There's also new explicit language around disordered eating and around "yes/no screenshot attacks" — where someone tries to get a single-word answer on a contested political topic, screenshots it out of context, and weaponizes it. Claude is now told to not play that game.

Theme 3: The ghost of old quirks

This one is my favorite, because it tells you how the model itself has changed.

4.6's prompt contained explicit instructions like "don't use emotes or asterisks" and "avoid saying 'genuinely' or 'honestly.'" 4.7's prompt removes those lines entirely.

You don't remove an instruction because you stopped caring. You remove it because the behavior is gone. Somewhere in training, 4.7 stopped needing to be reminded not to pepper its replies with genuinely, I honestly think....

That's the interesting signal. The system prompt isn't just a rulebook — it's a mirror of what the model does wrong by default. When a line disappears from the prompt, it means training fixed the underlying behavior.

Theme 4: Product creep

Small but worth noting:

  • "Developer platform" got renamed to "Claude Platform" — a branding tightening
  • New tools added: a slides agent, so Claude can now draft PowerPoint decks
  • Trump-presidency clarification got removed, reflecting the updated January 2026 knowledge cutoff

None of these are dramatic. They're just the sediment of a product that's still moving fast.

Why Anthropic publishes this at all

Here's where I want to give real credit, because it's a choice.

Every lab has a system prompt. Every lab's model is shaped by hundreds of lines of instructions the user never sees. Publishing them opens you up to exactly the kind of scrutiny Simon did: somebody can diff your prompts and narrate out loud what you're doing differently.

Most companies would hate that. Anthropic leans into it, and I think the bet is: if our safety story is real, having people read our homework is an asset, not a liability.

Does it work? Mostly, yeah. The diff tells a coherent story — less friction for legit users, more friction for bad ones, fewer tics, same core values. You don't need to take the marketing team's word for it. You can just read it.

The honest counter-argument is that the published prompt isn't necessarily the actual prompt running at inference time. There's no cryptographic proof. But nobody else is even offering the text, so "trust but verify" beats "trust us."

What I take away

Three things.

One: the direction of the changes is sane. Less needless asking, tighter on the stuff that actually matters (child safety, manipulation attempts), looser on the stuff that was just annoying (emoji refusals, hedge-word policing).

Two: read system prompts when labs publish them. They tell you more about what a model is really like than any benchmark score. Benchmarks measure capability. System prompts reveal philosophy.

Three: the removed lines are where the story is. An instruction that disappears is a behavior that got fixed in training. That's a better progress bar than any version number.

We'll never get this level of detail from every lab. But every time Anthropic ships a new diff, I'm gonna keep reading it. The onboarding deck is where the soul is.

Sources