
> **TL;DR:** AGENTS.md, memory banks, and feature specs all help agents — but they answer different questions (“how to work”, “what we’re doing now”, “what we plan to build”). This post argues the missing artifact is product truth: a behavior spec that states the behaviors to protect (must/must-not, edge cases, state transitions) so agents don’t refactor your product promises away.

If you use Claude Code, Cursor, Cline, or Codex seriously, your repo probably has some kind of context file. Maybe several. AGENTS.md, CLAUDE.md, a Memory Bank, feature specs, a README with conventions.

They all help. And they all leave the same gap.

If you're coming at this from the engineering side, see [Stewie for Engineering Teams](/for/engineering-teams).

## What each layer actually solves

**AGENTS.md / CLAUDE.md** — workflow steering. "Use pnpm, not npm." "Run tests before committing." "Don't modify generated files." This tells the agent *how to work in this repo*. It's process-oriented and changes when your conventions change.

**Memory Bank** (Cline model) — session context. "We're refactoring the auth module." "The billing migration is half-done." "Last session we decided to split the API into two services." This preserves *what we're doing right now* across sessions. It's temporal — it evolves weekly.

**Feature specs / PRDs** — intent. "We want users to be able to invite team members." "The free tier should allow 3 projects." This captures *what we plan to build*. It describes the future, written before implementation.

**Behavior specs (PBC)** — product truth. "If `tax_id` is missing, the invoice must be rejected with `TAX_ID_REQUIRED`." "Grace period is 14 days, not 30." "A workspace admin can remove members, but cannot remove themselves." This captures *what the product actually promises to do* — grounded in how it behaves today, not how someone intended it to behave.

## Why the confusion happens

All four artifacts live in the repo as text files. All four are readable by agents. All four are written in Markdown or something close to it. From the outside, they look like the same thing.

But they solve for different audiences at different timescales:

| Layer | Answers | Audience | Changes |
|-------|---------|----------|---------|
| AGENTS.md | "How do we work here?" | Agents + contributors | Quarterly |
| Memory Bank | "What are we doing right now?" | Agents (session continuity) | Weekly |
| Feature specs | "What do we want to build?" | Product owner + engineering | Per sprint |
| PBC | "What does the product promise?" | Everyone + agents | Rarely (it's that stable) |

The key difference: a behavior spec describes the *durable product truth* — the things that should be true regardless of which agent is running, which developer is coding, or which sprint you're in.

## What goes wrong without the behavior layer

When an agent has AGENTS.md but no PBC, it knows *how to work* but not *what to protect*. So it:

- Refactors billing logic and silently changes the grace period from 14 days to 30
- "Simplifies" an auth check that was handling a deliberate edge case
- Auto-fills a field that the product intentionally leaves empty for compliance reasons
- Removes a validation rule because it looks redundant — but it was load-bearing

The agent isn't being careless. It's following the instructions it has. The problem is that nobody wrote down *which behaviors are off-limits*.

## What a PBC adds

A `.pbc.md` file sits in your repo and states the behavior spec explicitly:

- **What must happen** — required outcomes for each behavior
- **What must not happen** — forbidden actions and states
- **Edge cases** — the exceptions that look like bugs but are deliberate
- **State transitions** — which moves are valid and which are blocked

It's Markdown-first, so it renders cleanly on GitHub and in any editor. But it has structured `pbc:*` blocks inside that tools can parse — behaviors, states, rules, outcomes — all deterministically extractable.

> PRD explains why. PBC specifies what. Code and tests prove how.

The three layers aren't competing. They're complementary. PBC fills the specific gap that the others leave open.

## When to write each one

- Write **AGENTS.md** when you set up a repo or onboard a new tool
- Write **Memory Bank** entries as you work (or let your tools do it)
- Write **feature specs** before you build something new
- Write **PBC** after you've built something worth protecting — start with billing, auth, or entitlements

You don't need all four on day one. But if you're running multiple agents on the same repo — or even one agent that you trust with refactoring — the behavior layer is the one most likely to prevent a production incident.

## Try it

The PBC spec is open source at [github.com/stewie-sh/pbc-spec](https://github.com/stewie-sh/pbc-spec). It ships with a reference CLI and a [browser-based viewer](https://pbc.stewie.sh) that renders `.pbc.md` files as interactive structured UI.

If you already have AGENTS.md in your repo, PBC is the natural next layer. Same workflow, different problem, same repo.
