
> **TL;DR:** AI coding agents make building fast and satisfying. The danger isn't that they write bad code — it's that they help you build the wrong thing confidently. The fix isn't slower building. It's knowing what's actually decided before you (or your agent) build on top of it.

There's a feeling every developer knows. You're in the zone. The agent is generating clean code. Tests pass. The feature takes shape in an hour instead of a day.

Then someone — a co-founder, a user, your future self — looks at it and says: "Why did you build it that way?"

And you realize: you spent the last three hours building on top of an assumption you never confirmed.

## The rabbit hole got faster

Before AI coding tools, rabbit holes were slow. You'd spend a day going down the wrong path, realize it, and course-correct. The cost was a day.

Now the rabbit hole is instant. An agent can scaffold an entire feature in minutes — complete with tests, error handling, and edge cases — all built on a premise that was never actually decided.

The code is good. The architecture is clean. The tests pass. And the whole thing is pointed in the wrong direction.

That's the most expensive satisfying feeling in coding: shipping fast and feeling productive while building something that shouldn't exist in its current form.

## Why this keeps happening

It's not because developers are careless. It's because of a specific gap in how we work with agents.

When you tell an agent "build the checkout flow," it needs to make dozens of micro-decisions:

- Is the refund window 14 days or 30?
- Should failed payments retry automatically or require user action?
- Can admins override pricing, or is it locked to the plan?
- Is the free tier limited by features or by usage?

The agent will answer every one of these questions. It has to — it can't write code without making choices. So it infers from the codebase, picks reasonable defaults, and builds confidently.

The problem: "reasonable" and "decided" are different things.

Some of those choices match what you intended. Some don't. And you won't find out which until you review the diff carefully — or worse, until a user hits the wrong behavior in production.

## The real cost isn't bad code

Bad code is easy to spot. A broken test, a type error, a crash — these surface immediately.

The expensive mistakes are the ones that look correct:

- An agent builds a permission system based on how auth currently works — but you hadn't decided whether to use role-based or attribute-based permissions yet
- An agent implements a billing grace period of 30 days because that's what it inferred from a comment — but the actual decision was 14 days, made in a conversation that never reached the codebase
- An agent extends the onboarding flow with email verification — reasonable, but you were deliberately keeping onboarding frictionless for the beta

Each of these produces working code that passes tests. Each one is a rabbit hole you'll need to unwind later — or worse, one you won't notice until it's in production shaping user expectations.

## The fix isn't slower building

The instinct is to add more review, more process, more checkpoints. Slow down. Be more careful.

That's the wrong response. Fast iteration is genuinely valuable — the [pottery experiment](https://austinkleon.com/2020/12/10/quantity-leads-to-quality-the-origin-of-a-parable/) proves it. Students told to make as many pots as possible produced better work than students told to make one perfect pot. Volume of iteration beats perfectionism.

The problem isn't speed. The problem is building on top of assumptions that feel like decisions.

The fix is making the difference visible: **what's actually decided vs. what's still an assumption.**

## What "decided" looks like as an artifact

Most repos have no artifact that captures this. Product decisions live in:

- Slack threads that scroll away
- Meeting notes nobody re-reads
- A founder's head
- Code comments that look like decisions but might be temporary hacks

When an agent (or a developer) needs to make a choice, there's nothing to check against. So they guess — and guessing at speed produces confident rabbit holes.

A [behavior spec](https://github.com/stewie-sh/pbc-spec) — what the PBC format calls a Product Behavior Contract — makes the decided stuff explicit:

```markdown
## Billing grace period

### When
A subscription payment fails

### Then
- System enters a 14-day grace period
- User retains full access during grace period
- Daily retry attempts against the payment method

### Invariants
- Grace period is exactly 14 days — not configurable per plan
- No data deletion during grace period
```

This isn't documentation. It's product truth — a short, structured spec of what the product promises to do. Written in Markdown, readable by both humans and agents, sitting in your repo next to the code.

When an agent sees this before building, it doesn't have to guess about the grace period. When a developer reviews a diff, they can check it against the contract instead of relying on memory.

## The two questions that prevent rabbit holes

Before building anything significant — whether you're coding yourself or delegating to an agent — two questions prevent most rabbit holes:

1. **What's actually decided here?** Not "what does the code currently do" or "what seems reasonable" — what has someone explicitly confirmed as the intended behavior?

2. **What's still an assumption?** Which parts of this feature are built on inferences that nobody has signed off on?

If you can answer both, you build with clarity. If you can't, you're about to go fast in a direction that might be wrong.

A behavior spec is just the artifact that makes those answers checkable — for you, for your team, and for your agents.

## Start with the expensive modules

You don't need to spec your entire product. Start with the modules where building the wrong thing costs the most:

- **Billing** — wrong grace period, wrong proration, wrong refund policy
- **Auth and permissions** — wrong access model baked into every feature built on top of it
- **Onboarding** — wrong flow shapes user expectations from day one
- **Entitlements** — wrong limits on the free tier cascade into pricing conversations

Write 3-5 behaviors for each. State what must happen, what must not happen, and the edge cases. That's enough to give yourself (and your agents) a checkpoint before building.

The goal isn't perfect specs. It's just enough product truth to stop building confidently in the wrong direction.

---

The PBC spec is open source at [github.com/stewie-sh/pbc-spec](https://github.com/stewie-sh/pbc-spec). You can browse example contracts in the [PBC viewer](https://pbc.stewie.sh).
