Philosophy — plan once, execute cheap

Spec-driven development with AI agents has two distinct phases that people conflate:

Thinking. Reading code, weighing tradeoffs, deciding what to change and why. This is expensive per token and benefits from the best model you have.
Typing. Applying the plan: edit files, run tests, fix typos, wire things up. This is verbose and repetitive. A weak model can do it — if the plan is concrete enough.

Squad-kit is built around the observation that most SDD tooling fails to separate these. Every “implement” turn reloads the planner-level context, reruns synthesis, and re-reads meta-artifacts the executor does not need. You pay top-tier tokens for typing work. Keeping those phases separate is the whole product thesis — see Getting started for the concrete commands.

The rule

Plan once. Execute cheap.

One expensive session produces NN-story-<slug>.md.
That file is the contract: paths, line ranges, signatures, verification commands, done criteria.
Implementation sessions attach only that file. No meta-prompt reload. No cross-artifact consistency checks.
If the plan is wrong, fix the plan. If the executor is wrong, tighten the plan next time.

Token math

Rough numbers from a real story in a production repo:

Phase	Squad-kit context on turn 1	Spec-Kit context on turn 1
Plan generation	intake (~2 KB) + meta-prompt (~5 KB) + repo files the planner chooses to read	`spec.md` + `plan.md` template (~4 KB) + constitution (~2 KB) + `/plan` orchestration (~4 KB) + model-driven research
Implementation	the plan file (~5–15 KB), nothing else	`/implement` template (~13 KB) + `tasks.md` + `plan.md` + `data-model.md` + `contracts/` + `research.md` + `quickstart.md`

The implementation delta is what matters: you run that loop dozens of times per feature. Five extra kilobytes of boilerplate loaded 40 times is 200 KB of wasted cache/tokens. Worse when you factor in that the cheap executor pays the same per-token rate as the expensive planner when those tokens sit in context.

Where the direct planner fits

The direct planner (squad new-plan --api) does not violate “plan once, execute cheap”:

The expensive model still plans once per story. squad-kit is the transport (API calls, tool loop, file writer), not a second “thinker” in the loop.
Context is demand-driven: the planner requests files through a bounded tool loop (read_file, list_dir) with a budget in config. There is no blind full-repo slurp.
The output is still one plan file the executor session attaches — same contract as the in-agent path.
The switch is about shortening the path from a reviewed intake to that plan on disk, not about adding another expensive planning pass.

	In-agent (`/squad-plan`)	Direct (`squad new-plan --api`)
Where planning happens	Inside the agent session you already use.	In the terminal, via the provider API.
Who feeds context	The agent (may over-read).	squad-kit (budget-enforced).
Credentials	Agent’s existing credentials.	`.squad/secrets.yaml` or a provider env var.
Best when	You already have a capable agent open.	You want one-shot CLI output without changing tools.
Cost shape	Same planner tier, agent overhead on top.	Same planner tier, no agent session overhead.

In both cases the “expensive” work is a single planning pass per story. The direct path simply avoids an agent session and lets squad-kit enforce read budgets the agent might ignore. The executor step stays identical: it still ingests one NN-story-*.md and nothing else.

What makes a plan “concrete enough”?

Every task in a squad-kit plan meets this bar:

A file path (or Create file:) so the executor knows exactly where to edit.
A symbol, line range, or regex when the change is in-place.
Type signatures or DTOs when adding new structures, in language-tagged code fences.
A verification command at the end: what to run, what passing looks like.

Vague guidance (“consider introducing a service layer”) does not belong here. That is a planning decision and belongs in the plan before it becomes a task.

What squad-kit gives up

The in-agent and direct planner paths share the same bundled generate-plan.md rules; they differ only in who reads the repo and how reads are bounded. Neither path adds a second “planning pass” before execution — the implementation turn still starts from a single artefact.

Spec-Kit’s /clarify and /analyze catch planning mistakes before implementation. Squad-kit does not have those. The tradeoff:

You trust your planner. If the planning model is weak, the plan is weak, and no squad-kit command saves you.
Planning is a single session, human-reviewed. That review replaces /clarify.
Cross-artifact consistency is unnecessary when there is only one artifact.

This is a deliberate choice, not an oversight. If you want safety nets, Spec-Kit is the right tool. For a feature-level comparison, see squad-kit vs Spec-Kit.

Why Markdown prompts, not embedded logic

Squad-kit’s default planning rules live in three Markdown files shipped inside the npm package (templates/prompts/). Your project conventions — verification commands, product rules, acceptance criteria — belong in intakes and plans under .squad/stories/ and .squad/plans/, which you own and commit.

A fork is the supported customisation path for changing those three files. The 0.1.x .squad/prompts/ override was removed in 0.2.0 because silent version drift between user copies and CLI behaviour was breaking workflows. The new contract: the CLI you installed defines the prompts — upgrade the package to pick up template changes, or maintain a fork.

Edit this page on GitHub ← Back to docs