SDD Deep Dive: Five Tools, One Feature

Five Tools, One Feature

The same forgot-password spec — built five different ways.

Same discipline. Different hands on the pen.

From Principles to Tools

In the intro deck, four principles produced a cleaner result:

Principle	What it means
Gaps before code	Surface every ambiguity before the agent generates anything
Spec is the contract	The agent reads the spec — not your memory or your Slack thread
Tests trace to spec lines	A failing test names a decision, not a code location
Spec is the changelog	When requirements change, the spec changes first

Now you’ll see the same forgot-password spec — built five different ways. Each tool enforces all four principles. The pen changes hands.

What You’ll Learn

The five main SDD toolchains

Spec Kit, BMAD, Matt Pocock Skills, Superpowers, OpenSpec — what each one is and who it’s for.
How to match a tool to your context

Solo vs team, greenfield vs legacy, AI-drafted vs author-driven.
The same discipline across all five

Same four principles. Different workflow. Different hands on the pen.

The Spec We’re Building From

The feature from the intro deck. The spec is already written.

# specs/forgot-password.yaml
endpoint: POST /auth/forgot-password
version:  "1.0"

inputs:
  - name: email   type: Email   required: true

invariants:
  - response is identical for known and unknown emails    # no enumeration
  - max 5 requests / hour / IP
  - max 3 requests / hour / email
  - token TTL <= 15 minutes
  - token is single-use
  - token stored as SHA-256 hash, never plaintext

outputs:
  status: 202 Accepted
  body:   { ok: true }

side_effects:
  - email sent IF user exists
  - audit log: { userId?, ip, requestedAt }

Five tools. Five paths to this same spec — and to working code that satisfies it.

Tool 1 of 5 — Spec Kit

You write the spec. Slash commands gate every phase.

github.com/github/spec-kit

Author-Driven

You write every decision. The AI executes, never guesses.

Gated Phases

Slash commands act as gates: specify → plan → tasks → implement.

Best For

Solo developers who want maximum control over every decision.

Spec Kit — Install & Constitution

uv tool install specify-cli \
  --from git+https://github.com/github/spec-kit.git

specify init . --integration claude

Set the project-wide rules once with /speckit.constitution. They prepend to every subsequent agent prompt.

/speckit.constitution 
Governing principles for an auth service: never leak account existence in any endpoint;
all security-sensitive endpoints rate-limited at multiple keys; 
tokens must be single-use, hashed, time-bounded; 
all auth events must be audit-logged.

Spec Kit — Specify

Describe intent, not stack.

/speckit.specify 
POST /auth/forgot-password — user requests a reset email. 
Same response for known and unknown emails (never leak existence). 
Rate limit: 5/hour/IP, 3/hour/email. Token: 15-minute TTL, single-use, SHA-256 hashed at rest. 
Audit log every request.

What the agent does next: drafts spec.md with acceptance criteria. Anywhere it sees a gap, it inserts a [NEEDS CLARIFICATION: <question>] marker rather than guessing. For critical ambiguities it may pause and ask up to 3 grouped questions before writing.

Spec Kit — Clarify

Don’t move on with [NEEDS CLARIFICATION] markers still in the spec. Run:

/speckit.clarify

What the agent does next: runs a structured ambiguity scan over the spec, then asks up to 5 targeted clarifying questions, one at a time — multiple-choice where possible (e.g. “200 / 202 / 204?”). Each answer is written back into the spec immediately. Stops when 5 are answered or no critical gaps remain.

Spec Kit — Plan

Now describe stack, not behaviour.

/speckit.plan 
TypeScript + Express. Prisma + PostgreSQL. 
Redis for rate-limit counters. Use express-rate-limit middleware. 
Use crypto.randomBytes for tokens. 
Use existing audit-log table and nodemailer SMTP wrapper.

What the agent does next: reads the clarified spec + the constitution, then writes plan.md mapping each spec line to a concrete implementation choice in your stack. Does not ask questions — by this point, the spec should be unambiguous.

Spec Kit — Tasks & Implement

/speckit.tasks

What the agent does next: reads the spec + plan, generates a numbered task list — each task referencing the spec line it traces back to:

[ ] 1. Add Zod EmailSchema       — Ref: spec acceptance line 1
[ ] 2. Add rate-limit middleware (5/h/IP)   — Ref: line 2
[ ] 3. Add per-email rate limiter (3/h/email) — Ref: line 3
[ ] 4. Migration: tokens table (tokenHash, expiresAt, usedAt) — Ref: lines 4-6
[ ] 5. Implement forgotPassword handler — Ref: spec (all)
[ ] 6. Wire audit.log in success and miss paths — Ref: line 7
[ ] 7. Tests — one per acceptance line

/speckit.implement

What the agent does next: works through the task list in order, ticking each off. Every line of code it writes is traceable to a spec acceptance line. Tests are generated against acceptance lines, not against implementation shape.

Tool 2 of 5 — BMAD Method

AI agents draft each artifact. You review and approve between phases.

github.com/bmad-code-org/BMAD-METHOD

AI-Drafted

Analyst, PM, Architect, Developer, UX, Tech Writer agents draft each artifact.

Human-Approved

You review and approve before the next agent runs.

Best For

Teams with dedicated reviewers who want AI to draft the spec.

BMAD — Install & Meet the Agents

npx bmad-method install
# Select your AI IDE — Claude Code, Cursor, Copilot, etc.

Each agent is installed as a skill. Default agents:

Agent skill	Persona	Role
`bmad-analyst`	Mary	Discovery & research
`bmad-agent-pm`	John	Product Manager
`bmad-agent-architect`	Winston	Software Architect
`bmad-agent-dev`	Amelia	Developer
`bmad-ux-designer`	Sally	UX Designer
`bmad-tech-writer`	Paige	Technical Writer
`bmad-help`	—	Navigator — invoke any time

Each agent reads only what the previous agent produced. You review every artifact before the next agent runs. The chain is the spec.

BMAD — Start With the Help Skill

After install, in a fresh chat:

bmad-help

What the skill does next: scans your project state (which artifacts already exist, which modules are installed) and recommends the next agent and workflow. For a new feature, it will point you at bmad-agent-pm + the create-PRD workflow.

BMAD — PM Drafts the PRD

bmad-agent-pm
Create a PRD for POST /auth/forgot-password.
Auth service. Must not leak account existence. Rate limit per IP and per email.
Tokens short-lived, single-use, hashed. Audit every request.

What the agent does next: the PM persona (John) activates and runs the create-PRD workflow. It walks you through goals → users → functional requirements → acceptance criteria, asking targeted questions where your brief was thin. The output is a PRD with acceptance criteria like:

[ ] Identical 202 response for known and unknown emails
[ ] 5/hour/IP and 3/hour/email rate limits
[ ] Token TTL <= 15 minutes, single-use
[ ] Token stored as hash, not plaintext
[ ] Audit log on every request

You review. Approve. Move on.

BMAD — Architect Designs Against the PRD

bmad-agent-architect
Read the PRD. Produce an architecture for forgot-password.
Stack: Express + Prisma + PostgreSQL + Redis.

What the agent does next: the Architect persona (Winston) activates, reads the PRD, and runs the create-architecture workflow. It produces the concurrency strategy, error classes, data model, and any cross-cutting decisions — asking only when a PRD acceptance line has more than one defensible design.

BMAD — Developer & Verification

bmad-agent-dev
Implement POST /auth/forgot-password.
Read the PRD and the architecture before any code.
Every acceptance criterion must be satisfied.

What the agent does next: the Developer persona (Amelia) activates, reads both artifacts, and runs the dev-story workflow. It implements one story at a time against the PRD, generating tests against acceptance criteria. Verification then walks the criteria one by one against the implementation:

✓  AC1  identical 202 for known and unknown — passes
✓  AC2  5/hour/IP — passes
✓  AC3  3/hour/email — passes
✓  AC4  token TTL — passes
✓  AC5  token hash storage — passes
✓  AC6  audit log on every request — passes

NFR p99 < 200ms — no load test present. Recommend k6 smoke before deploy.

Tool 3 of 5 — Matt Pocock’s Skills

Lightweight Claude Code skills. Each enforces one piece of the discipline. You compose them.

github.com/mattpocock/skills

Composable

Pick only the rungs you need. Mix with other workflows.

Lightweight

No install ceremony. npx skills@latest add mattpocock/skills

Best For

Developers who want to mix and match — some specs by you, some by AI.

Matt Pocock Skills — Install & Setup

npx skills@latest add mattpocock/skills

Pick the skills you want — and make sure /setup-matt-pocock-skills is one of them. Then run it once per repo to scaffold the per-repo config (issue tracker, triage labels, docs location):

/setup-matt-pocock-skills

Now the engineering skills are wired up:

Command	What it does
`/grill-with-docs`	Interrogates you about a feature against the existing domain model. Output: a hardened brief plus updates to `CONTEXT.md` and ADRs.
`/to-prd`	Turns the conversation into a PRD and submits it as a GitHub issue.
`/to-issues`	Breaks a plan, spec, or PRD into independently-grabbable GitHub issues (vertical slices).
`/tdd`	Drives test-first implementation, red → green → refactor.
`/diagnose`	Disciplined diagnosis loop for hard bugs: reproduce → minimise → hypothesise → instrument → fix.

Matt Pocock Skills — Grill the Spec

The clarify phase as a skill.

/grill-with-docs Build POST /auth/forgot-password — email, link, done.

What the skill does next (quoting the SKILL.md): “Interview me relentlessly about every aspect of this plan until we reach a shared understanding. Walk down each branch of the design tree, resolving dependencies between decisions one-by-one. For each question, provide your recommended answer. Ask the questions one at a time, waiting for feedback on each question before continuing.”

It also reads CONTEXT.md and any ADRs in your repo first, then sharpens the conversation against that vocabulary. Expect questions like:

1.  What response for unknown emails? (any inconsistency leaks existence)
2.  Per-IP rate limit, or per-email, or both?
3.  Token TTL? Single-use, or re-usable until expiry?
4.  Plaintext or hashed at rest?
5.  Audit log scope — only user-found, or every request?
6.  HTTP status — 200, 202, 204?
7.  What does the email body say if user doesn't exist?

Matt Pocock Skills — TDD the Implementation

Spec hardened. Now drive the build.

/tdd Implement POST /auth/forgot-password against the brief above. Test first. Smallest possible step.

What the skill does next: enforces vertical-slice TDD — one acceptance line at a time, not all tests up front. The SKILL.md explicitly bans the “horizontal slice” of writing all tests then all code, because bulk-written tests test imagined behaviour instead of actual behaviour.

Red

Write one failing integration-style test for one acceptance line — testing behaviour through the public interface, not implementation details.
Green

Implement the minimum to pass it.
Refactor

Only after green, never before.

If a bug surfaces, /diagnose runs the structured debugging loop (reproduce → minimise → hypothesise → instrument → fix → regression-test) instead of patching blind.

Tool 4 of 5 — Superpowers

A skills library that auto-activates the right discipline at the right moment. No slash commands to remember.

github.com/obra/superpowers

Auto-Activated

Skills trigger automatically by context — brainstorming before code, TDD during build, review between tasks.

Mandatory Workflows

”Mandatory workflows, not suggestions.” The agent checks for relevant skills before any task.

Best For

Reviewer-first culture where you want the discipline baked in by default.

Superpowers — Install

The plugin installs via the Claude Code plugin marketplace.

/plugin install superpowers@claude-plugins-official

Other agents have their own marketplaces — Codex CLI, Cursor, and Copilot CLI are documented in the README.

Once installed, the agent has the skills. There is no /brainstorm or /execute to type — the workflow activates from conversation context.

Superpowers — The Workflow Skills

Skill	Activates when
`brainstorming`	You start describing a feature, before any code
`writing-plans`	A design has been agreed
`subagent-driven-development`	A plan exists and execution begins
`executing-plans`	Batch-running approved plan steps
`test-driven-development`	During implementation
`requesting-code-review`	Between tasks
`verification-before-completion`	Before claiming done

The skill set covers the full loop from a fuzzy idea to verified code.

Superpowers — In Practice

Conversation, not commands:

> Need a forgot-password endpoint. Auth service.

What the skill does next (this is from the brainstorming SKILL.md checklist):

Explore project context

Reads files, docs, recent commits — grounds itself in your codebase before anything else.
Ask clarifying questions, one at a time

Purpose, constraints, success criteria. Never bulk questions.
Propose 2–3 approaches with tradeoffs

Each with a recommendation.
Present the design, get approval section by section

Cannot proceed without it.
Write design doc

Saved to docs/superpowers/specs/YYYY-MM-DD-<topic>-design.md and committed.

Superpowers — The Output

What you get back from brainstorming for the forgot-password endpoint:

Options to decide:
  - Response shape: identical (no leak) vs explicit (UX clearer)
  - Rate limit: IP only / email only / both / sliding window
  - Token storage: plaintext (fast) vs hashed (safe)
  - TTL: 5 / 15 / 60 minutes — security vs UX tradeoff

Risks flagged:
  - CWE-203 (account enumeration) if responses differ
  - SMTP exhaustion if no email-level rate limit
  - Token replay if no usedAt

Tool 5 of 5 — OpenSpec

Delta-spec for brownfield code. You don’t spec the whole system — you spec what’s changing.

github.com/Fission-AI/OpenSpec

Brownfield-First

No need to spec the whole system. Start with the delta.

Change as Changelog

Each delta becomes a permanent record of what changed and why.

Best For

Legacy codebases with no specs at all. Start anywhere.

OpenSpec — Install & Init

npm install -g @fission-ai/openspec@latest
openspec init

Greenfield SDD	OpenSpec
Spec the whole feature	Spec only what’s changing
Full spec before code	Change proposal IS the spec
New codebase	Current code is the baseline
—	Old changes archived as changelog

For an existing forgot-password with no rate limit, no TTL, no audit — you don’t write forgot-password.yaml from scratch. You write a change.

OpenSpec — Propose a Change

/opsx:propose add-forgot-password-hardening

What the command does next: the agent reads your existing code, infers the current behaviour, and generates a complete proposal folder at openspec/changes/add-forgot-password-hardening/:

File	Purpose
`proposal.md`	why we’re doing this, what’s changing
`specs/`	requirements and scenarios
`design.md`	technical approach
`tasks.md`	implementation checklist

The flow is generation-first, not interview-first — you review and edit the artifacts, then run /opsx:apply when satisfied.

OpenSpec — Inside the Proposal

Looking at real proposals in the OpenSpec repo, proposal.md follows a four-section structure:

# Harden POST /auth/forgot-password

## Why
Current implementation has no rate limit, leaks account existence,
and stores plaintext tokens. Security review flagged all three.

## What Changes
- identical 202 response for known and unknown emails
- 5/hour/IP and 3/hour/email rate limits
- token TTL = 15 minutes, single-use (usedAt set on first use)
- token stored as SHA-256 hash
- audit log on every request

## Capabilities
- Rate limiting (per-IP, per-email)
- Token hashing and TTL enforcement
- Audit logging across success and miss paths

## Impact
- Existing tests must still pass
- 7 new tests required for the new invariants
- No breaking changes to email template, SMTP wrapper, or route shape

OpenSpec — Apply & Archive

/opsx:apply

What the command does next: walks the tasks.md checklist for the active change, ticking each item off as the code edits land. From the README’s worked example:

You: /opsx:apply
AI:  Implementing tasks...
     ✓ 1.1 Add rate-limit middleware (per-IP)
     ✓ 1.2 Add rate-limit middleware (per-email)
     ✓ 2.1 Migrate tokens table — add tokenHash, expiresAt, usedAt
     ✓ 2.2 Hash tokens with SHA-256 on create
     ✓ 3.1 Wire audit.log into success and miss paths
     All tasks complete!

Once green, archive the change so it becomes part of the changelog:

/opsx:archive

OpenSpec — Expanded Workflow (Optional)

The README documents an expanded set of commands behind a profile switch:

openspec config profile   # select the expanded profile
openspec update           # apply the new slash commands

Command	Purpose
`/opsx:new`	Start a new change (more deliberate than `propose`)
`/opsx:continue`	Resume work on an existing change
`/opsx:ff`	Fast-forward progress
`/opsx:verify`	Check implementation against specs
`/opsx:bulk-archive`	Archive multiple changes at once
`/opsx:onboard`	Team workflow setup

For most flows the three-command core (propose → apply → archive) is enough.

Side By Side — Five Paths

Phase	Spec Kit	BMAD	Matt Pocock	Superpowers	OpenSpec
Clarify	`/speckit.specify` + `/speckit.clarify`	`bmad-agent-pm` PRD interview	`/grill-with-docs`	`brainstorming` skill	Existing code + “What’s changing?”
Spec	`spec.md`	PRD	Brief transcript	Design doc + plan	`proposal.md` + `specs/`
Plan	`/speckit.plan`	`bmad-agent-architect`	(in `/tdd` prompt)	`writing-plans` skill	`design.md` + `tasks.md`
Implement	`/speckit.implement`	`bmad-agent-dev`	`/tdd`	`subagent-driven-development`	`/opsx:apply`
Verify	Tests from spec	Acceptance walk-through	`/diagnose`	`verification-before-completion`	`/opsx:verify` (expanded profile)
Asks questions?	Yes — `/clarify` asks ≤5, one at a time	Yes — PM/Architect walk you through	Yes — `/grill-with-docs` interviews	Yes — `brainstorming` one at a time	No — generates from existing code
Spec author	You	AI (PM agent)	You + AI grilling	You + AI brainstorm	You (change only)

Same discipline ends every column: code traces to a spec, tests trace to spec lines, failures name invariants.

Which One For You?

Spec Kit

You want to own every decision yourself. Solo, maximum control.

BMAD

Your team has dedicated reviewers and you want AI to draft the artifacts.

Matt Pocock's Skills

You want to mix and match — some specs by you, some by AI, all light.

Superpowers

You want the discipline auto-applied without remembering commands.

OpenSpec

You’re working in a legacy codebase with no specs at all.

Five Tools, One Discipline

Tool	Best for	Principle it most visibly enforces
Spec Kit	Solo, max control	Gaps before code — you close every gap yourself
BMAD	Teams with dedicated reviewers	Spec is the contract — AI drafts, human signs off
Matt Pocock Skills	Lightweight, composable flow	Tests trace to spec — TDD-first loop built in
Superpowers	Reviewer-first culture	Spec is the contract — skills auto-apply the discipline
OpenSpec	Legacy codebases	Spec is the changelog — the change proposal IS the spec

Resources

Tool	Repository
Spec Kit	github.com/github/spec-kit
BMAD Method	github.com/bmad-code-org/BMAD-METHOD
Matt Pocock’s Skills	github.com/mattpocock/skills
Superpowers	github.com/obra/superpowers
OpenSpec	github.com/Fission-AI/OpenSpec

Five Tools, One Feature

From Principles to Tools

What You’ll Learn

The five main SDD toolchains

How to match a tool to your context

The same discipline across all five

The Spec We’re Building From

Tool 1 of 5 — Spec Kit

Author-Driven

Gated Phases

Best For

Spec Kit — Install & Constitution

Spec Kit — Specify

Spec Kit — Clarify

Spec Kit — Plan

Spec Kit — Tasks & Implement

Tool 2 of 5 — BMAD Method

AI-Drafted

Human-Approved

Best For

BMAD — Install & Meet the Agents

BMAD — Start With the Help Skill

BMAD — PM Drafts the PRD

BMAD — Architect Designs Against the PRD

BMAD — Developer & Verification

Tool 3 of 5 — Matt Pocock’s Skills

Composable

Lightweight

Best For

Matt Pocock Skills — Install & Setup

Matt Pocock Skills — Grill the Spec

Matt Pocock Skills — TDD the Implementation

Red

Green

Refactor

Tool 4 of 5 — Superpowers

Auto-Activated

Mandatory Workflows

Best For

Superpowers — Install

Superpowers — The Workflow Skills

Superpowers — In Practice

Explore project context

Ask clarifying questions, one at a time

Propose 2–3 approaches with tradeoffs

Present the design, get approval section by section

Write design doc

Superpowers — The Output

Tool 5 of 5 — OpenSpec

Brownfield-First

Change as Changelog

Best For

OpenSpec — Install & Init

OpenSpec — Propose a Change

OpenSpec — Inside the Proposal

OpenSpec — Apply & Archive

OpenSpec — Expanded Workflow (Optional)

Side By Side — Five Paths

Which One For You?

Spec Kit

BMAD

Matt Pocock's Skills

Superpowers

OpenSpec

Five Tools, One Discipline

Resources