DX — Human Practitioner Experiment
Status: Scheduled — April 2026
Design version: 1.0 (March 2026)
Related: AX (adversarial study, complete) · RX (replication, complete)
Purpose
DX tests whether the Generative Specification methodology transfers to engineers other than its author. AX established the quality gradient under controlled conditions. RX proved independent reproducibility. DX establishes between-practitioner replication: do 40 engineers from different backgrounds, using the same methodology, produce better outcomes than 40 engineers using unstructured AI prompting?
This is the experiment the single-author practitioner cases cannot supply. It is the controlled condition the adversarial study was designed to precede.
Design
Participants
- 40 developers (target)
- Self-selected from conference workshop registration
- Mix of experience levels; no pre-screening on AI tool familiarity
- Randomly assigned to groups A or B within each workshop session
Tasks
Session 1 (morning, 90 min): gs-workshop-base
Social bookmarking API (Linkmark). Add Weekly Digest feature: GET /digest + POST /digest/notify.
Session 2 (afternoon, 90 min + 3.5 hr): gs-workshop-kanban
Task management API with 11 intentional architectural issues. Add GET /activity + fix broken POST /tasks/:id/status atomicity.
Groups swap between sessions (A→B, B→A): every participant experiences both conditions.
Conditions
| Group | Condition | Protocol |
|---|---|---|
| A | Prompt-driven | Structured expert prompts as provided in README. Record all prompt modifications in PROMPT_LOG.md. |
| B | GS + ForgeCraft | Run setup_project → issue one instruction → do not intervene. Record interventions in INTERVENTION_LOG.md. |
Hypotheses
H1 (primary): Group B participants score higher on the 7-property rubric than Group A participants on the same task.
H2: Group B participants require fewer human interventions (measured by INTERVENTION_LOG entries) to produce a working implementation.
H3: Group B participants produce more architecturally complete implementations (measured by Bounded: zero prisma.* in routes, and Composable: correct response shape).
H4: The quality gap between groups widens on the more complex brownfield task (kanban) compared to the greenfield-adjacent task (base).
Dual Rubric
Automated scoring (runs against every fork after the session)
| Check | Property | Points |
|---|---|---|
npm test passes, coverage ≥ 60% | Verifiable | 2 |
Zero prisma.* in src/routes/ | Bounded | 2 |
CLAUDE.md exists in repo root | Self-describing | 1 |
ADR in docs/adr/ or adrs/ | Auditable | 1 |
| ≥ 50% of new commits: conventional format | Auditable | 1 |
| Feature endpoint returns correct JSON shape | Composable | 3 |
| Total | 10 |
Observer-scored (during session)
| Observation | Bonus |
|---|---|
| Transaction present in status-change endpoint (kanban) | +1 |
| Membership check enforced before comment operations (kanban) | +1 |
| Error middleware present | +1 |
Zero console.log in routes | +1 |
Data Collection
Every participant forks the workshop repo. After the session:
- Push fork to GitHub
- Fork URL submitted to session coordinator
- Automated scoring script runs against fork (see
scoring/) - Results aggregated in
results/(populated post-April 2026)
Verification Commands (per fork, per task)
# 1. Tests
npm test
# 2. Bounded: direct Prisma in routes (should be 0 in GS group)
grep -rn "prisma\." src/routes/ | grep -v "//.*prisma"
# 3. Self-describing: CLAUDE.md exists
test -f CLAUDE.md && echo "PASS" || echo "FAIL"
# 4. Feature works (base — digest)
TOKEN=$(curl -s http://localhost:3000/users/login \
-H "Content-Type: application/json" \
-d '{"email":"alice@example.com","password":"password123"}' | \
node -e "let d='';process.stdin.on('data',c=>d+=c).on('end',()=>console.log(JSON.parse(d).token))")
curl -s http://localhost:3000/digest -H "Authorization: Bearer $TOKEN"
# 5. Feature works (kanban — activity)
TOKEN=$(curl -s http://localhost:3001/users/login \
-H "Content-Type: application/json" \
-d '{"email":"alice@example.com","password":"password123"}' | \
node -e "let d='';process.stdin.on('data',c=>d+=c).on('end',()=>console.log(JSON.parse(d).token))")
curl -s http://localhost:3001/activity -H "Authorization: Bearer $TOKEN"
Timeline
| Date | Event |
|---|---|
| March 2026 | Design complete; workshop repos ready; scoring automation built |
| April 2026 | Workshop session (exact date TBD) |
| April–May 2026 | Results aggregated, scored, and committed to results/ |
| May 2026 | DX results incorporated into white paper §7.7.A |