diff --git a/TODOS.md b/TODOS.md index 72e3d023e..d5dd6f482 100644 --- a/TODOS.md +++ b/TODOS.md @@ -582,7 +582,24 @@ reads it yet. **Effort:** L (human: ~1 week / CC: ~4h) **Priority:** P0 -**Depends on:** 2+ weeks of v1 dogfood, profile diversity check passing. +**Depends on:** **90+ days of v1 dogfood stable across 3+ skills** (per +`docs/designs/PLAN_TUNING_V0.md` §"Deferred to v2" E1 acceptance criteria). +Distinct from the lighter-weight diversity-display gate +(`sample_size >= 20 AND skills_covered >= 3 AND question_ids_covered >= 8 +AND days_span >= 7`) used in /plan-tune to render the inferred column — +display is a UI affordance, promotion to E1 needs a much higher bar +because behavioral adaptation is consequential and hard to revert. Prior +versions of this card cited "2+ weeks" which conflicted with V0 — V0 wins. + +**Substrate risk (Codex outside-voice, Phase A review 2026-05-26):** Generated +skill prose is agent-compliance-based. Tests can verify templates contain the +right reads of `~/.gstack/developer-profile.json` and the right decision +points, but tests cannot prove agents obey them at runtime. E1 ships +adaptations as **advisory annotations on AskUserQuestion recommendations** +("Recommended via your profile: ") until there's a hard runtime +execution path. Do NOT gate any AUTO_DECIDE on inferred profile alone in v1 +of E1; explicit per-question preferences remain the only AUTO_DECIDE +source. ### E3 — `/plan-tune narrative` + `/plan-tune vibe` diff --git a/plan-tune/SKILL.md b/plan-tune/SKILL.md index f56fb5b43..526055c85 100644 --- a/plan-tune/SKILL.md +++ b/plan-tune/SKILL.md @@ -698,50 +698,77 @@ Canonical reference: `docs/designs/PLAN_TUNING_V0.md`. ## Step 0: Detect what the user wants -Read the user's message. Route based on plain-English intent, not keywords: +Read the user's message. Route based on plain-English intent, not keywords. -1. **First-time use** (config says `question_tuning` is not yet set to `true`) → - run `Enable + setup` below. -2. **"Show my profile" / "what do you know about me" / "show my vibe"** → +**Implicit gates run first** (before user-intent routing). These exist so first-time +users see the consent prompt and so explicit opt-ins eventually run the 5-Q setup. +Each gate is guarded by a marker file so the user is asked at most once. + +1. **Consent gate.** If `question_tuning` is `false` AND + `~/.gstack/.question-tuning-prompted` is missing → run `Consent + opt-in` + below. Honor the answer with a marker write either way; do not re-prompt. +2. **Setup gate.** If `question_tuning` is `true` AND + `~/.gstack/developer-profile.json`'s `declared` object is empty AND + `~/.gstack/.declared-setup-prompted` is missing → run `5-Q setup` below. + Touch the marker after setup completes OR is declined. + +When neither implicit gate fires, route by user intent: + +3. **"Show my profile" / "what do you know about me" / "show my vibe"** → run `Inspect profile`. -3. **"Review questions" / "what have I been asked" / "show recent"** → +4. **"Review questions" / "what have I been asked" / "show recent"** → run `Review question log`. -4. **"Stop asking me about X" / "never ask about Y" / "tune: ..."** → +5. **"Stop asking me about X" / "never ask about Y" / "tune: ..."** → run `Set a preference`. -5. **"Update my profile" / "I'm more boil-the-ocean than that" / "I've changed +6. **"Update my profile" / "I'm more boil-the-ocean than that" / "I've changed my mind"** → run `Edit declared profile` (confirm before writing). -6. **"Show the gap" / "how far off is my profile"** → run `Show gap`. -7. **"Turn it off" / "disable"** → `~/.claude/skills/gstack/bin/gstack-config set question_tuning false` -8. **"Turn it on" / "enable"** → `~/.claude/skills/gstack/bin/gstack-config set question_tuning true` -9. **Clear ambiguity** — if you can't tell what the user wants, ask plainly: - "Do you want to (a) see your profile, (b) review recent questions, (c) set - a preference, (d) update your declared profile, or (e) turn it off?" +7. **"Show the gap" / "how far off is my profile"** → run `Show gap`. +8. **"Turn it off" / "disable"** → `~/.claude/skills/gstack/bin/gstack-config set question_tuning false` +9. **"Turn it on" / "enable"** → `~/.claude/skills/gstack/bin/gstack-config set question_tuning true && touch ~/.gstack/.question-tuning-prompted` +10. **Clear ambiguity** — if you can't tell what the user wants, ask plainly: + "Do you want to (a) see your profile, (b) review recent questions, (c) set + a preference, (d) update your declared profile, or (e) turn it off?" Power-user shortcuts (one-word invocations) — handle these too: `profile`, `vibe`, `gap`, `stats`, `review`, `enable`, `disable`, `setup`. --- -## Enable + setup (first-time flow) +## Consent + opt-in -**When this fires.** The user invokes `/plan-tune` and the preamble shows -`QUESTION_TUNING: false` (the default). +**When this fires.** Step 0's consent gate: `question_tuning` is `false` AND +`~/.gstack/.question-tuning-prompted` is missing. The user has never been +asked. + +**Privacy note.** gstack defaults `question_tuning` to `false` for every user. +There is no auto-flip for any cohort. The consent prompt is the only path to +enabling, and the answer is honored with a marker file so the user is never +re-asked. Contributors are not auto-enrolled (see +`docs/designs/PLAN_TUNING_V1.md` §"Decisions log" for the privacy posture +rationale). If the user is a contributor (`gstack_contributor: true`), the +prompt can mention it as additional context, but the decision is still +explicit. **Flow:** -1. Read the current state: +1. Detect contributor state (for prompt framing only, not for auto-action): ```bash _QT=$(~/.claude/skills/gstack/bin/gstack-config get question_tuning 2>/dev/null || echo "false") + _CONTRIB=$(~/.claude/skills/gstack/bin/gstack-config get gstack_contributor 2>/dev/null || echo "false") echo "QUESTION_TUNING: $_QT" + echo "CONTRIBUTOR: $_CONTRIB" ``` -2. If `false`, use AskUserQuestion: +2. AskUserQuestion (use the contributor-specific framing only if `_CONTRIB=true`, + otherwise use the general framing): + **General framing:** > Question tuning is off. gstack can learn which of its prompts you find > valuable vs noisy — so over time, gstack stops asking questions you've > already answered the same way. It takes about 2 minutes to set up your > initial profile. v1 is observational: gstack tracks your preferences > and shows you a profile, but doesn't silently change skill behavior yet. + > Logs stay local (`~/.gstack/projects//question-log.jsonl`). > > RECOMMENDATION: Enable and set up your profile. Completeness: A=9/10. > @@ -749,13 +776,47 @@ Power-user shortcuts (one-word invocations) — handle these too: > B) Enable but skip setup (I'll fill it in later) > C) Cancel — I'm not ready -3. If A or B: enable: + **Contributor framing (only if `_CONTRIB=true`):** + > You're a gstack contributor. Question tuning isn't on by default for + > anyone, but contributors are the cohort whose data most helps v2 work + > (skills adapting to your steering style). Enabling logs every + > AskUserQuestion outcome locally to + > `~/.gstack/projects//question-log.jsonl` — nothing leaves your + > machine. v1 is observational only. + > + > RECOMMENDATION: Enable and set up your profile. Completeness: A=9/10. + > + > A) Enable + set up (recommended for contributors, ~2 min) + > B) Enable but skip setup (I'll fill it in later) + > C) Cancel — I'm not ready + +3. ALWAYS touch the marker, regardless of choice: + ```bash + touch ~/.gstack/.question-tuning-prompted + ``` + +4. If A or B: enable: ```bash ~/.claude/skills/gstack/bin/gstack-config set question_tuning true ``` -4. If A (full setup), ask FIVE one-per-dimension declaration questions via - individual AskUserQuestion calls (one at a time). Use plain English, no jargon: +5. If C: do nothing else. Tell the user: "Question tuning stays off. Re-enable + any time with `/plan-tune enable` or `gstack-config set question_tuning true`." + +## 5-Q setup (post-consent, or via Setup gate) + +**When this fires.** Two paths: +- Right after the consent prompt above accepts option A. +- Standalone via Step 0's setup gate: `question_tuning` is already `true` + (user opted in via gstack-config or earlier `/plan-tune enable`) AND + `declared` is empty AND `~/.gstack/.declared-setup-prompted` is missing. + This catches users who set `question_tuning: true` directly without + running the wizard. + +**Flow:** + +1. Ask FIVE one-per-dimension declaration questions via individual + AskUserQuestion calls (one at a time). Use plain English, no jargon: **Q1 — scope_appetite:** "When you're planning a feature, do you lean toward shipping the smallest useful version fast, or building the complete, edge- @@ -808,10 +869,18 @@ Power-user shortcuts (one-word invocations) — handle these too: " ``` -5. Tell the user: "Profile set. Question tuning is now on. Use `/plan-tune` +2. Touch the marker so the Setup gate doesn't re-fire: + ```bash + touch ~/.gstack/.declared-setup-prompted + ``` + Touch it even if the user bails out partway — they were asked; they chose + not to complete. The Setup gate respects that. They can rerun the 5-Q + anytime with `/plan-tune setup` (Step 0 power-user shortcut). + +3. Tell the user: "Profile set. Question tuning is on. Use `/plan-tune` again any time to inspect, adjust, or turn it off." -6. Show the profile inline as a confirmation (see `Inspect profile` below). +4. Show the profile inline as a confirmation (see `Inspect profile` below). --- @@ -832,12 +901,18 @@ Parse the JSON. Present in **plain English**, not raw floats: Format: "**scope_appetite:** 0.8 (boil the ocean — you prefer the complete version with edge cases covered)" -- If `inferred.diversity` passes the calibration gate (`sample_size >= 20 AND +- If `inferred.diversity` passes the **display gate** (`sample_size >= 20 AND skills_covered >= 3 AND question_ids_covered >= 8 AND days_span >= 7`), show the inferred column next to declared: "**scope_appetite:** declared 0.8 (boil the ocean) ↔ observed 0.72 (close)" Use words for the gap: 0.0-0.1 "close", 0.1-0.3 "drift", 0.3+ "mismatch". + This display gate is intentionally lower than the E1 **promotion gate** + (90+ days stable across 3+ skills, per `docs/designs/PLAN_TUNING_V0.md`). + Displaying inferred values is a UI affordance; shipping behavior-adapting + defaults based on the profile is consequential and needs a much higher + bar. Do NOT use the display gate as a green light for v2 E1 work. + - If the calibration gate isn't met, say: "Not enough observed data yet — need N more events across M more skills before we can show your observed profile." diff --git a/plan-tune/SKILL.md.tmpl b/plan-tune/SKILL.md.tmpl index 70f444679..2d71dffe9 100644 --- a/plan-tune/SKILL.md.tmpl +++ b/plan-tune/SKILL.md.tmpl @@ -52,50 +52,77 @@ Canonical reference: `docs/designs/PLAN_TUNING_V0.md`. ## Step 0: Detect what the user wants -Read the user's message. Route based on plain-English intent, not keywords: +Read the user's message. Route based on plain-English intent, not keywords. -1. **First-time use** (config says `question_tuning` is not yet set to `true`) → - run `Enable + setup` below. -2. **"Show my profile" / "what do you know about me" / "show my vibe"** → +**Implicit gates run first** (before user-intent routing). These exist so first-time +users see the consent prompt and so explicit opt-ins eventually run the 5-Q setup. +Each gate is guarded by a marker file so the user is asked at most once. + +1. **Consent gate.** If `question_tuning` is `false` AND + `~/.gstack/.question-tuning-prompted` is missing → run `Consent + opt-in` + below. Honor the answer with a marker write either way; do not re-prompt. +2. **Setup gate.** If `question_tuning` is `true` AND + `~/.gstack/developer-profile.json`'s `declared` object is empty AND + `~/.gstack/.declared-setup-prompted` is missing → run `5-Q setup` below. + Touch the marker after setup completes OR is declined. + +When neither implicit gate fires, route by user intent: + +3. **"Show my profile" / "what do you know about me" / "show my vibe"** → run `Inspect profile`. -3. **"Review questions" / "what have I been asked" / "show recent"** → +4. **"Review questions" / "what have I been asked" / "show recent"** → run `Review question log`. -4. **"Stop asking me about X" / "never ask about Y" / "tune: ..."** → +5. **"Stop asking me about X" / "never ask about Y" / "tune: ..."** → run `Set a preference`. -5. **"Update my profile" / "I'm more boil-the-ocean than that" / "I've changed +6. **"Update my profile" / "I'm more boil-the-ocean than that" / "I've changed my mind"** → run `Edit declared profile` (confirm before writing). -6. **"Show the gap" / "how far off is my profile"** → run `Show gap`. -7. **"Turn it off" / "disable"** → `~/.claude/skills/gstack/bin/gstack-config set question_tuning false` -8. **"Turn it on" / "enable"** → `~/.claude/skills/gstack/bin/gstack-config set question_tuning true` -9. **Clear ambiguity** — if you can't tell what the user wants, ask plainly: - "Do you want to (a) see your profile, (b) review recent questions, (c) set - a preference, (d) update your declared profile, or (e) turn it off?" +7. **"Show the gap" / "how far off is my profile"** → run `Show gap`. +8. **"Turn it off" / "disable"** → `~/.claude/skills/gstack/bin/gstack-config set question_tuning false` +9. **"Turn it on" / "enable"** → `~/.claude/skills/gstack/bin/gstack-config set question_tuning true && touch ~/.gstack/.question-tuning-prompted` +10. **Clear ambiguity** — if you can't tell what the user wants, ask plainly: + "Do you want to (a) see your profile, (b) review recent questions, (c) set + a preference, (d) update your declared profile, or (e) turn it off?" Power-user shortcuts (one-word invocations) — handle these too: `profile`, `vibe`, `gap`, `stats`, `review`, `enable`, `disable`, `setup`. --- -## Enable + setup (first-time flow) +## Consent + opt-in -**When this fires.** The user invokes `/plan-tune` and the preamble shows -`QUESTION_TUNING: false` (the default). +**When this fires.** Step 0's consent gate: `question_tuning` is `false` AND +`~/.gstack/.question-tuning-prompted` is missing. The user has never been +asked. + +**Privacy note.** gstack defaults `question_tuning` to `false` for every user. +There is no auto-flip for any cohort. The consent prompt is the only path to +enabling, and the answer is honored with a marker file so the user is never +re-asked. Contributors are not auto-enrolled (see +`docs/designs/PLAN_TUNING_V1.md` §"Decisions log" for the privacy posture +rationale). If the user is a contributor (`gstack_contributor: true`), the +prompt can mention it as additional context, but the decision is still +explicit. **Flow:** -1. Read the current state: +1. Detect contributor state (for prompt framing only, not for auto-action): ```bash _QT=$(~/.claude/skills/gstack/bin/gstack-config get question_tuning 2>/dev/null || echo "false") + _CONTRIB=$(~/.claude/skills/gstack/bin/gstack-config get gstack_contributor 2>/dev/null || echo "false") echo "QUESTION_TUNING: $_QT" + echo "CONTRIBUTOR: $_CONTRIB" ``` -2. If `false`, use AskUserQuestion: +2. AskUserQuestion (use the contributor-specific framing only if `_CONTRIB=true`, + otherwise use the general framing): + **General framing:** > Question tuning is off. gstack can learn which of its prompts you find > valuable vs noisy — so over time, gstack stops asking questions you've > already answered the same way. It takes about 2 minutes to set up your > initial profile. v1 is observational: gstack tracks your preferences > and shows you a profile, but doesn't silently change skill behavior yet. + > Logs stay local (`~/.gstack/projects//question-log.jsonl`). > > RECOMMENDATION: Enable and set up your profile. Completeness: A=9/10. > @@ -103,13 +130,47 @@ Power-user shortcuts (one-word invocations) — handle these too: > B) Enable but skip setup (I'll fill it in later) > C) Cancel — I'm not ready -3. If A or B: enable: + **Contributor framing (only if `_CONTRIB=true`):** + > You're a gstack contributor. Question tuning isn't on by default for + > anyone, but contributors are the cohort whose data most helps v2 work + > (skills adapting to your steering style). Enabling logs every + > AskUserQuestion outcome locally to + > `~/.gstack/projects//question-log.jsonl` — nothing leaves your + > machine. v1 is observational only. + > + > RECOMMENDATION: Enable and set up your profile. Completeness: A=9/10. + > + > A) Enable + set up (recommended for contributors, ~2 min) + > B) Enable but skip setup (I'll fill it in later) + > C) Cancel — I'm not ready + +3. ALWAYS touch the marker, regardless of choice: + ```bash + touch ~/.gstack/.question-tuning-prompted + ``` + +4. If A or B: enable: ```bash ~/.claude/skills/gstack/bin/gstack-config set question_tuning true ``` -4. If A (full setup), ask FIVE one-per-dimension declaration questions via - individual AskUserQuestion calls (one at a time). Use plain English, no jargon: +5. If C: do nothing else. Tell the user: "Question tuning stays off. Re-enable + any time with `/plan-tune enable` or `gstack-config set question_tuning true`." + +## 5-Q setup (post-consent, or via Setup gate) + +**When this fires.** Two paths: +- Right after the consent prompt above accepts option A. +- Standalone via Step 0's setup gate: `question_tuning` is already `true` + (user opted in via gstack-config or earlier `/plan-tune enable`) AND + `declared` is empty AND `~/.gstack/.declared-setup-prompted` is missing. + This catches users who set `question_tuning: true` directly without + running the wizard. + +**Flow:** + +1. Ask FIVE one-per-dimension declaration questions via individual + AskUserQuestion calls (one at a time). Use plain English, no jargon: **Q1 — scope_appetite:** "When you're planning a feature, do you lean toward shipping the smallest useful version fast, or building the complete, edge- @@ -162,10 +223,18 @@ Power-user shortcuts (one-word invocations) — handle these too: " ``` -5. Tell the user: "Profile set. Question tuning is now on. Use `/plan-tune` +2. Touch the marker so the Setup gate doesn't re-fire: + ```bash + touch ~/.gstack/.declared-setup-prompted + ``` + Touch it even if the user bails out partway — they were asked; they chose + not to complete. The Setup gate respects that. They can rerun the 5-Q + anytime with `/plan-tune setup` (Step 0 power-user shortcut). + +3. Tell the user: "Profile set. Question tuning is on. Use `/plan-tune` again any time to inspect, adjust, or turn it off." -6. Show the profile inline as a confirmation (see `Inspect profile` below). +4. Show the profile inline as a confirmation (see `Inspect profile` below). --- @@ -186,12 +255,18 @@ Parse the JSON. Present in **plain English**, not raw floats: Format: "**scope_appetite:** 0.8 (boil the ocean — you prefer the complete version with edge cases covered)" -- If `inferred.diversity` passes the calibration gate (`sample_size >= 20 AND +- If `inferred.diversity` passes the **display gate** (`sample_size >= 20 AND skills_covered >= 3 AND question_ids_covered >= 8 AND days_span >= 7`), show the inferred column next to declared: "**scope_appetite:** declared 0.8 (boil the ocean) ↔ observed 0.72 (close)" Use words for the gap: 0.0-0.1 "close", 0.1-0.3 "drift", 0.3+ "mismatch". + This display gate is intentionally lower than the E1 **promotion gate** + (90+ days stable across 3+ skills, per `docs/designs/PLAN_TUNING_V0.md`). + Displaying inferred values is a UI affordance; shipping behavior-adapting + defaults based on the profile is consequential and needs a much higher + bar. Do NOT use the display gate as a green light for v2 E1 work. + - If the calibration gate isn't met, say: "Not enough observed data yet — need N more events across M more skills before we can show your observed profile."