gstack

History

Garry Tan a6fb31726c v1.48.0.0 feat: AskUserQuestion split rule + runtime AUTO_DECIDE carve-out (#1740 ) * feat(preamble): add "Handling 5+ options — split, never drop" rule Agents repeatedly hit Conductor's 4-option AskUserQuestion cap and silently drop one option to fit, shrinking the user's decision space. This rule names the bug and gives two compliant shapes: batch into ≤4-groups (for coherent alternatives) or split into N sequential per-option calls (for independent scope items, default). Inline preamble subsection is ~15 lines (rule + buckets + pointer). Full reference with worked examples, Hold/dependency semantics, and final-summary validation lives in docs/askuserquestion-split.md. The agent loads the docs file on demand when N>4. Per-option call shape: D<N>.k header, ELI10, Recommendation, kind-note (no completeness score — decision actions, not coverage), Include / Defer / Cut / Hold buckets. Hold stops the chain immediately; the final D<N>.final call validates dependencies and confirms the assembled scope. question_ids: <skill>-split-<option-slug> (kebab-case ASCII, ≤64 chars). Also fixes orphan "12. " prefix on the existing CJK rule. Tier-2+ skills inherit via the existing resolver. SKILL.md regenerated for all 41 affected skills + 3 golden fixtures. Net diff per SKILL.md: ~34 lines (vs ~110 for the full inline version). 6 tests pin the inline contract (4-option cap, buckets, D-numbering, docs pointer, runtime AUTO_DECIDE gate reference, orphan 12 regression). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(question-pref): runtime AUTO_DECIDE carve-out for -split- ids Split chains (per-option AskUserQuestion calls emitted by the new "Handling 5+ options" rule) must never be silently auto-approved via /plan-tune preferences. The user's option set is sacred. Layer 1 (mechanism): unique <skill>-split-<option-slug> ids prevent cross-option preference leakage. Layer 2 (this commit): the runtime checker `gstack-question-preference --check` detects any id matching -split- and forces ASK_NORMALLY even when never-ask or ask-only-for-one-way preferences exist for that exact id. An explanatory note tells the user their preference was bypassed and why. 7 tests pin the carve-out: no-pref baseline, never-ask override, explanatory note text, ask-only-for-one-way override, always-ask (no note), non-split id containing "split" word (negative case for regex specificity), multi-skill split id formats. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(e2e): split-overflow regression for /plan-ceo-review Periodic-tier E2E test that catches the original failure mode the user complained about: 5+ options for ONE decision must split into N sequential AskUserQuestion calls, not drop one to fit Conductor's 4-option cap. Fixture: 5 independent chat-platform integration candidates (Slack/Discord/Teams/Telegram/Mattermost), each carrying its own include/defer/cut decision. Floor = 4 review-phase AUQs (standard [N-1] tolerance band). Pre-fix "drop to 4 + 1 dropped" fails this floor. Wired into test/helpers/touchfiles.ts: tier periodic, depends on plan-ceo-review/*, the new preamble subsection, the question-pref binary (for the carve-out), and the runner helper. touchfiles.test.ts expected count bumped 21 → 22 to account for the new entry. Cost: ~$0.30/run when EVALS_TIER=periodic. Skips silently otherwise. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> chore: post-merge regen + rebase size-budget baseline to v1.47.0.0 After merging origin/main (v1.45 → v1.47), three things needed cleanup: 1. spec/SKILL.md (main's new skill) regenerated to include our split-vs-drop preamble subsection — same mechanical regen as the other 41 tier-2+ skills. 2. Three golden ship fixtures refreshed to capture main's GSTACK_PLAN_MODE block + /spec routing entry + jargon-list.json refactor. 3. docs/skills.md — added /spec table row that main's PR (#1698/#1733) shipped without. Pre-existing failure on main; this PR catches and fixes. Also rebased test/skill-size-budget.test.ts from v1.44.1 → v1.47.0.0 baseline. Main's v1.46 (catalog tokens trim) + v1.47 (/spec skill) pushed the v1.44.1 anchor past the 5% ratchet to ×1.059 — pre-existing failure on main. This PR captures a fresh parity-baseline-v1.47.0.0.json and re-anchors the test there. Historical v1.44.1.json and v1.46.0.0.json retained in test/fixtures/ for reference. Our subsection contributes ~0.1% of the post-rebase corpus. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v1.48.0.0) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-26 23:43:07 -07:00
..
preamble	v1.48.0.0 feat: AskUserQuestion split rule + runtime AUTO_DECIDE carve-out (#1740 )	2026-05-26 23:43:07 -07:00
browse.ts	fix: avoid tilde-in-assignment to silence Claude Code permission prompts (#993 )	2026-04-16 14:49:56 -07:00
codex-helpers.ts	feat: Factory Droid compatibility — works across Claude Code, Codex, and Factory (v0.13.5.0) (#621 )	2026-03-29 08:57:34 -07:00
composition.ts	feat: composable skills — INVOKE_SKILL resolver + factoring infrastructure (v0.13.7.0) (#644 )	2026-03-29 23:35:17 -06:00
confidence.ts	v1.43.2.0 fix wave: post-Daegu paper-cut — 18 fixes, 28 bisect commits (#1642 )	2026-05-21 21:21:07 -07:00
constants.ts	feat(v1.3.0.0): open agents learnings + cross-model benchmark skill (#1040 )	2026-04-19 17:50:31 +08:00
design.ts	v1.45.0.0 feat(design): persistent board daemon — 24h boards, one tab, board history (#1710 )	2026-05-25 20:45:12 -07:00
dx.ts	feat: /plan-devex-review + /devex-review — DX review skills (v0.15.3.0) (#784 )	2026-04-03 16:22:57 -07:00
gbrain.ts	v1.42.0.0 Daegu wave: 23 community-filed bugs + PTY classifier enforcement (24 bisect commits) (#1594 )	2026-05-20 07:35:01 -07:00
index.ts	v1.46.0.0 feat: gstack v2 foundation — catalog tokens drop 56%, eval-first floor covers all 51 skills (#1712 )	2026-05-26 16:50:03 -07:00
learnings.ts	v1.33.1.0 fix(learnings): token-OR query + task-shaped retrieval in 3 long skills (#1442 )	2026-05-11 19:34:33 -07:00
make-pdf.ts	feat(v1.4.0.0): /make-pdf — markdown to publication-quality PDFs (#1086 )	2026-04-20 13:20:30 +08:00
model-overlay.ts	feat(v1.10.1.0): overlay efficacy harness + Opus 4.7 fanout nudge removal (#1166 )	2026-04-23 18:42:58 -07:00
preamble.ts	v1.46.0.0 feat: gstack v2 foundation — catalog tokens drop 56%, eval-first floor covers all 51 skills (#1712 )	2026-05-26 16:50:03 -07:00
question-tuning.ts	v1.15.0.0 feat: slim preamble + real-PTY plan-mode E2E harness (#1215 )	2026-04-26 13:55:13 -07:00
review-army.ts	v1.42.0.0 Daegu wave: 23 community-filed bugs + PTY classifier enforcement (24 bisect commits) (#1594 )	2026-05-20 07:35:01 -07:00
review.ts	v1.42.0.0 Daegu wave: 23 community-filed bugs + PTY classifier enforcement (24 bisect commits) (#1594 )	2026-05-20 07:35:01 -07:00
tasks-section.ts	v1.38.1.0 fix wave: surrogate-safe page captures (#1440 ), Implementation Tasks across review skills (#1454 ), root-level artifact patterns (#1452 ) (#1504 )	2026-05-14 21:46:50 -07:00
testing.ts	feat(v1.3.0.0): open agents learnings + cross-model benchmark skill (#1040 )	2026-04-19 17:50:31 +08:00
types.ts	v1.46.0.0 feat: gstack v2 foundation — catalog tokens drop 56%, eval-first floor covers all 51 skills (#1712 )	2026-05-26 16:50:03 -07:00
utility.ts	feat(v1.5.2.0): Opus 4.7 migration — model overlay, voice, routing (#1117 )	2026-04-22 01:06:22 -07:00