gstack/test/fixtures
Garry Tan 1a4f0c9c15
v1.33.1.0 fix(learnings): token-OR query + task-shaped retrieval in 3 long skills (#1442)
* fix(learnings): use token-OR matching in gstack-learnings-search --query

Split the query on whitespace into tokens; a learning matches if ANY
token appears as a substring in ANY of key/insight/files. Previously
the whole query was a single substring, so multi-word queries like
"debug investigation" only matched learnings whose insight contained
that exact contiguous phrase, which is usually nothing.

Whitespace-only query falls through to no-query (matches today's no-flag
behavior). Single-word queries behave exactly as before.

Adds test/gstack-learnings-search.test.ts: 3 assertions covering
multi-token, single-token, and no-query backwards compat.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(resolver): parameterized LEARNINGS_SEARCH with shell-injection guard

The {{LEARNINGS_SEARCH}} macro now accepts a query=KEYWORD argument that
gets interpolated as --query "<keyword>" into the generated bash. Empty
value falls through to no-query (principle of least surprise: a stray
{{LEARNINGS_SEARCH:query=}} placeholder gets today's behavior, not a
build failure). Pattern reuses the parameterized-macro parsing from
composition.ts. The 13 templates that don't pass a query stay
byte-identical in their generated SKILL.md output.

Shell-injection guard: the query value is whitelisted to
^[A-Za-z0-9 _-]+$ at gen-skill-docs time. Any \$(), backticks,
semicolons, or quotes throw a loud build error instead of emitting
executable bash. Static template queries are safe by inspection;
this defends against future contributors writing dangerous values.

Adds 5 assertions to test/gen-skill-docs.test.ts covering no-args,
claude+query=foo bar on both cross-project and project-scoped branches,
codex host variant, empty value semantics, and shell-injection payloads
(\$(whoami), backticks, ;, &, ", \\, \$x) throwing build errors.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(skills): task-shaped queries + mid-flow refresh in /investigate /qa /ship

The three long skills now pull learnings keyed to their theme at the
top, then re-pull at phase boundaries as work shifts to new sub-tasks.

Top-of-skill queries (5-6 token unions, token-OR matched):
- investigate: "debug investigation root cause hypothesis bug fix"
- qa: "qa testing bug regression flake fixture"
- ship: "release ship version changelog merge pr"

Mid-flow refresh blocks (concrete keyword recipe + worked examples):
- investigate: between Phase 1 (hypothesis) and Phase 2 (analysis),
  keyed to the hypothesis noun. Examples: auth-cookie, session-expiry.
- qa: between Phase 7 (triage) and Phase 8 (fix loop), keyed to the
  buggy component name. Examples: checkout-button, signup-form.
- ship: just before Step 12 (VERSION bump), keyed to the headline
  feature. Examples: learnings-search, pacing, worktree-ship.

Keyword recipe enforces alphanumeric+hyphen only (no quotes, slashes,
dots, colons) so dynamic queries cannot inject shell metacharacters.

The other 13 short-lived skills keep the bare {{LEARNINGS_SEARCH}} form.
Backwards-compat verified via diff: their generated SKILL.md output is
byte-identical to before this change.

Golden ship fixtures regenerated to match the new ship/SKILL.md output.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: bump version and changelog (v1.33.1.0)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test: refresh codex+factory ship golden fixtures

Follow-up to 513c9660 — the codex and factory host outputs needed
regeneration too, missed in the initial commit because gen:skill-docs
was only run for the claude host. Now matches gen:skill-docs --host all.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 19:34:33 -07:00
..
golden v1.33.1.0 fix(learnings): token-OR query + task-shaped retrieval in 3 long skills (#1442) 2026-05-11 19:34:33 -07:00
mode-posture feat: mode-posture energy fix for /plan-ceo-review and /office-hours (v1.1.2.0) (#1065) 2026-04-19 05:44:39 +08:00
plans v1.15.0.0 feat: slim preamble + real-PTY plan-mode E2E harness (#1215) 2026-04-26 13:55:13 -07:00
coverage-audit-fixture.ts feat: test coverage catalog — shared audit across plan/ship/review (v0.10.1.0) (#259) 2026-03-22 11:28:16 -07:00
eval-baselines.json fix: rewrite session-runner to claude -p subprocess, lower flaky baselines 2026-03-14 02:34:10 -05:00
forcing-finding-seeds.ts v1.31.0.0 fix: delete AskUserQuestion fallback (root cause of forever war) + harness primitives (#1390) 2026-05-09 17:01:13 -07:00
golden-ship-claude.md fix: community security wave — 8 PRs, 4 contributors (v0.15.13.0) (#847) 2026-04-06 00:47:04 -07:00
overlay-nudges.ts feat(v1.10.1.0): overlay efficacy harness + Opus 4.7 fanout nudge removal (#1166) 2026-04-23 18:42:58 -07:00
qa-eval-checkout-ground-truth.json fix: 100% E2E pass — isolate test dirs, restart server, relax FP thresholds 2026-03-14 07:17:17 -05:00
qa-eval-ground-truth.json fix: 100% E2E pass — isolate test dirs, restart server, relax FP thresholds 2026-03-14 07:17:17 -05:00
qa-eval-spa-ground-truth.json fix: 100% E2E pass — isolate test dirs, restart server, relax FP thresholds 2026-03-14 07:17:17 -05:00
review-army-migration.sql feat: Review Army — parallel specialist reviewers for /review (v0.14.3.0) (#692) 2026-03-30 22:07:50 -06:00
review-army-n-plus-one.rb feat: Review Army — parallel specialist reviewers for /review (v0.14.3.0) (#692) 2026-03-30 22:07:50 -06:00
review-eval-design-slop.css feat: design review lite in /review and /ship + gstack-diff-scope (v0.6.3) (#142) 2026-03-17 20:12:55 -05:00
review-eval-design-slop.html feat: design review lite in /review and /ship + gstack-diff-scope (v0.6.3) (#142) 2026-03-17 20:12:55 -05:00
review-eval-enum-diff.rb feat: contributor mode, session awareness, recommendation format (#90) 2026-03-16 01:45:50 -05:00
review-eval-enum.rb feat: contributor mode, session awareness, recommendation format (#90) 2026-03-16 01:45:50 -05:00
review-eval-vuln.rb feat: 3-tier eval suite with planted-bug outcome testing (EVALS=1) 2026-03-14 01:17:36 -05:00