gstack/test/helpers
Garry Tan ed0e00daab
test(touchfiles): register Phase 4 + judge-fixture entries, add llm-judge dep to format tests
Two new entries:
- office-hours-phase4-fork (periodic) — for the silent-auto-decide regression
- llm-judge-recommendation (periodic) — for the judge rubric fixture test

Plus extend the four plan-{ceo,eng}-review-format-* entries with
test/helpers/llm-judge.ts so rubric tweaks invalidate the wired-in tests.

Verified by simulation that surgical office-hours/SKILL.md.tmpl changes fire
office-hours-auto-mode + office-hours-phase4-fork without over-firing
llm-judge-recommendation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 14:18:35 -07:00
..
providers v1.24.0.0 feat: cross-platform hardening — curated Windows lane + Bun.which resolver + path-portability helper (#1252) 2026-05-01 07:21:28 -07:00
agent-sdk-runner.ts v1.24.0.0 feat: cross-platform hardening — curated Windows lane + Bun.which resolver + path-portability helper (#1252) 2026-05-01 07:21:28 -07:00
benchmark-judge.ts feat(v1.3.0.0): open agents learnings + cross-model benchmark skill (#1040) 2026-04-19 17:50:31 +08:00
benchmark-runner.ts feat(v1.3.0.0): open agents learnings + cross-model benchmark skill (#1040) 2026-04-19 17:50:31 +08:00
claude-pty-runner.ts v1.25.0.0 fix: AskUserQuestion resolves to host MCP variant when native is disallowed (#1287) 2026-05-01 08:45:36 -07:00
claude-pty-runner.unit.test.ts v1.21.1.0 test: tighten plan-ceo-review smoke (Step 0 must fire) (#1255) 2026-04-30 02:50:09 -07:00
codex-session-runner.ts fix: enforce Codex 1024-char description limit + auto-heal stale installs (v0.11.9.0) (#391) 2026-03-23 08:44:08 -07:00
e2e-helpers.ts feat: remove trigger guard + proactive opt-out prompt (#457) 2026-03-24 18:07:36 -07:00
eval-store.test.ts feat: QA restructure, browser ref staleness, eval efficiency metrics (v0.4.0) (#83) 2026-03-15 23:55:39 -05:00
eval-store.ts v1.15.0.0 feat: slim preamble + real-PTY plan-mode E2E harness (#1215) 2026-04-26 13:55:13 -07:00
gemini-session-runner.test.ts feat: Gemini CLI E2E tests (v0.9.2.0) (#252) 2026-03-20 08:30:09 -07:00
gemini-session-runner.ts feat: Gemini CLI E2E tests (v0.9.2.0) (#252) 2026-03-20 08:30:09 -07:00
llm-judge.ts test(helpers): add judgeRecommendation with deterministic regex + Haiku rubric 2026-05-01 14:17:58 -07:00
observability.test.ts fix: never clean up observability artifacts — partial file persists after finalize 2026-03-14 12:37:38 -05:00
pricing.ts feat(v1.3.0.0): open agents learnings + cross-model benchmark skill (#1040) 2026-04-19 17:50:31 +08:00
secret-sink-harness.ts v1.12.0.0 feat: /setup-gbrain — coding-agent onboarding for gbrain (#1183) 2026-04-24 01:38:21 -07:00
session-runner.test.ts feat: stream-json NDJSON parser for real-time E2E progress 2026-03-14 03:49:36 -05:00
session-runner.ts fix(checkpoint): rename /checkpoint → /context-save + /context-restore (v1.0.1.0) (#1064) 2026-04-19 08:38:19 +08:00
skill-parser.ts feat: content security — 4-layer prompt injection defense for pair-agent (#815) 2026-04-06 14:41:06 -07:00
tool-map.ts feat(v1.3.0.0): open agents learnings + cross-model benchmark skill (#1040) 2026-04-19 17:50:31 +08:00
touchfiles.ts test(touchfiles): register Phase 4 + judge-fixture entries, add llm-judge dep to format tests 2026-05-01 14:18:35 -07:00