gstack/test
Garry Tan 7c1d31a25e
test: LLM-judge for 10 skills + gstack-upgrade E2E
Add LLM-judge quality evals for all uncovered skills using a DRY
runWorkflowJudge helper with section marker guards. Add real E2E
test for gstack-upgrade using mock git remote (replaces test.todo).
Add plan-edit assertion to plan-design-review E2E.

14/15 skills now at full coverage. setup-browser-cookies remains
deferred (needs real browser).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 20:42:35 -07:00
..
fixtures feat: design review lite in /review and /ship + gstack-diff-scope (v0.6.3) (#142) 2026-03-17 20:12:55 -05:00
helpers test: validation + touchfile entries for 100% coverage 2026-03-17 20:42:32 -07:00
gen-skill-docs.test.ts refactor: rename qa-design-review → design-review 2026-03-17 20:23:05 -07:00
skill-e2e.test.ts test: LLM-judge for 10 skills + gstack-upgrade E2E 2026-03-17 20:42:35 -07:00
skill-llm-eval.test.ts test: LLM-judge for 10 skills + gstack-upgrade E2E 2026-03-17 20:42:35 -07:00
skill-parser.test.ts feat: SKILL.md template system, 3-tier testing, DX tools (v0.3.3) (#41) 2026-03-13 21:08:12 -07:00
skill-validation.test.ts test: validation + touchfile entries for 100% coverage 2026-03-17 20:42:32 -07:00
touchfiles.test.ts refactor: rename qa-design-review → design-review 2026-03-17 20:23:05 -07:00