gstack

History

Garry Tan 17c1c06cd9 feat: diff-based test selection for E2E and LLM-judge evals (v0.6.1.0) (#139 ) * feat: diff-based test selection for E2E and LLM-judge evals Each test declares file dependencies in a TOUCHFILES map. The test runner checks git diff against the base branch and only runs tests whose dependencies were modified. Global touchfiles (session-runner, eval-store, gen-skill-docs) trigger all tests. New scripts: test:e2e:all, test:evals:all, eval:select Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: bump version and changelog (v0.6.1.0) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: plan-design-review-audit eval — bump turns to 30, add efficiency hints The test was flaky at 20 turns because the agent reads a 300-line SKILL.md, navigates, extracts design data, and writes a report. Added hints to skip preamble/batch commands/write early while still testing the real SKILL.md. Now completes in ~13 turns consistently. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>		2026-03-17 18:45:41 -05:00
..
fixtures	feat: contributor mode, session awareness, recommendation format (#90 )	2026-03-16 01:45:50 -05:00
helpers	feat: diff-based test selection for E2E and LLM-judge evals (v0.6.1.0) (#139 )	2026-03-17 18:45:41 -05:00
gen-skill-docs.test.ts	feat: SELECTIVE EXPANSION + smarter ship gates (v0.5.3) (#134 )	2026-03-17 12:22:10 -05:00
skill-e2e.test.ts	feat: diff-based test selection for E2E and LLM-judge evals (v0.6.1.0) (#139 )	2026-03-17 18:45:41 -05:00
skill-llm-eval.test.ts	feat: diff-based test selection for E2E and LLM-judge evals (v0.6.1.0) (#139 )	2026-03-17 18:45:41 -05:00
skill-parser.test.ts	feat: SKILL.md template system, 3-tier testing, DX tools (v0.3.3) (#41 )	2026-03-13 21:08:12 -07:00
skill-validation.test.ts	feat: Completeness Principle — Boil the Lake (v0.6.1) (#140 )	2026-03-17 16:34:08 -05:00
touchfiles.test.ts	feat: diff-based test selection for E2E and LLM-judge evals (v0.6.1.0) (#139 )	2026-03-17 18:45:41 -05:00