gstack

History

Garry Tan 3492f98e82 docs: add E2E eval failure blame protocol "Not related to our changes" is an extraordinary claim that requires extraordinary proof. When evals fail during /ship: 1. Run the same eval on main — prove it fails there too 2. If it passes on main, it IS your change — trace the blame 3. If you can't verify, say "unverified" not "pre-existing" Added to CLAUDE.md and as a comment in skill-e2e.test.ts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>		2026-03-16 11:18:27 -05:00
..
fixtures	feat: contributor mode, session awareness, recommendation format (#90 )	2026-03-16 01:45:50 -05:00
helpers	feat: QA restructure, browser ref staleness, eval efficiency metrics (v0.4.0) (#83 )	2026-03-15 23:55:39 -05:00
gen-skill-docs.test.ts	fix: dynamic base branch detection across all SKILL templates (v0.3.10) (#81 )	2026-03-16 10:59:13 -05:00
skill-e2e.test.ts	docs: add E2E eval failure blame protocol	2026-03-16 11:18:27 -05:00
skill-llm-eval.test.ts	fix: lower planted-bug detection baselines and LLM judge thresholds for reliability	2026-03-14 05:16:17 -05:00
skill-parser.test.ts	feat: SKILL.md template system, 3-tier testing, DX tools (v0.3.3) (#41 )	2026-03-13 21:08:12 -07:00
skill-validation.test.ts	test: add deterministic contributor mode preamble validation	2026-03-16 11:17:47 -05:00