gstack

History

Garry Tan 0ac7ef4e81 fix: harden planted-bug eval prompt for reliable form testing Phase 3 was too vague ("click every nav link") causing the agent to wander instead of systematically testing form fields. Now explicitly directs: fill every input, clear it, try invalid values, submit and check console. Added Phase 4 finalize step to ensure report is updated with all findings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>		2026-03-14 13:28:18 -05:00
..
fixtures	fix: 100% E2E pass — isolate test dirs, restart server, relax FP thresholds	2026-03-14 07:17:17 -05:00
helpers	fix: auto-clear stale heartbeat when process is dead	2026-03-14 12:55:40 -05:00
gen-skill-docs.test.ts	feat: template-ify all skills + E2E tests for plan-ceo-review, plan-eng-review, retro	2026-03-14 07:28:02 -05:00
skill-e2e.test.ts	fix: harden planted-bug eval prompt for reliable form testing	2026-03-14 13:28:18 -05:00
skill-llm-eval.test.ts	fix: lower planted-bug detection baselines and LLM judge thresholds for reliability	2026-03-14 05:16:17 -05:00
skill-parser.test.ts	feat: SKILL.md template system, 3-tier testing, DX tools (v0.3.3) (#41 )	2026-03-13 21:08:12 -07:00
skill-validation.test.ts	fix: remove false-positive Exit code 1 pattern, fix NEEDS_SETUP test, update QA tests	2026-03-14 04:48:35 -05:00