gstack/test
Garry Tan f9cfabeda8
feat: add E2E observability — heartbeat, progress.log, NDJSON persistence, savePartial()
session-runner: atomic heartbeat file (e2e-live.json), per-run log directory
(~/.gstack-dev/e2e-runs/{runId}/), progress.log + per-test NDJSON persistence,
failure transcripts to persistent run dir instead of tmpdir.

eval-store: 3 new diagnostic fields (exit_reason, timeout_at_turn, last_tool_call),
savePartial() writes _partial-e2e.json after each addTest() for crash resilience,
finalize() cleans up partial file.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 11:04:16 -05:00
..
fixtures fix: 100% E2E pass — isolate test dirs, restart server, relax FP thresholds 2026-03-14 07:17:17 -05:00
helpers feat: add E2E observability — heartbeat, progress.log, NDJSON persistence, savePartial() 2026-03-14 11:04:16 -05:00
gen-skill-docs.test.ts feat: template-ify all skills + E2E tests for plan-ceo-review, plan-eng-review, retro 2026-03-14 07:28:02 -05:00
skill-e2e.test.ts fix: plan-ceo-review timeout — init git repo, skip codebase exploration, bump to 420s 2026-03-14 08:39:26 -05:00
skill-llm-eval.test.ts fix: lower planted-bug detection baselines and LLM judge thresholds for reliability 2026-03-14 05:16:17 -05:00
skill-parser.test.ts feat: SKILL.md template system, 3-tier testing, DX tools (v0.3.3) (#41) 2026-03-13 21:08:12 -07:00
skill-validation.test.ts fix: remove false-positive Exit code 1 pattern, fix NEEDS_SETUP test, update QA tests 2026-03-14 04:48:35 -05:00