gstack/test
Garry Tan 029a7c2a37
feat: eval-watch dashboard + observability unit tests (15 tests, 11 codepaths)
eval-watch: live terminal dashboard reads heartbeat + partial file every 1s,
shows completed/running tests, stale detection (>10min), --tail flag for
progress.log tail. Pure renderDashboard() function for testability.

observability.test.ts: unit tests for sanitizeTestName, heartbeat schema,
progress.log format, NDJSON file naming, savePartial() with _partial flag,
finalize() cleanup, diagnostic fields, watcher rendering, stale detection,
and non-fatal I/O guarantees.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 11:04:40 -05:00
..
fixtures fix: 100% E2E pass — isolate test dirs, restart server, relax FP thresholds 2026-03-14 07:17:17 -05:00
helpers feat: eval-watch dashboard + observability unit tests (15 tests, 11 codepaths) 2026-03-14 11:04:40 -05:00
gen-skill-docs.test.ts feat: template-ify all skills + E2E tests for plan-ceo-review, plan-eng-review, retro 2026-03-14 07:28:02 -05:00
skill-e2e.test.ts feat: wire runId + testName + diagnostics through all E2E tests 2026-03-14 11:04:28 -05:00
skill-llm-eval.test.ts fix: lower planted-bug detection baselines and LLM judge thresholds for reliability 2026-03-14 05:16:17 -05:00
skill-parser.test.ts feat: SKILL.md template system, 3-tier testing, DX tools (v0.3.3) (#41) 2026-03-13 21:08:12 -07:00
skill-validation.test.ts fix: remove false-positive Exit code 1 pattern, fix NEEDS_SETUP test, update QA tests 2026-03-14 04:48:35 -05:00