gstack/scripts
Garry Tan ed802d0c7f
feat: eval CLI tools + docs cleanup
Add eval:list, eval:compare, eval:summary CLI scripts for exploring
eval history from ~/.gstack-dev/evals/. eval:compare reuses the shared
comparison functions from eval-store.ts.

- eval:list: sorted table with branch/tier/cost filters
- eval:compare: thin wrapper around compareEvalResults + formatComparison
- eval:summary: aggregate stats, flaky test detection, branch rankings
- Remove unused @anthropic-ai/claude-agent-sdk from devDependencies
- Update CLAUDE.md: streaming docs, eval CLI commands, remove Agent SDK refs
- Add GH Actions eval upload (P2) and web dashboard (P3) to TODOS.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 03:49:57 -05:00
..
dev-skill.ts feat: SKILL.md template system, 3-tier testing, DX tools (v0.3.3) (#41) 2026-03-13 21:08:12 -07:00
eval-compare.ts feat: eval CLI tools + docs cleanup 2026-03-14 03:49:57 -05:00
eval-list.ts feat: eval CLI tools + docs cleanup 2026-03-14 03:49:57 -05:00
eval-summary.ts feat: eval CLI tools + docs cleanup 2026-03-14 03:49:57 -05:00
gen-skill-docs.ts fix: browse binary discovery broken for agents (v0.3.5) (#44) 2026-03-14 00:24:06 -07:00
skill-check.ts feat: SKILL.md template system, 3-tier testing, DX tools (v0.3.3) (#41) 2026-03-13 21:08:12 -07:00