mirror of https://github.com/garrytan/gstack.git
- 4 E2E tests: land-and-deploy (Fly.io detection + deploy report), canary (monitoring report structure), benchmark (perf report schema), setup-deploy (platform detection → CLAUDE.md config) - 4 LLM-judge evals: workflow quality for all 4 new skills - Touchfile entries for diff-based test selection (E2E + LLM-judge) - 460 free tests pass, 0 fail Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| codex-session-runner.ts | ||
| eval-store.test.ts | ||
| eval-store.ts | ||
| llm-judge.ts | ||
| observability.test.ts | ||
| session-runner.test.ts | ||
| session-runner.ts | ||
| skill-parser.ts | ||
| touchfiles.ts | ||