mirror of https://github.com/garrytan/gstack.git
Never passed reliably. Tests ambiguous routing ("think bigger" →
plan-ceo-review) but Claude legitimately answers directly instead
of invoking a skill. The other 10 journey tests cover routing
with clear, actionable signals.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| codex-session-runner.ts | ||
| e2e-helpers.ts | ||
| eval-store.test.ts | ||
| eval-store.ts | ||
| gemini-session-runner.test.ts | ||
| gemini-session-runner.ts | ||
| llm-judge.ts | ||
| observability.test.ts | ||
| session-runner.test.ts | ||
| session-runner.ts | ||
| skill-parser.ts | ||
| touchfiles.ts | ||