mirror of https://github.com/garrytan/gstack.git
Periodic-tier E2E test that catches the original failure mode the user complained about: 5+ options for ONE decision must split into N sequential AskUserQuestion calls, not drop one to fit Conductor's 4-option cap. Fixture: 5 independent chat-platform integration candidates (Slack/Discord/Teams/Telegram/Mattermost), each carrying its own include/defer/cut decision. Floor = 4 review-phase AUQs (standard [N-1] tolerance band). Pre-fix "drop to 4 + 1 dropped" fails this floor. Wired into test/helpers/touchfiles.ts: tier periodic, depends on plan-ceo-review/**, the new preamble subsection, the question-pref binary (for the carve-out), and the runner helper. touchfiles.test.ts expected count bumped 21 → 22 to account for the new entry. Cost: ~$0.30/run when EVALS_TIER=periodic. Skips silently otherwise. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| golden | ||
| ios-qa/FixtureApp | ||
| mode-posture | ||
| plans | ||
| coverage-audit-fixture.ts | ||
| eval-baselines.json | ||
| forcing-finding-seeds.ts | ||
| golden-ship-claude.md | ||
| overlay-nudges.ts | ||
| qa-eval-checkout-ground-truth.json | ||
| qa-eval-ground-truth.json | ||
| qa-eval-spa-ground-truth.json | ||
| review-army-migration.sql | ||
| review-army-n-plus-one.rb | ||
| review-eval-design-slop.css | ||
| review-eval-design-slop.html | ||
| review-eval-enum-diff.rb | ||
| review-eval-enum.rb | ||
| review-eval-vuln.rb | ||