mirror of https://github.com/garrytan/gstack.git
Two periodic-tier E2E tests exercising the preamble's privacy gate
end-to-end via the Agent SDK + canUseTool. Previously uncovered:
- Positive: stages a fake gbrain on PATH + gbrain_sync_mode_prompted=false
in config, runs a real skill, intercepts tool-use. Asserts the
preamble fires a 3-option AskUserQuestion matching the canonical
prose ("publish session memory" / "artifact" / "decline") and does
NOT fire a second time in the same run (idempotency within session).
- Negative: same staging but prompted=true. Asserts the gate stays
silent even with gbrain detected on the host.
Registered in test/helpers/touchfiles.ts as `brain-privacy-gate`
(periodic) with dependency tracking on generate-brain-sync-block.ts,
the three gstack-brain-* bins, gstack-config, and the Agent SDK runner.
Diff-based selection re-runs the E2E when any of those change.
Cost: ~$0.30-$0.50 per run. Only fires under EVALS=1 EVALS_TIER=periodic;
gate tier stays free.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| providers | ||
| agent-sdk-runner.ts | ||
| benchmark-judge.ts | ||
| benchmark-runner.ts | ||
| codex-session-runner.ts | ||
| e2e-helpers.ts | ||
| eval-store.test.ts | ||
| eval-store.ts | ||
| gemini-session-runner.test.ts | ||
| gemini-session-runner.ts | ||
| llm-judge.ts | ||
| observability.test.ts | ||
| plan-mode-handshake-helpers.ts | ||
| pricing.ts | ||
| secret-sink-harness.ts | ||
| session-runner.test.ts | ||
| session-runner.ts | ||
| skill-parser.ts | ||
| tool-map.ts | ||
| touchfiles.ts | ||