gstack

History

Garry Tan c13cee6939 test(e2e): privacy-gate AskUserQuestion fires from preamble (periodic tier) Two periodic-tier E2E tests exercising the preamble's privacy gate end-to-end via the Agent SDK + canUseTool. Previously uncovered: - Positive: stages a fake gbrain on PATH + gbrain_sync_mode_prompted=false in config, runs a real skill, intercepts tool-use. Asserts the preamble fires a 3-option AskUserQuestion matching the canonical prose ("publish session memory" / "artifact" / "decline") and does NOT fire a second time in the same run (idempotency within session). - Negative: same staging but prompted=true. Asserts the gate stays silent even with gbrain detected on the host. Registered in test/helpers/touchfiles.ts as `brain-privacy-gate` (periodic) with dependency tracking on generate-brain-sync-block.ts, the three gstack-brain-* bins, gstack-config, and the Agent SDK runner. Diff-based selection re-runs the E2E when any of those change. Cost: ~$0.30-$0.50 per run. Only fires under EVALS=1 EVALS_TIER=periodic; gate tier stays free. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-04-24 01:07:30 -07:00
..
providers	feat(v1.3.0.0): open agents learnings + cross-model benchmark skill (#1040 )	2026-04-19 17:50:31 +08:00
agent-sdk-runner.ts	v1.11.1.0 fix: plan-mode handshake + canUseTool test harness (#1182 )	2026-04-24 00:04:53 -07:00
benchmark-judge.ts	feat(v1.3.0.0): open agents learnings + cross-model benchmark skill (#1040 )	2026-04-19 17:50:31 +08:00
benchmark-runner.ts	feat(v1.3.0.0): open agents learnings + cross-model benchmark skill (#1040 )	2026-04-19 17:50:31 +08:00
codex-session-runner.ts	fix: enforce Codex 1024-char description limit + auto-heal stale installs (v0.11.9.0) (#391 )	2026-03-23 08:44:08 -07:00
e2e-helpers.ts	feat: remove trigger guard + proactive opt-out prompt (#457 )	2026-03-24 18:07:36 -07:00
eval-store.test.ts	feat: QA restructure, browser ref staleness, eval efficiency metrics (v0.4.0) (#83 )	2026-03-15 23:55:39 -05:00
eval-store.ts	feat: worktree isolation for E2E tests + infrastructure elegance (v0.11.12.0) (#425 )	2026-03-23 23:05:22 -07:00
gemini-session-runner.test.ts	feat: Gemini CLI E2E tests (v0.9.2.0) (#252 )	2026-03-20 08:30:09 -07:00
gemini-session-runner.ts	feat: Gemini CLI E2E tests (v0.9.2.0) (#252 )	2026-03-20 08:30:09 -07:00
llm-judge.ts	feat: mode-posture energy fix for /plan-ceo-review and /office-hours (v1.1.2.0) (#1065 )	2026-04-19 05:44:39 +08:00
observability.test.ts	fix: never clean up observability artifacts — partial file persists after finalize	2026-03-14 12:37:38 -05:00
plan-mode-handshake-helpers.ts	v1.11.1.0 fix: plan-mode handshake + canUseTool test harness (#1182 )	2026-04-24 00:04:53 -07:00
pricing.ts	feat(v1.3.0.0): open agents learnings + cross-model benchmark skill (#1040 )	2026-04-19 17:50:31 +08:00
secret-sink-harness.ts	feat(test): add secret-sink-harness for negative-space leak testing (D21 #5 )	2026-04-24 00:09:04 -07:00
session-runner.test.ts	feat: stream-json NDJSON parser for real-time E2E progress	2026-03-14 03:49:36 -05:00
session-runner.ts	fix(checkpoint): rename /checkpoint → /context-save + /context-restore (v1.0.1.0) (#1064 )	2026-04-19 08:38:19 +08:00
skill-parser.ts	feat: content security — 4-layer prompt injection defense for pair-agent (#815 )	2026-04-06 14:41:06 -07:00
tool-map.ts	feat(v1.3.0.0): open agents learnings + cross-model benchmark skill (#1040 )	2026-04-19 17:50:31 +08:00
touchfiles.ts	test(e2e): privacy-gate AskUserQuestion fires from preamble (periodic tier)	2026-04-24 01:07:30 -07:00