gstack

History

Garry Tan bc6040fda7 feat(spec): expansions — flags, archive, quality gate, plan-mode-aware Phase 5, /ship integration, tests Builds on the @jayzalowitz foundation (commit `a4e6ee38`) with the full expansion set from CEO + Eng + DX review (24 user decisions + 23 of 28 codex adversarial findings). spec/SKILL.md.tmpl additions: - Flag reference table (--dedupe / --no-gate / --audit / --execute / --no-execute / --file-only / --plan-file / --sync-archive). - Phase 1b --dedupe (default ON): gh issue list --search with graceful skip on gh-not-installed / unauthed / rate-limited / other errors. AskUserQuestion when matches found (merge / file-new / cancel). - Phase 3 HARD requirement: agent MUST grep/read at least one piece of evidence before asking. Project-level fallback prose for prompts with no concrete file mapping. Greenfield escape clause. - Phase 4.5 quality gate (default ON): codex adversarial dispatch with fail-closed redaction (AWS/GitHub/Anthropic/OpenAI/private-key regex), hard <<<USER_SPEC>>> delimiters + instruction boundary (prompt-injection defense), score 0-10 with <7 block, up to 3 iterations, AskUserQuestion escape on persistent <7 (ship anyway / save draft / one more try). - Phase 5 plan-mode-aware dispatch: reads GSTACK_PLAN_MODE env. Active → file-only + load into plan file. Inactive → file + --execute spawn by default. CLI overrides for explicit control. - Archive block via eval $(gstack-paths) → $GSTACK_STATE_ROOT/projects/ $SLUG/specs/<datetime>-<pid>-<slug>.md. Atomic .tmp/mv write. Sync excluded by default; --sync-archive to opt in. - --execute path: dirty-worktree gate (porcelain check + 3-option AUQ continue/stash/cancel), TOCTOU re-check after AUQ answer, SHA pin via git rev-parse HEAD, unique branch spec/<slug>-$$ + PID-suffixed worktree, mandatory final-confirm gate, stash policy with restore safety (preserve ref, never auto-drop). - TTHW timestamps captured at Phase 1 / first citation / file-or-spawn, emitted as ttfc_ms + tthw_ms in preamble telemetry envelope. Cross-system plumbing: - scripts/resolvers/preamble/generate-preamble-bash.ts: emit GSTACK_PLAN_MODE=active\|inactive based on CLAUDE_PLAN_FILE presence. - scripts/resolvers/preamble/generate-routing-injection.ts: add /spec to the routing block injected into project CLAUDE.md. - ship/SKILL.md.tmpl: new "Linked Spec" PR-body section. Reads archive frontmatter spec_issue_number and adds Closes #N when full delivery confirmed by existing plan-completion gate (codex F4 — conditional). Branch-name inference NOT used (codex F3 — fragile under rebase). Tests (W7): - test/spec-template-invariants.test.ts: 35 deterministic assertions covering Phase 1 hard gate, Phase 3 hard-grep mandate, --dedupe graceful-skip paths, --execute race + security hardening (TOCTOU, SHA pin, unique branch), quality-gate redaction + BLOCKED path, archive atomic write + sync exclusion, plan-mode-aware Phase 5. - test/spec-template-sync.test.ts: regen + byte-identical check. - test/skill-e2e-spec-execute.test.ts (periodic-tier scaffold). - test/skill-llm-eval-spec.test.ts (periodic-tier scaffold). - test/helpers/touchfiles.ts: register both periodics in E2E_TIERS + LLM_JUDGE_TOUCHFILES. 37/37 /spec tests pass. Full bun test exit 0 (pre-existing url-validation timeout unrelated to /spec). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-26 18:44:29 -07:00
..
providers	v1.30.0.0 fix wave: 21 community PRs + Windows CI extension + codex flag-semantics smoke (#1391 )	2026-05-09 08:06:47 -07:00
agent-sdk-runner.ts	v1.24.0.0 feat: cross-platform hardening — curated Windows lane + Bun.which resolver + path-portability helper (#1252 )	2026-05-01 07:21:28 -07:00
benchmark-judge.ts	feat(v1.3.0.0): open agents learnings + cross-model benchmark skill (#1040 )	2026-04-19 17:50:31 +08:00
benchmark-runner.ts	feat(v1.3.0.0): open agents learnings + cross-model benchmark skill (#1040 )	2026-04-19 17:50:31 +08:00
claude-pty-runner.ts	v1.31.0.0 fix: delete AskUserQuestion fallback (root cause of forever war) + harness primitives (#1390 )	2026-05-09 17:01:13 -07:00
claude-pty-runner.unit.test.ts	v1.31.0.0 fix: delete AskUserQuestion fallback (root cause of forever war) + harness primitives (#1390 )	2026-05-09 17:01:13 -07:00
codex-session-runner.ts	fix: enforce Codex 1024-char description limit + auto-heal stale installs (v0.11.9.0) (#391 )	2026-03-23 08:44:08 -07:00
e2e-helpers.ts	v1.39.2.0 feat: GSTACK_* env-shim for Conductor + gbrain/gstack setup docs (#1534 )	2026-05-16 12:32:33 -07:00
eval-store.test.ts	feat: QA restructure, browser ref staleness, eval efficiency metrics (v0.4.0) (#83 )	2026-03-15 23:55:39 -05:00
eval-store.ts	v1.32.0.0 fix wave: 7 community PRs + 5 gate-eval hardenings (#1431 )	2026-05-11 12:16:26 -07:00
gemini-session-runner.test.ts	feat: Gemini CLI E2E tests (v0.9.2.0) (#252 )	2026-03-20 08:30:09 -07:00
gemini-session-runner.ts	feat: Gemini CLI E2E tests (v0.9.2.0) (#252 )	2026-03-20 08:30:09 -07:00
llm-judge.ts	v1.25.1.0 fix: office-hours Phase 4 STOP gate + AskUserQuestion recommendation judge (#1296 )	2026-05-01 19:51:51 -07:00
observability.test.ts	fix: never clean up observability artifacts — partial file persists after finalize	2026-03-14 12:37:38 -05:00
pricing.ts	feat(v1.3.0.0): open agents learnings + cross-model benchmark skill (#1040 )	2026-04-19 17:50:31 +08:00
secret-sink-harness.ts	v1.12.0.0 feat: /setup-gbrain — coding-agent onboarding for gbrain (#1183 )	2026-04-24 01:38:21 -07:00
session-runner.test.ts	feat: stream-json NDJSON parser for real-time E2E progress	2026-03-14 03:49:36 -05:00
session-runner.ts	fix(checkpoint): rename /checkpoint → /context-save + /context-restore (v1.0.1.0) (#1064 )	2026-04-19 08:38:19 +08:00
skill-parser.ts	feat: content security — 4-layer prompt injection defense for pair-agent (#815 )	2026-04-06 14:41:06 -07:00
tool-map.ts	feat(v1.3.0.0): open agents learnings + cross-model benchmark skill (#1040 )	2026-04-19 17:50:31 +08:00
touchfiles.ts	feat(spec): expansions — flags, archive, quality gate, plan-mode-aware Phase 5, /ship integration, tests	2026-05-26 18:44:29 -07:00