mirror of https://github.com/garrytan/gstack.git
v1.57.10.0 feat: Codex review default-on across review/ship/plan/docs (#1966)
* feat(config): make codex_reviews the master switch for all Codex review Broaden the codex_reviews doc to describe it governing /review, /ship, /document-release, plan reviews, and /autoplan. Reject invalid values on set (preserving the existing value) so a typo can never silently flip paid Codex calls on or off. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(review): Codex review default-on across review/ship/plan/docs Add a shared codexPreflight() helper (constants.ts) that, in one bash block, reads codex_reviews, sources gstack-codex-probe, checks install + auth, and echoes a single canonical mode (ready/not_installed/not_authed/ disabled). All Codex resolvers route through it. - generateCodexPlanReview: opt-in question removed; the outside voice now runs automatically (default-on), falling back to a Claude subagent when Codex is missing/unauthed. Cross-model tension still gates on user approval (sovereignty preserved). - generateAdversarialStep: probe-based availability (install AND auth), distinct not-installed vs not-authed guidance; 200-line structured-review threshold unchanged. - generateCodexDocReview (new, wired via CODEX_DOC_REVIEW): reviews the release's docs against the shipped diff range, informational + an explicit apply-fixes decision point, never auto-edits. - autoplan Phase 0.5 now honors codex_reviews=disabled so the switch is truly global. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * chore(docs): regenerate SKILL docs + refresh ship golden Output of gen:skill-docs for the Codex-default-on resolver/template changes. Refreshes the factory-ship golden fixture (codex-host output unchanged — resolvers strip for the codex host). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(infra): widen size-budget guards for default-on Codex outside-voice The codexPreflight() block + CODEX_MODE branch prose (replacing the smaller opt-in question) grows plan-ceo/eng/devex-review and review by 5-7% over baseline. Each bump carries a comment justifying it as intentional capability, not slop. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test: guard Codex default-on + config reject-on-set skill-validation: assert plan reviews no longer carry the opt-in question and render the default-on outside-voice, document-release carries the doc review, and the codex host strips all of it. gstack-config: codex_reviews defaults to enabled, accepts enabled/disabled, and rejects an invalid value while preserving the existing one. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(test): align gstack-config tests with defaults-fallback behavior Three tests (last touched v0.13.7.0) asserted get/list print empty for unset keys, but gstack-config falls back to the documented defaults table (get returns the default, list shows the active-values block). Update the assertions to the real behavior and split out an unknown-key case that does still return empty. Pre-existing red, unrelated to codex review. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * v1.57.10.0 feat: Codex review default-on across review/ship/plan/docs Codex cross-model review now runs by default on /review, /ship, all four plan reviews, /document-release, and /autoplan, governed by one master switch (codex_reviews, default enabled). Plan-review outside voice is default-on; /document-release gets a new Codex doc-vs-diff audit; every call site detects install AND auth and falls back to a Claude subagent with a clear reason. Disable everything with: gstack-config set codex_reviews disabled Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
8241949357
commit
a5833c413f
86
CHANGELOG.md
86
CHANGELOG.md
|
|
@ -1,5 +1,91 @@
|
|||
# Changelog
|
||||
|
||||
## [1.57.10.0] - 2026-06-10
|
||||
|
||||
## **Codex review now runs by default everywhere it matters.**
|
||||
## **One switch governs it, and it falls back to Claude when Codex is missing or unauthed.**
|
||||
|
||||
Codex cross-model review used to be inconsistent. `/review` and `/ship` ran it
|
||||
automatically, but plan reviews hid it behind a "Want an outside voice?" question
|
||||
you had to say yes to every time, `/document-release` never ran it at all, and every
|
||||
entry point only checked whether the `codex` binary existed, not whether it was
|
||||
logged in. Now `codex_reviews` is one master switch (default `enabled`) that governs
|
||||
Codex review across `/review`, `/ship`, `/plan-ceo-review`, `/plan-eng-review`,
|
||||
`/plan-design-review`, `/plan-devex-review`, `/document-release`, and `/autoplan`.
|
||||
The plan-review outside voice runs automatically. `/document-release` gets a new
|
||||
Codex pass that checks your docs against what actually shipped. Every call site now
|
||||
detects install AND auth separately, and degrades to a Claude subagent with a clear
|
||||
one-line reason instead of silently skipping. Turn the whole thing off with one
|
||||
command: `gstack-config set codex_reviews disabled`.
|
||||
|
||||
### The numbers that matter
|
||||
|
||||
Verified by the gate-tier E2E evals that exercise these exact paths
|
||||
(`codex-offered-ceo-review`, `codex-offered-eng-review`, `document-release`,
|
||||
`codex-review-findings`), all green this run.
|
||||
|
||||
| Metric | Before | After | Δ |
|
||||
|--------|--------|-------|---|
|
||||
| Skills where Codex review runs by default | 2 | 8 | +6 |
|
||||
| Prompts to get a plan-review outside voice | 1 (opt-in each time) | 0 (automatic) | -1 |
|
||||
| Codex readiness detection | install only | install + auth | sharper |
|
||||
| Master switches to disable it all | 0 (per-skill only) | 1 (`codex_reviews`) | +1 |
|
||||
| `/document-release` Codex doc audit | none | doc-vs-diff pass | new |
|
||||
|
||||
When Codex is installed but not logged in, you used to get nothing on the paths that
|
||||
checked only `command -v codex`. Now you get a named reason ("Codex installed but not
|
||||
authenticated, using Claude subagent") and the review still happens. A typo on the
|
||||
switch (`gstack-config set codex_reviews disabledd`) is rejected and your existing
|
||||
setting is preserved, so a fat-finger can never silently turn paid Codex calls on or
|
||||
off.
|
||||
|
||||
### What this means for you
|
||||
|
||||
If you run gstack day to day, you stop deciding whether to get a second model's eyes
|
||||
on every plan and every release. It is just there, on by default, the way the strong
|
||||
reviewers already worked on diffs. If you do not have Codex set up, nothing breaks:
|
||||
you get the Claude outside voice instead, with a one-line note telling you how to add
|
||||
Codex for true cross-model coverage. If you want it gone, one command turns off all
|
||||
eight surfaces at once.
|
||||
|
||||
### Itemized changes
|
||||
|
||||
#### Added
|
||||
- **`codex_reviews` as the master switch** for Codex review across `/review`, `/ship`,
|
||||
`/document-release`, all four plan reviews, and `/autoplan` (`bin/gstack-config`).
|
||||
Default `enabled`. Invalid values on `set` are rejected with the existing value
|
||||
preserved, so a typo cannot flip paid Codex calls.
|
||||
- **`/document-release` Codex doc audit** (`generateCodexDocReview`): reviews the
|
||||
docs you touched against the release diff for stale claims, undocumented new
|
||||
surface, and over/under-sold CHANGELOG entries. Informational, with an explicit
|
||||
apply-fixes decision point. Never auto-edits docs.
|
||||
- **`codexPreflight()` shared helper** (`scripts/resolvers/constants.ts`): one
|
||||
self-contained bash block that reads the switch, sources the probe, checks install
|
||||
and auth, and emits a single canonical mode (`ready` / `not_installed` /
|
||||
`not_authed` / `disabled`).
|
||||
|
||||
#### Changed
|
||||
- **Plan-review outside voice is default-on**, not opt-in. The "Want an outside
|
||||
voice?" question is gone; it runs automatically and falls back to a Claude subagent
|
||||
when Codex is unavailable. Incorporating its findings still requires your explicit
|
||||
approval (cross-model tension is presented, never auto-applied).
|
||||
- **Adversarial review detects auth, not just install** (`generateAdversarialStep`):
|
||||
distinct "not installed" vs "not authenticated" guidance. The 200-line threshold
|
||||
for the heavier structured `codex review` is unchanged.
|
||||
- **`/autoplan` honors `codex_reviews=disabled`** in its Phase 0.5 preflight, so the
|
||||
switch is truly global.
|
||||
|
||||
#### Fixed
|
||||
- Three `gstack-config` tests asserted `get`/`list` print empty for unset keys; the
|
||||
tool falls back to the documented defaults table. Assertions now match real behavior.
|
||||
|
||||
#### For contributors
|
||||
- Size-budget guards widened for the default-on outside-voice prose, each with a
|
||||
rationale comment (`test/helpers/carve-guards.ts`, `test/helpers/parity-harness.ts`).
|
||||
- Static guards added: plan reviews must not carry the opt-in question and must render
|
||||
the default-on voice; `/document-release` must carry the doc review; the codex host
|
||||
strips all of it (`test/skill-validation.test.ts`).
|
||||
|
||||
## [1.57.9.0] - 2026-06-09
|
||||
|
||||
## **Your gstack checkout stays clean when gbrain is installed.**
|
||||
|
|
|
|||
|
|
@ -1065,11 +1065,17 @@ workflow.
|
|||
|
||||
```bash
|
||||
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off)
|
||||
_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled)
|
||||
source ~/.claude/skills/gstack/bin/gstack-codex-probe
|
||||
|
||||
# Master switch first: codex_reviews=disabled turns off ALL Codex work globally,
|
||||
# including autoplan's own dual-voice orchestration. Honor it before probing.
|
||||
if [ "$_CODEX_CFG" = "disabled" ]; then
|
||||
echo "[codex disabled by config — Claude-only voices] Re-enable: gstack-config set codex_reviews enabled"
|
||||
_CODEX_AVAILABLE=false
|
||||
# Check Codex binary. If missing, tag the degradation matrix and continue
|
||||
# with Claude subagent only (autoplan's existing degradation fallback).
|
||||
if ! command -v codex >/dev/null 2>&1; then
|
||||
elif ! command -v codex >/dev/null 2>&1; then
|
||||
_gstack_codex_log_event "codex_cli_missing"
|
||||
echo "[codex-unavailable: binary not found] — proceeding with Claude subagent only"
|
||||
_CODEX_AVAILABLE=false
|
||||
|
|
|
|||
|
|
@ -243,11 +243,17 @@ workflow.
|
|||
|
||||
```bash
|
||||
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off)
|
||||
_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled)
|
||||
source ~/.claude/skills/gstack/bin/gstack-codex-probe
|
||||
|
||||
# Master switch first: codex_reviews=disabled turns off ALL Codex work globally,
|
||||
# including autoplan's own dual-voice orchestration. Honor it before probing.
|
||||
if [ "$_CODEX_CFG" = "disabled" ]; then
|
||||
echo "[codex disabled by config — Claude-only voices] Re-enable: gstack-config set codex_reviews enabled"
|
||||
_CODEX_AVAILABLE=false
|
||||
# Check Codex binary. If missing, tag the degradation matrix and continue
|
||||
# with Claude subagent only (autoplan's existing degradation fallback).
|
||||
if ! command -v codex >/dev/null 2>&1; then
|
||||
elif ! command -v codex >/dev/null 2>&1; then
|
||||
_gstack_codex_log_event "codex_cli_missing"
|
||||
echo "[codex-unavailable: binary not found] — proceeding with Claude subagent only"
|
||||
_CODEX_AVAILABLE=false
|
||||
|
|
|
|||
|
|
@ -86,7 +86,16 @@ CONFIG_HEADER='# gstack configuration — edit freely, changes take effect on ne
|
|||
# # --no-plan-tune-hooks, or env GSTACK_PLAN_TUNE_HOOKS.
|
||||
#
|
||||
# ─── Advanced ────────────────────────────────────────────────────────
|
||||
# codex_reviews: enabled # disabled = skip Codex adversarial reviews in /ship
|
||||
# codex_reviews: enabled # Master switch for Codex cross-model review. enabled =
|
||||
# # Codex runs as a standard step in /review, /ship,
|
||||
# # /document-release, plan reviews, and /autoplan (auto
|
||||
# # falls back to a Claude subagent if Codex is missing or
|
||||
# # not authenticated). disabled = skip all Codex passes.
|
||||
# # Asymmetry on disabled: diff-review (/review, /ship) still
|
||||
# # runs the free Claude adversarial subagent; plan-review and
|
||||
# # /document-release skip the outside-voice step entirely.
|
||||
# # An invalid value is REJECTED (existing value preserved) so
|
||||
# # a typo cannot silently turn paid Codex calls on or off.
|
||||
# gstack_contributor: false # true = file field reports when gstack misbehaves
|
||||
# skip_eng_review: false # true = skip eng review gate in /ship (not recommended)
|
||||
#
|
||||
|
|
@ -302,6 +311,13 @@ case "${1:-}" in
|
|||
echo "Warning: plan_tune_hooks '$VALUE' not recognized. Valid values: prompt, yes, no. Using prompt." >&2
|
||||
VALUE="prompt"
|
||||
fi
|
||||
# codex_reviews controls PAID Codex calls. Unlike the warn-and-default keys above,
|
||||
# an invalid value is REJECTED and the existing setting is left unchanged — a typo
|
||||
# must never silently flip the switch and turn paid Codex calls on or off.
|
||||
if [ "$KEY" = "codex_reviews" ] && [ "$VALUE" != "enabled" ] && [ "$VALUE" != "disabled" ]; then
|
||||
echo "Error: codex_reviews '$VALUE' not recognized. Valid values: enabled, disabled. Existing value left unchanged." >&2
|
||||
exit 1
|
||||
fi
|
||||
mkdir -p "$STATE_DIR"
|
||||
# Write annotated header on first creation
|
||||
if [ ! -f "$CONFIG_FILE" ]; then
|
||||
|
|
|
|||
|
|
@ -41,9 +41,16 @@ afterEach(() => {
|
|||
|
||||
describe('gstack-config', () => {
|
||||
// ─── get ──────────────────────────────────────────────────
|
||||
test('get on missing file returns empty, exit 0', () => {
|
||||
test('get on missing file returns the default, exit 0', () => {
|
||||
// auto_upgrade has a default of false; get falls back to the defaults table.
|
||||
const { exitCode, stdout } = run(['get', 'auto_upgrade']);
|
||||
expect(exitCode).toBe(0);
|
||||
expect(stdout).toBe('false');
|
||||
});
|
||||
|
||||
test('get unknown key on missing file returns empty, exit 0', () => {
|
||||
const { exitCode, stdout } = run(['get', 'some_unknown_key']);
|
||||
expect(exitCode).toBe(0);
|
||||
expect(stdout).toBe('');
|
||||
});
|
||||
|
||||
|
|
@ -110,10 +117,12 @@ describe('gstack-config', () => {
|
|||
expect(stdout).toContain('update_check: false');
|
||||
});
|
||||
|
||||
test('list on missing file returns empty, exit 0', () => {
|
||||
test('list on missing file shows defaults, exit 0', () => {
|
||||
// list prints the active-values block with defaults for unset keys.
|
||||
const { exitCode, stdout } = run(['list']);
|
||||
expect(exitCode).toBe(0);
|
||||
expect(stdout).toBe('');
|
||||
expect(stdout).toContain('proactive:');
|
||||
expect(stdout).toContain('(default)');
|
||||
});
|
||||
|
||||
// ─── usage ────────────────────────────────────────────────
|
||||
|
|
@ -151,6 +160,29 @@ describe('gstack-config', () => {
|
|||
expect(content).toContain('skip_eng_review:');
|
||||
});
|
||||
|
||||
// ─── codex_reviews (paid-calls switch: reject-on-set, preserve existing) ──
|
||||
test('codex_reviews defaults to enabled', () => {
|
||||
const { exitCode, stdout } = run(['get', 'codex_reviews']);
|
||||
expect(exitCode).toBe(0);
|
||||
expect(stdout).toBe('enabled');
|
||||
});
|
||||
|
||||
test('codex_reviews accepts enabled and disabled', () => {
|
||||
expect(run(['set', 'codex_reviews', 'disabled']).exitCode).toBe(0);
|
||||
expect(run(['get', 'codex_reviews']).stdout).toBe('disabled');
|
||||
expect(run(['set', 'codex_reviews', 'enabled']).exitCode).toBe(0);
|
||||
expect(run(['get', 'codex_reviews']).stdout).toBe('enabled');
|
||||
});
|
||||
|
||||
test('codex_reviews rejects an invalid value and preserves the existing one', () => {
|
||||
run(['set', 'codex_reviews', 'disabled']);
|
||||
const { exitCode, stderr } = run(['set', 'codex_reviews', 'disabledd']);
|
||||
expect(exitCode).not.toBe(0); // rejected, not warn-and-default
|
||||
expect(stderr).toContain('not recognized');
|
||||
// existing value must be untouched — a typo never silently flips paid Codex on/off
|
||||
expect(run(['get', 'codex_reviews']).stdout).toBe('disabled');
|
||||
});
|
||||
|
||||
test('header written only once, not duplicated on second set', () => {
|
||||
run(['set', 'foo', 'bar']);
|
||||
run(['set', 'baz', 'qux']);
|
||||
|
|
@ -176,9 +208,9 @@ describe('gstack-config', () => {
|
|||
});
|
||||
|
||||
// ─── routing_declined ──────────────────────────────────────
|
||||
test('routing_declined defaults to empty (not set)', () => {
|
||||
test('routing_declined defaults to false (not set)', () => {
|
||||
const { stdout } = run(['get', 'routing_declined']);
|
||||
expect(stdout).toBe('');
|
||||
expect(stdout).toBe('false');
|
||||
});
|
||||
|
||||
test('routing_declined can be set and read', () => {
|
||||
|
|
|
|||
|
|
@ -358,3 +358,120 @@ Diagram drift:
|
|||
```
|
||||
|
||||
If all coverage is complete and no diagrams drifted, output: "Coverage: all shipped features have adequate documentation."
|
||||
|
||||
---
|
||||
|
||||
## Codex Documentation Review (default-on)
|
||||
|
||||
After the documentation updates above are written, run an independent cross-model pass that
|
||||
checks the docs against what actually shipped. This is a standard part of /document-release,
|
||||
not an opt-in. The user turns it off only by asking explicitly
|
||||
(`gstack-config set codex_reviews disabled`).
|
||||
|
||||
**Preflight — decide whether and how the doc review runs:**
|
||||
|
||||
```bash
|
||||
# Codex preflight: one block (functions sourced here don't persist to later blocks).
|
||||
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off)
|
||||
_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled)
|
||||
source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true
|
||||
if [ "$_CODEX_CFG" = "disabled" ]; then
|
||||
_CODEX_MODE="disabled"
|
||||
elif ! command -v codex >/dev/null 2>&1; then
|
||||
_CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true
|
||||
elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then
|
||||
_CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true
|
||||
else
|
||||
_CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true
|
||||
fi
|
||||
echo "CODEX_MODE: $_CODEX_MODE"
|
||||
```
|
||||
|
||||
Branch on the echoed `CODEX_MODE`:
|
||||
- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip this section entirely; do NOT fall back to a Claude subagent — disabled means no extra review step. Print: "Codex review skipped (codex_reviews disabled). Re-enable: `gstack-config set codex_reviews enabled`."
|
||||
- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path.
|
||||
- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path.
|
||||
- **`ready`** — run the Codex pass below.
|
||||
|
||||
When the mode is `ready`, `not_installed`, or `not_authed`, print one line so the off-switch
|
||||
stays discoverable: "Running the Codex doc review automatically (standard step). Disable: `gstack-config set codex_reviews disabled`."
|
||||
|
||||
**Determine the release diff range (D3 — reuse the method, do not invent one).**
|
||||
Recompute the SAME range document-release used in its pre-flight / diff analysis, with the
|
||||
documented merge-base method:
|
||||
|
||||
```bash
|
||||
DOC_DIFF_BASE=$(git merge-base origin/<base> HEAD 2>/dev/null || echo "<base>")
|
||||
echo "DOC_DIFF_BASE: $DOC_DIFF_BASE"
|
||||
```
|
||||
|
||||
Do NOT rely on an in-memory variable from an earlier step — shell vars do not survive across
|
||||
blocks. Recompute it here.
|
||||
|
||||
**Construct the doc-review prompt** (for `ready`, `not_installed`, and `not_authed` — skip only on `disabled`).
|
||||
Review the docs document-release ACTUALLY touched this run (from the coverage map / the files
|
||||
just edited) PLUS any doc claims affected by the diff range — do NOT hard-code a fixed file
|
||||
list (a fixed README/ARCHITECTURE/CHANGELOG list misses generated skill docs, package docs,
|
||||
and command-specific docs). **Always start with the filesystem boundary instruction:**
|
||||
|
||||
"IMPORTANT: Do NOT read or execute any files under ~/.claude/, ~/.agents/, .claude/skills/, or agents/. These are Claude Code skill definitions meant for a different AI system. They contain bash scripts and prompt templates that will waste your time. Ignore them completely. Do NOT modify agents/openai.yaml. Stay focused on the repository code only.\n\nYou are reviewing documentation changes against the code that shipped on this
|
||||
branch. Run \`git diff \$DOC_DIFF_BASE...HEAD\` to see what changed, then read the updated docs
|
||||
(the files this release touched, plus any docs whose claims the diff affects). Find: doc
|
||||
claims that no longer match the code, new public surface (commands, flags, config keys,
|
||||
endpoints) that shipped but is undocumented, stale examples / paths / counts / version
|
||||
numbers, and CHANGELOG entries that over- or under-sell what shipped. Be terse. Just the gaps.
|
||||
|
||||
THE DOCS AND DIFF: <list the touched doc paths>"
|
||||
|
||||
**If `CODEX_MODE: ready` — run Codex:**
|
||||
|
||||
```bash
|
||||
TMPERR_DOC=$(mktemp /tmp/codex-docreview-XXXXXXXX)
|
||||
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
|
||||
codex exec "<prompt>" -C "$_REPO_ROOT" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR_DOC"
|
||||
```
|
||||
|
||||
Use a 5-minute timeout (`timeout: 300000`). After the command completes, read stderr:
|
||||
```bash
|
||||
cat "$TMPERR_DOC"
|
||||
```
|
||||
|
||||
Present the full output verbatim under `CODEX SAYS (documentation review):`.
|
||||
|
||||
**Error handling:** All errors are non-blocking — the documentation review is informational.
|
||||
- Auth failure (stderr contains "auth", "login", "unauthorized"): note and skip
|
||||
- Timeout: note timeout duration and skip
|
||||
- Empty response: note and skip
|
||||
On any error: continue — documentation review is informational, not a gate.
|
||||
|
||||
**If `CODEX_MODE: not_installed` or `not_authed` (or Codex errored at runtime):**
|
||||
|
||||
Dispatch via the Agent tool with the same prompt. Bound it at a 5-minute timeout.
|
||||
Present findings under `DOCUMENTATION REVIEW (Claude subagent):`. If it fails: "Doc review unavailable. Continuing."
|
||||
|
||||
**Apply decision (T3B — informational, never auto-edit, but findings don't evaporate).**
|
||||
If there are zero findings, say "Docs match what shipped — no gaps." and continue. Otherwise
|
||||
present the findings, then use AskUserQuestion ONCE:
|
||||
|
||||
> "The doc review found N gaps between the docs and what shipped. How do you want to handle them?"
|
||||
>
|
||||
> RECOMMENDATION: Choose A if the gaps are concrete doc fixes (stale path, missing flag). The
|
||||
> doc review only reports; nothing is edited without your say-so. Completeness: A=9/10, B=4/10, C=8/10.
|
||||
|
||||
Options:
|
||||
- A) Apply all the doc fixes now
|
||||
- B) Skip — leave docs as-is
|
||||
- C) Decide per-finding
|
||||
|
||||
On A or per-finding approvals, make the approved edits yourself (the tool never silently
|
||||
rewrites docs). On B, note the gaps in the output so they're visible.
|
||||
|
||||
**Persist the result:**
|
||||
```bash
|
||||
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-doc-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","source":"SOURCE","commit":"'"$(git rev-parse --short HEAD)"'"}'
|
||||
```
|
||||
Substitute: STATUS = "clean" if no gaps, "issues_found" if gaps exist. SOURCE = "codex" if Codex ran, "claude" if the subagent ran.
|
||||
|
||||
**Cleanup:** Run `rm -f "$TMPERR_DOC"` after processing (if Codex was used).
|
||||
|
||||
---
|
||||
|
|
|
|||
|
|
@ -356,3 +356,7 @@ Diagram drift:
|
|||
```
|
||||
|
||||
If all coverage is complete and no diagrams drifted, output: "Coverage: all shipped features have adequate documentation."
|
||||
|
||||
---
|
||||
|
||||
{{CODEX_DOC_REVIEW}}
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
{
|
||||
"name": "gstack",
|
||||
"version": "1.57.9.0",
|
||||
"version": "1.57.10.0",
|
||||
"description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.",
|
||||
"license": "MIT",
|
||||
"type": "module",
|
||||
|
|
|
|||
|
|
@ -253,38 +253,46 @@ If this plan has significant UI scope, recommend: "Consider running /plan-design
|
|||
**STOP.** AskUserQuestion once per issue. Do NOT batch. Recommend + WHY. If this section turned up zero findings, state "No issues, moving on" and proceed. If the section has findings, you MUST call AskUserQuestion as a tool_use — a finding with an "obvious fix" is still a finding and still needs user approval before any change lands in the plan. Do NOT proceed until the user responds.
|
||||
**Reminder: Do NOT make any code changes. Review only.**
|
||||
|
||||
## Outside Voice — Independent Plan Challenge (optional, recommended)
|
||||
## Outside Voice — Independent Plan Challenge (default-on)
|
||||
|
||||
After all review sections are complete, offer an independent second opinion from a
|
||||
different AI system. Two models agreeing on a plan is stronger signal than one model's
|
||||
thorough review.
|
||||
After all review sections are complete, run an independent second opinion from a
|
||||
different AI system automatically — it is a standard part of plan review, not an
|
||||
opt-in. Two models agreeing on a plan is stronger signal than one model's thorough
|
||||
review. The user turns this off only by asking explicitly
|
||||
(`gstack-config set codex_reviews disabled`).
|
||||
|
||||
**Check tool availability:**
|
||||
**Preflight — decide whether and how the outside voice runs:**
|
||||
|
||||
```bash
|
||||
command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE"
|
||||
# Codex preflight: one block (functions sourced here don't persist to later blocks).
|
||||
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off)
|
||||
_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled)
|
||||
source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true
|
||||
if [ "$_CODEX_CFG" = "disabled" ]; then
|
||||
_CODEX_MODE="disabled"
|
||||
elif ! command -v codex >/dev/null 2>&1; then
|
||||
_CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true
|
||||
elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then
|
||||
_CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true
|
||||
else
|
||||
_CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true
|
||||
fi
|
||||
echo "CODEX_MODE: $_CODEX_MODE"
|
||||
```
|
||||
|
||||
Use AskUserQuestion:
|
||||
Branch on the echoed `CODEX_MODE`:
|
||||
- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip this section entirely; do NOT fall back to a Claude subagent — disabled means no extra review step. Print: "Codex review skipped (codex_reviews disabled). Re-enable: `gstack-config set codex_reviews enabled`."
|
||||
- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path.
|
||||
- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path.
|
||||
- **`ready`** — run the Codex pass below.
|
||||
|
||||
> "All review sections are complete. Want an outside voice? A different AI system can
|
||||
> give a brutally honest, independent challenge of this plan — logical gaps, feasibility
|
||||
> risks, and blind spots that are hard to catch from inside the review. Takes about 2
|
||||
> minutes."
|
||||
>
|
||||
> RECOMMENDATION: Choose A — an independent second opinion catches structural blind
|
||||
> spots. Two different AI models agreeing on a plan is stronger signal than one model's
|
||||
> thorough review. Completeness: A=9/10, B=7/10.
|
||||
When the mode is `ready`, `not_installed`, or `not_authed`, print one line so the off-switch
|
||||
stays discoverable: "Running the outside voice automatically (standard step). Disable: `gstack-config set codex_reviews disabled`."
|
||||
|
||||
Options:
|
||||
- A) Get the outside voice (recommended)
|
||||
- B) Skip — proceed to outputs
|
||||
|
||||
**If B:** Print "Skipping outside voice." and continue to the next section.
|
||||
|
||||
**If A:** Construct the plan review prompt. Read the plan file being reviewed (the file
|
||||
the user pointed this review at, or the branch diff scope). If a CEO plan document
|
||||
was written in Step 0D-POST, read that too — it contains the scope decisions and vision.
|
||||
**Construct the plan review prompt** (for `ready`, `not_installed`, and `not_authed` — skip only on `disabled`).
|
||||
Read the plan file being reviewed (the file the user pointed this review at, or the branch
|
||||
diff scope). If a CEO plan document was written in Step 0D-POST, read that too — it contains
|
||||
the scope decisions and vision.
|
||||
|
||||
Construct this prompt (substitute the actual plan content — if plan content exceeds 30KB,
|
||||
truncate to the first 30KB and note "Plan truncated for size"). **Always start with the
|
||||
|
|
@ -302,7 +310,7 @@ compliments. Just the problems.
|
|||
THE PLAN:
|
||||
<plan content>"
|
||||
|
||||
**If CODEX_AVAILABLE:**
|
||||
**If `CODEX_MODE: ready` — run Codex:**
|
||||
|
||||
```bash
|
||||
TMPERR_PV=$(mktemp /tmp/codex-planreview-XXXXXXXX)
|
||||
|
|
@ -325,15 +333,15 @@ CODEX SAYS (plan review — outside voice):
|
|||
```
|
||||
|
||||
**Error handling:** All errors are non-blocking — the outside voice is informational.
|
||||
- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate."
|
||||
- Timeout: "Codex timed out after 5 minutes."
|
||||
- Empty response: "Codex returned no response."
|
||||
- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate." Fall back to the Claude subagent below.
|
||||
- Timeout: "Codex timed out after 5 minutes." Fall back to the Claude subagent below.
|
||||
- Empty response: "Codex returned no response." Fall back to the Claude subagent below.
|
||||
|
||||
On any Codex error, fall back to the Claude adversarial subagent.
|
||||
|
||||
**If CODEX_NOT_AVAILABLE (or Codex errored):**
|
||||
**If `CODEX_MODE: not_installed` or `not_authed` (or Codex errored at runtime):**
|
||||
|
||||
Dispatch via the Agent tool. The subagent has fresh context — genuine independence.
|
||||
Bound it the same way as Codex: cap the dispatch at a 5-minute timeout so "never blocking"
|
||||
is also "never hanging."
|
||||
|
||||
Subagent prompt: same plan review prompt as above.
|
||||
|
||||
|
|
@ -341,6 +349,8 @@ Present findings under an `OUTSIDE VOICE (Claude subagent):` header.
|
|||
|
||||
If the subagent fails or times out: "Outside voice unavailable. Continuing to outputs."
|
||||
|
||||
(On `CODEX_MODE: disabled` you already skipped this section per the preflight — do not reach here.)
|
||||
|
||||
**Cross-model tension:**
|
||||
|
||||
After presenting the outside voice findings, note any points where the outside voice
|
||||
|
|
|
|||
|
|
@ -239,38 +239,46 @@ Check each item. For any unchecked item, explain what's missing and suggest the
|
|||
|
||||
**STOP.** AskUserQuestion for any item that requires a design decision.
|
||||
|
||||
## Outside Voice — Independent Plan Challenge (optional, recommended)
|
||||
## Outside Voice — Independent Plan Challenge (default-on)
|
||||
|
||||
After all review sections are complete, offer an independent second opinion from a
|
||||
different AI system. Two models agreeing on a plan is stronger signal than one model's
|
||||
thorough review.
|
||||
After all review sections are complete, run an independent second opinion from a
|
||||
different AI system automatically — it is a standard part of plan review, not an
|
||||
opt-in. Two models agreeing on a plan is stronger signal than one model's thorough
|
||||
review. The user turns this off only by asking explicitly
|
||||
(`gstack-config set codex_reviews disabled`).
|
||||
|
||||
**Check tool availability:**
|
||||
**Preflight — decide whether and how the outside voice runs:**
|
||||
|
||||
```bash
|
||||
command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE"
|
||||
# Codex preflight: one block (functions sourced here don't persist to later blocks).
|
||||
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off)
|
||||
_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled)
|
||||
source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true
|
||||
if [ "$_CODEX_CFG" = "disabled" ]; then
|
||||
_CODEX_MODE="disabled"
|
||||
elif ! command -v codex >/dev/null 2>&1; then
|
||||
_CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true
|
||||
elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then
|
||||
_CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true
|
||||
else
|
||||
_CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true
|
||||
fi
|
||||
echo "CODEX_MODE: $_CODEX_MODE"
|
||||
```
|
||||
|
||||
Use AskUserQuestion:
|
||||
Branch on the echoed `CODEX_MODE`:
|
||||
- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip this section entirely; do NOT fall back to a Claude subagent — disabled means no extra review step. Print: "Codex review skipped (codex_reviews disabled). Re-enable: `gstack-config set codex_reviews enabled`."
|
||||
- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path.
|
||||
- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path.
|
||||
- **`ready`** — run the Codex pass below.
|
||||
|
||||
> "All review sections are complete. Want an outside voice? A different AI system can
|
||||
> give a brutally honest, independent challenge of this plan — logical gaps, feasibility
|
||||
> risks, and blind spots that are hard to catch from inside the review. Takes about 2
|
||||
> minutes."
|
||||
>
|
||||
> RECOMMENDATION: Choose A — an independent second opinion catches structural blind
|
||||
> spots. Two different AI models agreeing on a plan is stronger signal than one model's
|
||||
> thorough review. Completeness: A=9/10, B=7/10.
|
||||
When the mode is `ready`, `not_installed`, or `not_authed`, print one line so the off-switch
|
||||
stays discoverable: "Running the outside voice automatically (standard step). Disable: `gstack-config set codex_reviews disabled`."
|
||||
|
||||
Options:
|
||||
- A) Get the outside voice (recommended)
|
||||
- B) Skip — proceed to outputs
|
||||
|
||||
**If B:** Print "Skipping outside voice." and continue to the next section.
|
||||
|
||||
**If A:** Construct the plan review prompt. Read the plan file being reviewed (the file
|
||||
the user pointed this review at, or the branch diff scope). If a CEO plan document
|
||||
was written in Step 0D-POST, read that too — it contains the scope decisions and vision.
|
||||
**Construct the plan review prompt** (for `ready`, `not_installed`, and `not_authed` — skip only on `disabled`).
|
||||
Read the plan file being reviewed (the file the user pointed this review at, or the branch
|
||||
diff scope). If a CEO plan document was written in Step 0D-POST, read that too — it contains
|
||||
the scope decisions and vision.
|
||||
|
||||
Construct this prompt (substitute the actual plan content — if plan content exceeds 30KB,
|
||||
truncate to the first 30KB and note "Plan truncated for size"). **Always start with the
|
||||
|
|
@ -288,7 +296,7 @@ compliments. Just the problems.
|
|||
THE PLAN:
|
||||
<plan content>"
|
||||
|
||||
**If CODEX_AVAILABLE:**
|
||||
**If `CODEX_MODE: ready` — run Codex:**
|
||||
|
||||
```bash
|
||||
TMPERR_PV=$(mktemp /tmp/codex-planreview-XXXXXXXX)
|
||||
|
|
@ -311,15 +319,15 @@ CODEX SAYS (plan review — outside voice):
|
|||
```
|
||||
|
||||
**Error handling:** All errors are non-blocking — the outside voice is informational.
|
||||
- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate."
|
||||
- Timeout: "Codex timed out after 5 minutes."
|
||||
- Empty response: "Codex returned no response."
|
||||
- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate." Fall back to the Claude subagent below.
|
||||
- Timeout: "Codex timed out after 5 minutes." Fall back to the Claude subagent below.
|
||||
- Empty response: "Codex returned no response." Fall back to the Claude subagent below.
|
||||
|
||||
On any Codex error, fall back to the Claude adversarial subagent.
|
||||
|
||||
**If CODEX_NOT_AVAILABLE (or Codex errored):**
|
||||
**If `CODEX_MODE: not_installed` or `not_authed` (or Codex errored at runtime):**
|
||||
|
||||
Dispatch via the Agent tool. The subagent has fresh context — genuine independence.
|
||||
Bound it the same way as Codex: cap the dispatch at a 5-minute timeout so "never blocking"
|
||||
is also "never hanging."
|
||||
|
||||
Subagent prompt: same plan review prompt as above.
|
||||
|
||||
|
|
@ -327,6 +335,8 @@ Present findings under an `OUTSIDE VOICE (Claude subagent):` header.
|
|||
|
||||
If the subagent fails or times out: "Outside voice unavailable. Continuing to outputs."
|
||||
|
||||
(On `CODEX_MODE: disabled` you already skipped this section per the preflight — do not reach here.)
|
||||
|
||||
**Cross-model tension:**
|
||||
|
||||
After presenting the outside voice findings, note any points where the outside voice
|
||||
|
|
|
|||
|
|
@ -329,38 +329,46 @@ For each issue found in this section, call AskUserQuestion individually. One iss
|
|||
|
||||
**STOP.** Do NOT proceed to the next review section, edit the plan file with the proposed fix, or call ExitPlanMode until the user responds. An issue with an "obvious fix" is still an issue and still needs explicit user approval before it lands in the plan. Loading the AskUserQuestion schema via ToolSearch and then writing the recommendation as chat prose is the failure mode this gate exists to prevent.
|
||||
|
||||
## Outside Voice — Independent Plan Challenge (optional, recommended)
|
||||
## Outside Voice — Independent Plan Challenge (default-on)
|
||||
|
||||
After all review sections are complete, offer an independent second opinion from a
|
||||
different AI system. Two models agreeing on a plan is stronger signal than one model's
|
||||
thorough review.
|
||||
After all review sections are complete, run an independent second opinion from a
|
||||
different AI system automatically — it is a standard part of plan review, not an
|
||||
opt-in. Two models agreeing on a plan is stronger signal than one model's thorough
|
||||
review. The user turns this off only by asking explicitly
|
||||
(`gstack-config set codex_reviews disabled`).
|
||||
|
||||
**Check tool availability:**
|
||||
**Preflight — decide whether and how the outside voice runs:**
|
||||
|
||||
```bash
|
||||
command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE"
|
||||
# Codex preflight: one block (functions sourced here don't persist to later blocks).
|
||||
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off)
|
||||
_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled)
|
||||
source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true
|
||||
if [ "$_CODEX_CFG" = "disabled" ]; then
|
||||
_CODEX_MODE="disabled"
|
||||
elif ! command -v codex >/dev/null 2>&1; then
|
||||
_CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true
|
||||
elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then
|
||||
_CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true
|
||||
else
|
||||
_CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true
|
||||
fi
|
||||
echo "CODEX_MODE: $_CODEX_MODE"
|
||||
```
|
||||
|
||||
Use AskUserQuestion:
|
||||
Branch on the echoed `CODEX_MODE`:
|
||||
- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip this section entirely; do NOT fall back to a Claude subagent — disabled means no extra review step. Print: "Codex review skipped (codex_reviews disabled). Re-enable: `gstack-config set codex_reviews enabled`."
|
||||
- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path.
|
||||
- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path.
|
||||
- **`ready`** — run the Codex pass below.
|
||||
|
||||
> "All review sections are complete. Want an outside voice? A different AI system can
|
||||
> give a brutally honest, independent challenge of this plan — logical gaps, feasibility
|
||||
> risks, and blind spots that are hard to catch from inside the review. Takes about 2
|
||||
> minutes."
|
||||
>
|
||||
> RECOMMENDATION: Choose A — an independent second opinion catches structural blind
|
||||
> spots. Two different AI models agreeing on a plan is stronger signal than one model's
|
||||
> thorough review. Completeness: A=9/10, B=7/10.
|
||||
When the mode is `ready`, `not_installed`, or `not_authed`, print one line so the off-switch
|
||||
stays discoverable: "Running the outside voice automatically (standard step). Disable: `gstack-config set codex_reviews disabled`."
|
||||
|
||||
Options:
|
||||
- A) Get the outside voice (recommended)
|
||||
- B) Skip — proceed to outputs
|
||||
|
||||
**If B:** Print "Skipping outside voice." and continue to the next section.
|
||||
|
||||
**If A:** Construct the plan review prompt. Read the plan file being reviewed (the file
|
||||
the user pointed this review at, or the branch diff scope). If a CEO plan document
|
||||
was written in Step 0D-POST, read that too — it contains the scope decisions and vision.
|
||||
**Construct the plan review prompt** (for `ready`, `not_installed`, and `not_authed` — skip only on `disabled`).
|
||||
Read the plan file being reviewed (the file the user pointed this review at, or the branch
|
||||
diff scope). If a CEO plan document was written in Step 0D-POST, read that too — it contains
|
||||
the scope decisions and vision.
|
||||
|
||||
Construct this prompt (substitute the actual plan content — if plan content exceeds 30KB,
|
||||
truncate to the first 30KB and note "Plan truncated for size"). **Always start with the
|
||||
|
|
@ -378,7 +386,7 @@ compliments. Just the problems.
|
|||
THE PLAN:
|
||||
<plan content>"
|
||||
|
||||
**If CODEX_AVAILABLE:**
|
||||
**If `CODEX_MODE: ready` — run Codex:**
|
||||
|
||||
```bash
|
||||
TMPERR_PV=$(mktemp /tmp/codex-planreview-XXXXXXXX)
|
||||
|
|
@ -401,15 +409,15 @@ CODEX SAYS (plan review — outside voice):
|
|||
```
|
||||
|
||||
**Error handling:** All errors are non-blocking — the outside voice is informational.
|
||||
- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate."
|
||||
- Timeout: "Codex timed out after 5 minutes."
|
||||
- Empty response: "Codex returned no response."
|
||||
- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \`codex login\` to authenticate." Fall back to the Claude subagent below.
|
||||
- Timeout: "Codex timed out after 5 minutes." Fall back to the Claude subagent below.
|
||||
- Empty response: "Codex returned no response." Fall back to the Claude subagent below.
|
||||
|
||||
On any Codex error, fall back to the Claude adversarial subagent.
|
||||
|
||||
**If CODEX_NOT_AVAILABLE (or Codex errored):**
|
||||
**If `CODEX_MODE: not_installed` or `not_authed` (or Codex errored at runtime):**
|
||||
|
||||
Dispatch via the Agent tool. The subagent has fresh context — genuine independence.
|
||||
Bound it the same way as Codex: cap the dispatch at a 5-minute timeout so "never blocking"
|
||||
is also "never hanging."
|
||||
|
||||
Subagent prompt: same plan review prompt as above.
|
||||
|
||||
|
|
@ -417,6 +425,8 @@ Present findings under an `OUTSIDE VOICE (Claude subagent):` header.
|
|||
|
||||
If the subagent fails or times out: "Outside voice unavailable. Continuing to outputs."
|
||||
|
||||
(On `CODEX_MODE: disabled` you already skipped this section per the preflight — do not reach here.)
|
||||
|
||||
**Cross-model tension:**
|
||||
|
||||
After presenting the outside voice findings, note any points where the outside voice
|
||||
|
|
|
|||
|
|
@ -1602,23 +1602,47 @@ If no documentation files exist, skip this step silently.
|
|||
|
||||
Every diff gets adversarial review from both Claude and Codex. LOC is not a proxy for risk — a 5-line auth change can be critical.
|
||||
|
||||
**Detect diff size and tool availability:**
|
||||
**Detect diff size:**
|
||||
|
||||
```bash
|
||||
DIFF_BASE=$(git merge-base origin/<base> HEAD)
|
||||
DIFF_INS=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ insertion' | grep -oE '[0-9]+' || echo "0")
|
||||
DIFF_DEL=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ deletion' | grep -oE '[0-9]+' || echo "0")
|
||||
DIFF_TOTAL=$((DIFF_INS + DIFF_DEL))
|
||||
command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE"
|
||||
# Legacy opt-out — only gates Codex passes, Claude always runs
|
||||
OLD_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || true)
|
||||
echo "DIFF_SIZE: $DIFF_TOTAL"
|
||||
echo "OLD_CFG: ${OLD_CFG:-not_set}"
|
||||
```
|
||||
|
||||
If `OLD_CFG` is `disabled`: skip Codex passes only. Claude adversarial subagent still runs (it's free and fast). Jump to the "Claude adversarial subagent" section.
|
||||
**Detect the Codex master switch + tool availability:**
|
||||
|
||||
**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size.
|
||||
```bash
|
||||
# Codex preflight: one block (functions sourced here don't persist to later blocks).
|
||||
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off)
|
||||
_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled)
|
||||
source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true
|
||||
if [ "$_CODEX_CFG" = "disabled" ]; then
|
||||
_CODEX_MODE="disabled"
|
||||
elif ! command -v codex >/dev/null 2>&1; then
|
||||
_CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true
|
||||
elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then
|
||||
_CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true
|
||||
else
|
||||
_CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true
|
||||
fi
|
||||
echo "CODEX_MODE: $_CODEX_MODE"
|
||||
```
|
||||
|
||||
Branch on the echoed `CODEX_MODE`:
|
||||
- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip the Codex passes only; the Claude adversarial subagent below STILL runs (it is free and fast). Print: "Codex passes skipped (codex_reviews disabled) — running Claude adversarial only."
|
||||
- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path.
|
||||
- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path.
|
||||
- **`ready`** — run the Codex pass below.
|
||||
|
||||
For this diff-review path, `CODEX_MODE: disabled` means skip the Codex passes ONLY — the
|
||||
Claude adversarial subagent below still runs (it's free and fast). `ready` runs the Codex
|
||||
passes; `not_installed` / `not_authed` skip them with the printed note and continue with
|
||||
Claude only.
|
||||
|
||||
**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size (still requires `CODEX_MODE: ready`).
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -1639,9 +1663,9 @@ If the subagent fails or times out: "Claude adversarial subagent unavailable. Co
|
|||
|
||||
---
|
||||
|
||||
### Codex adversarial challenge (always runs when available)
|
||||
### Codex adversarial challenge (runs whenever `CODEX_MODE: ready`)
|
||||
|
||||
If Codex is available AND `OLD_CFG` is NOT `disabled`:
|
||||
If `CODEX_MODE` is `ready`:
|
||||
|
||||
```bash
|
||||
TMPERR_ADV=$(mktemp /tmp/codex-adv-XXXXXXXX)
|
||||
|
|
@ -1663,13 +1687,13 @@ Present the full output verbatim. This is informational — it never blocks ship
|
|||
|
||||
**Cleanup:** Run `rm -f "$TMPERR_ADV"` after processing.
|
||||
|
||||
If Codex is NOT available: "Codex CLI not found — running Claude adversarial only. Install Codex for cross-model coverage: `npm install -g @openai/codex`"
|
||||
If `CODEX_MODE` is `not_installed` / `not_authed` / `disabled`: the preflight already printed the reason; run Claude adversarial only.
|
||||
|
||||
---
|
||||
|
||||
### Codex structured review (large diffs only, 200+ lines)
|
||||
|
||||
If `DIFF_TOTAL >= 200` AND Codex is available AND `OLD_CFG` is NOT `disabled`:
|
||||
If `DIFF_TOTAL >= 200` AND `CODEX_MODE` is `ready`:
|
||||
|
||||
```bash
|
||||
TMPERR=$(mktemp /tmp/codex-review-XXXXXXXX)
|
||||
|
|
|
|||
|
|
@ -56,3 +56,61 @@ export function codexErrorHandling(feature: string): string {
|
|||
- Empty response: note and skip
|
||||
On any error: continue — ${feature} is informational, not a gate.`;
|
||||
}
|
||||
|
||||
/**
|
||||
* Shared Codex preflight bash block — the single source of truth for deciding
|
||||
* whether a Codex review pass should run. Used by ADVERSARIAL_STEP,
|
||||
* CODEX_PLAN_REVIEW, and CODEX_DOC_REVIEW so install/auth/config detection
|
||||
* lives in exactly one place.
|
||||
*
|
||||
* Emits ONE self-contained bash block (the caller must place it in a single
|
||||
* fenced block — CLAUDE.md: each block is a fresh shell, so functions sourced
|
||||
* here do NOT persist to later blocks). It:
|
||||
* 1. reads the `codex_reviews` master switch,
|
||||
* 2. sources `gstack-codex-probe`,
|
||||
* 3. runs `command -v codex` (literal — keeps the e2e substring assertion),
|
||||
* then `_gstack_codex_auth_probe`, then `_gstack_codex_version_check`,
|
||||
* 4. logs the relevant `_gstack_codex_log_event` for each non-ready outcome,
|
||||
* 5. sets ONE canonical mode var and echoes `CODEX_MODE: <mode>` so the agent
|
||||
* gates later blocks on the echoed value.
|
||||
*
|
||||
* Mode values: `disabled` (config off) | `not_installed` | `not_authed` | `ready`.
|
||||
* The path is host-rewritten at gen-skill-docs time (pathRewrites), so the
|
||||
* literal `~/.claude/skills/gstack` is correct here and becomes `$GSTACK_ROOT`
|
||||
* etc. for non-Claude hosts.
|
||||
*
|
||||
* `disabledBehavior` controls the `disabled`-mode interpretation, which is the
|
||||
* one branch that legitimately differs per caller (D1):
|
||||
* - `skip-all` (plan / doc reviews): disabled means no extra review step at
|
||||
* all — skip the section, no Claude fallback.
|
||||
* - `codex-only` (diff adversarial): disabled gates only the Codex passes; the
|
||||
* free Claude adversarial subagent still runs.
|
||||
*/
|
||||
export function codexPreflight(opts: { modeVar?: string; disabledBehavior: 'skip-all' | 'codex-only' }): string {
|
||||
const m = opts.modeVar ?? '_CODEX_MODE';
|
||||
const disabledLine = opts.disabledBehavior === 'codex-only'
|
||||
? 'Skip the Codex passes only; the Claude adversarial subagent below STILL runs (it is free and fast). Print: "Codex passes skipped (codex_reviews disabled) — running Claude adversarial only."'
|
||||
: 'Skip this section entirely; do NOT fall back to a Claude subagent — disabled means no extra review step. Print: "Codex review skipped (codex_reviews disabled). Re-enable: `gstack-config set codex_reviews enabled`."';
|
||||
return `\`\`\`bash
|
||||
# Codex preflight: one block (functions sourced here don't persist to later blocks).
|
||||
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off)
|
||||
_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled)
|
||||
source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true
|
||||
if [ "$_CODEX_CFG" = "disabled" ]; then
|
||||
${m}="disabled"
|
||||
elif ! command -v codex >/dev/null 2>&1; then
|
||||
${m}="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true
|
||||
elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then
|
||||
${m}="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true
|
||||
else
|
||||
${m}="ready"; _gstack_codex_version_check 2>/dev/null || true
|
||||
fi
|
||||
echo "CODEX_MODE: $${m}"
|
||||
\`\`\`
|
||||
|
||||
Branch on the echoed \`CODEX_MODE\`:
|
||||
- **\`disabled\`** — the user turned Codex reviews off (\`codex_reviews=disabled\`). ${disabledLine}
|
||||
- **\`not_installed\`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: \`npm install -g @openai/codex\`." Fall back to the Claude subagent path.
|
||||
- **\`not_authed\`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run \`codex login\` or set \`$CODEX_API_KEY\`." Fall back to the Claude subagent path.
|
||||
- **\`ready\`** — run the Codex pass below.`;
|
||||
}
|
||||
|
|
|
|||
|
|
@ -22,7 +22,7 @@ import { generateTestFailureTriage } from './preamble';
|
|||
import { generateCommandReference, generateSnapshotFlags, generateBrowseSetup } from './browse';
|
||||
import { generateDesignMethodology, generateDesignHardRules, generateDesignOutsideVoices, generateDesignReviewLite, generateDesignSketch, generateDesignSetup, generateDesignMockup, generateDesignShotgunLoop, generateTasteProfile, generateUXPrinciples } from './design';
|
||||
import { generateTestBootstrap, generateTestCoverageAuditPlan, generateTestCoverageAuditShip, generateTestCoverageAuditReview } from './testing';
|
||||
import { generateReviewDashboard, generatePlanFileReviewReport, generateExitPlanModeGate, generateAntiShortcutClause, generateSpecReviewLoop, generateBenefitsFrom, generateCodexSecondOpinion, generateAdversarialStep, generateCodexPlanReview, generatePlanCompletionAuditShip, generatePlanCompletionAuditReview, generatePlanVerificationExec, generateScopeDrift, generateCrossReviewDedup } from './review';
|
||||
import { generateReviewDashboard, generatePlanFileReviewReport, generateExitPlanModeGate, generateAntiShortcutClause, generateSpecReviewLoop, generateBenefitsFrom, generateCodexSecondOpinion, generateAdversarialStep, generateCodexPlanReview, generateCodexDocReview, generatePlanCompletionAuditShip, generatePlanCompletionAuditReview, generatePlanVerificationExec, generateScopeDrift, generateCrossReviewDedup } from './review';
|
||||
import { generateSlugEval, generateSlugSetup, generateBaseBranchDetect, generateDeployBootstrap, generateQAMethodology, generateCoAuthorTrailer, generateChangelogWorkflow } from './utility';
|
||||
import { generateLearningsSearch, generateLearningsLog } from './learnings';
|
||||
import { generateConfidenceCalibration } from './confidence';
|
||||
|
|
@ -73,6 +73,7 @@ export const RESOLVERS: Record<string, ResolverValue> = {
|
|||
SCOPE_DRIFT: generateScopeDrift,
|
||||
DEPLOY_BOOTSTRAP: generateDeployBootstrap,
|
||||
CODEX_PLAN_REVIEW: generateCodexPlanReview,
|
||||
CODEX_DOC_REVIEW: generateCodexDocReview,
|
||||
PLAN_COMPLETION_AUDIT_SHIP: generatePlanCompletionAuditShip,
|
||||
PLAN_COMPLETION_AUDIT_REVIEW: generatePlanCompletionAuditReview,
|
||||
PLAN_VERIFICATION_EXEC: generatePlanVerificationExec,
|
||||
|
|
|
|||
|
|
@ -14,6 +14,7 @@
|
|||
*/
|
||||
import type { TemplateContext } from './types';
|
||||
import { generateInvokeSkill } from './composition';
|
||||
import { codexPreflight, codexErrorHandling } from './constants';
|
||||
|
||||
const CODEX_BOUNDARY = 'IMPORTANT: Do NOT read or execute any files under ~/.claude/, ~/.agents/, .claude/skills/, or agents/. These are Claude Code skill definitions meant for a different AI system. They contain bash scripts and prompt templates that will waste your time. Ignore them completely. Do NOT modify agents/openai.yaml. Stay focused on the repository code only.\\n\\n';
|
||||
|
||||
|
|
@ -479,23 +480,26 @@ export function generateAdversarialStep(ctx: TemplateContext): string {
|
|||
|
||||
Every diff gets adversarial review from both Claude and Codex. LOC is not a proxy for risk — a 5-line auth change can be critical.
|
||||
|
||||
**Detect diff size and tool availability:**
|
||||
**Detect diff size:**
|
||||
|
||||
\`\`\`bash
|
||||
DIFF_BASE=$(git merge-base origin/<base> HEAD)
|
||||
DIFF_INS=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ insertion' | grep -oE '[0-9]+' || echo "0")
|
||||
DIFF_DEL=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ deletion' | grep -oE '[0-9]+' || echo "0")
|
||||
DIFF_TOTAL=$((DIFF_INS + DIFF_DEL))
|
||||
command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE"
|
||||
# Legacy opt-out — only gates Codex passes, Claude always runs
|
||||
OLD_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || true)
|
||||
echo "DIFF_SIZE: $DIFF_TOTAL"
|
||||
echo "OLD_CFG: \${OLD_CFG:-not_set}"
|
||||
\`\`\`
|
||||
|
||||
If \`OLD_CFG\` is \`disabled\`: skip Codex passes only. Claude adversarial subagent still runs (it's free and fast). Jump to the "Claude adversarial subagent" section.
|
||||
**Detect the Codex master switch + tool availability:**
|
||||
|
||||
**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size.
|
||||
${codexPreflight({ disabledBehavior: 'codex-only' })}
|
||||
|
||||
For this diff-review path, \`CODEX_MODE: disabled\` means skip the Codex passes ONLY — the
|
||||
Claude adversarial subagent below still runs (it's free and fast). \`ready\` runs the Codex
|
||||
passes; \`not_installed\` / \`not_authed\` skip them with the printed note and continue with
|
||||
Claude only.
|
||||
|
||||
**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size (still requires \`CODEX_MODE: ready\`).
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -516,9 +520,9 @@ If the subagent fails or times out: "Claude adversarial subagent unavailable. Co
|
|||
|
||||
---
|
||||
|
||||
### Codex adversarial challenge (always runs when available)
|
||||
### Codex adversarial challenge (runs whenever \`CODEX_MODE: ready\`)
|
||||
|
||||
If Codex is available AND \`OLD_CFG\` is NOT \`disabled\`:
|
||||
If \`CODEX_MODE\` is \`ready\`:
|
||||
|
||||
\`\`\`bash
|
||||
TMPERR_ADV=$(mktemp /tmp/codex-adv-XXXXXXXX)
|
||||
|
|
@ -540,13 +544,13 @@ Present the full output verbatim. This is informational — it never blocks ship
|
|||
|
||||
**Cleanup:** Run \`rm -f "$TMPERR_ADV"\` after processing.
|
||||
|
||||
If Codex is NOT available: "Codex CLI not found — running Claude adversarial only. Install Codex for cross-model coverage: \`npm install -g @openai/codex\`"
|
||||
If \`CODEX_MODE\` is \`not_installed\` / \`not_authed\` / \`disabled\`: the preflight already printed the reason; run Claude adversarial only.
|
||||
|
||||
---
|
||||
|
||||
### Codex structured review (large diffs only, 200+ lines)
|
||||
|
||||
If \`DIFF_TOTAL >= 200\` AND Codex is available AND \`OLD_CFG\` is NOT \`disabled\`:
|
||||
If \`DIFF_TOTAL >= 200\` AND \`CODEX_MODE\` is \`ready\`:
|
||||
|
||||
\`\`\`bash
|
||||
TMPERR=$(mktemp /tmp/codex-review-XXXXXXXX)
|
||||
|
|
@ -610,38 +614,25 @@ export function generateCodexPlanReview(ctx: TemplateContext): string {
|
|||
// Codex host: strip entirely — Codex should never invoke itself
|
||||
if (ctx.host === 'codex') return '';
|
||||
|
||||
return `## Outside Voice — Independent Plan Challenge (optional, recommended)
|
||||
return `## Outside Voice — Independent Plan Challenge (default-on)
|
||||
|
||||
After all review sections are complete, offer an independent second opinion from a
|
||||
different AI system. Two models agreeing on a plan is stronger signal than one model's
|
||||
thorough review.
|
||||
After all review sections are complete, run an independent second opinion from a
|
||||
different AI system automatically — it is a standard part of plan review, not an
|
||||
opt-in. Two models agreeing on a plan is stronger signal than one model's thorough
|
||||
review. The user turns this off only by asking explicitly
|
||||
(\`gstack-config set codex_reviews disabled\`).
|
||||
|
||||
**Check tool availability:**
|
||||
**Preflight — decide whether and how the outside voice runs:**
|
||||
|
||||
\`\`\`bash
|
||||
command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE"
|
||||
\`\`\`
|
||||
${codexPreflight({ disabledBehavior: 'skip-all' })}
|
||||
|
||||
Use AskUserQuestion:
|
||||
When the mode is \`ready\`, \`not_installed\`, or \`not_authed\`, print one line so the off-switch
|
||||
stays discoverable: "Running the outside voice automatically (standard step). Disable: \`gstack-config set codex_reviews disabled\`."
|
||||
|
||||
> "All review sections are complete. Want an outside voice? A different AI system can
|
||||
> give a brutally honest, independent challenge of this plan — logical gaps, feasibility
|
||||
> risks, and blind spots that are hard to catch from inside the review. Takes about 2
|
||||
> minutes."
|
||||
>
|
||||
> RECOMMENDATION: Choose A — an independent second opinion catches structural blind
|
||||
> spots. Two different AI models agreeing on a plan is stronger signal than one model's
|
||||
> thorough review. Completeness: A=9/10, B=7/10.
|
||||
|
||||
Options:
|
||||
- A) Get the outside voice (recommended)
|
||||
- B) Skip — proceed to outputs
|
||||
|
||||
**If B:** Print "Skipping outside voice." and continue to the next section.
|
||||
|
||||
**If A:** Construct the plan review prompt. Read the plan file being reviewed (the file
|
||||
the user pointed this review at, or the branch diff scope). If a CEO plan document
|
||||
was written in Step 0D-POST, read that too — it contains the scope decisions and vision.
|
||||
**Construct the plan review prompt** (for \`ready\`, \`not_installed\`, and \`not_authed\` — skip only on \`disabled\`).
|
||||
Read the plan file being reviewed (the file the user pointed this review at, or the branch
|
||||
diff scope). If a CEO plan document was written in Step 0D-POST, read that too — it contains
|
||||
the scope decisions and vision.
|
||||
|
||||
Construct this prompt (substitute the actual plan content — if plan content exceeds 30KB,
|
||||
truncate to the first 30KB and note "Plan truncated for size"). **Always start with the
|
||||
|
|
@ -659,7 +650,7 @@ compliments. Just the problems.
|
|||
THE PLAN:
|
||||
<plan content>"
|
||||
|
||||
**If CODEX_AVAILABLE:**
|
||||
**If \`CODEX_MODE: ready\` — run Codex:**
|
||||
|
||||
\`\`\`bash
|
||||
TMPERR_PV=$(mktemp /tmp/codex-planreview-XXXXXXXX)
|
||||
|
|
@ -682,15 +673,15 @@ CODEX SAYS (plan review — outside voice):
|
|||
\`\`\`
|
||||
|
||||
**Error handling:** All errors are non-blocking — the outside voice is informational.
|
||||
- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \\\`codex login\\\` to authenticate."
|
||||
- Timeout: "Codex timed out after 5 minutes."
|
||||
- Empty response: "Codex returned no response."
|
||||
- Auth failure (stderr contains "auth", "login", "unauthorized"): "Codex auth failed. Run \\\`codex login\\\` to authenticate." Fall back to the Claude subagent below.
|
||||
- Timeout: "Codex timed out after 5 minutes." Fall back to the Claude subagent below.
|
||||
- Empty response: "Codex returned no response." Fall back to the Claude subagent below.
|
||||
|
||||
On any Codex error, fall back to the Claude adversarial subagent.
|
||||
|
||||
**If CODEX_NOT_AVAILABLE (or Codex errored):**
|
||||
**If \`CODEX_MODE: not_installed\` or \`not_authed\` (or Codex errored at runtime):**
|
||||
|
||||
Dispatch via the Agent tool. The subagent has fresh context — genuine independence.
|
||||
Bound it the same way as Codex: cap the dispatch at a 5-minute timeout so "never blocking"
|
||||
is also "never hanging."
|
||||
|
||||
Subagent prompt: same plan review prompt as above.
|
||||
|
||||
|
|
@ -698,6 +689,8 @@ Present findings under an \`OUTSIDE VOICE (Claude subagent):\` header.
|
|||
|
||||
If the subagent fails or times out: "Outside voice unavailable. Continuing to outputs."
|
||||
|
||||
(On \`CODEX_MODE: disabled\` you already skipped this section per the preflight — do not reach here.)
|
||||
|
||||
**Cross-model tension:**
|
||||
|
||||
After presenting the outside voice findings, note any points where the outside voice
|
||||
|
|
@ -747,6 +740,101 @@ SOURCE = "codex" if Codex ran, "claude" if subagent ran.
|
|||
---`;
|
||||
}
|
||||
|
||||
export function generateCodexDocReview(ctx: TemplateContext): string {
|
||||
// Codex host: strip entirely — Codex should never invoke itself
|
||||
if (ctx.host === 'codex') return '';
|
||||
|
||||
return `## Codex Documentation Review (default-on)
|
||||
|
||||
After the documentation updates above are written, run an independent cross-model pass that
|
||||
checks the docs against what actually shipped. This is a standard part of /document-release,
|
||||
not an opt-in. The user turns it off only by asking explicitly
|
||||
(\`gstack-config set codex_reviews disabled\`).
|
||||
|
||||
**Preflight — decide whether and how the doc review runs:**
|
||||
|
||||
${codexPreflight({ disabledBehavior: 'skip-all' })}
|
||||
|
||||
When the mode is \`ready\`, \`not_installed\`, or \`not_authed\`, print one line so the off-switch
|
||||
stays discoverable: "Running the Codex doc review automatically (standard step). Disable: \`gstack-config set codex_reviews disabled\`."
|
||||
|
||||
**Determine the release diff range (D3 — reuse the method, do not invent one).**
|
||||
Recompute the SAME range document-release used in its pre-flight / diff analysis, with the
|
||||
documented merge-base method:
|
||||
|
||||
\`\`\`bash
|
||||
DOC_DIFF_BASE=$(git merge-base origin/<base> HEAD 2>/dev/null || echo "<base>")
|
||||
echo "DOC_DIFF_BASE: $DOC_DIFF_BASE"
|
||||
\`\`\`
|
||||
|
||||
Do NOT rely on an in-memory variable from an earlier step — shell vars do not survive across
|
||||
blocks. Recompute it here.
|
||||
|
||||
**Construct the doc-review prompt** (for \`ready\`, \`not_installed\`, and \`not_authed\` — skip only on \`disabled\`).
|
||||
Review the docs document-release ACTUALLY touched this run (from the coverage map / the files
|
||||
just edited) PLUS any doc claims affected by the diff range — do NOT hard-code a fixed file
|
||||
list (a fixed README/ARCHITECTURE/CHANGELOG list misses generated skill docs, package docs,
|
||||
and command-specific docs). **Always start with the filesystem boundary instruction:**
|
||||
|
||||
"${CODEX_BOUNDARY}You are reviewing documentation changes against the code that shipped on this
|
||||
branch. Run \\\`git diff \\$DOC_DIFF_BASE...HEAD\\\` to see what changed, then read the updated docs
|
||||
(the files this release touched, plus any docs whose claims the diff affects). Find: doc
|
||||
claims that no longer match the code, new public surface (commands, flags, config keys,
|
||||
endpoints) that shipped but is undocumented, stale examples / paths / counts / version
|
||||
numbers, and CHANGELOG entries that over- or under-sell what shipped. Be terse. Just the gaps.
|
||||
|
||||
THE DOCS AND DIFF: <list the touched doc paths>"
|
||||
|
||||
**If \`CODEX_MODE: ready\` — run Codex:**
|
||||
|
||||
\`\`\`bash
|
||||
TMPERR_DOC=$(mktemp /tmp/codex-docreview-XXXXXXXX)
|
||||
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
|
||||
codex exec "<prompt>" -C "$_REPO_ROOT" -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR_DOC"
|
||||
\`\`\`
|
||||
|
||||
Use a 5-minute timeout (\`timeout: 300000\`). After the command completes, read stderr:
|
||||
\`\`\`bash
|
||||
cat "$TMPERR_DOC"
|
||||
\`\`\`
|
||||
|
||||
Present the full output verbatim under \`CODEX SAYS (documentation review):\`.
|
||||
|
||||
${codexErrorHandling('documentation review')}
|
||||
|
||||
**If \`CODEX_MODE: not_installed\` or \`not_authed\` (or Codex errored at runtime):**
|
||||
|
||||
Dispatch via the Agent tool with the same prompt. Bound it at a 5-minute timeout.
|
||||
Present findings under \`DOCUMENTATION REVIEW (Claude subagent):\`. If it fails: "Doc review unavailable. Continuing."
|
||||
|
||||
**Apply decision (T3B — informational, never auto-edit, but findings don't evaporate).**
|
||||
If there are zero findings, say "Docs match what shipped — no gaps." and continue. Otherwise
|
||||
present the findings, then use AskUserQuestion ONCE:
|
||||
|
||||
> "The doc review found N gaps between the docs and what shipped. How do you want to handle them?"
|
||||
>
|
||||
> RECOMMENDATION: Choose A if the gaps are concrete doc fixes (stale path, missing flag). The
|
||||
> doc review only reports; nothing is edited without your say-so. Completeness: A=9/10, B=4/10, C=8/10.
|
||||
|
||||
Options:
|
||||
- A) Apply all the doc fixes now
|
||||
- B) Skip — leave docs as-is
|
||||
- C) Decide per-finding
|
||||
|
||||
On A or per-finding approvals, make the approved edits yourself (the tool never silently
|
||||
rewrites docs). On B, note the gaps in the output so they're visible.
|
||||
|
||||
**Persist the result:**
|
||||
\`\`\`bash
|
||||
~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"codex-doc-review","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","source":"SOURCE","commit":"'"$(git rev-parse --short HEAD)"'"}'
|
||||
\`\`\`
|
||||
Substitute: STATUS = "clean" if no gaps, "issues_found" if gaps exist. SOURCE = "codex" if Codex ran, "claude" if the subagent ran.
|
||||
|
||||
**Cleanup:** Run \`rm -f "$TMPERR_DOC"\` after processing (if Codex was used).
|
||||
|
||||
---`;
|
||||
}
|
||||
|
||||
// ─── Plan File Discovery (shared helper) ──────────────────────────────
|
||||
|
||||
function generatePlanFileDiscovery(): string {
|
||||
|
|
|
|||
|
|
@ -4,23 +4,47 @@
|
|||
|
||||
Every diff gets adversarial review from both Claude and Codex. LOC is not a proxy for risk — a 5-line auth change can be critical.
|
||||
|
||||
**Detect diff size and tool availability:**
|
||||
**Detect diff size:**
|
||||
|
||||
```bash
|
||||
DIFF_BASE=$(git merge-base origin/<base> HEAD)
|
||||
DIFF_INS=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ insertion' | grep -oE '[0-9]+' || echo "0")
|
||||
DIFF_DEL=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ deletion' | grep -oE '[0-9]+' || echo "0")
|
||||
DIFF_TOTAL=$((DIFF_INS + DIFF_DEL))
|
||||
command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE"
|
||||
# Legacy opt-out — only gates Codex passes, Claude always runs
|
||||
OLD_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || true)
|
||||
echo "DIFF_SIZE: $DIFF_TOTAL"
|
||||
echo "OLD_CFG: ${OLD_CFG:-not_set}"
|
||||
```
|
||||
|
||||
If `OLD_CFG` is `disabled`: skip Codex passes only. Claude adversarial subagent still runs (it's free and fast). Jump to the "Claude adversarial subagent" section.
|
||||
**Detect the Codex master switch + tool availability:**
|
||||
|
||||
**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size.
|
||||
```bash
|
||||
# Codex preflight: one block (functions sourced here don't persist to later blocks).
|
||||
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off)
|
||||
_CODEX_CFG=$(~/.claude/skills/gstack/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled)
|
||||
source ~/.claude/skills/gstack/bin/gstack-codex-probe 2>/dev/null || true
|
||||
if [ "$_CODEX_CFG" = "disabled" ]; then
|
||||
_CODEX_MODE="disabled"
|
||||
elif ! command -v codex >/dev/null 2>&1; then
|
||||
_CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true
|
||||
elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then
|
||||
_CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true
|
||||
else
|
||||
_CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true
|
||||
fi
|
||||
echo "CODEX_MODE: $_CODEX_MODE"
|
||||
```
|
||||
|
||||
Branch on the echoed `CODEX_MODE`:
|
||||
- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip the Codex passes only; the Claude adversarial subagent below STILL runs (it is free and fast). Print: "Codex passes skipped (codex_reviews disabled) — running Claude adversarial only."
|
||||
- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path.
|
||||
- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path.
|
||||
- **`ready`** — run the Codex pass below.
|
||||
|
||||
For this diff-review path, `CODEX_MODE: disabled` means skip the Codex passes ONLY — the
|
||||
Claude adversarial subagent below still runs (it's free and fast). `ready` runs the Codex
|
||||
passes; `not_installed` / `not_authed` skip them with the printed note and continue with
|
||||
Claude only.
|
||||
|
||||
**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size (still requires `CODEX_MODE: ready`).
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -41,9 +65,9 @@ If the subagent fails or times out: "Claude adversarial subagent unavailable. Co
|
|||
|
||||
---
|
||||
|
||||
### Codex adversarial challenge (always runs when available)
|
||||
### Codex adversarial challenge (runs whenever `CODEX_MODE: ready`)
|
||||
|
||||
If Codex is available AND `OLD_CFG` is NOT `disabled`:
|
||||
If `CODEX_MODE` is `ready`:
|
||||
|
||||
```bash
|
||||
TMPERR_ADV=$(mktemp /tmp/codex-adv-XXXXXXXX)
|
||||
|
|
@ -65,13 +89,13 @@ Present the full output verbatim. This is informational — it never blocks ship
|
|||
|
||||
**Cleanup:** Run `rm -f "$TMPERR_ADV"` after processing.
|
||||
|
||||
If Codex is NOT available: "Codex CLI not found — running Claude adversarial only. Install Codex for cross-model coverage: `npm install -g @openai/codex`"
|
||||
If `CODEX_MODE` is `not_installed` / `not_authed` / `disabled`: the preflight already printed the reason; run Claude adversarial only.
|
||||
|
||||
---
|
||||
|
||||
### Codex structured review (large diffs only, 200+ lines)
|
||||
|
||||
If `DIFF_TOTAL >= 200` AND Codex is available AND `OLD_CFG` is NOT `disabled`:
|
||||
If `DIFF_TOTAL >= 200` AND `CODEX_MODE` is `ready`:
|
||||
|
||||
```bash
|
||||
TMPERR=$(mktemp /tmp/codex-review-XXXXXXXX)
|
||||
|
|
|
|||
|
|
@ -2332,23 +2332,47 @@ For each comment in `comments`:
|
|||
|
||||
Every diff gets adversarial review from both Claude and Codex. LOC is not a proxy for risk — a 5-line auth change can be critical.
|
||||
|
||||
**Detect diff size and tool availability:**
|
||||
**Detect diff size:**
|
||||
|
||||
```bash
|
||||
DIFF_BASE=$(git merge-base origin/<base> HEAD)
|
||||
DIFF_INS=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ insertion' | grep -oE '[0-9]+' || echo "0")
|
||||
DIFF_DEL=$(git diff "$DIFF_BASE" --stat | tail -1 | grep -oE '[0-9]+ deletion' | grep -oE '[0-9]+' || echo "0")
|
||||
DIFF_TOTAL=$((DIFF_INS + DIFF_DEL))
|
||||
command -v codex >/dev/null 2>&1 && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE"
|
||||
# Legacy opt-out — only gates Codex passes, Claude always runs
|
||||
OLD_CFG=$($GSTACK_ROOT/bin/gstack-config get codex_reviews 2>/dev/null || true)
|
||||
echo "DIFF_SIZE: $DIFF_TOTAL"
|
||||
echo "OLD_CFG: ${OLD_CFG:-not_set}"
|
||||
```
|
||||
|
||||
If `OLD_CFG` is `disabled`: skip Codex passes only. Claude adversarial subagent still runs (it's free and fast). Jump to the "Claude adversarial subagent" section.
|
||||
**Detect the Codex master switch + tool availability:**
|
||||
|
||||
**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size.
|
||||
```bash
|
||||
# Codex preflight: one block (functions sourced here don't persist to later blocks).
|
||||
_TEL=$($GSTACK_ROOT/bin/gstack-config get telemetry 2>/dev/null || echo off)
|
||||
_CODEX_CFG=$($GSTACK_ROOT/bin/gstack-config get codex_reviews 2>/dev/null || echo enabled)
|
||||
source $GSTACK_ROOT/bin/gstack-codex-probe 2>/dev/null || true
|
||||
if [ "$_CODEX_CFG" = "disabled" ]; then
|
||||
_CODEX_MODE="disabled"
|
||||
elif ! command -v codex >/dev/null 2>&1; then
|
||||
_CODEX_MODE="not_installed"; _gstack_codex_log_event "codex_cli_missing" 2>/dev/null || true
|
||||
elif ! _gstack_codex_auth_probe >/dev/null 2>&1; then
|
||||
_CODEX_MODE="not_authed"; _gstack_codex_log_event "codex_auth_failed" 2>/dev/null || true
|
||||
else
|
||||
_CODEX_MODE="ready"; _gstack_codex_version_check 2>/dev/null || true
|
||||
fi
|
||||
echo "CODEX_MODE: $_CODEX_MODE"
|
||||
```
|
||||
|
||||
Branch on the echoed `CODEX_MODE`:
|
||||
- **`disabled`** — the user turned Codex reviews off (`codex_reviews=disabled`). Skip the Codex passes only; the Claude adversarial subagent below STILL runs (it is free and fast). Print: "Codex passes skipped (codex_reviews disabled) — running Claude adversarial only."
|
||||
- **`not_installed`** — Codex CLI absent. Print: "Codex not installed — using Claude subagent. Install for cross-model coverage: `npm install -g @openai/codex`." Fall back to the Claude subagent path.
|
||||
- **`not_authed`** — installed but no credentials. Print: "Codex installed but not authenticated — using Claude subagent. Run `codex login` or set `$CODEX_API_KEY`." Fall back to the Claude subagent path.
|
||||
- **`ready`** — run the Codex pass below.
|
||||
|
||||
For this diff-review path, `CODEX_MODE: disabled` means skip the Codex passes ONLY — the
|
||||
Claude adversarial subagent below still runs (it's free and fast). `ready` runs the Codex
|
||||
passes; `not_installed` / `not_authed` skip them with the printed note and continue with
|
||||
Claude only.
|
||||
|
||||
**User override:** If the user explicitly requested "full review", "structured review", or "P1 gate", also run the Codex structured review regardless of diff size (still requires `CODEX_MODE: ready`).
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -2369,9 +2393,9 @@ If the subagent fails or times out: "Claude adversarial subagent unavailable. Co
|
|||
|
||||
---
|
||||
|
||||
### Codex adversarial challenge (always runs when available)
|
||||
### Codex adversarial challenge (runs whenever `CODEX_MODE: ready`)
|
||||
|
||||
If Codex is available AND `OLD_CFG` is NOT `disabled`:
|
||||
If `CODEX_MODE` is `ready`:
|
||||
|
||||
```bash
|
||||
TMPERR_ADV=$(mktemp /tmp/codex-adv-XXXXXXXX)
|
||||
|
|
@ -2393,13 +2417,13 @@ Present the full output verbatim. This is informational — it never blocks ship
|
|||
|
||||
**Cleanup:** Run `rm -f "$TMPERR_ADV"` after processing.
|
||||
|
||||
If Codex is NOT available: "Codex CLI not found — running Claude adversarial only. Install Codex for cross-model coverage: `npm install -g @openai/codex`"
|
||||
If `CODEX_MODE` is `not_installed` / `not_authed` / `disabled`: the preflight already printed the reason; run Claude adversarial only.
|
||||
|
||||
---
|
||||
|
||||
### Codex structured review (large diffs only, 200+ lines)
|
||||
|
||||
If `DIFF_TOTAL >= 200` AND Codex is available AND `OLD_CFG` is NOT `disabled`:
|
||||
If `DIFF_TOTAL >= 200` AND `CODEX_MODE` is `ready`:
|
||||
|
||||
```bash
|
||||
TMPERR=$(mktemp /tmp/codex-review-XXXXXXXX)
|
||||
|
|
|
|||
|
|
@ -145,6 +145,9 @@ export const CARVE_GUARDS: Record<string, CarveGuard> = {
|
|||
maxSkeletonBytes: 90_000,
|
||||
minUnionBytes: 80_000,
|
||||
mustContain: ['SCOPE EXPANSION', 'SELECTIVE EXPANSION', 'HOLD SCOPE', 'SCOPE REDUCTION'],
|
||||
// Default-on Codex outside-voice (codexPreflight block + CODEX_MODE branch
|
||||
// prose replacing the smaller opt-in question) lands this ~5.2% over baseline.
|
||||
maxSizeRatio: 1.08,
|
||||
},
|
||||
'plan-eng-review': {
|
||||
skill: 'plan-eng-review',
|
||||
|
|
@ -162,9 +165,11 @@ export const CARVE_GUARDS: Record<string, CarveGuard> = {
|
|||
minUnionBytes: 70_000,
|
||||
mustContain: ['Architecture', 'Code Quality', 'Test', 'Performance'],
|
||||
// Cross-cutting preamble growth (v1.57.2.0 AUQ-failure prose fallback + the
|
||||
// decision-memory nudge + the v1.57.4.0 Boil-the-Ocean rename) lands this just
|
||||
// over the strict 1.05; small headroom for the shared preamble additions.
|
||||
maxSizeRatio: 1.06,
|
||||
// decision-memory nudge + the v1.57.4.0 Boil-the-Ocean rename) plus the
|
||||
// default-on Codex outside-voice (codexPreflight block + CODEX_MODE branch
|
||||
// prose, replacing the smaller opt-in question) land this at ~6.6% over the
|
||||
// v1.53.0.0 baseline. Headroom for those intentional additions.
|
||||
maxSizeRatio: 1.08,
|
||||
},
|
||||
'plan-design-review': {
|
||||
skill: 'plan-design-review',
|
||||
|
|
@ -197,6 +202,9 @@ export const CARVE_GUARDS: Record<string, CarveGuard> = {
|
|||
maxSkeletonBytes: 76_000,
|
||||
minUnionBytes: 70_000,
|
||||
mustContain: ['developer experience', 'Getting Started'],
|
||||
// Default-on Codex outside-voice (codexPreflight block + CODEX_MODE branch
|
||||
// prose replacing the smaller opt-in question) lands this ~5.7% over baseline.
|
||||
maxSizeRatio: 1.08,
|
||||
},
|
||||
'office-hours': {
|
||||
skill: 'office-hours',
|
||||
|
|
@ -232,11 +240,15 @@ export const CARVE_GUARDS: Record<string, CarveGuard> = {
|
|||
maxSkeletonBytes: 50_000,
|
||||
minUnionBytes: 55_000,
|
||||
mustContain: ['CHANGELOG', 'Diataxis', 'coverage'],
|
||||
// The AUQ-failure prose fallback (v1.57.2.0) adds ~2KB to every skill's
|
||||
// always-loaded preamble; on this small carved skeleton that lands at ~5.9%
|
||||
// over the pre-carve/pre-AUQ v1.53.0.0 baseline. Headroom for the
|
||||
// cross-cutting addition; all other skills keep the strict 1.05 ceiling.
|
||||
maxSizeRatio: 1.08,
|
||||
// Two intentional additions stack on this small skill: the AUQ-failure prose
|
||||
// fallback (v1.57.2.0, ~2KB to every preamble) AND the new default-on Codex
|
||||
// documentation-review section (codexPreflight + prompt + apply-gate, carved
|
||||
// into release-body so the SKELETON stays under maxSkeletonBytes). On a ~55KB
|
||||
// baseline that whole new capability is ~18.6% of union bytes. The doc review
|
||||
// is a deliberate new feature, not preamble creep; the union ceiling is raised
|
||||
// to match while the skeleton budget (50_000) still holds the always-loaded
|
||||
// cost flat.
|
||||
maxSizeRatio: 1.20,
|
||||
},
|
||||
'design-consultation': {
|
||||
skill: 'design-consultation',
|
||||
|
|
|
|||
|
|
@ -210,7 +210,11 @@ const MONOLITH_INVARIANTS: ParityInvariant[] = [
|
|||
skill: 'review',
|
||||
mustContain: ['confidence', 'P1', 'P2'],
|
||||
mustHaveHeadings: ['## Preamble', '## When to invoke'],
|
||||
maxSizeRatio: 1.05,
|
||||
// The adversarial step swapped its bare `command -v codex` check for the shared
|
||||
// codexPreflight() block (install + auth tri-state + CODEX_MODE branch prose),
|
||||
// landing ~6.3% over the v1.53.0.0 baseline. Intentional: it adds proper
|
||||
// not-installed vs not-authed handling, not slop.
|
||||
maxSizeRatio: 1.08,
|
||||
minBytes: 70_000,
|
||||
},
|
||||
{
|
||||
|
|
|
|||
|
|
@ -1386,15 +1386,16 @@ describe('Codex skill', () => {
|
|||
expect(content).toContain('Adversarial review (always-on)');
|
||||
// Always-on: both Claude and Codex adversarial
|
||||
expect(content).toContain('Claude adversarial subagent (always runs)');
|
||||
expect(content).toContain('Codex adversarial challenge (always runs when available)');
|
||||
expect(content).toContain('Codex adversarial challenge (runs whenever');
|
||||
// Claude adversarial subagent dispatch
|
||||
expect(content).toContain('Agent tool');
|
||||
expect(content).toContain('FIXABLE');
|
||||
expect(content).toContain('INVESTIGATE');
|
||||
// Codex availability check
|
||||
expect(content).toContain('CODEX_NOT_AVAILABLE');
|
||||
// OLD_CFG only gates Codex, not Claude
|
||||
expect(content).toContain('skip Codex passes only');
|
||||
// Probe-based availability via the shared codexPreflight() (install + auth)
|
||||
expect(content).toContain('CODEX_MODE');
|
||||
expect(content).toContain('command -v codex'); // install check kept literal
|
||||
// codex_reviews=disabled gates Codex passes only; Claude adversarial still runs
|
||||
expect(content).toContain('skip the Codex passes ONLY');
|
||||
// Review log
|
||||
expect(content).toContain('adversarial-review');
|
||||
expect(content).toContain('reasoning_effort="high"');
|
||||
|
|
@ -1449,6 +1450,43 @@ describe('Codex skill', () => {
|
|||
expect(content).toContain('codex exec');
|
||||
});
|
||||
|
||||
// D5 regression guard: the Codex outside voice is default-on, not opt-in. A future
|
||||
// gen-skill-docs change must not silently reintroduce the "Want an outside voice?"
|
||||
// AskUserQuestion. The CODEX_PLAN_REVIEW content renders into each skill's
|
||||
// sections/review-sections.md (the skeleton points at it). plan-design-review uses
|
||||
// DESIGN_OUTSIDE_VOICES, not CODEX_PLAN_REVIEW, so it is excluded here.
|
||||
test('plan reviews run the Codex outside voice default-on (no opt-in question)', () => {
|
||||
for (const skill of ['plan-eng-review', 'plan-ceo-review', 'plan-devex-review']) {
|
||||
const content = fs.readFileSync(
|
||||
path.join(ROOT, skill, 'sections', 'review-sections.md'), 'utf-8');
|
||||
expect(content).not.toContain('Want an outside voice');
|
||||
expect(content).toContain('Outside Voice — Independent Plan Challenge (default-on)');
|
||||
expect(content).toContain('CODEX_MODE');
|
||||
expect(content).toContain('command -v codex'); // preflight install check (e2e relies on it)
|
||||
}
|
||||
});
|
||||
|
||||
test('/document-release includes the default-on Codex documentation review', () => {
|
||||
// The doc-review renders into the carved release-body section (kept out of the
|
||||
// always-loaded skeleton to respect the skeleton-byte budget).
|
||||
const content = fs.readFileSync(
|
||||
path.join(ROOT, 'document-release', 'sections', 'release-body.md'), 'utf-8');
|
||||
expect(content).toContain('Codex Documentation Review (default-on)');
|
||||
expect(content).toContain('CODEX_MODE');
|
||||
expect(content).toContain('codex-doc-review');
|
||||
});
|
||||
|
||||
test('codex-host document-release does NOT contain the Codex doc review', () => {
|
||||
// .agents/ is gitignored — generate on demand (codex never invokes itself)
|
||||
Bun.spawnSync(['bun', 'run', 'scripts/gen-skill-docs.ts', '--host', 'codex'], {
|
||||
cwd: ROOT, stdout: 'pipe', stderr: 'pipe',
|
||||
});
|
||||
const content = fs.readFileSync(
|
||||
path.join(ROOT, '.agents', 'skills', 'gstack-document-release', 'SKILL.md'), 'utf-8');
|
||||
expect(content).not.toContain('Codex Documentation Review');
|
||||
expect(content).not.toContain('codex-doc-review');
|
||||
});
|
||||
|
||||
test('codex review invocations avoid the prompt plus --base argument shape', () => {
|
||||
for (const rel of ['codex/SKILL.md', 'review/SKILL.md', 'ship/SKILL.md']) {
|
||||
// ship's codex command moved into sections/adversarial.md (T9 carve).
|
||||
|
|
|
|||
Loading…
Reference in New Issue