Before filing each CLEAN commit (i.e. not bucketed EXACT_DUP / OVERLAP /
SIBLING), invoke `codex review` for independent second opinion. Codex
CLI uses a different model family (OpenAI vs Claude) so signal is
genuinely independent — catches structural bugs the author missed
during write-up.
Skill is optional: if `codex` CLI not on PATH, emit soft warning
and continue. Don't block on tool availability.
Severity escalation: P0/P1 findings bump commit from CLEAN to OVERLAP
(don't file as new PR until addressed). P2 stays CLEAN with author
discretion (fix-before-file or document in PR body).
Real-world motivating case (2026-05-26): two CLEAN-bucketed PRs to
garrytan/gbrain (#1427 synopsis doc truncate, #1428 models doctor
args[0]) both had codex-caught P2 issues:
#1427: env-overridable cap not folded into computeCorpusGeneration
hash → different caps produced same corpus_generation → cache
invalidation broken.
#1428: `gbrain models doctor --help` regressed into network probe
run instead of usage print (ternary ordering bug).
Both fixed pre-merge as follow-up commits. Net cost avoided: 2
review ping-pong cycles + 2 fix PRs after upstream caught it.
Step 1.4: fetch `CONTRIBUTING.md` (case-insensitive) via gh api from
the upstream repo, cache to /tmp, extract pre-push commands + test
layout conventions + branch naming rules + banned patterns. Agent
uses inline when writing PR bodies.
Step 4.5: annotate each CLEAN/OVERLAP/SIBLING commit row with the
required pre-push gate (e.g. `bun run verify`), whether file
changes trigger special test paths (eval-replay for retrieval),
and whether the commit added tests. Soft warning on missing-tests
when not-required is unclear — don't block, let the human decide.
Motivating case 2026-05-26: contributor reading gbrain's CONTRIBUTING.md
discovered `bun run verify` was the canonical pre-push gate AND
that retrieval-touching commits need `gbrain eval replay` against
a baseline NDJSON. Without pr-prep surfacing this, every external
PR risks failing the verify gate on push.
`/ship` now invokes `/pr-prep --base $BASE_BRANCH --json` with
`GSTACK_FROM_SHIP=1` before any of: merge base branch, run tests,
version bump, push. If pr-prep returns EXACT_DUP (exit 1), ship
aborts with a pinpoint message naming the upstream PR + resolution
paths (close mine / cherry-pick unique parts / coordinate + retry
with `--skip-pr-prep`).
Skip conditions:
- No upstream remote configured (solo-repo case)
- `--skip-pr-prep` flag (override)
- pr-prep skill not installed (older gstack — stderr warn + continue)
SIBLING / OVERLAP / CLEAN buckets do NOT block. The JSON report is
written to `/tmp/ship-pr-prep.json` so Step 19 (PR body assembly)
can render upstream context as a collapsed section. Helps reviewers
triage faster.
Catches the real bug class motivating the new skill: contributor
pushes branch with N commits, M of N duplicate already-open upstream
work, reviewer closes the dups, branch cleanup churn. /ship now
fails fast before any test run wastes time on a doomed branch.
Walks `git log base..HEAD`, derives search keywords per commit from
subject + changed file paths, queries upstream issues + PRs via `gh`,
scores each commit against upstream collisions (EXACT_DUP / OVERLAP /
SIBLING / CLEAN) on a title-token + file-overlap Jaccard, and refuses
to proceed when EXACT_DUP found. Designed to slot into `/ship` as a
Step 0 hook (env `GSTACK_FROM_SHIP=1` switches to JSON output + skips
interactive prompts).
Motivating case (real, 2026-05-26): contributor's branch on
`garrytan/gbrain` had 8 commits ready for upstream PRs. Without
pr-prep, 4 of 4 unverified commits would have been duplicates:
- `e96332c5` (reindex CLI_ONLY one-char fix) collided with #913
OPEN 14 days, exact same fix
- `74819cec` (sourceId fallback) collided with #836 OPEN
- `787da2af` + `829099f9` (synopsis env-override) collided with
#1358 OPEN, same env-override pattern
- `e0133d8a` (LM Studio recipe) collided with #1051 + #1329
Cost avoided per branch: ~4 noise PRs, ~4 reviewer triage rounds,
contributor goodwill hit, ~4 branch closures. pr-prep catches all
in ~30-60s of `gh` queries.
v0.1.0 ships inline bash in SKILL.md (reviewable in one file). v0.2.0
should move Jaccard math + report rendering into `bin/gstack-pr-prep`
once tests exist. Out of scope: diff-content similarity, cross-repo
audit, LLM-judged semantic dup detection, auto-comment on upstream PR.
Skill check: `bun run skill:check pr-prep` clean (the warning about
"no \$B commands found" is informational — matches every other
non-browser skill like ship/review/plan-eng-review).
2026-05-26 00:07:54 +10:00
5 changed files with 1605 additions and 2 deletions
@ -910,7 +910,55 @@ If CEO Review is missing, mention as informational ("CEO Review not run — reco
For Design Review: run `source <(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 9, but consider running /design-review for a full visual audit post-implementation." Still never block.
Continue to Step 2 — do NOT block or ask. Ship runs its own review in Step 9.
Continue to Step 1.5 — do NOT block or ask. Ship runs its own review in Step 9.
@ -97,7 +97,55 @@ If CEO Review is missing, mention as informational ("CEO Review not run — reco
For Design Review: run `source <(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 9, but consider running /design-review for a full visual audit post-implementation." Still never block.
Continue to Step 2 — do NOT block or ask. Ship runs its own review in Step 9.
Continue to Step 1.5 — do NOT block or ask. Ship runs its own review in Step 9.