Commit Graph

3 Commits

Author SHA1 Message Date
Benjamin D. Smith b617f7916d feat(pr-prep): step 4.4 — second-opinion via codex review on CLEAN commits
Before filing each CLEAN commit (i.e. not bucketed EXACT_DUP / OVERLAP /
SIBLING), invoke `codex review` for independent second opinion. Codex
CLI uses a different model family (OpenAI vs Claude) so signal is
genuinely independent — catches structural bugs the author missed
during write-up.

Skill is optional: if `codex` CLI not on PATH, emit soft warning
and continue. Don't block on tool availability.

Severity escalation: P0/P1 findings bump commit from CLEAN to OVERLAP
(don't file as new PR until addressed). P2 stays CLEAN with author
discretion (fix-before-file or document in PR body).

Real-world motivating case (2026-05-26): two CLEAN-bucketed PRs to
garrytan/gbrain (#1427 synopsis doc truncate, #1428 models doctor
args[0]) both had codex-caught P2 issues:

  #1427: env-overridable cap not folded into computeCorpusGeneration
    hash → different caps produced same corpus_generation → cache
    invalidation broken.
  #1428: `gbrain models doctor --help` regressed into network probe
    run instead of usage print (ternary ordering bug).

Both fixed pre-merge as follow-up commits. Net cost avoided: 2
review ping-pong cycles + 2 fix PRs after upstream caught it.
2026-05-26 02:11:12 +10:00
Benjamin D. Smith c2cb8bc423 feat(pr-prep): read upstream CONTRIBUTING.md + surface pre-push gates per commit
Step 1.4: fetch `CONTRIBUTING.md` (case-insensitive) via gh api from
the upstream repo, cache to /tmp, extract pre-push commands + test
layout conventions + branch naming rules + banned patterns. Agent
uses inline when writing PR bodies.

Step 4.5: annotate each CLEAN/OVERLAP/SIBLING commit row with the
required pre-push gate (e.g. `bun run verify`), whether file
changes trigger special test paths (eval-replay for retrieval),
and whether the commit added tests. Soft warning on missing-tests
when not-required is unclear — don't block, let the human decide.

Motivating case 2026-05-26: contributor reading gbrain's CONTRIBUTING.md
discovered `bun run verify` was the canonical pre-push gate AND
that retrieval-touching commits need `gbrain eval replay` against
a baseline NDJSON. Without pr-prep surfacing this, every external
PR risks failing the verify gate on push.
2026-05-26 01:52:47 +10:00
Benjamin D. Smith aa8ba4d43c feat(pr-prep): pre-PR upstream duplicate audit skill
Walks `git log base..HEAD`, derives search keywords per commit from
subject + changed file paths, queries upstream issues + PRs via `gh`,
scores each commit against upstream collisions (EXACT_DUP / OVERLAP /
SIBLING / CLEAN) on a title-token + file-overlap Jaccard, and refuses
to proceed when EXACT_DUP found. Designed to slot into `/ship` as a
Step 0 hook (env `GSTACK_FROM_SHIP=1` switches to JSON output + skips
interactive prompts).

Motivating case (real, 2026-05-26): contributor's branch on
`garrytan/gbrain` had 8 commits ready for upstream PRs. Without
pr-prep, 4 of 4 unverified commits would have been duplicates:

  - `e96332c5` (reindex CLI_ONLY one-char fix) collided with #913
    OPEN 14 days, exact same fix
  - `74819cec` (sourceId fallback) collided with #836 OPEN
  - `787da2af` + `829099f9` (synopsis env-override) collided with
    #1358 OPEN, same env-override pattern
  - `e0133d8a` (LM Studio recipe) collided with #1051 + #1329

Cost avoided per branch: ~4 noise PRs, ~4 reviewer triage rounds,
contributor goodwill hit, ~4 branch closures. pr-prep catches all
in ~30-60s of `gh` queries.

v0.1.0 ships inline bash in SKILL.md (reviewable in one file). v0.2.0
should move Jaccard math + report rendering into `bin/gstack-pr-prep`
once tests exist. Out of scope: diff-content similarity, cross-repo
audit, LLM-judged semantic dup detection, auto-comment on upstream PR.

Skill check: `bun run skill:check pr-prep` clean (the warning about
"no \$B commands found" is informational — matches every other
non-browser skill like ship/review/plan-eng-review).
2026-05-26 00:07:54 +10:00