This commit is contained in:
Benjamin D. Smith 2026-05-31 12:32:16 +10:00 committed by GitHub
commit cadd40db5c
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
5 changed files with 1605 additions and 2 deletions

View File

@ -51,6 +51,7 @@ Conventions:
- [/plan-devex-review](plan-devex-review/SKILL.md): Interactive developer experience plan review.
- [/plan-eng-review](plan-eng-review/SKILL.md): Eng manager-mode plan review.
- [/plan-tune](plan-tune/SKILL.md): Self-tuning question sensitivity + developer psychographic for gstack (v1: observational).
- [/pr-prep](pr-prep/SKILL.md): Pre-PR upstream duplicate audit.
- [/qa](qa/SKILL.md): Systematically QA test a web application and fix bugs found.
- [/qa-only](qa-only/SKILL.md): Report-only QA testing.
- [/retro](retro/SKILL.md): Weekly engineering retrospective.

1159
pr-prep/SKILL.md Normal file

File diff suppressed because it is too large Load Diff

347
pr-prep/SKILL.md.tmpl Normal file
View File

@ -0,0 +1,347 @@
---
name: pr-prep
preamble-tier: 4
version: 0.1.0
description: |
Pre-PR upstream duplicate audit. Walks `git log base..HEAD`, derives
search keywords from commit subjects + changed file paths, queries
upstream issues + PRs via `gh`, scores each commit against upstream
collisions (EXACT_DUP / OVERLAP / SIBLING / CLEAN), and refuses to
proceed when EXACT_DUP found. Use when asked to "audit my PR",
"check for duplicates", "pr-prep", "is this already filed",
"upstream check before PR", or "pre-PR audit".
Proactively invoke this skill (do NOT skip the audit) before any
`gh pr create` against a tracked upstream repo. Hooks into /ship
as Step 0. (gstack)
allowed-tools:
- Bash
- Read
- Grep
- Glob
- AskUserQuestion
triggers:
- pr-prep
- audit my PR
- check for duplicates
- upstream check
- pre-PR audit
- is this already filed
- dup PR check
---
{{PREAMBLE}}
{{BASE_BRANCH_DETECT}}
# pr-prep: Pre-PR Upstream Duplicate Audit
You are running the `/pr-prep` workflow. This is a **read-only audit** that
verifies your branch's commits against upstream issues + PRs before you
file a duplicate. Refuses to proceed only on hard duplicates; everything
else is informational.
**Why this exists:** every contributor faces the upstream-dup risk.
Open issues sit for weeks. Multiple PRs converge on the same surface.
Filing a dup wastes reviewer time, contributor goodwill, and your own
branch cleanup. This skill catches dups in ~30s of `gh` queries
*before* the PR exists.
**Output:** per-commit collision report with severity buckets +
recommended action.
---
## Step 1: Pre-flight
1. Run `git status` (never with `-uall`). Working tree must be clean
or have only the commits-being-audited. Abort cleanly if the
working tree has unrelated mid-edit state.
2. Determine the base branch (already set by `{{BASE_BRANCH_DETECT}}`
into `$BASE_BRANCH`). Honor `--base <name>` flag override.
3. Resolve the upstream repo via `gh repo view --json nameWithOwner -q .nameWithOwner`.
Default uses `origin`; override via `--repo owner/name`.
4. **Read the upstream `CONTRIBUTING.md` (if present)** and surface its
pre-push gates + test requirements so the agent knows what must
pass BEFORE filing. Cache to `/tmp/pr-prep-contributing.md` for
the rest of the run.
```bash
gh api "repos/$REPO/contents/CONTRIBUTING.md" --jq .content 2>/dev/null \
| base64 -d > /tmp/pr-prep-contributing.md || \
gh api "repos/$REPO/contents/contributing.md" --jq .content 2>/dev/null \
| base64 -d > /tmp/pr-prep-contributing.md || \
echo "" > /tmp/pr-prep-contributing.md
```
Extract + echo at this step (no need to dump the whole file in the
final report — the agent uses it inline when writing PR bodies):
- Required pre-push commands (e.g. `bun run verify`, `npm test`,
`cargo test`). Look for "before pushing", "pre-push", "verify",
"must pass", "required" headings.
- Test layout conventions (where do unit / e2e / regression tests
belong). Look for "Writing tests", "test structure" sections.
- Branch naming / commit message conventions. Look for "branch
name", "commit format", "conventional commits".
- Welcomed PR areas (if listed). Skips contributions that conflict
with the repo's roadmap.
- Banned patterns (e.g. "never add to allowlist", "no new mocks",
"no breaking changes"). Treat as hard gates.
5. Sanity-check: at least 1 commit in `$BASE_BRANCH..HEAD`. If zero,
abort with "no commits to audit; you're already on $BASE_BRANCH".
```bash
BASE="${BASE_BRANCH:-main}"
REPO=$(gh repo view --json nameWithOwner -q .nameWithOwner 2>/dev/null)
COMMITS=$(git log "$BASE"..HEAD --pretty='%H' 2>/dev/null)
if [ -z "$COMMITS" ]; then
echo "No commits to audit: $(git branch --show-current) is at $BASE."
exit 0
fi
echo "Auditing $(echo "$COMMITS" | wc -l | tr -d ' ') commits against $REPO@$BASE"
```
## Step 2: Walk each commit, extract search signals
For each commit hash:
- **Subject**: `git show -s --format=%s <sha>` — strip conventional-commit
prefix (`fix(scope):`, `feat:`, `chore(deps):`, etc).
- **Changed files**: `git show --stat --name-only --format= <sha>`.
- **Keywords**: from subject, drop stop words + verbs (fix/add/update/
bump/remove). Keep 3-6 meaningful tokens.
Build a search query per commit by joining keywords. Example:
- Subject: `fix(synopsis): tail-truncate documentText for small-model chat handlers`
- Keywords: `synopsis tail-truncate documentText small-model chat`
- Query: `synopsis documentText truncate`
Cap query to ~5 tokens. Too long → zero matches. Too short → noisy
matches.
## Step 3: Query upstream issues + PRs
For each commit's query, run:
```bash
# Open issues + PRs (highest collision risk)
gh issue list --repo "$REPO" --state open --search "$QUERY" --limit 8 --json number,title,url,labels
gh pr list --repo "$REPO" --state open --search "$QUERY" --limit 8 --json number,title,url,headRefName,author
# Closed in last 90 days (might be unreleased master fix)
gh issue list --repo "$REPO" --state closed --search "$QUERY" --limit 5 --json number,title,url,closedAt
gh pr list --repo "$REPO" --state merged --search "$QUERY" --limit 5 --json number,title,url,mergedAt
```
Hard guard: skip if the search call returns rate-limit (HTTP 429).
Print warning + suggest `gh auth refresh`. Don't false-clear on
rate-limit silence.
## Step 4: Score each upstream hit
For every issue/PR returned, compute a collision score:
- **Title token overlap (Jaccard)**: intersect commit-subject keywords
with upstream-title keywords. ≥0.5 = strong match.
- **File overlap (open PRs only)**: `gh pr diff <number> --name-only`
vs the commit's changed files. ≥0.5 = strong match.
- **State weighting**:
- OPEN PR → 1.0× (highest dup risk)
- OPEN issue → 0.7× (someone's tracking it)
- MERGED last 14 days → 0.6× (might be unreleased)
- CLOSED issue → 0.2× (low risk, but useful context)
Final severity bucket per commit:
| Bucket | Trigger |
|---|---|
| **EXACT_DUP** | Any OPEN PR with title Jaccard ≥0.6 OR file overlap ≥0.6 |
| **OVERLAP** | Any OPEN PR/issue with score ≥0.3, or ≥3 OPEN issues |
| **SIBLING** | OPEN issues but no PR; or merged-recently with overlap |
| **CLEAN** | No hits, or only old closed issues |
## Step 4.4: Second-opinion review via codex (CLEAN commits only)
For each commit bucketed CLEAN (i.e. not duplicating upstream work),
run an independent second-opinion code review BEFORE the PR is opened.
Catches bugs the author missed without spending reviewer attention
upstream.
The skill assumes `codex` CLI is on PATH (OpenAI's official CLI;
`brew install codex` on macOS). If absent, emit a soft warning + skip
this step — don't block. Different model family from Claude gives
genuine independent signal.
```bash
if command -v codex >/dev/null 2>&1; then
for sha in $CLEAN_COMMIT_SHAS; do
diff=$(git show "$sha" --stat --pretty=format:"%s")
subject=$(git log -1 --format=%s "$sha")
codex review "Review commit ${sha:0:8} '${subject}' for correctness,
edge cases, and CONTRIBUTING.md compliance. Focus on: regression
risk on adjacent code paths, missing tests, hash/version-bump
invariants if touching cache keys, ordering bugs if touching
conditionals or dispatchers. Flag P0/P1/P2 issues with file:line."
done
else
echo "[pr-prep] codex CLI not found — skipping second-opinion review.
Install via 'brew install codex' or pin a fork-specific reviewer in
your skill config."
fi
```
Surface findings in the report under each commit as a `Codex P{N}`
line. P0/P1 findings escalate the commit's severity to OVERLAP at
minimum (don't file as CLEAN until addressed). P2 findings stay
CLEAN — author decides whether to fix-before-file or note-in-PR-body.
Real-world example (2026-05-26 motivating case):
- PR #1427 (synopsis doc truncate) → codex P2: env-overridable cap
not folded into `computeCorpusGeneration` hash. Different caps
produce same `corpus_generation` → cache invalidation breaks.
Fixed pre-merge, pushed as follow-up commit, comment posted to PR.
- PR #1428 (models doctor args[0]) → codex P2: `--help` regressed
into running network probes. Reorder ternary so `hasHelp` checked
first. Fixed pre-merge.
Both findings were structural, not stylistic. Author missed them
during own write-up. Net cost avoided: 2 review-cycle ping-pongs
upstream + a follow-up fix PR per finding.
## Step 4.5: Surface CONTRIBUTING.md pre-push gates per commit
For each commit that survives audit (CLEAN / OVERLAP / SIBLING — not
EXACT_DUP), check whether the changed files trigger any
CONTRIBUTING.md-stated test path. Example: a commit touching
`src/core/search/*` should run the eval-replay loop per the gbrain
CONTRIBUTING.md "Trigger paths" section.
Annotate each CLEAN/OVERLAP/SIBLING row with:
```
Pre-push gate: bun run verify (from CONTRIBUTING.md)
Trigger paths matched: none (no retrieval / no special test required)
Tests added in commit: yes / no / not required
```
If `Tests added: no` AND `not required` is unclear, surface as a
soft warning in the report but don't block — let the human decide.
## Step 5: Render report
Markdown table per commit:
```
## Audit: branch `feat/foo-bar` vs garrytan/gstack@main
### commit 20ed0eee fix(models): dispatch subcommand reads args[0] not args[1]
| Severity | # | Title | State | Author | Score |
|---|---|---|---|---|---|
| CLEAN | — | (no matches above threshold) | — | — | — |
**Action:** safe to file.
### commit ac213aa6 feat(synopsis): tail-truncate documentText
| Severity | # | Title | State | Author | Score |
|---|---|---|---|---|---|
| EXACT_DUP | #1358 | fix: allow contextual synopsis model env override | OPEN | lost9999 | 0.78 |
| OVERLAP | #1356 | fix: classify contextual synopsis transient errors | OPEN | lost9999 | 0.34 |
**Action:** close mine, comment on #1358 with my angle. DO NOT file new PR.
```
Print summary at end:
```
Summary: 1 EXACT_DUP, 1 CLEAN. 1 commit blocked.
```
## Step 6: Refusal on EXACT_DUP
If ANY commit is EXACT_DUP and `--force` is NOT set, exit non-zero
with a pinpoint message:
```
✗ Blocked: 1 commit duplicates open upstream work.
- ac213aa6 → #1358 (lost9999, OPEN 14d)
Resolutions:
1. Close your version, comment on #1358 with your angle.
2. Cherry-pick the unique parts to a new branch + file separately.
3. Override with `/pr-prep --force` if you've coordinated with
the existing PR author.
```
Always exit 0 on OVERLAP / SIBLING / CLEAN — those are informational.
## Step 7: /ship integration
When invoked by `/ship` (env `GSTACK_FROM_SHIP=1`):
- Skip the interactive AskUserQuestion confirmations
- Exit 0 on CLEAN/OVERLAP/SIBLING
- Exit 1 on EXACT_DUP (blocks /ship)
- Print machine-readable JSON to a known path so /ship can render
collisions in the PR body
## Flags
| Flag | Default | Effect |
|---|---|---|
| `--base <name>` | `main` | Base branch for commit walk |
| `--repo owner/name` | from `gh repo view` | Upstream repo for queries |
| `--force` | off | Proceed past EXACT_DUP (still print report) |
| `--json` | off | Machine-readable output, no markdown table |
| `--limit N` | 8 | Per-query result cap |
| `--no-file-diff` | off | Skip `gh pr diff` calls (faster, less accurate) |
## Cost + speed
- ~30-60s for a 5-commit branch
- 4 `gh` calls per commit (open issues, open PRs, closed issues, merged PRs)
- 1 extra `gh pr diff` per OPEN PR hit (capped at 5)
- ~25-50 `gh` calls total on a typical branch
- gh CLI personal rate limit: 5000/hr authenticated. Safe headroom.
## Real-world example (motivating case 2026-05-26)
User's branch on `garrytan/gbrain` had 8 commits ready for upstream
PRs. Without pr-prep, 4 of 4 unverified commits would have been
duplicates:
- `e96332c5` (reindex CLI_ONLY one-char fix) → #913 OPEN 14 days, same fix
- `74819cec` (sourceId fallback) → #836 OPEN, threads sourceId
- `787da2af` + `829099f9` (synopsis env-override) → #1358 OPEN, same env-override
- `e0133d8a` (LM Studio recipe) → #1051 + #1329, crowded space
Cost avoided: 4 noise PRs, 4 reviewer triage rounds, contributor
goodwill hit, 4 branch closures. pr-prep catches all 4 in ~45s.
---
## Implementation note for future maintainers
Two reasonable build modes:
1. **Inline bash in SKILL.md** (current) — agent walks the steps,
composes `gh` calls, computes Jaccard via `comm` + `wc`. Slower,
more transparent.
2. **Helper script `bin/gstack-pr-prep`** — bash entry point that
does the heavy lifting, agent just orchestrates + renders.
Faster, more testable. Migration path when v0.2.0 lands tests.
v0.1.0 ships mode 1 because it's reviewable in a single file. v0.2.0
should move the Jaccard math + report rendering into `bin/`.
## Out of scope (v0.1.0)
- Diff-content (not just file-name) similarity scoring. Useful but
expensive (`gh pr diff` × N × full body).
- Cross-repo audit (e.g., fix in fork A applies to upstream B).
- LLM-judged semantic dup detection. Out of scope for a deterministic
pre-flight check.
- Auto-comment on the upstream PR. Owner must decide what to say.
These belong in a v0.2+ wave once the deterministic gate proves out.

View File

@ -910,7 +910,55 @@ If CEO Review is missing, mention as informational ("CEO Review not run — reco
For Design Review: run `source <(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 9, but consider running /design-review for a full visual audit post-implementation." Still never block.
Continue to Step 2 — do NOT block or ask. Ship runs its own review in Step 9.
Continue to Step 1.5 — do NOT block or ask. Ship runs its own review in Step 9.
---
## Step 1.5: Upstream duplicate audit (pr-prep gate)
Catches the case where a contributor's branch would file a PR that
duplicates an already-open upstream PR or issue. Without this gate
the duplicate gets filed and is closed days later, costing reviewer
time + contributor goodwill.
Skip on:
- Forks that don't have a tracked upstream remote
- Branches where the base is the user's own fork (no upstream to dup)
- Explicit `--skip-pr-prep` flag
Otherwise run `/pr-prep` inline with `GSTACK_FROM_SHIP=1`:
```bash
# Skip if no upstream remote configured (solo-repo case)
if ! gh repo view --json nameWithOwner -q .nameWithOwner >/dev/null 2>&1; then
echo "[ship] no upstream repo detected, skipping pr-prep audit"
else
GSTACK_FROM_SHIP=1 ~/.claude/skills/gstack/bin/gstack-skill pr-prep --base "$BASE_BRANCH" --json > /tmp/ship-pr-prep.json 2>&1
PR_PREP_EXIT=$?
if [ "$PR_PREP_EXIT" -eq 1 ]; then
# EXACT_DUP found
cat /tmp/ship-pr-prep.json
echo ""
echo "✗ Ship aborted: pr-prep found exact duplicate upstream work."
echo " Resolution paths:"
echo " 1. Close your version, comment on the upstream PR with your angle"
echo " 2. Cherry-pick unique parts to a new branch + file separately"
echo " 3. Override with /ship --skip-pr-prep if coordinated with the upstream PR author"
exit 1
fi
# CLEAN / OVERLAP / SIBLING — render summary, continue
jq -r '.summary' /tmp/ship-pr-prep.json 2>/dev/null || true
fi
```
Note: the JSON report path (`/tmp/ship-pr-prep.json`) is read again in
Step 19 (PR body assembly) to surface SIBLING / OVERLAP findings as a
collapsed "Upstream context" section in the PR body. SIBLING context
helps reviewers triage faster; it does NOT block ship.
If the pr-prep skill is not installed (older gstack install), fall
through with a stderr warn and continue. Don't hard-fail ship on a
missing skill.
---

View File

@ -97,7 +97,55 @@ If CEO Review is missing, mention as informational ("CEO Review not run — reco
For Design Review: run `source <(~/.claude/skills/gstack/bin/gstack-diff-scope <base> 2>/dev/null)`. If `SCOPE_FRONTEND=true` and no design review (plan-design-review or design-review-lite) exists in the dashboard, mention: "Design Review not run — this PR changes frontend code. The lite design check will run automatically in Step 9, but consider running /design-review for a full visual audit post-implementation." Still never block.
Continue to Step 2 — do NOT block or ask. Ship runs its own review in Step 9.
Continue to Step 1.5 — do NOT block or ask. Ship runs its own review in Step 9.
---
## Step 1.5: Upstream duplicate audit (pr-prep gate)
Catches the case where a contributor's branch would file a PR that
duplicates an already-open upstream PR or issue. Without this gate
the duplicate gets filed and is closed days later, costing reviewer
time + contributor goodwill.
Skip on:
- Forks that don't have a tracked upstream remote
- Branches where the base is the user's own fork (no upstream to dup)
- Explicit `--skip-pr-prep` flag
Otherwise run `/pr-prep` inline with `GSTACK_FROM_SHIP=1`:
```bash
# Skip if no upstream remote configured (solo-repo case)
if ! gh repo view --json nameWithOwner -q .nameWithOwner >/dev/null 2>&1; then
echo "[ship] no upstream repo detected, skipping pr-prep audit"
else
GSTACK_FROM_SHIP=1 ~/.claude/skills/gstack/bin/gstack-skill pr-prep --base "$BASE_BRANCH" --json > /tmp/ship-pr-prep.json 2>&1
PR_PREP_EXIT=$?
if [ "$PR_PREP_EXIT" -eq 1 ]; then
# EXACT_DUP found
cat /tmp/ship-pr-prep.json
echo ""
echo "✗ Ship aborted: pr-prep found exact duplicate upstream work."
echo " Resolution paths:"
echo " 1. Close your version, comment on the upstream PR with your angle"
echo " 2. Cherry-pick unique parts to a new branch + file separately"
echo " 3. Override with /ship --skip-pr-prep if coordinated with the upstream PR author"
exit 1
fi
# CLEAN / OVERLAP / SIBLING — render summary, continue
jq -r '.summary' /tmp/ship-pr-prep.json 2>/dev/null || true
fi
```
Note: the JSON report path (`/tmp/ship-pr-prep.json`) is read again in
Step 19 (PR body assembly) to surface SIBLING / OVERLAP findings as a
collapsed "Upstream context" section in the PR body. SIBLING context
helps reviewers triage faster; it does NOT block ship.
If the pr-prep skill is not installed (older gstack install), fall
through with a stderr warn and continue. Don't hard-fail ship on a
missing skill.
---