fix: resolve merge conflicts with origin/main (v0.4.2 base branch detection)

Merge origin/main which added:
- BASE_BRANCH_DETECT placeholder + dynamic branch detection in all skills
- Updated contributor mode (reflection-based, 0-10 rating)
- Async await wrapping in browse js/eval commands
- Hardcoded-main regression test

Resolved conflicts:
- VERSION: keep 0.6.0 (our version, above 0.4.2)
- CHANGELOG: both entries preserved (0.6.0 above 0.4.2)
- gen-skill-docs.ts: keep main's updated contributor mode, add our escalation protocol
- review/SKILL.md.tmpl: fix hardcoded 'origin/main' in Step 1.5 to use origin/<base>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan 2026-03-16 11:35:47 -05:00
commit f5b981fcc0
No known key found for this signature in database
GPG Key ID: C1F69E85C74EFE1D
30 changed files with 876 additions and 205 deletions

View File

@ -200,6 +200,8 @@ Templates contain the workflows, tips, and examples that require human judgment.
| `{{SNAPSHOT_FLAGS}}` | `snapshot.ts` | Flag reference with examples | | `{{SNAPSHOT_FLAGS}}` | `snapshot.ts` | Flag reference with examples |
| `{{PREAMBLE}}` | `gen-skill-docs.ts` | Startup block: update check, session tracking, contributor mode, AskUserQuestion format | | `{{PREAMBLE}}` | `gen-skill-docs.ts` | Startup block: update check, session tracking, contributor mode, AskUserQuestion format |
| `{{BROWSE_SETUP}}` | `gen-skill-docs.ts` | Binary discovery + setup instructions | | `{{BROWSE_SETUP}}` | `gen-skill-docs.ts` | Binary discovery + setup instructions |
| `{{BASE_BRANCH_DETECT}}` | `gen-skill-docs.ts` | Dynamic base branch detection for PR-targeting skills (ship, review, qa, plan-ceo-review) |
| `{{QA_METHODOLOGY}}` | `gen-skill-docs.ts` | Shared QA methodology block for /qa and /qa-only |
This is structurally sound — if a command exists in code, it appears in docs. If it doesn't exist, it can't appear. This is structurally sound — if a command exists in code, it appears in docs. If it doesn't exist, it can't appear.

View File

@ -127,6 +127,18 @@ The `console`, `network`, and `dialog` commands read from the in-memory buffers,
Dialogs (alert, confirm, prompt) are auto-accepted by default to prevent browser lockup. The `dialog-accept` and `dialog-dismiss` commands control this behavior. For prompts, `dialog-accept <text>` provides the response text. All dialogs are logged to the dialog buffer with type, message, and action taken. Dialogs (alert, confirm, prompt) are auto-accepted by default to prevent browser lockup. The `dialog-accept` and `dialog-dismiss` commands control this behavior. For prompts, `dialog-accept <text>` provides the response text. All dialogs are logged to the dialog buffer with type, message, and action taken.
### JavaScript execution (`js` and `eval`)
`js` runs a single expression, `eval` runs a JS file. Both support `await` — expressions containing `await` are automatically wrapped in an async context:
```bash
$B js "await fetch('/api/data').then(r => r.json())" # works
$B js "document.title" # also works (no wrapping needed)
$B eval my-script.js # file with await works too
```
For `eval` files, single-line files return the expression value directly. Multi-line files need explicit `return` when using `await`. Comments containing "await" don't trigger wrapping.
### Multi-workspace support ### Multi-workspace support
Each workspace gets its own isolated browser instance with its own Chromium process, tabs, cookies, and logs. State is stored in `.gstack/` inside the project root (detected via `git rev-parse --show-toplevel`). Each workspace gets its own isolated browser instance with its own Chromium process, tabs, cookies, and logs. State is stored in `.gstack/` inside the project root (detected via `git rev-parse --show-toplevel`).

View File

@ -21,6 +21,26 @@
- Escalation protocol assertions added to all preamble skills (DONE_WITH_CONCERNS, BLOCKED, NEEDS_CONTEXT). - Escalation protocol assertions added to all preamble skills (DONE_WITH_CONCERNS, BLOCKED, NEEDS_CONTEXT).
- Two new TODOs: design docs → Supabase team store sync (P2), /plan-design-review skill (P2). - Two new TODOs: design docs → Supabase team store sync (P2), /plan-design-review skill (P2).
## 0.4.2 — 2026-03-16
- **`$B js "await fetch(...)"` now just works.** Any `await` expression in `$B js` or `$B eval` is automatically wrapped in an async context. No more `SyntaxError: await is only valid in async functions`. Single-line eval files return values directly; multi-line files use explicit `return`.
- **Contributor mode now reflects, not just reacts.** Instead of only filing reports when something breaks, contributor mode now prompts periodic reflection: "Rate your gstack experience 0-10. Not a 10? Think about why." Catches quality-of-life issues and friction that passive detection misses. Reports now include a 0-10 rating and "What would make this a 10" to focus on actionable improvements.
- **Skills now respect your branch target.** `/ship`, `/review`, `/qa`, and `/plan-ceo-review` detect which branch your PR actually targets instead of assuming `main`. Stacked branches, Conductor workspaces targeting feature branches, and repos using `master` all just work now.
- **`/retro` works on any default branch.** Repos using `master`, `develop`, or other default branch names are detected automatically — no more empty retros because the branch name was wrong.
- **New `{{BASE_BRANCH_DETECT}}` placeholder** for skill authors — drop it into any template and get 3-step branch detection (PR base → repo default → fallback) for free.
- **3 new E2E smoke tests** validate base branch detection works end-to-end across ship, review, and retro skills.
### For contributors
- Added `hasAwait()` helper with comment-stripping to avoid false positives on `// await` in eval files.
- Smart eval wrapping: single-line → expression `(...)`, multi-line → block `{...}` with explicit `return`.
- 6 new async wrapping unit tests, 40 new contributor mode preamble validation tests.
- Calibration example framed as historical ("used to fail") to avoid implying a live bug post-fix.
- Added "Writing SKILL templates" section to CLAUDE.md — rules for natural language over bash-isms, dynamic branch detection, self-contained code blocks.
- Hardcoded-main regression test scans all `.tmpl` files for git commands with hardcoded `main`.
- QA template cleaned up: removed `REPORT_DIR` shell variable, simplified port detection to prose.
- gstack-upgrade template: explicit cross-step prose for variable references between bash blocks.
## 0.4.1 — 2026-03-16 ## 0.4.1 — 2026-03-16
- **gstack now notices when it screws up.** Turn on contributor mode (`gstack-config set gstack_contributor true`) and gstack automatically writes up what went wrong — what you were doing, what broke, repro steps. Next time something annoys you, the bug report is already written. Fork gstack and fix it yourself. - **gstack now notices when it screws up.** Turn on contributor mode (`gstack-config set gstack_contributor true`) and gstack automatically writes up what went wrong — what you were doing, what broke, repro steps. Next time something annoys you, the bug report is already written. Fork gstack and fix it yourself.

View File

@ -67,6 +67,23 @@ SKILL.md files are **generated** from `.tmpl` templates. To update docs:
To add a new browse command: add it to `browse/src/commands.ts` and rebuild. To add a new browse command: add it to `browse/src/commands.ts` and rebuild.
To add a snapshot flag: add it to `SNAPSHOT_FLAGS` in `browse/src/snapshot.ts` and rebuild. To add a snapshot flag: add it to `SNAPSHOT_FLAGS` in `browse/src/snapshot.ts` and rebuild.
## Writing SKILL templates
SKILL.md.tmpl files are **prompt templates read by Claude**, not bash scripts.
Each bash code block runs in a separate shell — variables do not persist between blocks.
Rules:
- **Use natural language for logic and state.** Don't use shell variables to pass
state between code blocks. Instead, tell Claude what to remember and reference
it in prose (e.g., "the base branch detected in Step 0").
- **Don't hardcode branch names.** Detect `main`/`master`/etc dynamically via
`gh pr view` or `gh repo view`. Use `{{BASE_BRANCH_DETECT}}` for PR-targeting
skills. Use "the base branch" in prose, `<base>` in code block placeholders.
- **Keep bash blocks self-contained.** Each code block should work independently.
If a block needs context from a previous step, restate it in the prose above.
- **Express conditionals as English.** Instead of nested `if/elif/else` in bash,
write numbered decision steps: "1. If X, do Y. 2. Otherwise, do Z."
## Browser interaction ## Browser interaction
When you need to interact with a browser (QA, dogfooding, cookie setup), use the When you need to interact with a browser (QA, dogfooding, cookie setup), use the
@ -103,6 +120,21 @@ CHANGELOG.md is **for users**, not contributors. Write it like product release n
- No jargon: say "every question now tells you which project and branch you're in" not - No jargon: say "every question now tells you which project and branch you're in" not
"AskUserQuestion format standardized across skill templates via preamble resolver." "AskUserQuestion format standardized across skill templates via preamble resolver."
## E2E eval failure blame protocol
When an E2E eval fails during `/ship` or any other workflow, **never claim "not
related to our changes" without proving it.** These systems have invisible couplings —
a preamble text change affects agent behavior, a new helper changes timing, a
regenerated SKILL.md shifts prompt context.
**Required before attributing a failure to "pre-existing":**
1. Run the same eval on main (or base branch) and show it fails there too
2. If it passes on main but fails on the branch — it IS your change. Trace the blame.
3. If you can't run on main, say "unverified — may or may not be related" and flag it
as a risk in the PR body
"Pre-existing" without receipts is a lazy claim. Prove it or don't say it.
## Deploying to the active skill ## Deploying to the active skill
The active skill lives at `~/.claude/skills/gstack/`. After making changes: The active skill lives at `~/.claude/skills/gstack/`. After making changes:

View File

@ -22,9 +22,11 @@ bin/dev-teardown # deactivate — back to your global install
## Contributor mode ## Contributor mode
Contributor mode is for people who want to fix gstack when it annoys them. Enable it Contributor mode turns gstack into a self-improving tool. Enable it and Claude Code
and Claude Code will automatically log issues to `~/.gstack/contributor-logs/` as you will periodically reflect on its gstack experience — rating it 0-10 at the end of
work — what you were doing, what went wrong, repro steps, raw output. each major workflow step. When something isn't a 10, it thinks about why and files
a report to `~/.gstack/contributor-logs/` with what happened, repro steps, and what
would make it better.
```bash ```bash
~/.claude/skills/gstack/bin/gstack-config set gstack_contributor true ~/.claude/skills/gstack/bin/gstack-config set gstack_contributor true
@ -36,7 +38,7 @@ the issue, fix it, and open a PR.
### The contributor workflow ### The contributor workflow
1. **Hit friction while using gstack** — contributor mode logs it automatically 1. **Use gstack normally** — contributor mode reflects and logs issues automatically
2. **Check your logs:** `ls ~/.gstack/contributor-logs/` 2. **Check your logs:** `ls ~/.gstack/contributor-logs/`
3. **Fork and clone gstack** (if you haven't already) 3. **Fork and clone gstack** (if you haven't already)
4. **Symlink your fork into the project where you hit the bug:** 4. **Symlink your fork into the project where you hit the bug:**
@ -217,6 +219,8 @@ bun run skill:check
bun run dev:skill bun run dev:skill
``` ```
For template authoring best practices (natural language over bash-isms, dynamic branch detection, `{{BASE_BRANCH_DETECT}}` usage), see CLAUDE.md's "Writing SKILL templates" section.
To add a browse command, add it to `browse/src/commands.ts`. To add a snapshot flag, add it to `SNAPSHOT_FLAGS` in `browse/src/snapshot.ts`. Then rebuild. To add a browse command, add it to `browse/src/commands.ts`. To add a snapshot flag, add it to `SNAPSHOT_FLAGS` in `browse/src/snapshot.ts`. Then rebuild.
## Conductor workspaces ## Conductor workspaces

View File

@ -44,12 +44,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli
## Contributor Mode ## Contributor Mode
If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better.
**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. **At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better!
**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: **Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore.
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
``` ```
# {Title} # {Title}
@ -58,20 +61,23 @@ Hey gstack team — ran into this while using /{skill-name}:
**What I was trying to do:** {what the user/agent was attempting} **What I was trying to do:** {what the user/agent was attempting}
**What happened instead:** {what actually happened} **What happened instead:** {what actually happened}
**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} **My rating:** {0-10} — {one sentence on why it wasn't a 10}
## Steps to reproduce ## Steps to reproduce
1. {step} 1. {step}
## Raw output ## Raw output
(wrap any error messages or unexpected output in a markdown code block) ```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
**Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill}
``` ```
Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
## Completion Status Protocol ## Completion Status Protocol

View File

@ -45,12 +45,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli
## Contributor Mode ## Contributor Mode
If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better.
**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. **At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better!
**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: **Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore.
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
``` ```
# {Title} # {Title}
@ -59,20 +62,23 @@ Hey gstack team — ran into this while using /{skill-name}:
**What I was trying to do:** {what the user/agent was attempting} **What I was trying to do:** {what the user/agent was attempting}
**What happened instead:** {what actually happened} **What happened instead:** {what actually happened}
**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} **My rating:** {0-10} — {one sentence on why it wasn't a 10}
## Steps to reproduce ## Steps to reproduce
1. {step} 1. {step}
## Raw output ## Raw output
(wrap any error messages or unexpected output in a markdown code block) ```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
**Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill}
``` ```
Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
## Completion Status Protocol ## Completion Status Protocol

View File

@ -44,12 +44,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli
## Contributor Mode ## Contributor Mode
If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better.
**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. **At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better!
**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: **Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore.
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
``` ```
# {Title} # {Title}
@ -58,20 +61,23 @@ Hey gstack team — ran into this while using /{skill-name}:
**What I was trying to do:** {what the user/agent was attempting} **What I was trying to do:** {what the user/agent was attempting}
**What happened instead:** {what actually happened} **What happened instead:** {what actually happened}
**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} **My rating:** {0-10} — {one sentence on why it wasn't a 10}
## Steps to reproduce ## Steps to reproduce
1. {step} 1. {step}
## Raw output ## Raw output
(wrap any error messages or unexpected output in a markdown code block) ```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
**Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill}
``` ```
Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
## Completion Status Protocol ## Completion Status Protocol

View File

@ -11,6 +11,12 @@ import type { Page } from 'playwright';
import * as fs from 'fs'; import * as fs from 'fs';
import * as path from 'path'; import * as path from 'path';
/** Detect await keyword, ignoring comments. Accepted risk: await in string literals triggers wrapping (harmless). */
function hasAwait(code: string): boolean {
const stripped = code.replace(/\/\/.*$/gm, '').replace(/\/\*[\s\S]*?\*\//g, '');
return /\bawait\b/.test(stripped);
}
// Security: Path validation to prevent path traversal attacks // Security: Path validation to prevent path traversal attacks
const SAFE_DIRECTORIES = ['/tmp', process.cwd()]; const SAFE_DIRECTORIES = ['/tmp', process.cwd()];
@ -118,7 +124,8 @@ export async function handleReadCommand(
case 'js': { case 'js': {
const expr = args[0]; const expr = args[0];
if (!expr) throw new Error('Usage: browse js <expression>'); if (!expr) throw new Error('Usage: browse js <expression>');
const result = await page.evaluate(expr); const wrapped = hasAwait(expr) ? `(async()=>(${expr}))()` : expr;
const result = await page.evaluate(wrapped);
return typeof result === 'object' ? JSON.stringify(result, null, 2) : String(result ?? ''); return typeof result === 'object' ? JSON.stringify(result, null, 2) : String(result ?? '');
} }
@ -128,6 +135,13 @@ export async function handleReadCommand(
validateReadPath(filePath); validateReadPath(filePath);
if (!fs.existsSync(filePath)) throw new Error(`File not found: ${filePath}`); if (!fs.existsSync(filePath)) throw new Error(`File not found: ${filePath}`);
const code = fs.readFileSync(filePath, 'utf-8'); const code = fs.readFileSync(filePath, 'utf-8');
if (hasAwait(code)) {
const trimmed = code.trim();
const isSingleExpr = trimmed.split('\n').length === 1;
const wrapped = isSingleExpr ? `(async()=>(${trimmed}))()` : `(async()=>{\n${code}\n})()`;
const result = await page.evaluate(wrapped);
return typeof result === 'object' ? JSON.stringify(result, null, 2) : String(result ?? '');
}
const result = await page.evaluate(code); const result = await page.evaluate(code);
return typeof result === 'object' ? JSON.stringify(result, null, 2) : String(result ?? ''); return typeof result === 'object' ? JSON.stringify(result, null, 2) : String(result ?? '');
} }

View File

@ -144,6 +144,60 @@ describe('Inspection', () => {
expect(obj.b).toBe(2); expect(obj.b).toBe(2);
}); });
test('js supports await expressions', async () => {
const result = await handleReadCommand('js', ['await Promise.resolve(42)'], bm);
expect(result).toBe('42');
});
test('js does not false-positive on await substring', async () => {
const result = await handleReadCommand('js', ['(() => { const awaitable = 5; return awaitable })()'], bm);
expect(result).toBe('5');
});
test('eval supports await in single-line file', async () => {
const tmp = '/tmp/eval-await-test.js';
fs.writeFileSync(tmp, 'await Promise.resolve("hello from eval")');
try {
const result = await handleReadCommand('eval', [tmp], bm);
expect(result).toBe('hello from eval');
} finally {
fs.unlinkSync(tmp);
}
});
test('eval does not wrap when await is only in a comment', async () => {
const tmp = '/tmp/eval-comment-test.js';
fs.writeFileSync(tmp, '// no need to await this\ndocument.title');
try {
const result = await handleReadCommand('eval', [tmp], bm);
expect(result).toBe('Test Page - Basic');
} finally {
fs.unlinkSync(tmp);
}
});
test('eval multi-line with await and explicit return', async () => {
const tmp = '/tmp/eval-multiline-await.js';
fs.writeFileSync(tmp, 'const data = await Promise.resolve("multi");\nreturn data;');
try {
const result = await handleReadCommand('eval', [tmp], bm);
expect(result).toBe('multi');
} finally {
fs.unlinkSync(tmp);
}
});
test('eval multi-line with await but no return gives empty string', async () => {
const tmp = '/tmp/eval-multiline-no-return.js';
fs.writeFileSync(tmp, 'const data = await Promise.resolve("lost");\ndata;');
try {
const result = await handleReadCommand('eval', [tmp], bm);
expect(result).toBe('');
} finally {
fs.unlinkSync(tmp);
}
});
test('css returns computed property', async () => { test('css returns computed property', async () => {
const result = await handleReadCommand('css', ['h1', 'color'], bm); const result = await handleReadCommand('css', ['h1', 'color'], bm);
// Navy color // Navy color

View File

@ -44,12 +44,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli
## Contributor Mode ## Contributor Mode
If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better.
**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. **At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better!
**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: **Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore.
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
``` ```
# {Title} # {Title}
@ -58,20 +61,23 @@ Hey gstack team — ran into this while using /{skill-name}:
**What I was trying to do:** {what the user/agent was attempting} **What I was trying to do:** {what the user/agent was attempting}
**What happened instead:** {what actually happened} **What happened instead:** {what actually happened}
**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} **My rating:** {0-10} — {one sentence on why it wasn't a 10}
## Steps to reproduce ## Steps to reproduce
1. {step} 1. {step}
## Raw output ## Raw output
(wrap any error messages or unexpected output in a markdown code block) ```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
**Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill}
``` ```
Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
## Completion Status Protocol ## Completion Status Protocol

View File

@ -94,14 +94,20 @@ fi
echo "Install type: $INSTALL_TYPE at $INSTALL_DIR" echo "Install type: $INSTALL_TYPE at $INSTALL_DIR"
``` ```
The install type and directory path printed above will be used in all subsequent steps.
### Step 3: Save old version ### Step 3: Save old version
Use the install directory from Step 2's output below:
```bash ```bash
OLD_VERSION=$(cat "$INSTALL_DIR/VERSION" 2>/dev/null || echo "unknown") OLD_VERSION=$(cat "$INSTALL_DIR/VERSION" 2>/dev/null || echo "unknown")
``` ```
### Step 4: Upgrade ### Step 4: Upgrade
Use the install type and directory detected in Step 2:
**For git installs** (global-git, local-git): **For git installs** (global-git, local-git):
```bash ```bash
cd "$INSTALL_DIR" cd "$INSTALL_DIR"
@ -125,7 +131,7 @@ rm -rf "$INSTALL_DIR.bak" "$TMP_DIR"
### Step 4.5: Sync local vendored copy ### Step 4.5: Sync local vendored copy
After upgrading the primary install, check if there's also a local copy in the current project that needs updating: Use the install directory from Step 2. Check if there's also a local vendored copy that needs updating:
```bash ```bash
_ROOT=$(git rev-parse --show-toplevel 2>/dev/null) _ROOT=$(git rev-parse --show-toplevel 2>/dev/null)

View File

@ -92,14 +92,20 @@ fi
echo "Install type: $INSTALL_TYPE at $INSTALL_DIR" echo "Install type: $INSTALL_TYPE at $INSTALL_DIR"
``` ```
The install type and directory path printed above will be used in all subsequent steps.
### Step 3: Save old version ### Step 3: Save old version
Use the install directory from Step 2's output below:
```bash ```bash
OLD_VERSION=$(cat "$INSTALL_DIR/VERSION" 2>/dev/null || echo "unknown") OLD_VERSION=$(cat "$INSTALL_DIR/VERSION" 2>/dev/null || echo "unknown")
``` ```
### Step 4: Upgrade ### Step 4: Upgrade
Use the install type and directory detected in Step 2:
**For git installs** (global-git, local-git): **For git installs** (global-git, local-git):
```bash ```bash
cd "$INSTALL_DIR" cd "$INSTALL_DIR"
@ -123,7 +129,7 @@ rm -rf "$INSTALL_DIR.bak" "$TMP_DIR"
### Step 4.5: Sync local vendored copy ### Step 4.5: Sync local vendored copy
After upgrading the primary install, check if there's also a local copy in the current project that needs updating: Use the install directory from Step 2. Check if there's also a local vendored copy that needs updating:
```bash ```bash
_ROOT=$(git rev-parse --show-toplevel 2>/dev/null) _ROOT=$(git rev-parse --show-toplevel 2>/dev/null)

View File

@ -44,12 +44,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli
## Contributor Mode ## Contributor Mode
If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better.
**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. **At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better!
**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: **Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore.
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
``` ```
# {Title} # {Title}
@ -58,20 +61,23 @@ Hey gstack team — ran into this while using /{skill-name}:
**What I was trying to do:** {what the user/agent was attempting} **What I was trying to do:** {what the user/agent was attempting}
**What happened instead:** {what actually happened} **What happened instead:** {what actually happened}
**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} **My rating:** {0-10} — {one sentence on why it wasn't a 10}
## Steps to reproduce ## Steps to reproduce
1. {step} 1. {step}
## Raw output ## Raw output
(wrap any error messages or unexpected output in a markdown code block) ```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
**Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill}
``` ```
Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
## Completion Status Protocol ## Completion Status Protocol
@ -98,6 +104,25 @@ ATTEMPTED: [what you tried]
RECOMMENDATION: [what the user should do next] RECOMMENDATION: [what the user should do next]
``` ```
## Step 0: Detect base branch
Determine which branch this PR targets. Use the result as "the base branch" in all subsequent steps.
1. Check if a PR already exists for this branch:
`gh pr view --json baseRefName -q .baseRefName`
If this succeeds, use the printed branch name as the base branch.
2. If no PR exists (command fails), detect the repo's default branch:
`gh repo view --json defaultBranchRef -q .defaultBranchRef.name`
3. If both commands fail, fall back to `main`.
Print the detected base branch name. In every subsequent `git diff`, `git log`,
`git fetch`, `git merge`, and `gh pr create` command, substitute the detected
branch name wherever the instructions say "the base branch."
---
# Mega Plan Review Mode # Mega Plan Review Mode
## Philosophy ## Philosophy
@ -142,7 +167,7 @@ Before doing anything else, run a system audit. This is not the plan review —
Run the following commands: Run the following commands:
``` ```
git log --oneline -30 # Recent history git log --oneline -30 # Recent history
git diff main --stat # What's already changed git diff <base> --stat # What's already changed
git stash list # Any stashed work git stash list # Any stashed work
grep -r "TODO\|FIXME\|HACK\|XXX" --include="*.rb" --include="*.js" -l grep -r "TODO\|FIXME\|HACK\|XXX" --include="*.rb" --include="*.js" -l
find . -name "*.rb" -newer Gemfile.lock | head -20 # Recently touched files find . -name "*.rb" -newer Gemfile.lock | head -20 # Recently touched files

View File

@ -16,6 +16,8 @@ allowed-tools:
{{PREAMBLE}} {{PREAMBLE}}
{{BASE_BRANCH_DETECT}}
# Mega Plan Review Mode # Mega Plan Review Mode
## Philosophy ## Philosophy
@ -60,7 +62,7 @@ Before doing anything else, run a system audit. This is not the plan review —
Run the following commands: Run the following commands:
``` ```
git log --oneline -30 # Recent history git log --oneline -30 # Recent history
git diff main --stat # What's already changed git diff <base> --stat # What's already changed
git stash list # Any stashed work git stash list # Any stashed work
grep -r "TODO\|FIXME\|HACK\|XXX" --include="*.rb" --include="*.js" -l grep -r "TODO\|FIXME\|HACK\|XXX" --include="*.rb" --include="*.js" -l
find . -name "*.rb" -newer Gemfile.lock | head -20 # Recently touched files find . -name "*.rb" -newer Gemfile.lock | head -20 # Recently touched files

View File

@ -44,12 +44,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli
## Contributor Mode ## Contributor Mode
If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better.
**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. **At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better!
**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: **Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore.
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
``` ```
# {Title} # {Title}
@ -58,20 +61,23 @@ Hey gstack team — ran into this while using /{skill-name}:
**What I was trying to do:** {what the user/agent was attempting} **What I was trying to do:** {what the user/agent was attempting}
**What happened instead:** {what actually happened} **What happened instead:** {what actually happened}
**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} **My rating:** {0-10} — {one sentence on why it wasn't a 10}
## Steps to reproduce ## Steps to reproduce
1. {step} 1. {step}
## Raw output ## Raw output
(wrap any error messages or unexpected output in a markdown code block) ```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
**Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill}
``` ```
Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
## Completion Status Protocol ## Completion Status Protocol

View File

@ -43,12 +43,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli
## Contributor Mode ## Contributor Mode
If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better.
**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. **At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better!
**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: **Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore.
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
``` ```
# {Title} # {Title}
@ -57,20 +60,23 @@ Hey gstack team — ran into this while using /{skill-name}:
**What I was trying to do:** {what the user/agent was attempting} **What I was trying to do:** {what the user/agent was attempting}
**What happened instead:** {what actually happened} **What happened instead:** {what actually happened}
**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} **My rating:** {0-10} — {one sentence on why it wasn't a 10}
## Steps to reproduce ## Steps to reproduce
1. {step} 1. {step}
## Raw output ## Raw output
(wrap any error messages or unexpected output in a markdown code block) ```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
**Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill}
``` ```
Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
## Completion Status Protocol ## Completion Status Protocol

View File

@ -48,12 +48,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli
## Contributor Mode ## Contributor Mode
If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better.
**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. **At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better!
**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: **Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore.
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
``` ```
# {Title} # {Title}
@ -62,20 +65,23 @@ Hey gstack team — ran into this while using /{skill-name}:
**What I was trying to do:** {what the user/agent was attempting} **What I was trying to do:** {what the user/agent was attempting}
**What happened instead:** {what actually happened} **What happened instead:** {what actually happened}
**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} **My rating:** {0-10} — {one sentence on why it wasn't a 10}
## Steps to reproduce ## Steps to reproduce
1. {step} 1. {step}
## Raw output ## Raw output
(wrap any error messages or unexpected output in a markdown code block) ```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
**Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill}
``` ```
Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
## Completion Status Protocol ## Completion Status Protocol
@ -102,6 +108,25 @@ ATTEMPTED: [what you tried]
RECOMMENDATION: [what the user should do next] RECOMMENDATION: [what the user should do next]
``` ```
## Step 0: Detect base branch
Determine which branch this PR targets. Use the result as "the base branch" in all subsequent steps.
1. Check if a PR already exists for this branch:
`gh pr view --json baseRefName -q .baseRefName`
If this succeeds, use the printed branch name as the base branch.
2. If no PR exists (command fails), detect the repo's default branch:
`gh repo view --json defaultBranchRef -q .defaultBranchRef.name`
3. If both commands fail, fall back to `main`.
Print the detected base branch name. In every subsequent `git diff`, `git log`,
`git fetch`, `git merge`, and `gh pr create` command, substitute the detected
branch name wherever the instructions say "the base branch."
---
# /qa: Test → Fix → Verify # /qa: Test → Fix → Verify
You are a QA engineer AND a bug-fix engineer. Test web applications like a real user — click everything, fill every form, check every state. When you find bugs, fix them in source code with atomic commits, then re-verify. Produce a structured report with before/after evidence. You are a QA engineer AND a bug-fix engineer. Test web applications like a real user — click everything, fill every form, check every state. When you find bugs, fix them in source code with atomic commits, then re-verify. Produce a structured report with before/after evidence.
@ -158,8 +183,7 @@ If `NEEDS_SETUP`:
**Create output directories:** **Create output directories:**
```bash ```bash
REPORT_DIR=".gstack/qa-reports" mkdir -p .gstack/qa-reports/screenshots
mkdir -p "$REPORT_DIR/screenshots"
``` ```
--- ---

View File

@ -20,6 +20,8 @@ allowed-tools:
{{PREAMBLE}} {{PREAMBLE}}
{{BASE_BRANCH_DETECT}}
# /qa: Test → Fix → Verify # /qa: Test → Fix → Verify
You are a QA engineer AND a bug-fix engineer. Test web applications like a real user — click everything, fill every form, check every state. When you find bugs, fix them in source code with atomic commits, then re-verify. Produce a structured report with before/after evidence. You are a QA engineer AND a bug-fix engineer. Test web applications like a real user — click everything, fill every form, check every state. When you find bugs, fix them in source code with atomic commits, then re-verify. Produce a structured report with before/after evidence.
@ -59,8 +61,7 @@ fi
**Create output directories:** **Create output directories:**
```bash ```bash
REPORT_DIR=".gstack/qa-reports" mkdir -p .gstack/qa-reports/screenshots
mkdir -p "$REPORT_DIR/screenshots"
``` ```
--- ---

View File

@ -43,12 +43,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli
## Contributor Mode ## Contributor Mode
If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better.
**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. **At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better!
**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: **Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore.
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
``` ```
# {Title} # {Title}
@ -57,20 +60,23 @@ Hey gstack team — ran into this while using /{skill-name}:
**What I was trying to do:** {what the user/agent was attempting} **What I was trying to do:** {what the user/agent was attempting}
**What happened instead:** {what actually happened} **What happened instead:** {what actually happened}
**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} **My rating:** {0-10} — {one sentence on why it wasn't a 10}
## Steps to reproduce ## Steps to reproduce
1. {step} 1. {step}
## Raw output ## Raw output
(wrap any error messages or unexpected output in a markdown code block) ```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
**Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill}
``` ```
Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
## Completion Status Protocol ## Completion Status Protocol
@ -97,6 +103,16 @@ ATTEMPTED: [what you tried]
RECOMMENDATION: [what the user should do next] RECOMMENDATION: [what the user should do next]
``` ```
## Detect default branch
Before gathering data, detect the repo's default branch name:
`gh repo view --json defaultBranchRef -q .defaultBranchRef.name`
If this fails, fall back to `main`. Use the detected name wherever the instructions
say `origin/<default>` below.
---
# /retro — Weekly Engineering Retrospective # /retro — Weekly Engineering Retrospective
Generates a comprehensive engineering retrospective analyzing commit history, work patterns, and code quality metrics. Team-aware: identifies the user running the command, then analyzes every contributor with per-person praise and growth opportunities. Designed for a senior IC/CTO-level builder using Claude Code as a force multiplier. Generates a comprehensive engineering retrospective analyzing commit history, work patterns, and code quality metrics. Team-aware: identifies the user running the command, then analyzes every contributor with per-person praise and growth opportunities. Designed for a senior IC/CTO-level builder using Claude Code as a force multiplier.
@ -131,7 +147,7 @@ Usage: /retro [window]
First, fetch origin and identify the current user: First, fetch origin and identify the current user:
```bash ```bash
git fetch origin main --quiet git fetch origin <default> --quiet
# Identify who is running the retro # Identify who is running the retro
git config user.name git config user.name
git config user.email git config user.email
@ -143,28 +159,28 @@ Run ALL of these git commands in parallel (they are independent):
```bash ```bash
# 1. All commits in window with timestamps, subject, hash, AUTHOR, files changed, insertions, deletions # 1. All commits in window with timestamps, subject, hash, AUTHOR, files changed, insertions, deletions
git log origin/main --since="<window>" --format="%H|%aN|%ae|%ai|%s" --shortstat git log origin/<default> --since="<window>" --format="%H|%aN|%ae|%ai|%s" --shortstat
# 2. Per-commit test vs total LOC breakdown with author # 2. Per-commit test vs total LOC breakdown with author
# Each commit block starts with COMMIT:<hash>|<author>, followed by numstat lines. # Each commit block starts with COMMIT:<hash>|<author>, followed by numstat lines.
# Separate test files (matching test/|spec/|__tests__/) from production files. # Separate test files (matching test/|spec/|__tests__/) from production files.
git log origin/main --since="<window>" --format="COMMIT:%H|%aN" --numstat git log origin/<default> --since="<window>" --format="COMMIT:%H|%aN" --numstat
# 3. Commit timestamps for session detection and hourly distribution (with author) # 3. Commit timestamps for session detection and hourly distribution (with author)
# Use TZ=America/Los_Angeles for Pacific time conversion # Use TZ=America/Los_Angeles for Pacific time conversion
TZ=America/Los_Angeles git log origin/main --since="<window>" --format="%at|%aN|%ai|%s" | sort -n TZ=America/Los_Angeles git log origin/<default> --since="<window>" --format="%at|%aN|%ai|%s" | sort -n
# 4. Files most frequently changed (hotspot analysis) # 4. Files most frequently changed (hotspot analysis)
git log origin/main --since="<window>" --format="" --name-only | grep -v '^$' | sort | uniq -c | sort -rn git log origin/<default> --since="<window>" --format="" --name-only | grep -v '^$' | sort | uniq -c | sort -rn
# 5. PR numbers from commit messages (extract #NNN patterns) # 5. PR numbers from commit messages (extract #NNN patterns)
git log origin/main --since="<window>" --format="%s" | grep -oE '#[0-9]+' | sed 's/^#//' | sort -n | uniq | sed 's/^/#/' git log origin/<default> --since="<window>" --format="%s" | grep -oE '#[0-9]+' | sed 's/^#//' | sort -n | uniq | sed 's/^/#/'
# 6. Per-author file hotspots (who touches what) # 6. Per-author file hotspots (who touches what)
git log origin/main --since="<window>" --format="AUTHOR:%aN" --name-only git log origin/<default> --since="<window>" --format="AUTHOR:%aN" --name-only
# 7. Per-author commit counts (quick summary) # 7. Per-author commit counts (quick summary)
git shortlog origin/main --since="<window>" -sn --no-merges git shortlog origin/<default> --since="<window>" -sn --no-merges
# 8. Greptile triage history (if available) # 8. Greptile triage history (if available)
cat ~/.gstack/greptile-history.md 2>/dev/null || true cat ~/.gstack/greptile-history.md 2>/dev/null || true
@ -323,14 +339,14 @@ If the time window is 14 days or more, split into weekly buckets and show trends
### Step 11: Streak Tracking ### Step 11: Streak Tracking
Count consecutive days with at least 1 commit to origin/main, going back from today. Track both team streak and personal streak: Count consecutive days with at least 1 commit to origin/<default>, going back from today. Track both team streak and personal streak:
```bash ```bash
# Team streak: all unique commit dates (Pacific time) — no hard cutoff # Team streak: all unique commit dates (Pacific time) — no hard cutoff
TZ=America/Los_Angeles git log origin/main --format="%ad" --date=format:"%Y-%m-%d" | sort -u TZ=America/Los_Angeles git log origin/<default> --format="%ad" --date=format:"%Y-%m-%d" | sort -u
# Personal streak: only the current user's commits # Personal streak: only the current user's commits
TZ=America/Los_Angeles git log origin/main --author="<user_name>" --format="%ad" --date=format:"%Y-%m-%d" | sort -u TZ=America/Los_Angeles git log origin/<default> --author="<user_name>" --format="%ad" --date=format:"%Y-%m-%d" | sort -u
``` ```
Count backward from today — how many consecutive days have at least one commit? This queries the full history so streaks of any length are reported accurately. Display both: Count backward from today — how many consecutive days have at least one commit? This queries the full history so streaks of any length are reported accurately. Display both:
@ -548,7 +564,7 @@ When the user runs `/retro compare` (or `/retro compare 14d`):
## Important Rules ## Important Rules
- ALL narrative output goes directly to the user in the conversation. The ONLY file written is the `.context/retros/` JSON snapshot. - ALL narrative output goes directly to the user in the conversation. The ONLY file written is the `.context/retros/` JSON snapshot.
- Use `origin/main` for all git queries (not local main which may be stale) - Use `origin/<default>` for all git queries (not local main which may be stale)
- Convert all timestamps to Pacific time for display (use `TZ=America/Los_Angeles`) - Convert all timestamps to Pacific time for display (use `TZ=America/Los_Angeles`)
- If the window has zero commits, say so and suggest a different window - If the window has zero commits, say so and suggest a different window
- Round LOC/hour to nearest 50 - Round LOC/hour to nearest 50

View File

@ -15,6 +15,16 @@ allowed-tools:
{{PREAMBLE}} {{PREAMBLE}}
## Detect default branch
Before gathering data, detect the repo's default branch name:
`gh repo view --json defaultBranchRef -q .defaultBranchRef.name`
If this fails, fall back to `main`. Use the detected name wherever the instructions
say `origin/<default>` below.
---
# /retro — Weekly Engineering Retrospective # /retro — Weekly Engineering Retrospective
Generates a comprehensive engineering retrospective analyzing commit history, work patterns, and code quality metrics. Team-aware: identifies the user running the command, then analyzes every contributor with per-person praise and growth opportunities. Designed for a senior IC/CTO-level builder using Claude Code as a force multiplier. Generates a comprehensive engineering retrospective analyzing commit history, work patterns, and code quality metrics. Team-aware: identifies the user running the command, then analyzes every contributor with per-person praise and growth opportunities. Designed for a senior IC/CTO-level builder using Claude Code as a force multiplier.
@ -49,7 +59,7 @@ Usage: /retro [window]
First, fetch origin and identify the current user: First, fetch origin and identify the current user:
```bash ```bash
git fetch origin main --quiet git fetch origin <default> --quiet
# Identify who is running the retro # Identify who is running the retro
git config user.name git config user.name
git config user.email git config user.email
@ -61,28 +71,28 @@ Run ALL of these git commands in parallel (they are independent):
```bash ```bash
# 1. All commits in window with timestamps, subject, hash, AUTHOR, files changed, insertions, deletions # 1. All commits in window with timestamps, subject, hash, AUTHOR, files changed, insertions, deletions
git log origin/main --since="<window>" --format="%H|%aN|%ae|%ai|%s" --shortstat git log origin/<default> --since="<window>" --format="%H|%aN|%ae|%ai|%s" --shortstat
# 2. Per-commit test vs total LOC breakdown with author # 2. Per-commit test vs total LOC breakdown with author
# Each commit block starts with COMMIT:<hash>|<author>, followed by numstat lines. # Each commit block starts with COMMIT:<hash>|<author>, followed by numstat lines.
# Separate test files (matching test/|spec/|__tests__/) from production files. # Separate test files (matching test/|spec/|__tests__/) from production files.
git log origin/main --since="<window>" --format="COMMIT:%H|%aN" --numstat git log origin/<default> --since="<window>" --format="COMMIT:%H|%aN" --numstat
# 3. Commit timestamps for session detection and hourly distribution (with author) # 3. Commit timestamps for session detection and hourly distribution (with author)
# Use TZ=America/Los_Angeles for Pacific time conversion # Use TZ=America/Los_Angeles for Pacific time conversion
TZ=America/Los_Angeles git log origin/main --since="<window>" --format="%at|%aN|%ai|%s" | sort -n TZ=America/Los_Angeles git log origin/<default> --since="<window>" --format="%at|%aN|%ai|%s" | sort -n
# 4. Files most frequently changed (hotspot analysis) # 4. Files most frequently changed (hotspot analysis)
git log origin/main --since="<window>" --format="" --name-only | grep -v '^$' | sort | uniq -c | sort -rn git log origin/<default> --since="<window>" --format="" --name-only | grep -v '^$' | sort | uniq -c | sort -rn
# 5. PR numbers from commit messages (extract #NNN patterns) # 5. PR numbers from commit messages (extract #NNN patterns)
git log origin/main --since="<window>" --format="%s" | grep -oE '#[0-9]+' | sed 's/^#//' | sort -n | uniq | sed 's/^/#/' git log origin/<default> --since="<window>" --format="%s" | grep -oE '#[0-9]+' | sed 's/^#//' | sort -n | uniq | sed 's/^/#/'
# 6. Per-author file hotspots (who touches what) # 6. Per-author file hotspots (who touches what)
git log origin/main --since="<window>" --format="AUTHOR:%aN" --name-only git log origin/<default> --since="<window>" --format="AUTHOR:%aN" --name-only
# 7. Per-author commit counts (quick summary) # 7. Per-author commit counts (quick summary)
git shortlog origin/main --since="<window>" -sn --no-merges git shortlog origin/<default> --since="<window>" -sn --no-merges
# 8. Greptile triage history (if available) # 8. Greptile triage history (if available)
cat ~/.gstack/greptile-history.md 2>/dev/null || true cat ~/.gstack/greptile-history.md 2>/dev/null || true
@ -241,14 +251,14 @@ If the time window is 14 days or more, split into weekly buckets and show trends
### Step 11: Streak Tracking ### Step 11: Streak Tracking
Count consecutive days with at least 1 commit to origin/main, going back from today. Track both team streak and personal streak: Count consecutive days with at least 1 commit to origin/<default>, going back from today. Track both team streak and personal streak:
```bash ```bash
# Team streak: all unique commit dates (Pacific time) — no hard cutoff # Team streak: all unique commit dates (Pacific time) — no hard cutoff
TZ=America/Los_Angeles git log origin/main --format="%ad" --date=format:"%Y-%m-%d" | sort -u TZ=America/Los_Angeles git log origin/<default> --format="%ad" --date=format:"%Y-%m-%d" | sort -u
# Personal streak: only the current user's commits # Personal streak: only the current user's commits
TZ=America/Los_Angeles git log origin/main --author="<user_name>" --format="%ad" --date=format:"%Y-%m-%d" | sort -u TZ=America/Los_Angeles git log origin/<default> --author="<user_name>" --format="%ad" --date=format:"%Y-%m-%d" | sort -u
``` ```
Count backward from today — how many consecutive days have at least one commit? This queries the full history so streaks of any length are reported accurately. Display both: Count backward from today — how many consecutive days have at least one commit? This queries the full history so streaks of any length are reported accurately. Display both:
@ -466,7 +476,7 @@ When the user runs `/retro compare` (or `/retro compare 14d`):
## Important Rules ## Important Rules
- ALL narrative output goes directly to the user in the conversation. The ONLY file written is the `.context/retros/` JSON snapshot. - ALL narrative output goes directly to the user in the conversation. The ONLY file written is the `.context/retros/` JSON snapshot.
- Use `origin/main` for all git queries (not local main which may be stale) - Use `origin/<default>` for all git queries (not local main which may be stale)
- Convert all timestamps to Pacific time for display (use `TZ=America/Los_Angeles`) - Convert all timestamps to Pacific time for display (use `TZ=America/Los_Angeles`)
- If the window has zero commits, say so and suggest a different window - If the window has zero commits, say so and suggest a different window
- Round LOC/hour to nearest 50 - Round LOC/hour to nearest 50

View File

@ -2,7 +2,7 @@
name: review name: review
version: 1.0.0 version: 1.0.0
description: | description: |
Pre-landing PR review. Analyzes diff against main for SQL safety, LLM trust Pre-landing PR review. Analyzes diff against the base branch for SQL safety, LLM trust
boundary violations, conditional side effects, and other structural issues. boundary violations, conditional side effects, and other structural issues.
allowed-tools: allowed-tools:
- Bash - Bash
@ -44,12 +44,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli
## Contributor Mode ## Contributor Mode
If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better.
**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. **At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better!
**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: **Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore.
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
``` ```
# {Title} # {Title}
@ -58,20 +61,23 @@ Hey gstack team — ran into this while using /{skill-name}:
**What I was trying to do:** {what the user/agent was attempting} **What I was trying to do:** {what the user/agent was attempting}
**What happened instead:** {what actually happened} **What happened instead:** {what actually happened}
**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} **My rating:** {0-10} — {one sentence on why it wasn't a 10}
## Steps to reproduce ## Steps to reproduce
1. {step} 1. {step}
## Raw output ## Raw output
(wrap any error messages or unexpected output in a markdown code block) ```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
**Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill}
``` ```
Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
## Completion Status Protocol ## Completion Status Protocol
@ -98,17 +104,36 @@ ATTEMPTED: [what you tried]
RECOMMENDATION: [what the user should do next] RECOMMENDATION: [what the user should do next]
``` ```
## Step 0: Detect base branch
Determine which branch this PR targets. Use the result as "the base branch" in all subsequent steps.
1. Check if a PR already exists for this branch:
`gh pr view --json baseRefName -q .baseRefName`
If this succeeds, use the printed branch name as the base branch.
2. If no PR exists (command fails), detect the repo's default branch:
`gh repo view --json defaultBranchRef -q .defaultBranchRef.name`
3. If both commands fail, fall back to `main`.
Print the detected base branch name. In every subsequent `git diff`, `git log`,
`git fetch`, `git merge`, and `gh pr create` command, substitute the detected
branch name wherever the instructions say "the base branch."
---
# Pre-Landing PR Review # Pre-Landing PR Review
You are running the `/review` workflow. Analyze the current branch's diff against main for structural issues that tests don't catch. You are running the `/review` workflow. Analyze the current branch's diff against the base branch for structural issues that tests don't catch.
--- ---
## Step 1: Check branch ## Step 1: Check branch
1. Run `git branch --show-current` to get the current branch. 1. Run `git branch --show-current` to get the current branch.
2. If on `main`, output: **"Nothing to review — you're on main or have no changes against main."** and stop. 2. If on the base branch, output: **"Nothing to review — you're on the base branch or have no changes against it."** and stop.
3. Run `git fetch origin main --quiet && git diff origin/main --stat` to check if there's a diff. If no diff, output the same message and stop. 3. Run `git fetch origin <base> --quiet && git diff origin/<base> --stat` to check if there's a diff. If no diff, output the same message and stop.
--- ---
@ -117,10 +142,10 @@ You are running the `/review` workflow. Analyze the current branch's diff agains
Before reviewing code quality, check: **did they build what was requested — nothing more, nothing less?** Before reviewing code quality, check: **did they build what was requested — nothing more, nothing less?**
1. Read `TODOS.md` (if it exists). Read PR description (`gh pr view --json body --jq .body 2>/dev/null || true`). 1. Read `TODOS.md` (if it exists). Read PR description (`gh pr view --json body --jq .body 2>/dev/null || true`).
Read commit messages (`git log origin/main..HEAD --oneline`). Read commit messages (`git log origin/<base>..HEAD --oneline`).
**If no PR exists:** rely on commit messages and TODOS.md for stated intent — this is the common case since /review runs before /ship creates the PR. **If no PR exists:** rely on commit messages and TODOS.md for stated intent — this is the common case since /review runs before /ship creates the PR.
2. Identify the **stated intent** — what was this branch supposed to accomplish? 2. Identify the **stated intent** — what was this branch supposed to accomplish?
3. Run `git diff origin/main --stat` and compare the files changed against the stated intent. 3. Run `git diff origin/<base> --stat` and compare the files changed against the stated intent.
4. Evaluate with skepticism: 4. Evaluate with skepticism:
**SCOPE CREEP detection:** **SCOPE CREEP detection:**
@ -166,13 +191,13 @@ Read `.claude/skills/review/greptile-triage.md` and follow the fetch, filter, cl
## Step 3: Get the diff ## Step 3: Get the diff
Fetch the latest main to avoid false positives from a stale local main: Fetch the latest base branch to avoid false positives from stale local state:
```bash ```bash
git fetch origin main --quiet git fetch origin <base> --quiet
``` ```
Run `git diff origin/main` to get the full diff. This includes both committed and uncommitted changes against the latest main. Run `git diff origin/<base>` to get the full diff. This includes both committed and uncommitted changes against the latest base branch.
--- ---

View File

@ -2,7 +2,7 @@
name: review name: review
version: 1.0.0 version: 1.0.0
description: | description: |
Pre-landing PR review. Analyzes diff against main for SQL safety, LLM trust Pre-landing PR review. Analyzes diff against the base branch for SQL safety, LLM trust
boundary violations, conditional side effects, and other structural issues. boundary violations, conditional side effects, and other structural issues.
allowed-tools: allowed-tools:
- Bash - Bash
@ -16,17 +16,19 @@ allowed-tools:
{{PREAMBLE}} {{PREAMBLE}}
{{BASE_BRANCH_DETECT}}
# Pre-Landing PR Review # Pre-Landing PR Review
You are running the `/review` workflow. Analyze the current branch's diff against main for structural issues that tests don't catch. You are running the `/review` workflow. Analyze the current branch's diff against the base branch for structural issues that tests don't catch.
--- ---
## Step 1: Check branch ## Step 1: Check branch
1. Run `git branch --show-current` to get the current branch. 1. Run `git branch --show-current` to get the current branch.
2. If on `main`, output: **"Nothing to review — you're on main or have no changes against main."** and stop. 2. If on the base branch, output: **"Nothing to review — you're on the base branch or have no changes against it."** and stop.
3. Run `git fetch origin main --quiet && git diff origin/main --stat` to check if there's a diff. If no diff, output the same message and stop. 3. Run `git fetch origin <base> --quiet && git diff origin/<base> --stat` to check if there's a diff. If no diff, output the same message and stop.
--- ---
@ -35,10 +37,10 @@ You are running the `/review` workflow. Analyze the current branch's diff agains
Before reviewing code quality, check: **did they build what was requested — nothing more, nothing less?** Before reviewing code quality, check: **did they build what was requested — nothing more, nothing less?**
1. Read `TODOS.md` (if it exists). Read PR description (`gh pr view --json body --jq .body 2>/dev/null || true`). 1. Read `TODOS.md` (if it exists). Read PR description (`gh pr view --json body --jq .body 2>/dev/null || true`).
Read commit messages (`git log origin/main..HEAD --oneline`). Read commit messages (`git log origin/<base>..HEAD --oneline`).
**If no PR exists:** rely on commit messages and TODOS.md for stated intent — this is the common case since /review runs before /ship creates the PR. **If no PR exists:** rely on commit messages and TODOS.md for stated intent — this is the common case since /review runs before /ship creates the PR.
2. Identify the **stated intent** — what was this branch supposed to accomplish? 2. Identify the **stated intent** — what was this branch supposed to accomplish?
3. Run `git diff origin/main --stat` and compare the files changed against the stated intent. 3. Run `git diff origin/<base> --stat` and compare the files changed against the stated intent.
4. Evaluate with skepticism: 4. Evaluate with skepticism:
**SCOPE CREEP detection:** **SCOPE CREEP detection:**
@ -84,13 +86,13 @@ Read `.claude/skills/review/greptile-triage.md` and follow the fetch, filter, cl
## Step 3: Get the diff ## Step 3: Get the diff
Fetch the latest main to avoid false positives from a stale local main: Fetch the latest base branch to avoid false positives from stale local state:
```bash ```bash
git fetch origin main --quiet git fetch origin <base> --quiet
``` ```
Run `git diff origin/main` to get the full diff. This includes both committed and uncommitted changes against the latest main. Run `git diff origin/<base>` to get the full diff. This includes both committed and uncommitted changes against the latest base branch.
--- ---

View File

@ -123,12 +123,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli
## Contributor Mode ## Contributor Mode
If \`_CONTRIB\` is \`true\`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." If \`_CONTRIB\` is \`true\`: you are in **contributor mode**. You're a gstack user who also helps make it better.
**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance even minor stuff. **At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown file a field report. Maybe our contributor will help make us better!
**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site.
**To file:** write \`~/.gstack/contributor-logs/{slug}.md\` with this structure: **Calibration this is the bar:** For example, \`$B js "await fetch(...)"\` used to fail with \`SyntaxError: await is only valid in async functions\` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore.
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write \`~/.gstack/contributor-logs/{slug}.md\` with **all sections below** (do not truncate — include every section through the Date/Version footer):
\`\`\` \`\`\`
# {Title} # {Title}
@ -137,20 +140,23 @@ Hey gstack team — ran into this while using /{skill-name}:
**What I was trying to do:** {what the user/agent was attempting} **What I was trying to do:** {what the user/agent was attempting}
**What happened instead:** {what actually happened} **What happened instead:** {what actually happened}
**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} **My rating:** {0-10} {one sentence on why it wasn't a 10}
## Steps to reproduce ## Steps to reproduce
1. {step} 1. {step}
## Raw output ## Raw output
(wrap any error messages or unexpected output in a markdown code block) \`\`\`
{paste the actual error or unexpected output here}
\`\`\`
## What would make this a 10
{one sentence: what gstack should have done differently}
**Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill}
\`\`\` \`\`\`
Then run: \`mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md\` Slug: lowercase, hyphens, max 60 chars (e.g. \`browse-js-no-await\`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
Slug: lowercase, hyphens, max 60 chars (e.g. \`browse-snapshot-ref-gap\`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
## Completion Status Protocol ## Completion Status Protocol
@ -199,6 +205,27 @@ If \`NEEDS_SETUP\`:
3. If \`bun\` is not installed: \`curl -fsSL https://bun.sh/install | bash\``; 3. If \`bun\` is not installed: \`curl -fsSL https://bun.sh/install | bash\``;
} }
function generateBaseBranchDetect(): string {
return `## Step 0: Detect base branch
Determine which branch this PR targets. Use the result as "the base branch" in all subsequent steps.
1. Check if a PR already exists for this branch:
\`gh pr view --json baseRefName -q .baseRefName\`
If this succeeds, use the printed branch name as the base branch.
2. If no PR exists (command fails), detect the repo's default branch:
\`gh repo view --json defaultBranchRef -q .defaultBranchRef.name\`
3. If both commands fail, fall back to \`main\`.
Print the detected base branch name. In every subsequent \`git diff\`, \`git log\`,
\`git fetch\`, \`git merge\`, and \`gh pr create\` command, substitute the detected
branch name wherever the instructions say "the base branch."
---`;
}
function generateQAMethodology(): string { function generateQAMethodology(): string {
return `## Modes return `## Modes
@ -480,6 +507,7 @@ const RESOLVERS: Record<string, () => string> = {
SNAPSHOT_FLAGS: generateSnapshotFlags, SNAPSHOT_FLAGS: generateSnapshotFlags,
PREAMBLE: generatePreamble, PREAMBLE: generatePreamble,
BROWSE_SETUP: generateBrowseSetup, BROWSE_SETUP: generateBrowseSetup,
BASE_BRANCH_DETECT: generateBaseBranchDetect,
QA_METHODOLOGY: generateQAMethodology, QA_METHODOLOGY: generateQAMethodology,
}; };

View File

@ -41,12 +41,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli
## Contributor Mode ## Contributor Mode
If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better.
**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. **At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better!
**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: **Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore.
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
``` ```
# {Title} # {Title}
@ -55,20 +58,23 @@ Hey gstack team — ran into this while using /{skill-name}:
**What I was trying to do:** {what the user/agent was attempting} **What I was trying to do:** {what the user/agent was attempting}
**What happened instead:** {what actually happened} **What happened instead:** {what actually happened}
**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} **My rating:** {0-10} — {one sentence on why it wasn't a 10}
## Steps to reproduce ## Steps to reproduce
1. {step} 1. {step}
## Raw output ## Raw output
(wrap any error messages or unexpected output in a markdown code block) ```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
**Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill}
``` ```
Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
## Completion Status Protocol ## Completion Status Protocol

View File

@ -2,7 +2,7 @@
name: ship name: ship
version: 1.0.0 version: 1.0.0
description: | description: |
Ship workflow: merge main, run tests, review diff, bump VERSION, update CHANGELOG, commit, push, create PR. Ship workflow: detect + merge base branch, run tests, review diff, bump VERSION, update CHANGELOG, commit, push, create PR.
allowed-tools: allowed-tools:
- Bash - Bash
- Read - Read
@ -43,12 +43,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli
## Contributor Mode ## Contributor Mode
If `_CONTRIB` is `true`: you are in **contributor mode**. When you hit friction with **gstack itself** (not the user's app), file a field report. Think: "hey, I was trying to do X with gstack and it didn't work / was confusing / was annoying. Here's what happened." If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better.
**gstack issues:** browse command fails/wrong output, snapshot missing elements, skill instructions unclear or misleading, binary crash/hang, unhelpful error message, any rough edge or annoyance — even minor stuff. **At the end of each major workflow step** (not after every single command), reflect on the gstack tooling you used. Rate your experience 0 to 10. If it wasn't a 10, think about why. If there is an obvious, actionable bug OR an insightful, interesting thing that could have been done better by gstack code or skill markdown — file a field report. Maybe our contributor will help make us better!
**NOT gstack issues:** user's app bugs, network errors to user's URL, auth failures on user's site.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with this structure: **Calibration — this is the bar:** For example, `$B js "await fetch(...)"` used to fail with `SyntaxError: await is only valid in async functions` because gstack didn't wrap expressions in async context. Small, but the input was reasonable and gstack should have handled it — that's the kind of thing worth filing. Things less consequential than this, ignore.
**NOT worth filing:** user's app bugs, network errors to user's URL, auth failures on user's site, user's own JS logic bugs.
**To file:** write `~/.gstack/contributor-logs/{slug}.md` with **all sections below** (do not truncate — include every section through the Date/Version footer):
``` ```
# {Title} # {Title}
@ -57,20 +60,23 @@ Hey gstack team — ran into this while using /{skill-name}:
**What I was trying to do:** {what the user/agent was attempting} **What I was trying to do:** {what the user/agent was attempting}
**What happened instead:** {what actually happened} **What happened instead:** {what actually happened}
**How annoying (1-5):** {1=meh, 3=friction, 5=blocker} **My rating:** {0-10} — {one sentence on why it wasn't a 10}
## Steps to reproduce ## Steps to reproduce
1. {step} 1. {step}
## Raw output ## Raw output
(wrap any error messages or unexpected output in a markdown code block) ```
{paste the actual error or unexpected output here}
```
## What would make this a 10
{one sentence: what gstack should have done differently}
**Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill} **Date:** {YYYY-MM-DD} | **Version:** {gstack version} | **Skill:** /{skill}
``` ```
Then run: `mkdir -p ~/.gstack/contributor-logs && open ~/.gstack/contributor-logs/{slug}.md` Slug: lowercase, hyphens, max 60 chars (e.g. `browse-js-no-await`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
Slug: lowercase, hyphens, max 60 chars (e.g. `browse-snapshot-ref-gap`). Skip if file already exists. Max 3 reports per session. File inline and continue — don't stop the workflow. Tell user: "Filed gstack field report: {title}"
## Completion Status Protocol ## Completion Status Protocol
@ -97,12 +103,31 @@ ATTEMPTED: [what you tried]
RECOMMENDATION: [what the user should do next] RECOMMENDATION: [what the user should do next]
``` ```
## Step 0: Detect base branch
Determine which branch this PR targets. Use the result as "the base branch" in all subsequent steps.
1. Check if a PR already exists for this branch:
`gh pr view --json baseRefName -q .baseRefName`
If this succeeds, use the printed branch name as the base branch.
2. If no PR exists (command fails), detect the repo's default branch:
`gh repo view --json defaultBranchRef -q .defaultBranchRef.name`
3. If both commands fail, fall back to `main`.
Print the detected base branch name. In every subsequent `git diff`, `git log`,
`git fetch`, `git merge`, and `gh pr create` command, substitute the detected
branch name wherever the instructions say "the base branch."
---
# Ship: Fully Automated Ship Workflow # Ship: Fully Automated Ship Workflow
You are running the `/ship` workflow. This is a **non-interactive, fully automated** workflow. Do NOT ask for confirmation at any step. The user said `/ship` which means DO IT. Run straight through and output the PR URL at the end. You are running the `/ship` workflow. This is a **non-interactive, fully automated** workflow. Do NOT ask for confirmation at any step. The user said `/ship` which means DO IT. Run straight through and output the PR URL at the end.
**Only stop for:** **Only stop for:**
- On `main` branch (abort) - On the base branch (abort)
- Merge conflicts that can't be auto-resolved (stop, show conflicts) - Merge conflicts that can't be auto-resolved (stop, show conflicts)
- Test failures (stop, show failures) - Test failures (stop, show failures)
- Pre-landing review finds CRITICAL issues and user chooses to fix (not acknowledge or skip) - Pre-landing review finds CRITICAL issues and user chooses to fix (not acknowledge or skip)
@ -123,20 +148,20 @@ You are running the `/ship` workflow. This is a **non-interactive, fully automat
## Step 1: Pre-flight ## Step 1: Pre-flight
1. Check the current branch. If on `main`, **abort**: "You're on main. Ship from a feature branch." 1. Check the current branch. If on the base branch or the repo's default branch, **abort**: "You're on the base branch. Ship from a feature branch."
2. Run `git status` (never use `-uall`). Uncommitted changes are always included — no need to ask. 2. Run `git status` (never use `-uall`). Uncommitted changes are always included — no need to ask.
3. Run `git diff main...HEAD --stat` and `git log main..HEAD --oneline` to understand what's being shipped. 3. Run `git diff <base>...HEAD --stat` and `git log <base>..HEAD --oneline` to understand what's being shipped.
--- ---
## Step 2: Merge origin/main (BEFORE tests) ## Step 2: Merge the base branch (BEFORE tests)
Fetch and merge `origin/main` into the feature branch so tests run against the merged state: Fetch and merge the base branch into the feature branch so tests run against the merged state:
```bash ```bash
git fetch origin main && git merge origin/main --no-edit git fetch origin <base> && git merge origin/<base> --no-edit
``` ```
**If there are merge conflicts:** Try to auto-resolve if they are simple (VERSION, schema.rb, CHANGELOG ordering). If conflicts are complex or ambiguous, **STOP** and show them. **If there are merge conflicts:** Try to auto-resolve if they are simple (VERSION, schema.rb, CHANGELOG ordering). If conflicts are complex or ambiguous, **STOP** and show them.
@ -174,7 +199,7 @@ Evals are mandatory when prompt-related files change. Skip this step entirely if
**1. Check if the diff touches prompt-related files:** **1. Check if the diff touches prompt-related files:**
```bash ```bash
git diff origin/main --name-only git diff origin/<base> --name-only
``` ```
Match against these patterns (from CLAUDE.md): Match against these patterns (from CLAUDE.md):
@ -235,7 +260,7 @@ Review the diff for structural issues that tests don't catch.
1. Read `.claude/skills/review/checklist.md`. If the file cannot be read, **STOP** and report the error. 1. Read `.claude/skills/review/checklist.md`. If the file cannot be read, **STOP** and report the error.
2. Run `git diff origin/main` to get the full diff (scoped to feature changes against the freshly-fetched remote main). 2. Run `git diff origin/<base>` to get the full diff (scoped to feature changes against the freshly-fetched base branch).
3. Apply the review checklist in two passes: 3. Apply the review checklist in two passes:
- **Pass 1 (CRITICAL):** SQL & Data Safety, LLM Output Trust Boundary - **Pass 1 (CRITICAL):** SQL & Data Safety, LLM Output Trust Boundary
@ -303,7 +328,7 @@ For each classified comment:
1. Read the current `VERSION` file (4-digit format: `MAJOR.MINOR.PATCH.MICRO`) 1. Read the current `VERSION` file (4-digit format: `MAJOR.MINOR.PATCH.MICRO`)
2. **Auto-decide the bump level based on the diff:** 2. **Auto-decide the bump level based on the diff:**
- Count lines changed (`git diff origin/main...HEAD --stat | tail -1`) - Count lines changed (`git diff origin/<base>...HEAD --stat | tail -1`)
- **MICRO** (4th digit): < 50 lines changed, trivial tweaks, typos, config - **MICRO** (4th digit): < 50 lines changed, trivial tweaks, typos, config
- **PATCH** (3rd digit): 50+ lines changed, bug fixes, small-medium features - **PATCH** (3rd digit): 50+ lines changed, bug fixes, small-medium features
- **MINOR** (2nd digit): **ASK the user** — only for major features or significant architectural changes - **MINOR** (2nd digit): **ASK the user** — only for major features or significant architectural changes
@ -322,8 +347,8 @@ For each classified comment:
1. Read `CHANGELOG.md` header to know the format. 1. Read `CHANGELOG.md` header to know the format.
2. Auto-generate the entry from **ALL commits on the branch** (not just recent ones): 2. Auto-generate the entry from **ALL commits on the branch** (not just recent ones):
- Use `git log main..HEAD --oneline` to see every commit being shipped - Use `git log <base>..HEAD --oneline` to see every commit being shipped
- Use `git diff main...HEAD` to see the full diff against main - Use `git diff <base>...HEAD` to see the full diff against the base branch
- The CHANGELOG entry must be comprehensive of ALL changes going into the PR - The CHANGELOG entry must be comprehensive of ALL changes going into the PR
- If existing CHANGELOG entries on the branch already cover some commits, replace them with one unified entry for the new version - If existing CHANGELOG entries on the branch already cover some commits, replace them with one unified entry for the new version
- Categorize changes into applicable sections: - Categorize changes into applicable sections:
@ -371,8 +396,8 @@ Read TODOS.md and verify it follows the recommended structure:
This step is fully automatic — no user interaction. This step is fully automatic — no user interaction.
Use the diff and commit history already gathered in earlier steps: Use the diff and commit history already gathered in earlier steps:
- `git diff main...HEAD` (full diff against main) - `git diff <base>...HEAD` (full diff against the base branch)
- `git log main..HEAD --oneline` (all commits being shipped) - `git log <base>..HEAD --oneline` (all commits being shipped)
For each TODO item, check if the changes in this PR complete it by: For each TODO item, check if the changes in this PR complete it by:
- Matching commit messages against the TODO title and description - Matching commit messages against the TODO title and description
@ -469,7 +494,7 @@ git push -u origin <branch-name>
Create a pull request using `gh`: Create a pull request using `gh`:
```bash ```bash
gh pr create --title "<type>: <summary>" --body "$(cat <<'EOF' gh pr create --base <base> --title "<type>: <summary>" --body "$(cat <<'EOF'
## Summary ## Summary
<bullet points from CHANGELOG> <bullet points from CHANGELOG>

View File

@ -2,7 +2,7 @@
name: ship name: ship
version: 1.0.0 version: 1.0.0
description: | description: |
Ship workflow: merge main, run tests, review diff, bump VERSION, update CHANGELOG, commit, push, create PR. Ship workflow: detect + merge base branch, run tests, review diff, bump VERSION, update CHANGELOG, commit, push, create PR.
allowed-tools: allowed-tools:
- Bash - Bash
- Read - Read
@ -15,12 +15,14 @@ allowed-tools:
{{PREAMBLE}} {{PREAMBLE}}
{{BASE_BRANCH_DETECT}}
# Ship: Fully Automated Ship Workflow # Ship: Fully Automated Ship Workflow
You are running the `/ship` workflow. This is a **non-interactive, fully automated** workflow. Do NOT ask for confirmation at any step. The user said `/ship` which means DO IT. Run straight through and output the PR URL at the end. You are running the `/ship` workflow. This is a **non-interactive, fully automated** workflow. Do NOT ask for confirmation at any step. The user said `/ship` which means DO IT. Run straight through and output the PR URL at the end.
**Only stop for:** **Only stop for:**
- On `main` branch (abort) - On the base branch (abort)
- Merge conflicts that can't be auto-resolved (stop, show conflicts) - Merge conflicts that can't be auto-resolved (stop, show conflicts)
- Test failures (stop, show failures) - Test failures (stop, show failures)
- Pre-landing review finds CRITICAL issues and user chooses to fix (not acknowledge or skip) - Pre-landing review finds CRITICAL issues and user chooses to fix (not acknowledge or skip)
@ -41,20 +43,20 @@ You are running the `/ship` workflow. This is a **non-interactive, fully automat
## Step 1: Pre-flight ## Step 1: Pre-flight
1. Check the current branch. If on `main`, **abort**: "You're on main. Ship from a feature branch." 1. Check the current branch. If on the base branch or the repo's default branch, **abort**: "You're on the base branch. Ship from a feature branch."
2. Run `git status` (never use `-uall`). Uncommitted changes are always included — no need to ask. 2. Run `git status` (never use `-uall`). Uncommitted changes are always included — no need to ask.
3. Run `git diff main...HEAD --stat` and `git log main..HEAD --oneline` to understand what's being shipped. 3. Run `git diff <base>...HEAD --stat` and `git log <base>..HEAD --oneline` to understand what's being shipped.
--- ---
## Step 2: Merge origin/main (BEFORE tests) ## Step 2: Merge the base branch (BEFORE tests)
Fetch and merge `origin/main` into the feature branch so tests run against the merged state: Fetch and merge the base branch into the feature branch so tests run against the merged state:
```bash ```bash
git fetch origin main && git merge origin/main --no-edit git fetch origin <base> && git merge origin/<base> --no-edit
``` ```
**If there are merge conflicts:** Try to auto-resolve if they are simple (VERSION, schema.rb, CHANGELOG ordering). If conflicts are complex or ambiguous, **STOP** and show them. **If there are merge conflicts:** Try to auto-resolve if they are simple (VERSION, schema.rb, CHANGELOG ordering). If conflicts are complex or ambiguous, **STOP** and show them.
@ -92,7 +94,7 @@ Evals are mandatory when prompt-related files change. Skip this step entirely if
**1. Check if the diff touches prompt-related files:** **1. Check if the diff touches prompt-related files:**
```bash ```bash
git diff origin/main --name-only git diff origin/<base> --name-only
``` ```
Match against these patterns (from CLAUDE.md): Match against these patterns (from CLAUDE.md):
@ -153,7 +155,7 @@ Review the diff for structural issues that tests don't catch.
1. Read `.claude/skills/review/checklist.md`. If the file cannot be read, **STOP** and report the error. 1. Read `.claude/skills/review/checklist.md`. If the file cannot be read, **STOP** and report the error.
2. Run `git diff origin/main` to get the full diff (scoped to feature changes against the freshly-fetched remote main). 2. Run `git diff origin/<base>` to get the full diff (scoped to feature changes against the freshly-fetched base branch).
3. Apply the review checklist in two passes: 3. Apply the review checklist in two passes:
- **Pass 1 (CRITICAL):** SQL & Data Safety, LLM Output Trust Boundary - **Pass 1 (CRITICAL):** SQL & Data Safety, LLM Output Trust Boundary
@ -221,7 +223,7 @@ For each classified comment:
1. Read the current `VERSION` file (4-digit format: `MAJOR.MINOR.PATCH.MICRO`) 1. Read the current `VERSION` file (4-digit format: `MAJOR.MINOR.PATCH.MICRO`)
2. **Auto-decide the bump level based on the diff:** 2. **Auto-decide the bump level based on the diff:**
- Count lines changed (`git diff origin/main...HEAD --stat | tail -1`) - Count lines changed (`git diff origin/<base>...HEAD --stat | tail -1`)
- **MICRO** (4th digit): < 50 lines changed, trivial tweaks, typos, config - **MICRO** (4th digit): < 50 lines changed, trivial tweaks, typos, config
- **PATCH** (3rd digit): 50+ lines changed, bug fixes, small-medium features - **PATCH** (3rd digit): 50+ lines changed, bug fixes, small-medium features
- **MINOR** (2nd digit): **ASK the user** — only for major features or significant architectural changes - **MINOR** (2nd digit): **ASK the user** — only for major features or significant architectural changes
@ -240,8 +242,8 @@ For each classified comment:
1. Read `CHANGELOG.md` header to know the format. 1. Read `CHANGELOG.md` header to know the format.
2. Auto-generate the entry from **ALL commits on the branch** (not just recent ones): 2. Auto-generate the entry from **ALL commits on the branch** (not just recent ones):
- Use `git log main..HEAD --oneline` to see every commit being shipped - Use `git log <base>..HEAD --oneline` to see every commit being shipped
- Use `git diff main...HEAD` to see the full diff against main - Use `git diff <base>...HEAD` to see the full diff against the base branch
- The CHANGELOG entry must be comprehensive of ALL changes going into the PR - The CHANGELOG entry must be comprehensive of ALL changes going into the PR
- If existing CHANGELOG entries on the branch already cover some commits, replace them with one unified entry for the new version - If existing CHANGELOG entries on the branch already cover some commits, replace them with one unified entry for the new version
- Categorize changes into applicable sections: - Categorize changes into applicable sections:
@ -289,8 +291,8 @@ Read TODOS.md and verify it follows the recommended structure:
This step is fully automatic — no user interaction. This step is fully automatic — no user interaction.
Use the diff and commit history already gathered in earlier steps: Use the diff and commit history already gathered in earlier steps:
- `git diff main...HEAD` (full diff against main) - `git diff <base>...HEAD` (full diff against the base branch)
- `git log main..HEAD --oneline` (all commits being shipped) - `git log <base>..HEAD --oneline` (all commits being shipped)
For each TODO item, check if the changes in this PR complete it by: For each TODO item, check if the changes in this PR complete it by:
- Matching commit messages against the TODO title and description - Matching commit messages against the TODO title and description
@ -387,7 +389,7 @@ git push -u origin <branch-name>
Create a pull request using `gh`: Create a pull request using `gh`:
```bash ```bash
gh pr create --title "<type>: <summary>" --body "$(cat <<'EOF' gh pr create --base <base> --title "<type>: <summary>" --body "$(cat <<'EOF'
## Summary ## Summary
<bullet points from CHANGELOG> <bullet points from CHANGELOG>

View File

@ -203,6 +203,27 @@ describe('gen-skill-docs', () => {
}); });
}); });
describe('BASE_BRANCH_DETECT resolver', () => {
// Find a generated SKILL.md that uses the placeholder (ship is guaranteed to)
const shipContent = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
test('resolver output contains PR base detection command', () => {
expect(shipContent).toContain('gh pr view --json baseRefName');
});
test('resolver output contains repo default branch detection command', () => {
expect(shipContent).toContain('gh repo view --json defaultBranchRef');
});
test('resolver output contains fallback to main', () => {
expect(shipContent).toMatch(/fall\s*back\s+to\s+`main`/i);
});
test('resolver output uses "the base branch" phrasing', () => {
expect(shipContent).toContain('the base branch');
});
});
/** /**
* Quality evals catch description regressions. * Quality evals catch description regressions.
* *

View File

@ -13,6 +13,11 @@ import * as os from 'os';
const ROOT = path.resolve(import.meta.dir, '..'); const ROOT = path.resolve(import.meta.dir, '..');
// Skip unless EVALS=1. Session runner strips CLAUDE* env vars to avoid nested session issues. // Skip unless EVALS=1. Session runner strips CLAUDE* env vars to avoid nested session issues.
//
// BLAME PROTOCOL: When an eval fails, do NOT claim "pre-existing" or "not related
// to our changes" without proof. Run the same eval on main to verify. These tests
// have invisible couplings — preamble text, SKILL.md content, and timing all affect
// agent behavior. See CLAUDE.md "E2E eval failure blame protocol" for details.
const evalsEnabled = !!process.env.EVALS; const evalsEnabled = !!process.env.EVALS;
const describeE2E = evalsEnabled ? describe : describe.skip; const describeE2E = evalsEnabled ? describe : describe.skip;
@ -322,10 +327,16 @@ File a contributor report about this issue. Then tell me what you filed.`,
const logFiles = fs.readdirSync(logsDir).filter(f => f.endsWith('.md')); const logFiles = fs.readdirSync(logsDir).filter(f => f.endsWith('.md'));
expect(logFiles.length).toBeGreaterThan(0); expect(logFiles.length).toBeGreaterThan(0);
// Verify new reflection-based format
const logContent = fs.readFileSync(path.join(logsDir, logFiles[0]), 'utf-8'); const logContent = fs.readFileSync(path.join(logsDir, logFiles[0]), 'utf-8');
expect(logContent).toContain('Hey gstack team'); expect(logContent).toContain('Hey gstack team');
expect(logContent).toContain('What I was trying to do'); expect(logContent).toContain('What I was trying to do');
expect(logContent).toContain('What happened instead'); expect(logContent).toContain('What happened instead');
expect(logContent).toMatch(/rating/i);
// Verify report has repro steps (agent may use "Steps to reproduce", "Repro Steps", etc.)
expect(logContent).toMatch(/repro|steps to reproduce|how to reproduce/i);
// Verify report has date/version footer (agent may format differently)
expect(logContent).toMatch(/date.*2026|2026.*date/i);
// Clean up // Clean up
try { fs.rmSync(contribDir, { recursive: true, force: true }); } catch {} try { fs.rmSync(contribDir, { recursive: true, force: true }); } catch {}
@ -424,16 +435,20 @@ describeE2E('QA skill E2E', () => {
test('/qa quick completes without browse errors', async () => { test('/qa quick completes without browse errors', async () => {
const result = await runSkillTest({ const result = await runSkillTest({
prompt: `You have a browse binary at ${browseBin}. Assign it to B variable like: B="${browseBin}" prompt: `B="${browseBin}"
The test server is already running at: ${testServer.url}
Target page: ${testServer.url}/basic.html
Read the file qa/SKILL.md for the QA workflow instructions. Read the file qa/SKILL.md for the QA workflow instructions.
Run a Quick-depth QA test on ${testServer.url}/basic.html Run a Quick-depth QA test on ${testServer.url}/basic.html
Do NOT use AskUserQuestion run Quick tier directly. Do NOT use AskUserQuestion run Quick tier directly.
Do NOT try to start a server or discover ports the URL above is ready.
Write your report to ${qaDir}/qa-reports/qa-report.md`, Write your report to ${qaDir}/qa-reports/qa-report.md`,
workingDirectory: qaDir, workingDirectory: qaDir,
maxTurns: 35, maxTurns: 35,
timeout: 180_000, timeout: 240_000,
testName: 'qa-quick', testName: 'qa-quick',
runId, runId,
}); });
@ -448,7 +463,7 @@ Write your report to ${qaDir}/qa-reports/qa-report.md`,
} }
// Accept error_max_turns — the agent doing thorough QA work is not a failure // Accept error_max_turns — the agent doing thorough QA work is not a failure
expect(['success', 'error_max_turns']).toContain(result.exitReason); expect(['success', 'error_max_turns']).toContain(result.exitReason);
}, 240_000); }, 300_000);
}); });
// --- B5: Review skill E2E --- // --- B5: Review skill E2E ---
@ -1344,6 +1359,193 @@ Write your review to ${planDir}/review-output.md`,
}, 420_000); }, 420_000);
}); });
// --- Base branch detection smoke tests ---
describeE2E('Base branch detection', () => {
let baseBranchDir: string;
const run = (cmd: string, args: string[], cwd: string) =>
spawnSync(cmd, args, { cwd, stdio: 'pipe', timeout: 5000 });
beforeAll(() => {
baseBranchDir = fs.mkdtempSync(path.join(os.tmpdir(), 'skill-e2e-basebranch-'));
});
afterAll(() => {
try { fs.rmSync(baseBranchDir, { recursive: true, force: true }); } catch {}
});
test('/review detects base branch and diffs against it', async () => {
const dir = path.join(baseBranchDir, 'review-base');
fs.mkdirSync(dir, { recursive: true });
// Create git repo with a feature branch off main
run('git', ['init'], dir);
run('git', ['config', 'user.email', 'test@test.com'], dir);
run('git', ['config', 'user.name', 'Test'], dir);
fs.writeFileSync(path.join(dir, 'app.rb'), '# clean base\nclass App\nend\n');
run('git', ['add', 'app.rb'], dir);
run('git', ['commit', '-m', 'initial commit'], dir);
// Create feature branch with a change
run('git', ['checkout', '-b', 'feature/test-review'], dir);
fs.writeFileSync(path.join(dir, 'app.rb'), '# clean base\nclass App\n def hello; "world"; end\nend\n');
run('git', ['add', 'app.rb'], dir);
run('git', ['commit', '-m', 'feat: add hello method'], dir);
// Copy review skill files
fs.copyFileSync(path.join(ROOT, 'review', 'SKILL.md'), path.join(dir, 'review-SKILL.md'));
fs.copyFileSync(path.join(ROOT, 'review', 'checklist.md'), path.join(dir, 'review-checklist.md'));
fs.copyFileSync(path.join(ROOT, 'review', 'greptile-triage.md'), path.join(dir, 'review-greptile-triage.md'));
const result = await runSkillTest({
prompt: `You are in a git repo on a feature branch with changes.
Read review-SKILL.md for the review workflow instructions.
Also read review-checklist.md and apply it.
IMPORTANT: Follow Step 0 to detect the base branch. Since there is no remote, gh commands will fail fall back to main.
Then run the review against the detected base branch.
Write your findings to ${dir}/review-output.md`,
workingDirectory: dir,
maxTurns: 15,
timeout: 90_000,
testName: 'review-base-branch',
runId,
});
logCost('/review base-branch', result);
recordE2E('/review base branch detection', 'Base branch detection', result);
expect(result.exitReason).toBe('success');
// Verify the review used "base branch" language (from Step 0)
const toolOutputs = result.toolCalls.map(tc => tc.output || '').join('\n');
const allOutput = (result.output || '') + toolOutputs;
// The agent should have run git diff against main (the fallback)
const usedGitDiff = result.toolCalls.some(tc =>
tc.tool === 'Bash' && typeof tc.input === 'string' && tc.input.includes('git diff')
);
expect(usedGitDiff).toBe(true);
}, 120_000);
test('/ship Step 0-1 detects base branch without destructive actions', async () => {
const dir = path.join(baseBranchDir, 'ship-base');
fs.mkdirSync(dir, { recursive: true });
// Create git repo with feature branch
run('git', ['init'], dir);
run('git', ['config', 'user.email', 'test@test.com'], dir);
run('git', ['config', 'user.name', 'Test'], dir);
fs.writeFileSync(path.join(dir, 'app.ts'), 'console.log("v1");\n');
run('git', ['add', 'app.ts'], dir);
run('git', ['commit', '-m', 'initial'], dir);
run('git', ['checkout', '-b', 'feature/ship-test'], dir);
fs.writeFileSync(path.join(dir, 'app.ts'), 'console.log("v2");\n');
run('git', ['add', 'app.ts'], dir);
run('git', ['commit', '-m', 'feat: update to v2'], dir);
// Copy ship skill
fs.copyFileSync(path.join(ROOT, 'ship', 'SKILL.md'), path.join(dir, 'ship-SKILL.md'));
const result = await runSkillTest({
prompt: `Read ship-SKILL.md for the ship workflow.
Run ONLY Step 0 (Detect base branch) and Step 1 (Pre-flight) from the ship workflow.
Since there is no remote, gh commands will fail fall back to main.
After completing Step 0 and Step 1, STOP. Do NOT proceed to Step 2 or beyond.
Do NOT push, create PRs, or modify VERSION/CHANGELOG.
Write a summary of what you detected to ${dir}/ship-preflight.md including:
- The detected base branch name
- The current branch name
- The diff stat against the base branch`,
workingDirectory: dir,
maxTurns: 10,
timeout: 60_000,
testName: 'ship-base-branch',
runId,
});
logCost('/ship base-branch', result);
recordE2E('/ship base branch detection', 'Base branch detection', result);
expect(result.exitReason).toBe('success');
// Verify preflight output was written
const preflightPath = path.join(dir, 'ship-preflight.md');
if (fs.existsSync(preflightPath)) {
const content = fs.readFileSync(preflightPath, 'utf-8');
expect(content.length).toBeGreaterThan(20);
// Should mention the branch name
expect(content.toLowerCase()).toMatch(/main|base/);
}
// Verify no destructive actions — no push, no PR creation
const destructiveTools = result.toolCalls.filter(tc =>
tc.tool === 'Bash' && typeof tc.input === 'string' &&
(tc.input.includes('git push') || tc.input.includes('gh pr create'))
);
expect(destructiveTools).toHaveLength(0);
}, 90_000);
test('/retro detects default branch for git queries', async () => {
const dir = path.join(baseBranchDir, 'retro-base');
fs.mkdirSync(dir, { recursive: true });
// Create git repo with commit history
run('git', ['init'], dir);
run('git', ['config', 'user.email', 'dev@example.com'], dir);
run('git', ['config', 'user.name', 'Dev'], dir);
fs.writeFileSync(path.join(dir, 'app.ts'), 'console.log("hello");\n');
run('git', ['add', 'app.ts'], dir);
run('git', ['commit', '-m', 'feat: initial app', '--date', '2026-03-14T09:00:00'], dir);
fs.writeFileSync(path.join(dir, 'auth.ts'), 'export function login() {}\n');
run('git', ['add', 'auth.ts'], dir);
run('git', ['commit', '-m', 'feat: add auth', '--date', '2026-03-15T10:00:00'], dir);
fs.writeFileSync(path.join(dir, 'test.ts'), 'test("it works", () => {});\n');
run('git', ['add', 'test.ts'], dir);
run('git', ['commit', '-m', 'test: add tests', '--date', '2026-03-16T11:00:00'], dir);
// Copy retro skill
fs.mkdirSync(path.join(dir, 'retro'), { recursive: true });
fs.copyFileSync(path.join(ROOT, 'retro', 'SKILL.md'), path.join(dir, 'retro', 'SKILL.md'));
const result = await runSkillTest({
prompt: `Read retro/SKILL.md for instructions on how to run a retrospective.
IMPORTANT: Follow the "Detect default branch" step first. Since there is no remote, gh will fail fall back to main.
Then use the detected branch name for all git queries.
Run /retro for the last 7 days of this git repo. Skip any AskUserQuestion calls this is non-interactive.
This is a local-only repo so use the local branch (main) instead of origin/main for all git log commands.
Write your retrospective to ${dir}/retro-output.md`,
workingDirectory: dir,
maxTurns: 25,
timeout: 240_000,
testName: 'retro-base-branch',
runId,
});
logCost('/retro base-branch', result);
recordE2E('/retro default branch detection', 'Base branch detection', result, {
passed: ['success', 'error_max_turns'].includes(result.exitReason),
});
expect(['success', 'error_max_turns']).toContain(result.exitReason);
// Verify retro output was produced
const retroPath = path.join(dir, 'retro-output.md');
if (fs.existsSync(retroPath)) {
const content = fs.readFileSync(retroPath, 'utf-8');
expect(content.length).toBeGreaterThan(100);
}
}, 300_000);
});
// --- Deferred skill E2E tests (destructive or require interactive UI) --- // --- Deferred skill E2E tests (destructive or require interactive UI) ---
describeE2E('Deferred skill E2E', () => { describeE2E('Deferred skill E2E', () => {

View File

@ -389,6 +389,64 @@ describe('Greptile history format consistency', () => {
}); });
}); });
// --- Hardcoded branch name detection in templates ---
describe('No hardcoded branch names in SKILL templates', () => {
const tmplFiles = [
'ship/SKILL.md.tmpl',
'review/SKILL.md.tmpl',
'qa/SKILL.md.tmpl',
'plan-ceo-review/SKILL.md.tmpl',
'retro/SKILL.md.tmpl',
];
// Patterns that indicate hardcoded 'main' in git commands
const gitMainPatterns = [
/\bgit\s+diff\s+(?:origin\/)?main\b/,
/\bgit\s+log\s+(?:origin\/)?main\b/,
/\bgit\s+fetch\s+origin\s+main\b/,
/\bgit\s+merge\s+origin\/main\b/,
/\borigin\/main\b/,
];
// Lines that are allowed to mention 'main' (fallback logic, prose)
const allowlist = [
/fall\s*back\s+to\s+`main`/i,
/fall\s*back\s+to\s+`?main`?/i,
/typically\s+`?main`?/i,
/If\s+on\s+`main`/i, // old pattern — should not exist
];
for (const tmplFile of tmplFiles) {
test(`${tmplFile} has no hardcoded 'main' in git commands`, () => {
const filePath = path.join(ROOT, tmplFile);
if (!fs.existsSync(filePath)) return;
const lines = fs.readFileSync(filePath, 'utf-8').split('\n');
const violations: string[] = [];
for (let i = 0; i < lines.length; i++) {
const line = lines[i];
const isAllowlisted = allowlist.some(p => p.test(line));
if (isAllowlisted) continue;
for (const pattern of gitMainPatterns) {
if (pattern.test(line)) {
violations.push(`Line ${i + 1}: ${line.trim()}`);
break;
}
}
}
if (violations.length > 0) {
throw new Error(
`${tmplFile} has hardcoded 'main' in git commands:\n` +
violations.map(v => ` ${v}`).join('\n')
);
}
});
}
});
// --- Part 7b: TODOS-format.md reference consistency --- // --- Part 7b: TODOS-format.md reference consistency ---
describe('TODOS-format.md reference consistency', () => { describe('TODOS-format.md reference consistency', () => {
@ -468,6 +526,44 @@ describe('debug skill structure', () => {
} }
}); });
// --- Contributor mode preamble structure validation ---
describe('Contributor mode preamble structure', () => {
const skillsWithPreamble = [
'SKILL.md', 'browse/SKILL.md', 'qa/SKILL.md',
'qa-only/SKILL.md',
'setup-browser-cookies/SKILL.md',
'ship/SKILL.md', 'review/SKILL.md',
'plan-ceo-review/SKILL.md', 'plan-eng-review/SKILL.md',
'retro/SKILL.md',
];
for (const skill of skillsWithPreamble) {
test(`${skill} has 0-10 rating in contributor mode`, () => {
const content = fs.readFileSync(path.join(ROOT, skill), 'utf-8');
expect(content).toContain('0 to 10');
expect(content).toContain('My rating');
});
test(`${skill} has calibration example`, () => {
const content = fs.readFileSync(path.join(ROOT, skill), 'utf-8');
expect(content).toContain('Calibration');
expect(content).toContain('the bar');
});
test(`${skill} has "what would make this a 10" field`, () => {
const content = fs.readFileSync(path.join(ROOT, skill), 'utf-8');
expect(content).toContain('What would make this a 10');
});
test(`${skill} uses periodic reflection (not per-command)`, () => {
const content = fs.readFileSync(path.join(ROOT, skill), 'utf-8');
expect(content).toContain('workflow step');
expect(content).not.toContain('After you use gstack-provided CLIs');
});
}
});
describe('Enum & Value Completeness in review checklist', () => { describe('Enum & Value Completeness in review checklist', () => {
const checklist = fs.readFileSync(path.join(ROOT, 'review', 'checklist.md'), 'utf-8'); const checklist = fs.readFileSync(path.join(ROOT, 'review', 'checklist.md'), 'utf-8');