* feat: browser ref staleness detection via async count() validation
resolveRef() now checks element count to detect stale refs after page
mutations (e.g. SPA navigation). RefEntry stores role+name metadata
for better diagnostics. 3 new snapshot tests for staleness detection.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: qa-only skill, qa fix loop, plan-to-QA artifact flow
Add /qa-only (report-only, Edit tool blocked), restructure /qa with
find-fix-verify cycle, add {{QA_METHODOLOGY}} DRY placeholder for
shared methodology. /plan-eng-review now writes test-plan artifacts
to ~/.gstack/projects/<slug>/ for QA consumption.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: eval efficiency metrics — turns, duration, commentary across all surfaces
Add generateCommentary() for natural-language delta interpretation,
per-test turns/duration in comparison and summary output, judgePassed
unit tests, 3 new E2E tests (qa-only, qa fix loop, plan artifact).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: bump version and changelog (v0.4.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: update ARCHITECTURE, BROWSER, CONTRIBUTING, README for v0.4.0
- ARCHITECTURE: add ref staleness detection section, update RefEntry type
- BROWSER: add ref staleness paragraph to snapshot system docs
- CONTRIBUTING: update eval tool descriptions with commentary feature
- README: fix missing qa-only in project-local uninstall command
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: add user-facing benefit descriptions to v0.4.0 changelog
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: log non-ENOENT errors in ensureStateDir() instead of silently swallowing
Replace bare catch {} with ENOENT-only silence. Non-ENOENT errors (EACCES,
ENOSPC) are now logged to .gstack/browse-server.log. Includes test for
permission-denied scenario with chmod 444.
* feat: merge TODO.md + TODOS.md into unified backlog with shared format reference
Merge TODO.md (roadmap) and TODOS.md (near-term) into one file organized by
skill/component with P0-P4 priority ordering and Completed section. Add shared
review/TODOS-format.md for canonical format. Add static validation tests.
* feat: add 2-tier Greptile reply system with escalation detection
Add reply templates (Tier 1 friendly, Tier 2 firm), explicit escalation
detection algorithm, and severity re-ranking guidance to greptile-triage.md.
* feat: cross-skill TODOS awareness + Greptile template refs in all skills
/ship Step 5.5: auto-detect completed TODOs, offer reorganization.
/review Step 5.5: cross-reference PR against open TODOs.
/plan-ceo-review, /plan-eng-review: TODOS context in planning.
/retro: Backlog Health metric. /qa: bug TODO context in diff-aware mode.
All Greptile-aware skills now reference reply templates and escalation detection.
* chore: bump version and changelog (v0.3.8)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: update CONTRIBUTING.md for v0.3.8 changes
Clarify test tier cost table (Tier 3 standalone vs combined), add TODOS.md
to "Things to know", mention Greptile triage in ship workflow description.
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The `[ -n "$_UPD" ] && echo "$_UPD"` line in 5 skills was missing `|| true`,
causing exit code 1 when the update check finds no update (empty $_UPD).
Fix: convert ship/, review/, plan-ceo-review/, plan-eng-review/, retro/ to
.tmpl templates using {{UPDATE_CHECK}} placeholder (same as browse/qa/etc).
All 9 skills now generated from templates — preamble changes propagate everywhere.
Also: regenerates qa/SKILL.md which had drifted from its template, adds 12 tests
validating the update check preamble exits 0 in all skills, removes completed TODO.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>