From a25c301ad24e209f3e7291cd0bca862bd9828f5a Mon Sep 17 00:00:00 2001 From: Garry Tan Date: Mon, 25 May 2026 19:43:47 -0700 Subject: [PATCH] chore: bump version and changelog (v1.45.0.0) Design boards now live 24h, not 10 minutes. One daemon hosts every board, one tab survives the whole day. See CHANGELOG.md for the full release summary + metrics + itemized changes. TODOS.md gains a "design daemon: follow-ups" section capturing the P3 test gaps + maintainability nits the /ship review army flagged but that aren't blocking for this release. Co-Authored-By: Claude Opus 4.7 (1M context) --- CHANGELOG.md | 59 +++++++++++++++++++++++++++++++++++++++++++++ TODOS.md | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++++ VERSION | 2 +- package.json | 2 +- 4 files changed, 129 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index b32ceed4b..c41dba887 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,64 @@ # Changelog +## [1.45.0.0] - 2026-05-25 + +## **Design boards now live 24 hours, not 10 minutes. One daemon hosts every board, one tab survives the whole day.** + +Run `$D compare --serve` and you get a persistent design daemon at `.gstack/design.json` instead of a fresh process per call. Open three design sessions across an afternoon and they all land at `/boards//` on the same port. The browser tab you opened first still works for the board you published an hour later. The idle timeout went from 10 minutes (the old per-process server) to 24 hours of inactivity (the daemon's lifetime). Submit a board, the URL stays accessible until the daemon idles out, so you can scroll back through the day's design history at `http://127.0.0.1:N/`. + +Skill invocations (`/design-shotgun`, `/design-consultation`, `/plan-design-review`, `/design-review`, `/office-hours`) keep calling `$D compare --serve` exactly the same way. The CLI shape is unchanged. What's different is the binary now self-execs into daemon mode under the hood, attaches to a running daemon if one is there, spawns a fresh one if not, and prints `BOARD_PUBLISHED: http://127.0.0.1:N/boards//` to stderr so the skill can echo the URL. The legacy `--no-daemon` flag preserves the old single-process behavior for tests and debugging. + +### The numbers that matter + +Source: `bun test design/test/` and `git diff origin/main...HEAD --stat`. + +| Metric | Before | After | Δ | +|-----------------------------------------|---------------|---------------|----------------| +| Idle timeout per board | 10 minutes | 24 hours | 144× | +| Server processes for N boards | N | 1 | N× | +| Browser tabs to keep open | one per board | one total | N× | +| Design tests in repo | 16 | 67 | +51 | +| Test paths covered (failure modes) | not enumerated| 38 / 100% | full coverage | +| Plan-review findings absorbed pre-impl | 2 | 19 | 17× from Codex | + +| Component | New lines | Test lines | +|----------------------------|-----------|------------| +| design/src/daemon.ts | ~580 | 30 tests | +| design/src/daemon-client.ts| ~340 | 17 tests | +| design/src/daemon-state.ts | ~180 | (via client + daemon tests) | +| Browser round-trip via HTTP| (existed) | 4 tests | + +The compression: 51 new tests cover every endpoint, lifecycle path, LRU eviction, idle handling, identity-verified spawn, version mismatch with and without active boards, PID-reuse safety, path traversal rejection, and cross-board feedback isolation. The plan-review pass caught 2 architectural issues in-house; an outside Codex pass caught 17 more, all absorbed into the implementation before any code was written. The version-mismatch path now refuses to silently kill a daemon with active boards (it prints a warning and exits 1), so upgrading gstack mid-design-session doesn't drop your in-memory board history. + +### What this means for the builder + +Open `/design-shotgun` Monday morning, work through three rounds of variants, walk away for lunch, come back, click Submit. The board is still there. Open a second `/design-shotgun` for a different feature in the afternoon, get a new URL at `/boards//`, no port churn, your morning board still works. The whole day's worth of design exploration accumulates as a browsable history at the daemon's root. Stop worrying about the 10-minute death clock. + +### Itemized changes + +#### Added +- **Persistent design daemon** (`design/src/daemon.ts`). Bun HTTP server on `127.0.0.1` hosting many boards under `/boards//`. Per-board state machine (`serving | regenerating | done`), LRU cap of 50 boards (evicts `done` first, returns 503 when 50 non-done coexist), 24h idle timeout with 1h extensions up to a 28h ceiling when boards are still active, per-board async mutex serializing feedback POST vs reload POST. Index page at `/` lists recent boards newest first. +- **`$D daemon status`** and **`$D daemon stop [--force]`**. The stop sub-command refuses without `--force` when active boards exist, so a casual stop doesn't drop in-flight history. +- **Daemon client** (`design/src/daemon-client.ts`). `ensureDaemon()` handles spawn-or-attach with file-lock-protected spawn (re-reads state inside the lock to close the two-CLIs-race window) and identity-verified SIGTERM (reads `/proc/PID/cmdline` on Linux, `ps -p PID -o command=` on macOS, only signals if `gstack-design-daemon` is in the cmdline). PID-reuse safety: if the state file points at a PID belonging to an unrelated process, no signal is sent and a fresh daemon spawns. Version-mismatch refusal: if a CLI from a newer gstack version arrives while boards are still open in an older daemon, the CLI prints a user-actionable warning and exits 1 instead of silently restarting and losing history. +- **Shared daemon state utilities** (`design/src/daemon-state.ts`). Atomic state-file write (`` + `renameSync` at mode `0o600`), `fs.openSync('wx')` exclusive lock, cross-platform cmdline reader, version lookup that falls back through `DESIGN_DAEMON_VERSION` env → `design/dist/.version` baked at build time → source-tree `VERSION` → `"unknown"`. +- **End-to-end round-trip tests against a real spawned daemon** (`design/test/feedback-roundtrip-daemon.test.ts`). HTTP fetch drives publish → submit → regenerate → reload → round-2 submit, asserting `feedback.json` lands at the daemon-derived `sourceDir` with `boardId` and `publishedAt` augmented fields. + +#### Changed +- **Board JS uses relative URLs** instead of an injected `__GSTACK_SERVER_URL` global. The same generated HTML works at `/` (legacy `--no-daemon`) and `/boards//` (daemon). `location.protocol` feature-detect keeps the `file://` DOM-only fallback path working. +- **Bare `GET /boards/` returns 301** to `/boards//`. The trailing slash is load-bearing for relative-URL resolution in the board JS; without it, `fetch('./api/feedback')` would resolve to the wrong scope. +- **Reload guard rejects directory paths**. `design/src/serve.ts:200-212` previously let `resolvedReload === allowedDir` through, which then crashed `readFileSync` with `EISDIR`. Now requires `statSync(resolvedReload).isFile()` with a clear 400 instead. +- **Feedback files carry `boardId` and `publishedAt`** so agents polling `feedback.json` / `feedback-pending.json` in a multi-board world can verify which board produced what. +- **`sourceDir` is derived from `realpath(html)` server-side**, never trusted from the publish POST body. +- **Skill resolvers and templates** (`scripts/resolvers/design.ts`, `design-shotgun/SKILL.md`, `design-consultation/SKILL.md`, `plan-design-review/SKILL.md`, `office-hours/SKILL.md`) updated to parse `BOARD_URL:` from stderr and POST reloads to `${BOARD_URL}api/reload` instead of the legacy port-only `/api/reload`. Legacy `SERVE_STARTED: port=N html=...` line still emitted for back-compat. + +#### Fixed +- **Compiled design binary self-execs as the daemon** via a `--daemon-mode` flag, so the daemon lifecycle works for users installing from `design/dist/design` (not just `bun run` against the source tree). +- **Version lookup** is consistent between client and daemon. Both go through `readVersionString()`, so the version-mismatch refusal path works on the compiled binary instead of always reading `"unknown"` and matching itself. + +#### For contributors +- **Test infrastructure split**: `design/test/daemon.test.ts` (30 in-process tests against the exported `fetchHandler`, ~70ms) for fast iteration; `design/test/daemon-discovery.test.ts` (17 real-spawn tests, ~8s) for lifecycle + lock + identity guarantees. Shared helpers in `design/test/daemon-tests-fixtures.ts`. +- **Plan-review process**: this branch ran `/plan-eng-review` twice. Round 1 caught 2 architecture findings. An outside-voice Codex pass after round 1 found 17 more (URL contract self-contradiction, false test-green claim, lock semantics, identity verification, version-mismatch silent data loss, several others). Round 2 absorbed all 17 before implementation started. The full review trail is preserved in the plan file's `## GSTACK REVIEW REPORT` section. + ## [1.44.1.0] - 2026-05-24 ## **Nine community fixes ship in one bundle.** Office-hours session counter works again, iOS QA tunnels survive macOS 26.x, Windows brain-sync stops dropping artifacts, browse server tells you whether the bind failure was a port collision or a sandbox block. diff --git a/TODOS.md b/TODOS.md index fbd645e60..e8f1c9a70 100644 --- a/TODOS.md +++ b/TODOS.md @@ -1,5 +1,73 @@ # TODOS +## design daemon: follow-ups (filed v1.45.0.0 via /ship review army) + +### P3: Tighten daemon test coverage + +**What:** Three test gaps the testing specialist flagged on the v1.45.0.0 ship: + +1. `design/test/daemon.test.ts:441` (`bare GET /api/progress does NOT reset + meaningful activity`) is a smoke test pretending to be a behavioral test. + Its own body comment admits it can't read `lastMeaningfulActivity` in-process + and only asserts the endpoint stays functional. Move to + `daemon-discovery.test.ts` with `DESIGN_DAEMON_IDLE_MS=2000` + + `DESIGN_DAEMON_CHECK_MS=200`, poll `/api/progress` in a loop, wait + `IDLE_MS+CHECK_MS`, assert the daemon process actually exited. + +2. `design/test/daemon-discovery.test.ts:14` docstring claims a + "concurrent-CLIs race (two real subprocesses, one wins the lock)" test + exists. It doesn't. Add one: fire two `ensureDaemon()` calls in parallel + against the same stateFile via `Promise.all` (or two real subprocesses); + assert both resolve to the same port, exactly one `spawned=true`, exactly + one daemon process alive, no orphaned lock file. This is the primary + correctness gate for the new daemon's spawn-vs-attach race. + +3. `design/test/daemon.test.ts:462` `idleCheckTick` only has a "callable + without throwing" smoke. The 24h-extension-with-active-boards path + (`daemon.ts:240-252`) and the `MAX_EXTENSIONS` hard ceiling + (`daemon.ts:244-247`, force-shutdown after 4 extensions) are untested. + Both are load-bearing for the 24h timeout this PR ships. + +Plus: missing malformed-JSON tests for `POST /api/boards` and +`POST /boards//api/reload` (only feedback has one); missing stale-lock +reclaim test (`daemon-state.ts:208-213` PID-dead branch is only exercised +indirectly). + +**Pros:** Closes real coverage gaps that ship-time review surfaced. Each test +is small (one or two `it()` blocks). + +**Cons:** None — these are additive tests, no behavior change. + +### P3: Minor maintainability nits from /ship review + +- `design/src/cli.ts` and `design/src/serve.ts` both have a small `openBrowser` + helper with identical darwin/linux/else branches. Extract a shared + `design/src/open-browser.ts`. +- `design/src/daemon-client.ts:320` (`AbortSignal.timeout(2000)`) and `:357` + (`delay(50)`) use bare numeric literals while sibling timeouts are named + constants. Promote to `SHUTDOWN_POST_TIMEOUT_MS` and `ALIVE_POLL_INTERVAL_MS`. +- `design/src/daemon-state.ts:21` `serverPath` field is written + (`daemon.ts:541`) but never read by production code. Either remove or + document the forensic intent. + +### P3: Daemon scope deferred from v1.45.0.0 plan + +Originally listed in the plan's "TODOs surfaced for later" section: + +- Per-daemon scoped auth tokens (only relevant once a tunnel/share use case appears). +- Optional persistent board history on disk in + `~/.gstack/projects/$SLUG/designs/history/` so submitted boards survive + daemon restarts. +- Windows spawn branch lifted from browse (V1 daemon is macOS + Linux; + Windows users fall back to legacy `--no-daemon` per-process server). +- `$D board list` / `$D board stop ` per-board ops CLI (V1 has only + `$D daemon status` / `stop`). +- Cross-worktree daemon attach (conductor sibling worktrees of the same + repo currently each spawn their own daemon — matches browse; revisit + if it causes friction). + +--- + ## browse server: terminal-agent teardown follow-ups (filed v1.41 via /plan-eng-review) ### ✅ DONE (v1.44.0.0): Identity-based terminal-agent kill (replace pkill regex with PID) diff --git a/VERSION b/VERSION index fc2278642..94cf8fed1 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -1.44.1.0 +1.45.0.0 diff --git a/package.json b/package.json index 3bcaa0f77..a8e289a30 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "gstack", - "version": "1.44.1.0", + "version": "1.45.0.0", "description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.", "license": "MIT", "type": "module",