diff --git a/CHANGELOG.md b/CHANGELOG.md index b32ceed4b..c41dba887 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,64 @@ # Changelog +## [1.45.0.0] - 2026-05-25 + +## **Design boards now live 24 hours, not 10 minutes. One daemon hosts every board, one tab survives the whole day.** + +Run `$D compare --serve` and you get a persistent design daemon at `.gstack/design.json` instead of a fresh process per call. Open three design sessions across an afternoon and they all land at `/boards//` on the same port. The browser tab you opened first still works for the board you published an hour later. The idle timeout went from 10 minutes (the old per-process server) to 24 hours of inactivity (the daemon's lifetime). Submit a board, the URL stays accessible until the daemon idles out, so you can scroll back through the day's design history at `http://127.0.0.1:N/`. + +Skill invocations (`/design-shotgun`, `/design-consultation`, `/plan-design-review`, `/design-review`, `/office-hours`) keep calling `$D compare --serve` exactly the same way. The CLI shape is unchanged. What's different is the binary now self-execs into daemon mode under the hood, attaches to a running daemon if one is there, spawns a fresh one if not, and prints `BOARD_PUBLISHED: http://127.0.0.1:N/boards//` to stderr so the skill can echo the URL. The legacy `--no-daemon` flag preserves the old single-process behavior for tests and debugging. + +### The numbers that matter + +Source: `bun test design/test/` and `git diff origin/main...HEAD --stat`. + +| Metric | Before | After | Δ | +|-----------------------------------------|---------------|---------------|----------------| +| Idle timeout per board | 10 minutes | 24 hours | 144× | +| Server processes for N boards | N | 1 | N× | +| Browser tabs to keep open | one per board | one total | N× | +| Design tests in repo | 16 | 67 | +51 | +| Test paths covered (failure modes) | not enumerated| 38 / 100% | full coverage | +| Plan-review findings absorbed pre-impl | 2 | 19 | 17× from Codex | + +| Component | New lines | Test lines | +|----------------------------|-----------|------------| +| design/src/daemon.ts | ~580 | 30 tests | +| design/src/daemon-client.ts| ~340 | 17 tests | +| design/src/daemon-state.ts | ~180 | (via client + daemon tests) | +| Browser round-trip via HTTP| (existed) | 4 tests | + +The compression: 51 new tests cover every endpoint, lifecycle path, LRU eviction, idle handling, identity-verified spawn, version mismatch with and without active boards, PID-reuse safety, path traversal rejection, and cross-board feedback isolation. The plan-review pass caught 2 architectural issues in-house; an outside Codex pass caught 17 more, all absorbed into the implementation before any code was written. The version-mismatch path now refuses to silently kill a daemon with active boards (it prints a warning and exits 1), so upgrading gstack mid-design-session doesn't drop your in-memory board history. + +### What this means for the builder + +Open `/design-shotgun` Monday morning, work through three rounds of variants, walk away for lunch, come back, click Submit. The board is still there. Open a second `/design-shotgun` for a different feature in the afternoon, get a new URL at `/boards//`, no port churn, your morning board still works. The whole day's worth of design exploration accumulates as a browsable history at the daemon's root. Stop worrying about the 10-minute death clock. + +### Itemized changes + +#### Added +- **Persistent design daemon** (`design/src/daemon.ts`). Bun HTTP server on `127.0.0.1` hosting many boards under `/boards//`. Per-board state machine (`serving | regenerating | done`), LRU cap of 50 boards (evicts `done` first, returns 503 when 50 non-done coexist), 24h idle timeout with 1h extensions up to a 28h ceiling when boards are still active, per-board async mutex serializing feedback POST vs reload POST. Index page at `/` lists recent boards newest first. +- **`$D daemon status`** and **`$D daemon stop [--force]`**. The stop sub-command refuses without `--force` when active boards exist, so a casual stop doesn't drop in-flight history. +- **Daemon client** (`design/src/daemon-client.ts`). `ensureDaemon()` handles spawn-or-attach with file-lock-protected spawn (re-reads state inside the lock to close the two-CLIs-race window) and identity-verified SIGTERM (reads `/proc/PID/cmdline` on Linux, `ps -p PID -o command=` on macOS, only signals if `gstack-design-daemon` is in the cmdline). PID-reuse safety: if the state file points at a PID belonging to an unrelated process, no signal is sent and a fresh daemon spawns. Version-mismatch refusal: if a CLI from a newer gstack version arrives while boards are still open in an older daemon, the CLI prints a user-actionable warning and exits 1 instead of silently restarting and losing history. +- **Shared daemon state utilities** (`design/src/daemon-state.ts`). Atomic state-file write (`` + `renameSync` at mode `0o600`), `fs.openSync('wx')` exclusive lock, cross-platform cmdline reader, version lookup that falls back through `DESIGN_DAEMON_VERSION` env → `design/dist/.version` baked at build time → source-tree `VERSION` → `"unknown"`. +- **End-to-end round-trip tests against a real spawned daemon** (`design/test/feedback-roundtrip-daemon.test.ts`). HTTP fetch drives publish → submit → regenerate → reload → round-2 submit, asserting `feedback.json` lands at the daemon-derived `sourceDir` with `boardId` and `publishedAt` augmented fields. + +#### Changed +- **Board JS uses relative URLs** instead of an injected `__GSTACK_SERVER_URL` global. The same generated HTML works at `/` (legacy `--no-daemon`) and `/boards//` (daemon). `location.protocol` feature-detect keeps the `file://` DOM-only fallback path working. +- **Bare `GET /boards/` returns 301** to `/boards//`. The trailing slash is load-bearing for relative-URL resolution in the board JS; without it, `fetch('./api/feedback')` would resolve to the wrong scope. +- **Reload guard rejects directory paths**. `design/src/serve.ts:200-212` previously let `resolvedReload === allowedDir` through, which then crashed `readFileSync` with `EISDIR`. Now requires `statSync(resolvedReload).isFile()` with a clear 400 instead. +- **Feedback files carry `boardId` and `publishedAt`** so agents polling `feedback.json` / `feedback-pending.json` in a multi-board world can verify which board produced what. +- **`sourceDir` is derived from `realpath(html)` server-side**, never trusted from the publish POST body. +- **Skill resolvers and templates** (`scripts/resolvers/design.ts`, `design-shotgun/SKILL.md`, `design-consultation/SKILL.md`, `plan-design-review/SKILL.md`, `office-hours/SKILL.md`) updated to parse `BOARD_URL:` from stderr and POST reloads to `${BOARD_URL}api/reload` instead of the legacy port-only `/api/reload`. Legacy `SERVE_STARTED: port=N html=...` line still emitted for back-compat. + +#### Fixed +- **Compiled design binary self-execs as the daemon** via a `--daemon-mode` flag, so the daemon lifecycle works for users installing from `design/dist/design` (not just `bun run` against the source tree). +- **Version lookup** is consistent between client and daemon. Both go through `readVersionString()`, so the version-mismatch refusal path works on the compiled binary instead of always reading `"unknown"` and matching itself. + +#### For contributors +- **Test infrastructure split**: `design/test/daemon.test.ts` (30 in-process tests against the exported `fetchHandler`, ~70ms) for fast iteration; `design/test/daemon-discovery.test.ts` (17 real-spawn tests, ~8s) for lifecycle + lock + identity guarantees. Shared helpers in `design/test/daemon-tests-fixtures.ts`. +- **Plan-review process**: this branch ran `/plan-eng-review` twice. Round 1 caught 2 architecture findings. An outside-voice Codex pass after round 1 found 17 more (URL contract self-contradiction, false test-green claim, lock semantics, identity verification, version-mismatch silent data loss, several others). Round 2 absorbed all 17 before implementation started. The full review trail is preserved in the plan file's `## GSTACK REVIEW REPORT` section. + ## [1.44.1.0] - 2026-05-24 ## **Nine community fixes ship in one bundle.** Office-hours session counter works again, iOS QA tunnels survive macOS 26.x, Windows brain-sync stops dropping artifacts, browse server tells you whether the bind failure was a port collision or a sandbox block. diff --git a/TODOS.md b/TODOS.md index fbd645e60..e8f1c9a70 100644 --- a/TODOS.md +++ b/TODOS.md @@ -1,5 +1,73 @@ # TODOS +## design daemon: follow-ups (filed v1.45.0.0 via /ship review army) + +### P3: Tighten daemon test coverage + +**What:** Three test gaps the testing specialist flagged on the v1.45.0.0 ship: + +1. `design/test/daemon.test.ts:441` (`bare GET /api/progress does NOT reset + meaningful activity`) is a smoke test pretending to be a behavioral test. + Its own body comment admits it can't read `lastMeaningfulActivity` in-process + and only asserts the endpoint stays functional. Move to + `daemon-discovery.test.ts` with `DESIGN_DAEMON_IDLE_MS=2000` + + `DESIGN_DAEMON_CHECK_MS=200`, poll `/api/progress` in a loop, wait + `IDLE_MS+CHECK_MS`, assert the daemon process actually exited. + +2. `design/test/daemon-discovery.test.ts:14` docstring claims a + "concurrent-CLIs race (two real subprocesses, one wins the lock)" test + exists. It doesn't. Add one: fire two `ensureDaemon()` calls in parallel + against the same stateFile via `Promise.all` (or two real subprocesses); + assert both resolve to the same port, exactly one `spawned=true`, exactly + one daemon process alive, no orphaned lock file. This is the primary + correctness gate for the new daemon's spawn-vs-attach race. + +3. `design/test/daemon.test.ts:462` `idleCheckTick` only has a "callable + without throwing" smoke. The 24h-extension-with-active-boards path + (`daemon.ts:240-252`) and the `MAX_EXTENSIONS` hard ceiling + (`daemon.ts:244-247`, force-shutdown after 4 extensions) are untested. + Both are load-bearing for the 24h timeout this PR ships. + +Plus: missing malformed-JSON tests for `POST /api/boards` and +`POST /boards//api/reload` (only feedback has one); missing stale-lock +reclaim test (`daemon-state.ts:208-213` PID-dead branch is only exercised +indirectly). + +**Pros:** Closes real coverage gaps that ship-time review surfaced. Each test +is small (one or two `it()` blocks). + +**Cons:** None — these are additive tests, no behavior change. + +### P3: Minor maintainability nits from /ship review + +- `design/src/cli.ts` and `design/src/serve.ts` both have a small `openBrowser` + helper with identical darwin/linux/else branches. Extract a shared + `design/src/open-browser.ts`. +- `design/src/daemon-client.ts:320` (`AbortSignal.timeout(2000)`) and `:357` + (`delay(50)`) use bare numeric literals while sibling timeouts are named + constants. Promote to `SHUTDOWN_POST_TIMEOUT_MS` and `ALIVE_POLL_INTERVAL_MS`. +- `design/src/daemon-state.ts:21` `serverPath` field is written + (`daemon.ts:541`) but never read by production code. Either remove or + document the forensic intent. + +### P3: Daemon scope deferred from v1.45.0.0 plan + +Originally listed in the plan's "TODOs surfaced for later" section: + +- Per-daemon scoped auth tokens (only relevant once a tunnel/share use case appears). +- Optional persistent board history on disk in + `~/.gstack/projects/$SLUG/designs/history/` so submitted boards survive + daemon restarts. +- Windows spawn branch lifted from browse (V1 daemon is macOS + Linux; + Windows users fall back to legacy `--no-daemon` per-process server). +- `$D board list` / `$D board stop ` per-board ops CLI (V1 has only + `$D daemon status` / `stop`). +- Cross-worktree daemon attach (conductor sibling worktrees of the same + repo currently each spawn their own daemon — matches browse; revisit + if it causes friction). + +--- + ## browse server: terminal-agent teardown follow-ups (filed v1.41 via /plan-eng-review) ### ✅ DONE (v1.44.0.0): Identity-based terminal-agent kill (replace pkill regex with PID) diff --git a/VERSION b/VERSION index fc2278642..94cf8fed1 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -1.44.1.0 +1.45.0.0 diff --git a/package.json b/package.json index 3bcaa0f77..a8e289a30 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "gstack", - "version": "1.44.1.0", + "version": "1.45.0.0", "description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.", "license": "MIT", "type": "module",