Commit Graph

3 Commits

Author SHA1 Message Date
Garry Tan 11de390be1
v1.58.5.0 feat: first-run activation scaffold + gstack router front door (#2078)
* feat: first-run activation — project-aware scaffold, router front door, onboarding nudges

Adds the activation system that drives a new install toward a concrete first move:
- bin/gstack-first-task-detect: local-git+filesystem repo classifier emitting one
  validated enum bucket (greenfield/code_<lang>/branch_ahead/dirty_default/clean_default),
  portable timeouts, fail-safe empty output.
- generate-first-run-guidance.ts: unified preamble section — first-run project-aware
  scaffold + returning-session plan->review->ship tip, gated on a persistent .activated
  marker and never run in headless. Detection wired lazily in generate-preamble-bash.ts.
- SKILL.md.tmpl: top-level gstack skill is now a pure router (browse body removed; it
  lives in /browse), routing any request and sending browser/QA work to /browse.
- setup: first-move nudge on first install. office-hours: closing handoff that launches
  the next review via the Skill tool.
- telemetry-ingest: accept onboarding/first_task_scaffold_shown/handoff/route event types.

* test: cover first-run detection + repoint browse-content assertions to /browse

- New unit tests for every detection bucket, the eval-safe enum contract, and the
  first-run gating (test/preamble-first-task-scaffold.test.ts); periodic E2E that runs
  the detector through the real harness (test/skill-e2e-first-task-scaffold.test.ts).
- Repoint browse-content assertions (gen-skill-docs, audit-compliance, skill-validation,
  LLM-judge eval) from the root skill to browse/SKILL.md following the router split;
  add a regression pinning that the router carries no browse body.
- Register first-task-scaffold touchfiles + periodic tier; bump parity/carve size caps
  ~1-2KB per skill for the shared first-run-guidance preamble section.
- Refresh ship golden fixtures for the preamble addition.

* chore: regenerate SKILL.md + llms.txt for first-run activation

* chore: bump version and changelog (v1.58.5.0)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(test): repoint bws skillmd-* setup-block assertions to browse/SKILL.md

The skillmd-setup-discovery / -no-local-binary / -outside-git E2E tests extracted
the `## SETUP`→`## IMPORTANT` browse binary-discovery block from the root SKILL.md.
P2 moved that block to browse/SKILL.md (end anchor is now `## Core QA Patterns`),
so the slice came back empty and the `browse/dist/browse` guard failed. Repoint to
browse/SKILL.md. Verified: 7/7 e2e-browse pass locally.

* fix(test): tolerate skill-discovery race in PTY plan-mode smoke

The e2e-pty-plan-smoke suite (office-hours / plan-mode-no-op) failed in CI with
`Unknown command: /office-hours` (claude exited ~10s) while passing locally. Root
cause: a cold CI container's overlay-FS scan of the symlinked ~/.claude/skills
registry finishes AFTER the runner's 8s boot grace, so the first `/skill` send
reaches claude before the skill is indexed and is rejected as unknown. The runner
gave up on the first "Unknown command:" line.

runPlanSkillObservation now re-sends the skill command up to 3x (6s apart),
re-marking the buffer each time so stale scrollback can't re-trip the check,
before concluding the skill is genuinely unregistered. A real dangling-symlink /
missing-skill still surfaces as 'exited' (after retries), preserving the original
diagnostic. Pure-helper contract unchanged: 95/95 unit tests pass.

This is a pre-existing harness bug (fails identically on #2077's own branch, which
introduced the suite) surfaced while shipping the activation feature.

* debug(ci): temporarily instrument pty-smoke skill discovery

Capture claude version, env, registry tree, and a claude -p discovery probe to
pin why /office-hours isn't discovered in CI (retries proved it's not a race).
Temporary — revert once the registry fix is identified.

* chore: revert pty-smoke harness experiments (race-retry + CI debug step)

Diagnosis is conclusive and the experiments aren't the fix, so restore the
harness to its original state (net-zero diff vs main for both files).

What the CI debug step proved: `claude -p` returns READY — claude v2.1.187 fully
DISCOVERS /office-hours from the symlinked registry. Only the interactive PTY TUI
rejects it as "Unknown command" (and it received the full command text). So the
e2e-pty-plan-smoke failure is a claude 2.1.187 interactive-TUI regression (skills
discovered by `claude -p` aren't exposed as TUI slash commands), pre-existing in
the #2077 harness and failing identically on its own origin branch — unrelated to
this activation PR. The race-retry can't help (the TUI genuinely lacks the
command); the debug step also tripped actionlint (shellcheck SC2012). Both reverted.

* fix(ci): copy SKILL.md as real files in pty-smoke registry (cross-mount symlink)

The e2e-pty-plan-smoke suite failed with "Unknown command: /office-hours" in CI
while passing locally. Root cause (proven, not guessed): claude 2.1.187's
interactive-TUI skill scanner does not follow the /github/home -> /__w cross-mount
symlink the registry used for per-skill SKILL.md. Evidence: a CI debug step showed
`claude -p` discovered the skill (printed READY), and a local macOS repro with the
identical symlinked registry recognized /office-hours — isolating the failure to
the container's cross-mount symlink, not registration content, claude version,
duplicate names, or a race.

Fix: register the per-skill SKILL.md + sections as REAL copies (same mount as
$HOME) so the TUI reads them directly. The gstack root stays a symlink — the
preamble's runtime bash resolves bin/* and sections/* through it and bash follows
cross-mount symlinks fine.

* fix(ci): guard rm expansion in pty-smoke registry (shellcheck SC2115)

* fix(ci): also register pty-smoke skills project-scoped (cwd/.claude/skills)

The real-file user-dir registration still left the TUI rejecting /office-hours in
the container. claude's interactive TUI surfaces /slash commands from the PROJECT
dir (<cwd>/.claude/skills); the smokes run with cwd=$REPO whose .claude/skills is
gitignored (absent on a fresh CI checkout), so the user-dir registry feeds
`claude -p` (READY) but not the TUI. Populate $REPO/.claude/skills with real
SKILL.md + sections copies (no gstack symlink there — it would point at its own
parent; runtime paths use the user-dir gstack symlink).

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 09:42:45 -07:00
Garry Tan 115d81d792
fix: security wave 1 — 14 fixes for audit #783 (v0.15.7.0) (#810)
* fix: DNS rebinding protection checks AAAA (IPv6) records too

Cherry-pick PR #744 by @Gonzih. Closes the IPv6-only DNS rebinding gap
by checking both A and AAAA records independently.

Co-Authored-By: Gonzih <gonzih@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: validateOutputPath symlink bypass — resolve real path before safe-dir check

Cherry-pick PR #745 by @Gonzih. Adds a second pass using fs.realpathSync()
to resolve symlinks after lexical path validation.

Co-Authored-By: Gonzih <gonzih@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: validate saved URLs before navigation in restoreState

Cherry-pick PR #751 by @Gonzih. Prevents navigation to cloud metadata
endpoints or file:// URIs embedded in user-writable state files.

Co-Authored-By: Gonzih <gonzih@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: telemetry-ingest uses anon key instead of service role key

Cherry-pick PR #750 by @Gonzih. The service role key bypasses RLS and
grants unrestricted database access — anon key + RLS is the right model
for a public telemetry endpoint.

Co-Authored-By: Gonzih <gonzih@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: killAgent() actually kills the sidebar claude subprocess

Cherry-pick PR #743 by @Gonzih. Implements cross-process kill signaling
via kill-file + polling pattern, tracks active processes per-tab.

Co-Authored-By: Gonzih <gonzih@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(design): bind server to localhost and validate reload paths

Cherry-pick PR #803 by @garagon. Adds hostname: '127.0.0.1' to Bun.serve()
and validates /api/reload paths are within cwd() or tmpdir(). Closes C1+C2
from security audit #783.

Co-Authored-By: garagon <garagon@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add auth gate to /inspector/events SSE endpoint (C3)

The /inspector/events endpoint had no authentication, unlike /activity/stream
which validates tokens. Now requires the same Bearer header or ?token= query
param check. Closes C3 from security audit #783.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sanitize design feedback with trust boundary markers (C4+H5)

Wrap user feedback in <user-feedback> XML markers with tag escaping to
prevent prompt injection via malicious feedback text. Cap accumulated
feedback to last 5 iterations to limit incremental poisoning.
Closes C4 and H5 from security audit #783.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: harden file/directory permissions to owner-only (C5+H9+M9+M10)

Add mode 0o700 to all mkdirSync calls for state/session directories.
Add mode 0o600 to all writeFileSync calls for session.json, chat.jsonl,
and log files. Add umask 077 to setup script. Prevents auth tokens, chat
history, and browser logs from being world-readable on multi-user systems.
Closes C5, H9, M9, M10 from security audit #783.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: TOCTOU race in setup symlink creation (C6)

Remove the existence check before mkdir -p (it's idempotent) and validate
the target isn't already a symlink before creating the link. Prevents a
local attacker from racing between the check and mkdir to redirect
SKILL.md writes. Closes C6 from security audit #783.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remove CORS wildcard, restrict to localhost (H1)

Replace Access-Control-Allow-Origin: * with http://127.0.0.1 on sidebar
tab/chat endpoints. The Chrome extension uses manifest host_permissions
to bypass CORS entirely, so this only blocks malicious websites from
making cross-origin requests. Closes H1 from security audit #783.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: make cookie picker auth mandatory (H2)

Remove the conditional if(authToken) guard that skipped auth when
authToken was undefined. Now all cookie picker data/action routes
reject unauthenticated requests. Closes H2 from security audit #783.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: gate /health token on chrome-extension Origin header

Only return the auth token in /health response when the request Origin
starts with chrome-extension://. The Chrome extension always sends this
origin via manifest host_permissions. Regular HTTP requests (including
tunneled ones from ngrok/SSH) won't get the token. The extension also
has a fallback path through background.js that reads the token from the
state file directly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: update server-auth test for chrome-extension Origin gating

The test previously checked for 'localhost-only' comment. Now checks for
'chrome-extension://' since the token is gated on Origin header.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.15.7.0)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Gonzih <gonzih@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: garagon <garagon@users.noreply.github.com>
2026-04-04 22:12:04 -07:00
Garry Tan 3b22fc39e6
feat: opt-in usage telemetry + community intelligence platform (v0.8.6) (#210)
* feat: add gstack-telemetry-log and gstack-analytics scripts

Local telemetry infrastructure for gstack usage tracking.
gstack-telemetry-log appends JSONL events with skill name, duration,
outcome, session ID, and platform info. Supports off/anonymous/community
privacy tiers. gstack-analytics renders a personal usage dashboard
from local data.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add telemetry preamble injection + opt-in prompt + epilogue

Extends generatePreamble() with telemetry start block (config read,
timer, session ID, .pending marker), opt-in prompt (gated by
.telemetry-prompted), and epilogue instructions for Claude to log
events after skill completion. Adds 5 telemetry tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: regenerate all SKILL.md files with telemetry blocks

Automated regeneration from gen-skill-docs.ts changes. All skills
now include telemetry start block, opt-in prompt, and epilogue.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add Supabase schema, edge functions, and SQL views

Telemetry backend infrastructure: telemetry_events table with RLS
(insert-only), installations table for retention tracking,
update_checks for install pings. Edge functions for update-check
(version + ping), telemetry-ingest (batch insert), and
community-pulse (weekly active count). SQL views for crash
clustering and skill co-occurrence sequences.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add telemetry-sync, community-dashboard, and integration tests

gstack-telemetry-sync: fire-and-forget JSONL → Supabase sync with
privacy tier field stripping, batch limits, and cursor tracking.
gstack-community-dashboard: CLI tool querying Supabase for skill
popularity, crash clusters, and version distribution.
19 integration tests covering all telemetry scripts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: session-specific .pending markers + crash_clusters view fix

Addresses Codex review findings:
- .pending race condition: use .pending-$SESSION_ID instead of
  shared .pending file to prevent concurrent session interference
- crash_clusters view: add total_occurrences and anonymous_occurrences
  columns since anonymous tier has no installation_id
- Added test: own session pending marker is not finalized

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: dual-attempt update check with Supabase install ping

Fires a parallel background curl to Supabase during the slow-path
version fetch. Logs upgrade_prompted event only on fresh fetches
(not cached replays) to avoid overcounting. GitHub remains the
primary version source — Supabase ping is fire-and-forget.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: integrate telemetry usage stats into /retro output

Retro now reads ~/.gstack/analytics/skill-usage.jsonl and includes
gstack usage metrics (skill run counts, top skills, success rate)
in the weekly retrospective output.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: move 'Skill usage telemetry' to Completed in TODOS.md

Implemented in this branch: local JSONL logging, opt-in prompt,
privacy tiers, Supabase backend, community dashboard, /retro
integration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: wire Supabase credentials and expose tables via Data API

Add supabase/config.sh with project URL and publishable key (safe to
commit — RLS restricts to INSERT only). Update telemetry-sync,
community-dashboard, and update-check to source the config and
include proper auth headers for the Supabase REST API.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add SELECT RLS policies to migration for community dashboard reads

All telemetry data is anonymous (no PII), so public reads via the
publishable key are safe. Needed for the community dashboard to
query skill popularity, crash clusters, and version distribution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.8.6)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: analytics backward-compatible with old JSONL format

Handle old-format events (no event_type field) alongside new format.
Skip hook_fire events. Fix grep -c whitespace issues and unbound
variable errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: map JSONL field names to Postgres columns in telemetry-sync

Local JSONL uses short names (v, ts, sessions) but the Supabase
table expects full names (schema_version, event_timestamp,
concurrent_sessions). Add sed mapping during field stripping.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address Codex adversarial findings — cursor, opt-out, queries

- Sync cursor now advances on HTTP 2xx (not grep for "inserted")
- Update-check respects telemetry opt-out before pinging Supabase
- Dashboard queries use correct view column names (total_occurrences)
- Sync strips old-format "repo" field to prevent privacy leak

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add Privacy & Telemetry section to README

Transparent disclosure of what telemetry collects, what it never sends,
how to opt out, and a link to the schema so users can verify.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 17:21:05 -07:00