gstack/sync-gbrain/SKILL.md.tmpl

381 lines
16 KiB
Cheetah

---
name: sync-gbrain
preamble-tier: 2
version: 1.0.0
description: |
Keep gbrain current with this repo's code and refresh agent search
guidance in CLAUDE.md. Wraps the gstack-gbrain-sync orchestrator with
state probing, native code-surface registration, capability checks,
and a verdict block. Re-runnable, idempotent. Use when: "sync gbrain",
"refresh gbrain", "re-index this repo", "gbrain search isn't finding
things". (gstack)
triggers:
- sync gbrain
- refresh gbrain
- reindex repo
- update gbrain
allowed-tools:
- Bash
- Read
- Write
- Edit
- Glob
- Grep
- AskUserQuestion
---
{{PREAMBLE}}
# /sync-gbrain — Keep gbrain current and teach the agent to use it
You are running the canonical "keep this brain up to date" verb. /setup-gbrain
installs gbrain once; /sync-gbrain runs every time the user wants the brain
refreshed against this repo's current state, and refreshes the agent-side
guidance in CLAUDE.md so the coding agent knows when to prefer `gbrain`
search over Grep.
**Architecture (post-codex review):** This skill uses gbrain v0.20.0+'s
**native code surfaces** (`gbrain sources add`, `gbrain sync --strategy code`,
`gbrain reindex-code`, `gbrain code-def/code-refs/code-callers/code-callees`).
It does NOT use `gbrain import` (that path is for markdown directories).
It does NOT touch `~/.gstack/` indexing (the existing `gstack-gbrain-source-wireup`
owns that — never double-store).
## User-invocable
When the user types `/sync-gbrain`, run this skill. Argument modes (parsed by
the skill itself, not a dispatcher binary):
- `/sync-gbrain` — incremental sync (default; mtime fast-path; ~50ms steady-state)
- `/sync-gbrain --full` — full code reindex via `gbrain reindex-code` (~25-35 min on a big repo)
- `/sync-gbrain --code-only` — only run the code stage; skip memory + brain-sync
- `/sync-gbrain --dry-run` — preview what would sync; no writes anywhere
- `/sync-gbrain --no-memory` / `--no-brain-sync` — selectively skip stages
- `/sync-gbrain --quiet` — suppress per-stage output
- `/sync-gbrain --refresh-cache` — force-rebuild brain-aware planning cache (v1.48; replaces /brain-refresh-context per D1 fold). Skips code + memory stages; routes to `gstack-brain-cache refresh --project <slug>`.
- `/sync-gbrain --audit` — emit summary of gstack-owned pages per project + sensitive-content audit (v1.48 / D10 lifecycle). Read-only.
Pass-through args go straight to the orchestrator at
`{{BIN_DIR}}/gstack-gbrain-sync.ts`.
**`--refresh-cache` short-circuit:** when this flag is present, the skill
runs ONLY the cache refresh (`gstack-brain-cache refresh --project <slug>`
for the current worktree's slug, plus a cross-project refresh of
user-profile if `gstack/user-profile/<user-slug>` exists). Code +
memory + brain-sync stages are skipped. Useful when the user knows the
brain has new info gstack should pick up before the next planning skill.
**`--audit` short-circuit:** when this flag is present, the skill runs
`gstack-brain-cache list --project <slug> --json`, summarizes by page
type, then scans for any cached salience entries that ended up outside
the SALIENCE_DEFAULT_ALLOWLIST (T17 / D9 leak check). Read-only; no
modifications to brain or cache.
---
## Step 1: State probe
Before doing anything, check that /setup-gbrain has been run on this Mac.
```bash
~/.claude/skills/gstack/bin/gstack-gbrain-detect 2>/dev/null
```
**Brain trust policy gate (v1.48 / Phase 1.5 / D4 — added by T13+T5c):**
If `gbrain_mcp_mode == "remote-http"` from the detect output AND the per-
endpoint policy is `unset`, the policy question MUST fire here before
the orchestrator runs. Local engines auto-set to `personal` silently per
the per-transport default table.
```bash
_HASH=$(~/.claude/skills/gstack/bin/gstack-config endpoint-hash 2>/dev/null)
_POLICY=$(~/.claude/skills/gstack/bin/gstack-config get brain_trust_policy@$_HASH 2>/dev/null || echo unset)
echo "BRAIN_TRUST_POLICY[$_HASH]: $_POLICY"
```
If `_POLICY == "unset"` AND `_HASH != "local"`, AskUserQuestion per the
Step 9.5 wording in `/setup-gbrain` (personal vs shared, with persistence
to `brain_trust_policy@<hash>` and conditional `artifacts_sync_mode=full`
flip for personal). Then continue.
If `_POLICY == "unset"` AND `_HASH == "local"`, auto-set personal:
```bash
~/.claude/skills/gstack/bin/gstack-config set brain_trust_policy@$_HASH personal
```
**Split-engine model (v1.34.0.0+).** Code stage runs locally against the
per-machine gbrain engine (PGLite or whatever `gbrain config` points to),
with each worktree of a repo registered as its own source. **Memory stage
also runs locally** in local-stdio MCP mode — `gstack-memory-ingest` shells
out to `gbrain import` against the same local engine. In remote-http MCP
mode (Path 4), the memory stage instead persists staged markdown to
`~/.gstack/transcripts/<run-id>/` and the artifacts pipeline pushes it to
the brain admin's pull job (plan D11). Brain-sync (the `gstack-brain-sync`
push to git) is the one stage that never touches local engine and runs
regardless of mode.
Practically: local PGLite stays code-only on remote-http machines; the
remote brain holds everything else. Local-stdio machines mix code +
transcripts in one local engine, as they always have.
Also check the per-repo trust policy. If `gstack-gbrain-repo-policy get` for
this repo returns `deny`, STOP:
> "This repo's gbrain trust policy is `deny`. Run `/setup-gbrain --repo` to
> change it before syncing."
---
## Step 1.5: Local engine pre-flight (plan D12)
Read `gbrain_local_status` from the Step 1 detect output. Branch as follows
BEFORE invoking the orchestrator:
- **`ok`**: proceed to Step 2 normally.
- **`no-cli`**: STOP. "Local gbrain CLI not installed. Run `/setup-gbrain`
first."
- **`missing-config`** AND `gbrain_mcp_mode == "remote-http"`: tell the user
"Your brain queries (the `mcp__gbrain__*` tools) work via remote MCP, but
symbol code search needs a local PGLite. Run `/setup-gbrain` and pick
'Yes' at the new 'local code index' prompt (Step 4.5), or run
`gbrain init --pglite --json --embedding-model voyage:voyage-code-3 --embedding-dimensions 1024`
directly (drop the voyage flags if `VOYAGE_API_KEY` isn't set). Continuing
without code stage."
Then proceed to Step 2 — the orchestrator's `runCodeImport()` and
`runMemoryIngest()` will return SKIP per plan D12; only `runBrainSyncPush()`
will run. Do NOT abort.
- **`missing-config`** AND `gbrain_mcp_mode != "remote-http"`: STOP. "Local
gbrain CLI is installed but no engine config. Run `/setup-gbrain` first."
- **`broken-config`** OR **`broken-db`**: STOP with a clear message:
```
Local gbrain config at ~/.gbrain/config.json points at an unreachable
engine (status: {gbrain_local_status}). Two options:
1. Re-run /setup-gbrain — Step 1.5 offers Retry / Switch to PGLite /
Switch brain mode / Quit (plan D4).
2. Repair manually: mv ~/.gbrain/config.json ~/.gbrain/config.json.bak
&& gbrain init --pglite --json --embedding-model voyage:voyage-code-3 \
--embedding-dimensions 1024 (drop voyage flags if VOYAGE_API_KEY unset)
Re-run /sync-gbrain after.
```
Do NOT continue — the orchestrator would skip code+memory and only run
brain-sync, which is a degraded state the user should fix explicitly.
This pre-flight short-circuits the orchestrator before it spends ~80ms
probing the engine again. The orchestrator independently runs the same
classifier for defense-in-depth, but Step 1.5's STOP is where the user
gets the actionable remediation message.
---
## Step 2: Run the orchestrator
Pass user args to the orchestrator. Do not paraphrase them — pass through
as-is.
```bash
bun run ~/.claude/skills/gstack/bin/gstack-gbrain-sync.ts <user-args>
```
The orchestrator runs three stages: code → memory → brain-sync (per the
plan's storage tiering). Each stage failure is non-fatal; subsequent stages
still run. State is persisted to `~/.gstack/.gbrain-sync-state.json` via
tmp-file + atomic rename. Concurrent runs are blocked by a lock file at
`~/.gstack/.sync-gbrain.lock` (5-min stale-takeover).
---
## Step 3: Code-index health check
After the sync run, query gbrain for the cwd source's page_count:
```bash
SOURCE_ID=$(grep -o '"source_id":"[^"]*"' ~/.gstack/.gbrain-sync-state.json 2>/dev/null \
| head -1 | sed 's/.*"source_id":"//;s/".*//')
PAGES=$(gbrain sources list --json 2>/dev/null \
| jq -r --arg id "$SOURCE_ID" '.sources[] | select(.id==$id) | .page_count' 2>/dev/null \
|| echo 0)
echo "cwd source: $SOURCE_ID, page_count: $PAGES"
```
If `PAGES` is 0 or empty AND the user did NOT pass `--no-code` AND mode was
not `--full`, AskUserQuestion via the format in the preamble:
> D1 — This repo has 0 indexed pages in gbrain. Run a full code reindex now?
>
> ELI10: gbrain hasn't indexed this repo's code yet. The semantic search
> tools (`gbrain search`, `code-def`, `code-refs`) will return nothing
> until we run a full pass. Takes ~25-35 minutes on a big Mac.
>
> Recommendation: A — the brain is unusable for code search until indexed,
> and Step 2 of this skill already verified gbrain is configured correctly.
>
> Note: options differ in kind, not coverage — no completeness score.
>
> A) Run /sync-gbrain --full now (recommended)
> B) Skip — I'll run it later
If A: re-invoke the orchestrator with `--full --code-only`.
If B: continue to Step 4 with the empty-corpus state recorded.
---
## Step 4: Refresh `## GBrain Search Guidance` block in CLAUDE.md
Capability check (per /plan-eng-review §6):
```bash
SLUG="_capability_check_$$"
CAPABILITY_OK=0
if [ -f ~/.gbrain/config.json ] && \
gbrain --version 2>/dev/null | grep -q '^gbrain '; then
# GBRAIN_PREPARE=true ensures prepared statements stay enabled when
# connecting through a PgBouncer transaction-mode pooler (port 6543).
# Without it, search silently returns no results (#1435).
export GBRAIN_PREPARE=true
if echo "ping" | gbrain put "$SLUG" >/dev/null 2>&1; then
# Retry search up to 3 times with 1s delay — under transaction-mode
# pooling the search index may not be visible on the next connection
# immediately after the put.
for _attempt in 1 2 3; do
if gbrain search "ping" 2>/dev/null | grep -q "$SLUG"; then
CAPABILITY_OK=1
break
fi
sleep 1
done
fi
fi
gbrain delete "$SLUG" 2>/dev/null || true
```
Then update CLAUDE.md based on capability state:
**If `CAPABILITY_OK=1`** — write or update the block. Idempotent: find the
HTML-comment-delimited block; replace its body if it exists; append at the
end of CLAUDE.md if it doesn't. NEVER duplicate. Block is machine-AGNOSTIC
(no engine, no page counts, no last-sync time — those are in the existing
`## GBrain Configuration` block).
Verbatim block content (copy exactly):
```markdown
## GBrain Search Guidance (configured by /sync-gbrain)
<!-- gstack-gbrain-search-guidance:start -->
GBrain is set up and synced on this machine. The agent should prefer gbrain
over Grep when the question is semantic or when you don't know the exact
identifier yet.
**This worktree is pinned to a worktree-scoped code source** via the
`.gbrain-source` file in the repo root (kubectl-style context). Any
`gbrain code-def`, `code-refs`, `code-callers`, `code-callees`, or `query`
call from anywhere under this worktree routes to that source by default —
no `--source` flag needed. Conductor sibling worktrees of the same repo
each have their own pin and their own indexed pages, so semantic results
match the actual code on disk in this worktree.
Two indexed corpora available via the `gbrain` CLI:
- This worktree's code (auto-pinned via `.gbrain-source`).
- `~/.gstack/` curated memory (registered as `gstack-brain-<user>` source via
the existing federation pipeline).
Prefer gbrain when:
- "Where is X handled?" / semantic intent, no exact string yet:
`gbrain search "<terms>"` or `gbrain query "<question>"`
- "Where is symbol Y defined?" / symbol-based code questions:
`gbrain code-def <symbol>` or `gbrain code-refs <symbol>`
- "What calls Y?" / "What does Y depend on?":
`gbrain code-callers <symbol>` / `gbrain code-callees <symbol>`
- "What did we decide last time?" / past plans, retros, learnings:
`gbrain search "<terms>" --source gstack-brain-<user>`
Grep is still right for known exact strings, regex, multiline patterns, and
file globs. Run `/sync-gbrain` after meaningful code changes; for ongoing
auto-sync across all worktrees, run `gbrain autopilot --install` once per
machine — gbrain's daemon handles incremental refresh on a schedule.
Safety: don't run `/sync-gbrain` while `gbrain autopilot` is active — the
orchestrator refuses destructive source ops when it detects a running autopilot
to avoid racing it (#1734). Prefer registering user repos with `gbrain sources
add --path <dir>` (no `--url`): URL-managed sources can auto-reclone, and the
sync code walk for them requires an explicit `--allow-reclone` opt-in.
<!-- gstack-gbrain-search-guidance:end -->
```
Use the Read + Edit tools. The find-and-replace target is the entire region
from `<!-- gstack-gbrain-search-guidance:start -->` through
`<!-- gstack-gbrain-search-guidance:end -->`. If those markers are missing,
search for `## GBrain Search Guidance (configured by /sync-gbrain)` heading
and replace from there to the next `## ` or EOF. If no heading exists, append
the entire block at the end of CLAUDE.md.
**Atomic write:** write the new CLAUDE.md content to a tmp file alongside it
(e.g., `CLAUDE.md.sync-gbrain.tmp`) then `mv` to atomic-rename, so a crash
mid-write never leaves the file half-modified.
**If `CAPABILITY_OK=0`** — REMOVE the block entirely if present. Use the same
Edit tool to strip the start/end-marker region. The `## GBrain Configuration`
block stays in place (it's a record of the install, not a capability claim).
Do NOT crash if CLAUDE.md is missing or unwritable — log a warning and
continue.
---
## Step 5: Verdict block (idempotent doctor output)
Print a status block matching `/setup-gbrain` Step 10 conventions. Each row
is `[OK]/[FIX]/[WARN]/[ERR]`. Reuse `gbrain doctor --json --fast` for
informational rows but DO NOT gate the guidance block on doctor (per
/plan-eng-review §6 — doctor is too strict for unrelated reasons).
```
gbrain status: GREEN
CLI ............. OK <gbrain version>
Engine .......... OK <pglite|supabase>
Capability ...... OK write+search round-trip
CWD source ...... OK <gstack-code-{repo_slug}> (page_count=<N>)
~/.gstack source. OK <gstack-brain-{user}> (page_count=<N>) — managed by /setup-gbrain
Memory sync ..... OK <artifacts_sync_mode>
CLAUDE.md ....... OK ## GBrain Search Guidance present
Last sync ....... OK <last_sync from state file>
Run `/sync-gbrain` again any time gbrain feels off; safe and idempotent.
```
If any row is YELLOW or RED, the verdict line says so and the failing rows
surface a one-line "next action" (e.g., `Capability ...... ERR capability
check failed; CLAUDE.md guidance block REMOVED — run /setup-gbrain to repair`).
---
## Concurrency note
This skill is safe to run concurrently from multiple terminals on the same
Mac. The orchestrator acquires a lock at `~/.gstack/.sync-gbrain.lock` before
any state-file or CLAUDE.md mutation and exits with code 2 if another sync is
in flight. Stale locks (process died) auto-clear after 5 minutes.
## Cross-machine note
The `## GBrain Search Guidance` block is committed to the repo's CLAUDE.md
and travels with `git push`/`git pull` — NOT through `~/.gstack/.brain-allowlist`
(which is for `~/.gstack/` brain-sync only). On a different Mac with a synced
CLAUDE.md but no local gbrain, /sync-gbrain detects the mismatch via the
capability check and REMOVES the block (the local agent shouldn't be told to
use a tool that isn't installed).
## Status reporting
End with a Completion Status (per the preamble protocol):
- **DONE** — all stages green, CLAUDE.md guidance block present, verdict GREEN.
- **DONE_WITH_CONCERNS** — sync ran but at least one stage failed or capability
check failed. List which.
- **BLOCKED** — could not acquire lock, gbrain not on PATH, or per-repo policy
is deny. State the blocker.
- **NEEDS_CONTEXT** — /setup-gbrain has not been run, or `gbrain doctor` shows
a state that requires user decision (e.g., engine migration).