Merge remote-tracking branch 'upstream/main' into worktree-api-timeout-retry-v2

# Conflicts:
#	rust/crates/runtime/src/config.rs
#	rust/crates/runtime/src/lib.rs
This commit is contained in:
TheArchitectit 2026-06-04 09:17:43 -05:00
commit 9e50cb6e20
33 changed files with 8741 additions and 1193 deletions

View File

@ -13,7 +13,7 @@ cd "$repo_root"
if [[ -x scripts/roadmap-check-ids.sh ]]; then
echo "pre-push: scripts/roadmap-check-ids.sh" >&2
scripts/roadmap-check-ids.sh
scripts/roadmap-check-ids.sh >&2
fi
if [[ "${SKIP_CLAW_PRE_PUSH_BUILD:-}" == "1" ]]; then

1
.gitignore vendored
View File

@ -8,6 +8,7 @@ archive/
# Claw Code local artifacts
.claw/settings.local.json
.claw/sessions/
.claw/rules.local/
.clawhip/
status-help.txt
# Legacy Python port session scratch artifacts

View File

@ -1244,8 +1244,7 @@ Model name prefix now wins unconditionally over env-var presence. Regression tes
45. **`claw dump-manifests` fails with opaque "No such file or directory"** — dogfooded 2026-04-09. `claw dump-manifests` emits `error: failed to extract manifests: No such file or directory (os error 2)` with no indication of which file or directory is missing. **Partial fix at `47aa1a5`+1**: error message now includes `looked in: <path>` so the build-tree path is visible, what manifests are, or how to fix it. Fix shape: (a) surface the missing path in the error message; (b) add a pre-check that explains what manifests are and where they should be (e.g. `.claw/manifests/` or the plugins directory); (c) if the command is only valid after `claw init` or after installing plugins, say so explicitly. Source: Jobdori dogfood 2026-04-09.
45. **`claw dump-manifests` fails with opaque `No such file or directory`** — **done (verified 2026-04-12):** current `main` now accepts `claw dump-manifests --manifests-dir PATH`, pre-checks for the required upstream manifest files (`src/commands.ts`, `src/tools.ts`, `src/entrypoints/cli.tsx`), and replaces the opaque os error with guidance that points users to `CLAUDE_CODE_UPSTREAM` or `--manifests-dir`. Fresh proof: parser coverage for both flag forms, unit coverage for missing-manifest and explicit-path flows, and `output_format_contract` JSON coverage via the new flag all pass. **Original filing below.**
45. **`claw dump-manifests` fails with opaque `No such file or directory`** — **done (verified 2026-04-12):** current `main` now accepts `claw dump-manifests --manifests-dir PATH`, pre-checks for the required upstream manifest files (`src/commands.ts`, `src/tools.ts`, `src/entrypoints/cli.tsx`), and replaces the opaque os error with guidance that points users to `CLAUDE_CODE_UPSTREAM` or `--manifests-dir`. Fresh proof: parser coverage for both flag forms, unit coverage for missing-manifest and explicit-path flows, and `output_format_contract` JSON coverage via the new flag all pass. **Original filing below.**
45. **`claw dump-manifests` fails with opaque `No such file or directory`** — **done (verified 2026-06-03):** current `main` now emits a self-contained Rust resolver inventory for `claw dump-manifests` without requiring upstream TypeScript files, build-machine paths, or `CLAUDE_CODE_UPSTREAM`. Explicit `--manifests-dir PATH` scopes resolver discovery to another directory and missing/not-directory values emit typed `missing_manifests` guidance. Fresh proof: parser coverage for both flag forms, unit coverage for self-contained default, explicit-directory, and missing-directory flows, plus `output_format_contract` JSON coverage all pass. **Original filing below.**
46. **`/tokens`, `/cache`, `/stats` were dead spec — parse arms missing** — dogfooded 2026-04-09. All three had spec entries with `resume_supported: true` but no parse arms, producing the circular error "Unknown slash command: /tokens — Did you mean /tokens". Also `SlashCommand::Stats` existed but was unimplemented in both REPL and resume dispatch. **Done at `60ec2ae` 2026-04-09**: `"tokens" | "cache"` now alias to `SlashCommand::Stats`; `Stats` is wired in both REPL and resume path with full JSON output. Source: Jobdori dogfood.
47. **`/diff` fails with cryptic "unknown option 'cached'" outside a git repo; resume /diff used wrong CWD** — dogfooded 2026-04-09. `claw --resume <session> /diff` in a non-git directory produced `git diff --cached failed: error: unknown option 'cached'` because git falls back to `--no-index` mode outside a git tree. Also resume `/diff` used `session_path.parent()` (the `.claw/sessions/<id>/` dir) as CWD for the diff — never a git repo. **Done at `aef85f8` 2026-04-09**: `render_diff_report_for()` now checks `git rev-parse --is-inside-work-tree` first and returns a clear "no git repository" message; resume `/diff` uses `std::env::current_dir()`. Source: Jobdori dogfood.
@ -1547,7 +1546,7 @@ Original filing (2026-04-13): user requested a `-acp` parameter to support ACP p
**Source.** Jobdori dogfood 2026-04-17 against `/tmp/cd3` on main HEAD `e58c194` in response to Clawhip pinpoint nudge at `1494653681222811751`. Distinct from #80/#81/#82 (status/error surfaces lie about *static* runtime state): this is a surface that lies about *time itself*, and the lie is smeared into every live-agent system prompt, not just a single error string or status field.
84. **`claw dump-manifests` default search path is the build machine's absolute filesystem path baked in at compile time — broken and information-leaking for any user running a distributed binary** — dogfooded 2026-04-17 on main HEAD `70a0f0c` from `/tmp/cd4` (fresh workspace). Running `claw dump-manifests` with no arguments emits:
84. **DONE — `claw dump-manifests` no longer bakes or leaks the build machine's absolute filesystem path** — fixed 2026-06-03 in `fix: make dump-manifests self-contained`. The runtime default now uses the current workspace and emits the Rust resolver inventory directly, so distributed binaries no longer depend on upstream TypeScript source files or compile-time `CARGO_MANIFEST_DIR` paths. Explicit `--manifests-dir` remains as a discovery-root override and invalid roots return typed `missing_manifests` diagnostics. Original filing below: running `claw dump-manifests` with no arguments emitted:
```
error: Manifest source files are missing.
repo root: /Users/yeongyu/clawd/claw-code
@ -6303,7 +6302,7 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
422. **`export --output-format json` and `--resume latest` report the same "no managed sessions" scenario using two different `kind` codes — `no_managed_sessions` vs `session_load_failed` — making "no session found" undetectable by a single kind-code check** — dogfooded 2026-04-30 KST (UTC+9) by Jobdori on `e939777f`. Running `claw export --output-format json` with no session present returns (on stderr, exit 1): `{"error":"no managed sessions found in .claw/sessions/<fingerprint>/","hint":"Start \`claw\` to create a session, then rerun with \`--resume latest\`.\nNote: claw partitions sessions per workspace fingerprint; sessions from other CWDs are invisible.","kind":"no_managed_sessions","type":"error"}`. Running `claw --resume latest /status --output-format json` with no session present returns (on stderr, exit 1): `{"error":"failed to restore session: no managed sessions found in .claw/sessions/<fingerprint>/","hint":"Start \`claw\` to create a session, then rerun with \`--resume latest\`.\nNote: claw partitions sessions per workspace fingerprint; sessions from other CWDs are invisible.","kind":"session_load_failed","type":"error"}`. Both describe the same root condition — there are no sessions to operate on — but they expose it via different `kind` discriminants. Automation that checks `kind == "no_managed_sessions"` to detect a cold workspace will miss the `--resume` path's `session_load_failed`, and vice versa. A wrapper that guards "run with --resume only if a session exists" must special-case both codes. The hint text is identical between them, suggesting the messages are logically equivalent. Additionally neither code matches the proposed canonical names `session_not_found` / `session_load_failed` as stable `ErrorKind` discriminants described in ROADMAP #77's fix shape, which explicitly proposes typed error-kind codes for session lifecycle failures. **Required fix shape:** (a) unify "no sessions found for this workspace fingerprint" under a single canonical `kind` code — either `no_managed_sessions` or `session_not_found` — used consistently by every command path that encounters an empty session registry; (b) if `session_load_failed` is a more general category (covering e.g. corrupt session files, IO errors, schema version mismatches), it should nest a concrete `reason:"no_managed_sessions"` or `reason:"session_not_found"` sub-field so callers can distinguish "empty registry" from "found but unreadable"; (c) align with the canonical error-kind contract proposed in #77; (d) add regression coverage proving `export` and `--resume latest` in an empty workspace both return an error with the same top-level `kind` code. **Why this matters:** session guard-rails in orchestration need a single stable `kind` to detect cold workspaces without enumerating all possible no-session synonyms. Two divergent codes for the same condition make defensive automation brittle and contradict the promise of machine-readable error envelopes. Source: Jobdori live dogfood, `e939777f`, 2026-04-30 KST (UTC+9).
407. **`config --output-format json` returns `files[].loaded:false` with no `load_error`, `not_found`, or `skip_reason` field — automation cannot distinguish "file does not exist", "file exists but parse failed", and "file exists but was skipped by policy" from the same `loaded:false` value; also `loaded_files` and `merged_keys` are bare integers with no per-file attribution** — dogfooded 2026-04-30 by Jobdori on `e939777f`. Running `./claw --output-format json config` on a workspace with 5 discovered config files returns `{"kind":"config","cwd":"...","files":[{"loaded":false,"path":"/Users/yeongyu/.claw.json","source":"user"},{"loaded":true,"path":"/Users/yeongyu/.claw/settings.json","source":"user"},{"loaded":true,"path":"/Users/yeongyu/clawd/claw-code/.claw.json","source":"project"},{"loaded":false,"path":"/Users/yeongyu/clawd/claw-code/.claw/settings.json","source":"project"},{"loaded":false,"path":"/Users/yeongyu/clawd/claw-code/.claw/settings.local.json","source":"local"}],"loaded_files":2,"merged_keys":2}`. Three of five files have `loaded:false` with no accompanying `not_found:true`, `parse_error`, `io_error`, or `skip_reason`; automation must stat each path separately to guess why. Also `loaded_files:2` and `merged_keys:2` are bare counts — ambiguous whether `merged_keys:2` means 2 total top-level JSON keys across all files or 2 unique merged settings. **Required fix shape:** (a) add `not_found: bool` and optional `load_error: string` to each `files[]` entry so callers can distinguish missing, parse-broken, and policy-skipped files without filesystem probing; (b) document or rename `merged_keys` as `merged_setting_count` or `total_merged_keys` to remove the int-semantics ambiguity; (c) optionally add `merged_keys_by_file: [{path, keys}]` for attribution; (d) add regression coverage proving `files[]` entries with `loaded:false` carry at minimum `not_found` distinguishing non-existent paths from load failures. Source: Jobdori live dogfood, `e939777f`, 2026-04-30.
407. **DONE — `config --output-format json` returns structured file load states instead of bare `loaded:false` ambiguity** — fixed 2026-06-03 in `fix: report config file load statuses`. `claw config --output-format json` now emits `files[].status` (`loaded`, `not_found`, `skipped`, `load_error`), `reason`/`skip_reason`, and `detail` where applicable, plus top-level `load_error`, `merged_key_count`, and `merged_keys_meaning`. The config list surface is best-effort so one broken settings file no longer erases the rest of the discovery report, while section-specific config requests still preserve the typed nonzero parse-error envelope. Regression coverage: `inspect_classifies_missing_loaded_and_legacy_skipped_files`, `inspect_reports_parse_errors_but_keeps_valid_merged_config`, `config_json_reports_structured_unloaded_file_reasons_407`, `config_json_list_reports_parse_errors_without_dropping_file_statuses_407`, and existing `config_parse_error_has_typed_error_kind_and_hint_764`.
408. **`status --output-format json` `workspace.changed_files` is ambiguous — on a workspace with 5 untracked files, `changed_files:5`, `staged_files:0`, `unstaged_files:0`, `untracked_files:5`; it is unclear whether `changed_files` is the sum of all four git-status categories or only a subset; automation cannot tell if `changed_files:5` means "5 tracked modified" or "5 total non-clean files including untracked"** — dogfooded 2026-04-30 by Jobdori on `e939777f`. Running `./claw --output-format json status` returns `{"workspace":{"changed_files":5,"staged_files":0,"unstaged_files":0,"untracked_files":5,...}}``changed_files==untracked_files==5` with staged and unstaged both zero. The field name `changed_files` implies "modified tracked files" but the value equals the untracked count, not `staged+unstaged`. Without a comment or documented definition, automation must probe whether `changed_files = staged + unstaged` (excludes untracked) or `changed_files = staged + unstaged + untracked + conflicted` (total dirty). Also `git_state:"dirty · 5 files · 5 untracked"` repeats the same data as a prose string alongside the structured integer fields — redundant human-readable string alongside machine-readable integers. **Required fix shape:** (a) document and stabilize `changed_files` as either `tracked_dirty_count` (staged+unstaged only) or `total_non-clean_count` (staged+unstaged+untracked+conflicted) and rename to remove the ambiguity; (b) ensure a machine consumer can compute `is_clean` as a single boolean field without interpreting `git_state` prose; (c) deprecate or remove `git_state` prose string now that all its constituent counts are available as integers; (d) add regression coverage proving `changed_files` semantics against a workspace with staged, unstaged, untracked, and conflicted files. Source: Jobdori live dogfood, `e939777f`, 2026-04-30.
@ -6345,58 +6344,58 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
422. **Unknown top-level subcommands fall through to chat prompt path instead of returning `unknown_subcommand` error — typos silently send the subcommand string as a chat message to the configured LLM** — dogfooded 2026-05-11 by Jobdori on `b98b9a71` in response to Clawhip pinpoint nudge at `1503215095088676956`. Reproduction: `unset ANTHROPIC_AUTH_TOKEN; export ANTHROPIC_API_KEY=fake-key-for-routing-test; claw completely-bogus-subcommand --output-format json` returns `{"error":"api returned 401 Unauthorized (authentication_error) [trace req_011...]: invalid x-api-key","kind":"api_http_error"}` — proving the unknown token reached the Anthropic API endpoint as a chat prompt. With valid credentials, the bogus subcommand string would be silently consumed as a chat message, billing the user for a typo and producing whatever continuation the LLM generates. **Pre-error path:** `claw <unknown> --output-format json` with no creds returns `kind:"missing_credentials"` (the auth gate fires first), masking the routing bug. Only with creds present does the fallthrough manifest as the actual prompt being sent. **Sibling exit-code bug:** when the chat-path 401 returns, the JSON envelope is `kind:"api_http_error"` but exit code is **0**, while `cli_parse` errors (e.g. `--no-such-flag`) and `missing_credentials` errors correctly exit **1**. Exit-code parity between error envelopes is broken — automation that gates on `$?` will treat the 401-as-chat as success. **Required fix shape:** (a) reserve unknown top-level tokens that match no registered subcommand and emit `kind:"unknown_subcommand"` with `unknown:<token>` field and exit code 1, BEFORE the chat fallback path; (b) when a token is intended as a chat prompt, require an explicit verb (`prompt`, `chat`, `ask`) or `--prompt` flag; (c) ensure exit codes are non-zero for all `kind:*_error` envelopes; (d) regression test: `claw <bogus> --output-format json` with valid auth returns `kind:"unknown_subcommand"` exit 1, never reaches the API. **Why this matters:** automation that calls `claw <subcommand>` with a programmatically constructed verb (typo, version drift, refactored command) silently bills tokens and produces hallucinated output instead of a typed error. Cross-cluster with #108 (CLI fallthrough discovered earlier) — #422 is the post-#108 audit confirming the routing bug still bites with valid credentials. Source: Jobdori live dogfood, `b98b9a71`, 2026-05-11.
423. **`claw prompt` does not read prompt text from stdin when no positional prompt arg is provided — `echo "what is 2+2" | claw prompt --output-format json` returns `kind:"unknown" error:"prompt subcommand requires a prompt string"` instead of consuming stdin** — dogfooded 2026-05-11 by Jobdori on `3c563fa1` in response to Clawhip pinpoint nudge at `1503222644739276951`. Reproduction: `echo "what is 2+2" | claw prompt --output-format json``{"error":"prompt subcommand requires a prompt string","hint":null,"kind":"unknown","type":"error"}` exit 1. Same for `claw prompt --output-format json` with stdin redirected from a file. The most common Unix automation pattern (`cmd | claw prompt`) is broken because the prompt subcommand only reads the positional argument, never falls through to stdin. **Sibling envelope-kind bug:** the error `kind` is `"unknown"` instead of a typed `"missing_argument"` or `"validation_error"`. The `unknown` discriminator is the catch-all bucket — automation that switches on `kind` to differentiate input-validation errors from runtime errors gets no signal here. **Required fix shape:** (a) when `prompt` subcommand has no positional prompt arg AND stdin is not a TTY (i.e., piped or redirected), read stdin to EOF and use that as the prompt; (b) emit `kind:"missing_argument"` (not `"unknown"`) when both positional arg and stdin are absent; (c) add `--prompt-stdin` or `--stdin` opt-in flag for explicit control; (d) regression tests: `echo X | claw prompt --output-format json` reaches the runtime with prompt=X, AND `claw prompt < /dev/null` returns `kind:"missing_argument"` exit 1. **Why this matters:** Unix pipelines are the foundation of CLI automation. Every other major CLI (curl, jq, gh, kubectl) accepts stdin as the primary input when no positional arg is given. Breaking this convention forces automation to either inline the prompt as a shell-quoted string (escaping nightmare for multiline/code) or write to a temp file first. The `kind:"unknown"` error category compounds the problem by making the failure indistinguishable from a runtime crash. Source: Jobdori live dogfood, `3c563fa1`, 2026-05-11.
423. **DONE — `claw prompt` reads prompt text from stdin when no positional prompt arg is provided** — fixed 2026-06-03 in `fix: read prompt subcommand input from stdin`. `parse_args()` now treats non-empty piped stdin as the prompt body for `claw prompt` when the positional prompt is empty, and supports `--stdin` / `--prompt-stdin` to append piped context to an explicit positional prompt. The existing `missing_prompt` JSON/stdout contract is preserved for closed or whitespace-only stdin. User docs now show `printf '...' | ./target/debug/claw prompt --output-format json`, and regression coverage verifies both a pure stdin prompt and explicit stdin context reach the mock Anthropic provider request and return structured JSON output.
424. **`--model` rejects bare canonical Anthropic model names (`claude-opus-4-7`, `claude-opus-4-6`, `claude-sonnet-4-6`) as `invalid_model_syntax` — only short aliases (`opus`, `sonnet`, `haiku`) and full prefixed form (`anthropic/claude-opus-4-7`) work; sibling: error message stale-suggests `claude-opus-4-6` not `4-7`** — dogfooded 2026-05-11 by Jobdori on `6c0c305a` in response to Clawhip pinpoint nudge at `1503230194889134103`. Reproduction: `claw --model claude-opus-4-7 status --output-format json``{"error":"invalid model syntax: 'claude-opus-4-7'. Expected provider/model (e.g., anthropic/claude-opus-4-6) or known alias (opus, sonnet, haiku)","kind":"invalid_model_syntax"}`. Same for `claude-opus-4-6`, `claude-sonnet-4-6`. Forcing `--model anthropic/claude-opus-4-7` works (`model:"anthropic/claude-opus-4-7"`, `model_source:"flag"`). Three problems compounded: (a) Anthropic-canonical model names without provider prefix are rejected even though the `claude-` prefix unambiguously identifies the provider; (b) the error suggests `anthropic/claude-opus-4-6` as the example — `4-7` shipped 2026-04-16 and is the current production Anthropic frontier model, the suggestion is one model behind; (c) the alias list `opus, sonnet, haiku` doesn't disambiguate version (which `opus` does the alias resolve to — `opus-4-6` or `opus-4-7`?). **Required fix shape:** (a) accept bare `claude-*` and `gpt-*` model names as canonical-named-without-prefix and route via name-prefix detection (already implemented for prefix-routed mode); (b) update the example in `invalid_model_syntax` error to current frontier (`anthropic/claude-opus-4-7`); (c) document or expose `opus` → exact-version mapping in the error message and in `claw doctor`/`status` output (`model_alias_resolved_to: "claude-opus-4-7"`); (d) regression test: `claw --model claude-opus-4-7 status --output-format json` returns `model_source:"flag"`, not `kind:"invalid_model_syntax"`. **Sibling bug observed in same probe:** `enabledPlugins` deprecation warning repeats 3 times in stderr for the same `~/.claw/settings.json` load — config file is being loaded/parsed 3 times during a single `status` invocation. **Why this matters:** every Anthropic doc, every CCAPI route, every internal tooling references models by their bare canonical name (`claude-opus-4-7`). Forcing the `anthropic/` prefix breaks copy-paste from Anthropic's own examples and adds a redundant token to every invocation. The stale `4-6` suggestion in the error message actively misdirects users away from the current model. Source: Jobdori live dogfood, `6c0c305a`, 2026-05-11.
424. **DONE — `--model` accepts bare canonical provider model names and Anthropic routing prefixes are stripped before provider calls** — fixed 2026-06-03 in `fix: normalize Anthropic model routing`. `validate_model_syntax()` now accepts unambiguous bare `claude-*` and `gpt-*` model IDs while preserving raw model provenance, and Anthropic `/v1/messages` plus `/v1/messages/count_tokens` request bodies strip the CLI-only `anthropic/` routing prefix so default/alias models do not reach Anthropic as `anthropic/claude-*`. Existing `qwen-*`/`grok-*` prefix-hint behavior remains intentionally unchanged for provider families whose bare names are ambiguous with DashScope/xAI routing. Regression coverage: `standard_messages_body_strips_anthropic_routing_prefix`, `send_message_strips_anthropic_routing_prefix_on_wire`, `default_model_alias_uses_anthropic_routing_prefix`, and the bare `--model=claude-opus-4-6 status` / `--model gpt-4 prompt` parser assertions in `parses_single_word_command_aliases_without_falling_back_to_prompt_mode`.
425. **Config file precedence (`.claw/settings.json` always wins over `.claw.json`) is undocumented in user-facing surfaces — `config --output-format json` reports both files as `loaded:true` with no `precedence_rank` or `wins_for_keys` attribution; sibling: deprecation warning fires 4× per status invocation (was 3× in #424, regression upward)** — dogfooded 2026-05-11 by Jobdori on `d7dbe951` in response to Clawhip pinpoint nudge at `1503237744451649537`. Reproduction: create `.claw.json` with `{"model":"anthropic/claude-sonnet-4-6"}` and `.claw/settings.json` with `{"model":"anthropic/claude-opus-4-7"}` in the same workspace. `claw status --output-format json` returns `model:"anthropic/claude-opus-4-7", model_source:"config"`. Reverse the files (.claw.json=opus, settings.json=sonnet) → `model:"anthropic/claude-sonnet-4-6"`. Confirmed: `.claw/settings.json` **always** wins over `.claw.json` for conflicting keys, regardless of file mtime or alphabetical order. `claw config --output-format json` reports both as `loaded:true` with no `precedence_rank`, `effective_for_keys`, or `shadowed_keys` attribution. The only signal of precedence is the final merged value in `status` — automation cannot programmatically discover which file contributed which key without re-implementing the merge logic. **Sibling bug (regression from #424):** the `enabledPlugins` deprecation warning now fires **4 times** in stderr per single `status` invocation (was 3× in #424's probe at HEAD `6c0c305a`; current HEAD `d7dbe951` shows 4×). Config load count went up by 1. **Sibling bug observed in config-section probe:** `claw config model --output-format json` with a `.claw.json` that contains a benign unknown key (e.g., `"alpha":"x"`) returns `{"error":"/path/.claw.json: unknown key \"alpha\" (line 1)","kind":"unknown"}` — the entire config command fails with a generic `unknown` kind instead of (a) tolerating unrecognized keys with a warning, or (b) emitting a typed `kind:"unknown_key"` error scoped to the offending file/key. **Required fix shape:** (a) document precedence order in `USAGE.md` (`.claw/settings.local.json > .claw/settings.json > .claw.json` for project scope; `user`/`system` scope at each layer); (b) add `precedence_rank:int` and optional `wins_for_keys:[string]` / `shadowed_keys:[string]` to each entry in `config --output-format json` `files[]`; (c) dedupe the deprecation warning to fire **once per discovered file** instead of N× per load pass; (d) make `config <section> --output-format json` tolerate unknown keys with warnings, OR emit `kind:"unknown_key"` with `path:` and `key:` fields scoped to the offending file. **Why this matters:** users mixing legacy `.claw.json` with new `.claw/settings.json` have no way to verify which file is actually controlling their runtime. The undocumented precedence + missing per-key attribution forces trial-and-error to debug config drift. Cross-references #407 (config files no load_error) and #415 (config section returns merged_keys count not values). Source: Jobdori live dogfood, `d7dbe951`, 2026-05-11.
425. **DONE — config JSON exposes file precedence attribution and unknown config keys are warnings** — fixed 2026-06-03 in `fix: attribute config precedence in JSON`. Runtime config inspection now reports every discovered file with `precedence_rank`, `wins_for_keys`, and `shadowed_keys`, so `.claw/settings.json` overriding legacy `.claw.json` is visible without reimplementing merge order. Unknown keys are tolerated as structured validation warnings, including `claw config <section> --output-format json`, while wrong-type errors still fail. The deprecation-warning path remains deduplicated once per process for text-mode `status`, and JSON config surfaces collect warnings structurally without stderr duplication. Docs in `USAGE.md` now spell out the precedence chain and JSON attribution fields. Regression coverage: `config_json_attributes_precedence_and_shadowed_keys_425`, `config_section_json_tolerates_unknown_keys_as_warnings_425`, `status_deduplicates_config_deprecation_warnings_per_invocation_425`, and the runtime validator unknown-key warning tests.
426. **`ANTHROPIC_MODEL` env var bypasses the `invalid_model_syntax` validator that `--model` enforces — bogus model strings are accepted with `status:"ok"`, deferred-failing only when the first API call is made** — dogfooded 2026-05-11 by Jobdori on `3730b459` in response to Clawhip pinpoint nudge at `1503245298800136296`. Reproduction (asymmetric validation): `claw --model bogus-model-xyz status --output-format json` returns `kind:"invalid_model_syntax"` exit 1; `ANTHROPIC_MODEL=bogus-model-xyz claw status --output-format json` returns `model:"bogus-model-xyz", model_raw:"bogus-model-xyz", model_source:"env", status:"ok"` — the doctor surface lies that the configured model is valid when it is not. The bogus model only manifests as a failure when the first prompt fires and the API rejects it with 404/400. Three sibling discoveries in the same probe: (a) **alias indirection invisible**: `ANTHROPIC_MODEL=opus claw status --output-format json` returns `model:"claude-opus-4-6", model_raw:"opus", model_source:"env"` — the `opus` alias resolves to `claude-opus-4-6` (the *previous* frontier, not the current `claude-opus-4-7` released 2026-04-16). Users typing `opus` get yesterday's model with no warning. (b) **`CLAW_MODEL` env var silently ignored**: `CLAW_MODEL=opus claw status` shows `model:"claude-opus-4-6" model_source:"default"` — the `CLAW_MODEL` env var (the project-namespaced equivalent that users expect) does not exist; only `ANTHROPIC_MODEL` is honored. No warning when a `CLAW_*` env var that looks like it should work is set. (c) **`ANTHROPIC_DEFAULT_MODEL` also silently ignored**: the longer-named env var that some Anthropic SDKs use is not recognized. **Required fix shape:** (a) symmetric validation: `ANTHROPIC_MODEL` env value must pass the same `invalid_model_syntax` check that `--model` does, and `claw status` must return `kind:"invalid_model"` / `status:"warn"` (not `status:"ok"`) when the resolved model is unrecognized; (b) expose alias resolution in `status`: add `model_alias_resolved_to:string|null` field so automation can see `opus → claude-opus-4-6`; (c) bump the `opus` alias to `claude-opus-4-7` (current frontier) or document the alias-to-version mapping policy explicitly; (d) accept `CLAW_MODEL` and `ANTHROPIC_DEFAULT_MODEL` env vars with parity to `ANTHROPIC_MODEL`, OR emit a warning when those env vars are set but unrecognized. **Why this matters:** the most common automation pattern is `export ANTHROPIC_MODEL=...` in a shell rc file. Bogus values pass silently, alias indirection hides the actual model in use, and `CLAW_MODEL` looking like a working name but doing nothing is a footgun. Cross-references #424 (bare canonical names rejected at validator level) — together #424 + #426 make model selection inconsistent across CLI flag, env var, and alias paths. Source: Jobdori live dogfood, `3730b459`, 2026-05-11.
426. **DONE — environment model selection is validated and status exposes alias/env provenance** — fixed 2026-06-03 in `fix: validate env model selection`. `CLAW_MODEL`, `ANTHROPIC_MODEL`, and `ANTHROPIC_DEFAULT_MODEL` now share the same env-model path before config/default fallback; prompt/REPL startup validates the resolved model before provider construction; and `status --output-format json` reports invalid env/config models as `status:"warn"` with `model_validation_error_kind:"invalid_model"` while preserving workspace/config/sandbox context. Status JSON now includes `model_alias_resolved_to` and `model_env_var`, making alias expansion and the winning env var auditable. The built-in/default `opus` alias now targets `anthropic/claude-opus-4-7` / `claude-opus-4-7`, with docs updated in `USAGE.md` and `rust/README.md`; the API alias table keeps token-limit metadata for both `claude-opus-4-7` and legacy `claude-opus-4-6`. Regression coverage: `status_json_accepts_namespaced_model_env_and_surfaces_alias_426`, `status_json_warns_on_invalid_model_env_426`, model alias/unit tests, and provider alias tests.
427. **Subcommand `--help` paths (`resume`, `session`, `compact`) hit the auth gate and trigger config validation before returning static help — `claw resume --help` with no credentials returns `missing_credentials` error instead of help text** — dogfooded 2026-05-11 by Jobdori on `1fecdf09` in response to Clawhip pinpoint nudge at `1503252843669491892`. Reproduction (no env vars, isolated `CLAW_CONFIG_HOME`): `claw resume --help` returns `{"error":"missing Anthropic credentials; export ANTHROPIC_AUTH_TOKEN or ANTHROPIC_API_KEY..."}` instead of usage text. Same for `claw session --help`, `claw compact --help`. By contrast, `claw prompt --help` and `claw --help` (top-level) return proper usage text without auth. Even worse: with a broken `.claw.json` discovered up the parent directory tree (e.g., `mcpServers.missing-command: missing string field command`), the subcommand `--help` paths fail with `[error-kind: unknown]` from config validation — config load is happening before `--help` is parsed. **Sibling exit-code bug:** `claw resume --help --output-format json` returns `kind:"missing_credentials"` but exits **0** (the exit-code parity bug from #422 reproduces on this path too — only `cli_parse` exits 1 consistently). **Sibling: `claw resume <bogus-id>` should be local-only** but also hits `missing_credentials``resume` of a session that doesn't exist on disk should return `kind:"session_not_found"` from a local lookup, not require API credentials. Same class as ROADMAP #357 (session list requires creds) and #369 (session help/fork require credentials) — now confirmed for `resume`. **Required fix shape:** (a) `--help` MUST short-circuit before any auth check, config load, or session resolution — emit static usage text from a compiled-in string table, no I/O; (b) `resume <id>` must check the local session store first; if the id is absent on disk, emit `kind:"session_not_found"` with `sessions_dir` field; only require auth when resuming a known-on-disk session that requires re-establishing API context; (c) ensure exit code 1 for all error envelopes including `missing_credentials` returned from a `--help` path that should never have reached the auth gate; (d) regression test: with empty `CLAW_CONFIG_HOME` and no env vars, every `claw <subcommand> --help` returns usage text on stdout, exit 0, no `kind:*_error` envelope. **Why this matters:** `--help` is the universal CLI discovery primitive. Failing `--help` because of missing API credentials or broken config files makes claw undiscoverable to users debugging an already-broken setup. Cross-references #357 (session list), #369 (session help/fork), #422 (exit code parity), #108 (subcommand fallthrough). Source: Jobdori live dogfood, `1fecdf09`, 2026-05-11.
427. **DONE — resume/session/compact help short-circuits locally and missing resume sessions report the session store** — fixed 2026-06-03 in `fix: keep session help local`. `claw resume --help`, `claw --resume --help`, `claw session --help`, and `claw compact --help` now route through static `LocalHelpTopic` output before config loading, session resolution, credential checks, provider startup, or slash-command interactive-only fallthrough. The direct `claw resume <session>` alias now shares the existing `--resume` restore parser, so `claw resume <missing> --output-format json` returns a local `session_not_found` restore envelope with `sessions_dir` and exit code 1 instead of reaching provider credentials. Regression coverage: `resume_session_compact_help_short_circuits_before_config_or_auth_427`, `resume_missing_session_json_reports_local_store_before_auth_427`, local help parser tests, and resume parser tests.
428. **Default `permission_mode` is `danger-full-access` — claw runs with FULL filesystem + network + tool access out of the box, with no opt-in flag and no warning from `doctor`** — dogfooded 2026-05-11 by Jobdori on `72048449` in response to Clawhip pinpoint nudge at `1503260393622212628`. Reproduction (no env vars, isolated `CLAW_CONFIG_HOME`, no config files, no CLI flags): `claw status --output-format json` returns `permission_mode:"danger-full-access"` as the default. The three supported modes per the validator error message are `read-only`, `workspace-write`, `danger-full-access` — and `danger-full-access` is chosen with zero user opt-in. `claw doctor --output-format json` produces a `sandbox` check with `status:"warn", summary:"sandbox was requested but is not currently active"` (because macOS lacks Linux `unshare`), but **emits no warning, info, or summary about the permission_mode itself being danger-full-access**. There is no `permissions` check in `doctor` output at all. **Required fix shape:** (a) change default `permission_mode` to `workspace-write` (safe-by-default: filesystem write limited to cwd, network limited to LLM endpoints, no arbitrary command exec); (b) require explicit `--permission-mode danger-full-access` or `--dangerously-skip-permissions` to opt into full access; (c) add a `permissions` check to `doctor --output-format json` that emits `status:"warn"` when `permission_mode == "danger-full-access"` without explicit source (flag/env/config), with details like `mode:"danger-full-access", source:"default", message:"running with full access without explicit opt-in"`; (d) document the three modes and the default in USAGE.md with one-paragraph descriptions of what each mode allows. **Sibling typed-error bug:** `claw --permission-mode bogus-mode status --output-format json` returns `kind:"unknown"` instead of `kind:"invalid_permission_mode"` — same catch-all problem as #424, #426. **Sibling flag-name asymmetry:** `--dangerously-skip-permissions` works but `--skip-permissions` (Claude Code's flag) returns `kind:"cli_parse"` `unknown option`. Users migrating from Claude Code lose the short flag name. **Why this matters:** every other security-conscious CLI (Docker, kubectl, terraform) requires explicit opt-in for dangerous modes. Defaulting to `danger-full-access` is a footgun for first-time users who pipe `curl install.sh | sh` and immediately get a tool with full filesystem write and arbitrary command exec. The doctor surface is the only diagnostic users consult before trusting the tool, and it stays silent about the most permissive setting. Cross-references #50, #87, #91, #94, #97, #101, #106, #115, #123 (permission-audit sweep) — those all cover permission *rule* and *list* surfaces; #428 covers the *mode default* itself. Source: Jobdori live dogfood, `72048449`, 2026-05-11.
428. **DONE — default permission mode is workspace-write with auditable permission provenance** — fixed 2026-06-03 in `fix: default to workspace-write permissions`. Fresh invocations now resolve the fallback permission mode to `workspace-write` instead of `danger-full-access`; `danger-full-access` requires an explicit `--permission-mode danger-full-access`, `--dangerously-skip-permissions`, `--skip-permissions`, env, or config opt-in. `status --output-format json` includes `permission_mode_source` and `permission_mode_env_var`, and `doctor --output-format json` includes a `permissions` check with `mode`, `source`, `source_explicit`, `message`, and tool allow/gate lists. Invalid CLI permission modes now emit typed `invalid_permission_mode` JSON errors, and docs describe the three modes plus the safe default. Regression coverage: `default_permission_mode_is_workspace_write_and_audited_428`, `explicit_danger_permission_mode_is_audited_and_alias_supported_428`, `invalid_permission_mode_json_is_typed_428`, parser default tests, classifier coverage, and `given_workspace_write_enforcer_when_web_tools_then_denied`.
429. **No global `--cwd`/`-C`/`--directory` flag — `claw` cannot be invoked against an arbitrary working directory without first `cd`-ing into it; `--cwd` only exists as a subcommand option for `system-prompt`, and the `cli_parse` "Did you mean --acp?" suggestion is misleading (the `--acp` flag is unrelated to directory selection)** — dogfooded 2026-05-11 by Jobdori on `ec882f4c` in response to Clawhip pinpoint nudge at `1503267943285264394`. Reproduction: `claw --cwd /tmp/claw-dog-cwd status --output-format json``{"error":"unknown option: --cwd","hint":"Did you mean --acp?\nRun `claw --help` for usage.","kind":"cli_parse"}`. Same error for `--cwd <relative>`, `--cwd <nonexistent>`, `--cwd <file-not-dir>`, `--cwd ""`. Inspecting `claw --help`: `--cwd PATH` appears ONLY in the usage line `claw system-prompt [--cwd PATH] [--date YYYY-MM-DD]` — it is not a global flag and is not accepted by `status`, `doctor`, `mcp list`, `init`, or any other subcommand. Users programmatically running claw against multiple workspaces must `cd` into each one before invoking, breaking the `subprocess.run(['claw', 'status', '--cwd', ws], cwd=other_dir)` pattern that every other major CLI (cargo `-C`, git `-C`, npm `--prefix`, gh `--repo` semantically, kubectl `--kubeconfig`+`--context`) supports. **Sibling misleading-suggestion bug:** the `cli_parse` error's `hint` field suggests `Did you mean --acp?` for `--cwd`. `--acp` is the alias for ACP/Zed editor integration (entirely unrelated to working directory). The Levenshtein-distance auto-complete is matching on first-character similarity without considering semantic relatedness. Users following the hint get a totally orthogonal feature. **Required fix shape:** (a) add a global `--cwd PATH` / `-C PATH` flag accepted before any subcommand, parsed in the global flag pre-pass; (b) validate the path exists and is a directory; emit `kind:"invalid_cwd"` with `path:` and `reason:` (`"not_found"`/`"not_a_directory"`/`"empty"`) when validation fails; (c) document the precedence: `--cwd` flag > `$PWD` > `env::current_dir()`; (d) fix the "Did you mean" hint algorithm to filter suggestions by semantic category (don't suggest `--acp` for `--cwd`; suggest `claw system-prompt --cwd PATH` if the user clearly wants `cwd` override but used the wrong scope); (e) regression test: `claw --cwd /tmp status --output-format json` from any `$PWD` returns `workspace.cwd:"/private/tmp"` (or `cwd:"/tmp"` after #421 fix). **Why this matters:** every claw automation orchestrator runs claw against multiple workspaces from a single parent process. Forcing `cd` before each invocation breaks parallelism (can't use shared cwd across concurrent invocations), breaks subprocess wrappers that want to pass cwd explicitly, and breaks `xargs`/`parallel`-style pipelines. Cross-references #421 (cwd canonicalization leak — fix should canonicalize but report user-input via `--cwd`). Source: Jobdori live dogfood, `ec882f4c`, 2026-05-11.
429. **DONE — global workspace directory override is accepted and validated before dispatch** — fixed 2026-06-03 in `fix: add global cwd override`. `claw --cwd PATH ...`, `claw -C PATH ...`, and `claw --directory PATH ...` now run as if launched from the selected workspace before config, status, doctor, MCP, skills, and other command dispatch. The override takes precedence over process `$PWD`; invalid values emit typed `invalid_cwd` JSON errors with `path` and `reason` (`not_found`, `not_a_directory`, or `empty`) instead of the old misleading `Did you mean --acp?` CLI parse path. Help/usage docs list the global flags and the precedence/validation contract. Regression coverage: `global_cwd_flag_routes_status_workspace_and_short_alias_429`, `global_cwd_flag_reports_typed_invalid_paths_429`, and classifier coverage for `invalid_cwd`.
430. **`dump-manifests` is documented as "emit every skill/agent/tool manifest the resolver would load for the current cwd" but actually requires the upstream Claude Code TypeScript source files (`src/commands.ts`, `src/tools.ts`, `src/entrypoints/cli.tsx`) — the command is unusable for any user who installed claw without cloning the original Claude Code repo** — dogfooded 2026-05-11 by Jobdori on `075c2144` in response to Clawhip pinpoint nudge at `1503275502046023690`. Reproduction: `claw dump-manifests --output-format json` returns `{"error":"Manifest source files are missing.","hint":"repo root: /private/tmp/claw-dog-0530\n missing: src/commands.ts, src/tools.ts, src/entrypoints/cli.tsx\n Hint: set CLAUDE_CODE_UPSTREAM=/path/to/upstream or pass \`claw dump-manifests --manifests-dir /path/to/upstream\`.","kind":"missing_manifests"}`. The fresh-main worktree at `/private/tmp/claw-dog-0530` does not contain these TypeScript files because the Rust port doesn't include the upstream TS source. The `--help` text says the command works against "the current cwd" but in practice it requires `CLAUDE_CODE_UPSTREAM=` pointing at an unshipped TS source tree. **Three sibling problems compounded:** (a) **derivative-work disclosure leak**: the error message exposes that `claw-code` is a port of Claude Code (`CLAUDE_CODE_UPSTREAM` env var name) — even if true, surfacing this in a casual diagnostic message couples user-facing behavior to upstream provenance details. (b) **kind drift**: `claw dump-manifests --manifests-dir /tmp/nonexistent --output-format json` returns `kind:"unknown"`, while `claw dump-manifests` (no override) returns `kind:"missing_manifests"`. Same root cause (no usable upstream), two different `kind` discriminators — automation cannot switch on a single error type. (c) **export-positional-arg silently dropped**: probed in the same run — `claw export <bogus-positional>` ignores the path and returns `kind:"no_managed_sessions"` regardless of what positional arg was passed. The `--help` advertises `[PATH]` as the output-file destination but the path is discarded before validation, indistinguishable from invocation with no args. **Required fix shape:** (a) make `dump-manifests` emit the manifests claw-code itself ships with (Rust-resolver-discovered skills/agents/tools), independent of any upstream TS source — that matches the `--help` description; (b) if upstream-comparison is genuinely needed for parity work, move it to a separate command like `parity dump-upstream-manifests` and remove the upstream dependency from `dump-manifests`; (c) standardize on one error `kind` for the manifest-missing failure mode (`missing_manifests` is more descriptive than `unknown`); (d) `claw export <PATH>` must validate the path positional arg before the session-discovery check, so users see `kind:"invalid_output_path"` (or similar) when the path is malformed instead of always seeing `kind:"no_managed_sessions"`. **Why this matters:** `dump-manifests` is the inventory surface a downstream automation lane would call to learn what claw can do in the current workspace. If it's broken without upstream TS source, downstream lanes can't introspect — they have to fall back to `agents list`/`skills list`/`mcp list` separately and re-aggregate. Cross-references #422 (kind:unknown for unknown_subcommand), #423 (kind:unknown for missing_argument), #428 (kind:unknown for invalid_permission_mode) — `kind:"unknown"` keeps appearing as the catch-all for surfaces that should have typed kinds. Source: Jobdori live dogfood, `075c2144`, 2026-05-11.
430. **DONE — `dump-manifests` emits the self-contained Rust resolver inventory instead of requiring upstream Claude Code TypeScript source files** — fixed 2026-06-03 in `fix: make dump-manifests self-contained`. `claw dump-manifests --output-format json` now succeeds from an installed workspace with `source:"rust-resolver"`, command/tool/agent/skill/bootstrap manifests, no `CLAUDE_CODE_UPSTREAM` hint, and no `src/commands.ts` dependency. Explicit `--manifests-dir` scopes resolver discovery to another directory and missing/not-directory roots emit typed `missing_manifests` JSON. Sibling export diagnostics now validate explicit positional/`--output` paths before session discovery and return typed `invalid_output_path` JSON with `path` and `reason`. Regression coverage: `dump_manifests_defaults_to_rust_resolver_inventory`, `dump_manifests_scopes_explicit_manifest_dir_without_upstream_ts`, `dump_manifests_missing_explicit_dir_has_typed_kind`, `dump_manifests_and_init_emit_json_when_requested`, `local_json_surfaces_have_non_empty_action_contract_714`, and `export_invalid_output_path_reports_typed_json_430`.
431. **`skills uninstall <name>` requires Anthropic credentials despite being a local filesystem operation — `claw skills uninstall nonexistent-skill-xyz --output-format json` returns `kind:"missing_credentials"` instead of resolving locally that the skill doesn't exist** — dogfooded 2026-05-11 by Jobdori on `328fd114` in response to Clawhip pinpoint nudge at `1503275502046023690` (sibling probe to #430). Reproduction (no creds, isolated `CLAW_CONFIG_HOME`): `claw skills uninstall nonexistent-skill-xyz --output-format json` returns `{"error":"missing Anthropic credentials; export ANTHROPIC_AUTH_TOKEN or ANTHROPIC_API_KEY...","kind":"missing_credentials"}`. Uninstalling a skill is a pure local filesystem operation: read the skills directory, find the named skill, remove its files. There is no semantic reason to require API credentials. Same class of bug as #357 (`session list` requires creds), #369 (`session help/fork` require creds), and #427 (`resume <bogus-id>` requires creds). **Three sibling findings in same probe:** (a) `claw skills install <bogus-name>` returns `{"error":"No such file or directory (os error 2)","kind":"unknown"}` — leaks raw OS error string with no hint about expected install source format (path vs name vs URL?), and the catch-all `kind:"unknown"` again instead of typed `kind:"skill_install_source_not_found"`. (b) `claw skills install` (no args) returns `action:"help"` with `unexpected:"install"` — but `install` IS a documented subcommand. The handler treats it as "unknown action" instead of "missing required argument". Should emit `kind:"missing_argument"` with `argument:"install_source"`. (c) `claw agents create my-agent` returns `action:"help"` with `unexpected:"create my-agent"` — there is no agent-creation surface at all. Users must hand-craft `.claw/agents/<name>.md` files with no scaffolding command, while `claw init` only creates the top-level `.claw/` skeleton. **Required fix shape:** (a) `skills uninstall <name>` must be local-first: enumerate the local skills dir, return `kind:"skill_not_found"` (with `skills_dir:` and `available_names:[]` fields) for missing, or remove the files and return `kind:"skills"` with `action:"uninstall", removed:<name>` for present skills; (b) `skills install <source>` must distinguish source forms (`path:`, `name:`, `url:`) and emit `kind:"invalid_install_source"` with the parsed-and-failed reason; (c) `skills install` (no args) emits `kind:"missing_argument"` with `argument:"install_source"`; (d) add `claw agents create <name>` (or `claw init agent <name>`) that scaffolds `.claw/agents/<name>.md` with a stub frontmatter; or document explicitly that agents are user-authored only. **Why this matters:** lifecycle commands (`uninstall`, `install`, `create`) are the primary surface for managing claw's extension surface area. If `uninstall` requires API creds, an offline user who fat-fingered an install can't undo it. If `install` returns a raw OS error, automation can't programmatically recover. If `agents create` doesn't exist, agent authoring is undocumented file-touching only. Cross-references #357, #369, #427 (auth-gate-on-local-ops cluster), and #422/#423/#428/#430 (`kind:"unknown"` catch-all cluster). Source: Jobdori live dogfood, `328fd114`, 2026-05-11.
431. **DONE — `skills uninstall <name>` resolves locally instead of requiring Anthropic credentials** — fixed 2026-06-03 in `fix: keep skills lifecycle local`. `claw skills uninstall nonexistent-skill-xyz --output-format json` now stays on the local skills lifecycle surface and emits `kind:"skills"`, `action:"uninstall"`, `error_kind:"skill_not_found"`, `skills_dir`, `available_names`, and a hint without provider credentials. `claw skills install` no-arg emits typed `missing_argument` with `argument:"install_source"`; `claw skills install <bogus-name>` emits typed `invalid_install_source` with `source`, `source_kind`, `reason`, and a recovery hint. Installed skill roundtrips remove the installed files through the shared local lifecycle helper. `claw agents create <name>` now scaffolds `.claw/agents/<name>.toml` and lists through the existing TOML agent discovery surface. Regression coverage: `skills_lifecycle_errors_have_typed_local_json_795_431`, `skills_install_uninstall_roundtrip_stays_local_431`, `agents_create_scaffolds_toml_and_lists_locally_431`, local command routing tests, parser discriminant tests, and command help/docs assertions.
432. **`--allowedTools` validator inconsistency: tool name list is half snake_case (`bash`, `read_file`, `write_file`, `edit_file`, `glob_search`, `grep_search`) and half PascalCase (`WebFetch`, `WebSearch`, `TodoWrite`, `Skill`, `Agent`, `Sleep`) with three UPPERCASE entries (`REPL`, `LSP`, `MCP`); accepts undocumented CamelCase aliases (`Read`, `Write`, `Edit`) and silently translates them to snake_case; argument parsing consumes the next positional when value is missing** — dogfooded 2026-05-11 by Jobdori on `fad53e2d` in response to Clawhip pinpoint nudge at `1503283046856655029`. Reproduction: `claw --allowedTools status --output-format json``{"error":"unsupported tool in --allowedTools: status (expected one of: bash, read_file, write_file, edit_file, glob_search, grep_search, WebFetch, WebSearch, TodoWrite, Skill, Agent, ToolSearch, NotebookEdit, Sleep, SendUserMessage, Config, EnterPlanMode, ExitPlanMode, StructuredOutput, REPL, PowerShell, AskUserQuestion, TaskCreate, RunTaskPacket, TaskGet, TaskList, TaskStop, TaskUpdate, TaskOutput, WorkerCreate, WorkerGet, WorkerObserve, WorkerResolveTrust, WorkerAwaitReady, WorkerSendPrompt, WorkerRestart, WorkerTerminate, WorkerObserveCompletion, TeamCreate, TeamDelete, CronCreate, CronDelete, CronList, LSP, ListMcpResources, ReadMcpResource, McpAuth, RemoteTrigger, MCP, TestingPermission)","kind":"unknown"}`. The `status` subcommand was consumed as the `--allowedTools` value because the flag parser doesn't distinguish missing-value from end-of-flag-args. The error reveals **the supported tool list mixes naming conventions inconsistently within a single error message**: snake_case (`bash`, `read_file`, `write_file`, `edit_file`, `glob_search`, `grep_search`), PascalCase (`WebFetch`, `WebSearch`, `TodoWrite`, `Skill`, `Agent`, `Sleep`, `Config`, `PowerShell`, `AskUserQuestion`, `TaskCreate`, `WorkerCreate`, `TeamCreate`, `CronCreate`), UPPERCASE (`REPL`, `LSP`, `MCP`), and CamelCase compounds (`McpAuth`, `RemoteTrigger`). **Hidden alias mapping**: `claw --allowedTools Read,Write,Edit status --output-format json` is accepted and returns `allowed_tools.entries:["edit_file","read_file","write_file"]` — proving the validator has an undocumented CamelCase→snake_case alias map (`Read`→`read_file`, `Write`→`write_file`, `Edit`→`edit_file`) that is not surfaced in the error message. Users who copy-paste tool names from Claude Code documentation work, users who copy from the validator error don't. **Sibling missing-value bug:** `claw --allowedTools status` with `status` as a positional subcommand is interpreted as `--allowedTools=status`, swallowing the subcommand. The flag parser must require a value for `--allowedTools` and emit `kind:"missing_argument"` when followed by a recognized subcommand or `--`-prefixed flag instead of silently treating the next arg as a tool name. **Sibling typed-kind bug:** both errors use `kind:"unknown"` instead of typed `kind:"invalid_tool_name"` / `kind:"missing_argument"` — the catch-all keeps appearing (#422/#423/#424/#428/#430/#431/#432). **Required fix shape:** (a) standardize the canonical tool-name registry on one casing convention (snake_case is most CLI-ergonomic) and update both the registry and all CamelCase aliases; (b) document and expose the alias map (`tool_aliases:{Read:"read_file",...}`) in `claw doctor`/`status` and in the validator error; (c) flag parser must require a value for `--allowedTools` and refuse to consume a recognized subcommand or `-`/`--`-prefixed token as the value, emit `kind:"missing_argument"` with `argument:"--allowedTools"`; (d) emit `kind:"invalid_tool_name"` with `tool_name:` and `available:[]` fields instead of `kind:"unknown"`; (e) regression test that `claw --allowedTools <subcommand>` rejects with `missing_argument`, and that the canonical name list in errors uses the same casing as the alias map. **Why this matters:** `--allowedTools` is the primary surface for restricting claw's tool surface area (security-relevant). Inconsistent naming between the validator error and the alias map means users following the error message guidance pick names that work in some places and fail in others. The missing-value bug silently swallows a subcommand, leading to confusing "unsupported tool: status" errors when the user actually wanted to run `claw status`. Cross-references #94/#97/#101/#106/#115/#123 (permission-rule audit), #428 (default permission_mode), #422/#423/#424/#428/#430/#431 (`kind:"unknown"` catch-all). Source: Jobdori live dogfood, `fad53e2d`, 2026-05-11.
432. **DONE — `--allowedTools` uses a canonical snake_case registry with typed diagnostics and documented aliases** — fixed 2026-06-04 in `fix: type allowed tools validation`. `GlobalToolRegistry::normalize_allowed_tools` now normalizes built-in, plugin, runtime, and MCP wrapper tool names to canonical snake_case allow-list entries while still accepting documented aliases such as `read`, `Read`, and legacy provider-facing names like `WebFetch`/`MCPTool`. Provider tool definitions and CLI/subagent executors compare against canonical names, so aliases do not break internal dispatch. `claw --allowedTools status --output-format json` now refuses to consume `status` as a value and emits typed `missing_argument` JSON with `argument:"--allowedTools"`; unsupported names emit typed `invalid_tool_name` JSON with `tool_name`, `available`, and `tool_aliases`. `status --output-format json` exposes `allowed_tools.available` and `allowed_tools.aliases`, and help/usage docs describe canonical names plus aliases. Regression coverage: `parses_allowed_tools_flags_with_aliases_and_lists`, `rejects_allowed_tools_followed_by_subcommand_or_flag_432`, `rejects_unknown_allowed_tools`, `allowed_tools_errors_have_typed_json_and_alias_map_432`, `allowed_tools_normalize_to_canonical_snake_case_and_aliases_432`, status JSON alias assertions, MCP wrapper normalization coverage, and classifier coverage for `invalid_tool_name`.
433. **Repeated `--output-format` flag silently takes the last value without warning — `claw --output-format json --output-format text status` produces text output, no signal that the prior `json` was overridden; sibling: `--output-format` value is case-sensitive (`JSON` rejected as `kind:"unknown"`); sibling: no `CLAW_OUTPUT_FORMAT` env var for default format override** — dogfooded 2026-05-11 by Jobdori on `ce39d5c5` in response to Clawhip pinpoint nudge at `1503290592556220488`. Reproduction: `claw --output-format json --output-format text status` returns the text-format `Status\n Model claude-opus-4-6...` table — the first `--output-format json` was silently overridden. No warning, no `format_overridden:true` field, no stderr message. Scripts that compose flag arrays from multiple sources (`flags=("${BASE_FLAGS[@]}" --output-format json)` while `BASE_FLAGS` already contains `--output-format text`) silently get the wrong format. **Three sibling findings in same probe:** (a) **case-sensitivity drift**: `claw --output-format JSON status` returns `{"error":"unsupported value for --output-format: JSON (expected text or json)","kind":"unknown"}` — error message tells user to use lowercase `json` but doesn't accept the uppercase form that users often type from muscle memory. Most CLI flag-value validators (cargo, kubectl, gh) are case-insensitive for enum values or accept both forms with normalization. (b) **`kind:"unknown"` for invalid format value**: same catch-all bucket bug as #422/#423/#424/#428/#430/#431/#432 — should be `kind:"invalid_output_format"` with `value:` and `expected:["text","json"]` fields. (c) **no env-var default for output format**: `CLAW_OUTPUT_FORMAT=json claw status` silently ignored — no env override for the global default, forcing scripts to repeat `--output-format json` on every invocation. Other major CLIs honor `KUBECTL_OUTPUT=`, `AWS_DEFAULT_OUTPUT=`, `GH_NO_PROMPT=` etc. (d) **silently-ignored env vars `CLAW_LOG`/`RUST_LOG`**: no env-based log level control surfaced in `claw doctor` — debug logging requires undocumented `RUST_LOG=` (Rust convention) but `claw --help` doesn't mention either. **Required fix shape:** (a) repeated `--output-format` (or any flag that takes a value, not a count flag) emits a warning to stderr (`warning: --output-format specified multiple times; using last value 'text'`) and adds a `format_source:"flag", format_overridden:[]` field to the JSON envelope; (b) accept case-insensitive enum values for `--output-format` (`JSON`, `Json`, `json` all work), document the canonical lowercase form in `--help`; (c) emit `kind:"invalid_output_format"` (not `kind:"unknown"`) when value is invalid; (d) accept `CLAW_OUTPUT_FORMAT` env var as the default for `--output-format`, with flag-overrides-env precedence documented; (e) document `RUST_LOG` / `CLAW_LOG` in `--help` or doctor output as the log-level env vars; (f) regression test: repeated flag emits stderr warning + JSON metadata field; case-insensitive enum accepts all three casings; env-var default is honored when flag is absent. **Why this matters:** scripts that compose flag arrays from multiple sources (CI envs + per-invocation flags) silently get the wrong output format. Case-sensitive enum values trip up users typing from muscle memory. Missing env-var defaults force per-invocation flag repetition. Cross-references #422/#423/#424/#428/#430/#431/#432 (`kind:"unknown"` catch-all cluster). Source: Jobdori live dogfood, `ce39d5c5`, 2026-05-11.
433. **DONE — `--output-format` selection is typed, case-insensitive, env-configurable, and auditable** — fixed 2026-06-04 in `fix: type output format selection`. `CliOutputFormat::parse` now accepts `text`/`json` in any casing, `CLAW_OUTPUT_FORMAT` seeds the default output format when no CLI output-format flag is present, explicit flags override the env default, and repeated flags emit `warning: --output-format specified multiple times; using last value '...'`. `status --output-format json` exposes `format_source`, `format_raw`, and `format_overridden`; invalid values return typed `invalid_output_format` JSON with `value`, `expected:["text","json"]`, and a recovery hint instead of `kind:"unknown"`. Top-level help documents `CLAW_OUTPUT_FORMAT`, `CLAW_LOG`, and `RUST_LOG`, and doctor system JSON surfaces those env values. Regression coverage: `output_format_flags_and_env_have_typed_contract_433` and `classify_error_kind_returns_correct_discriminants`. Verification: `cargo fmt --manifest-path rust/Cargo.toml --all -- --check`, focused output-format and classifier tests, `scripts/roadmap-check-ids.sh`, `git diff --check`, `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli --no-run`, and `cargo build --manifest-path rust/Cargo.toml --workspace --locked`.
434. **POSIX `--` end-of-flags separator is not recognized — `claw -- "-prompt-with-dash"` returns `{"error":"unknown option: --","hint":"Did you mean -V?","kind":"cli_parse"}` instead of treating subsequent args as positional; shorthand prompt mode cannot accept dash-prefixed prompts at all** — dogfooded 2026-05-11 by Jobdori on `0e5f6958` in response to Clawhip pinpoint nudge at `1503298142286905484`. Reproduction: `claw -- "-prompt-with-dash" --output-format json` returns `{"error":"unknown option: --","hint":"Did you mean -V?\nRun \`claw --help\` for usage.","kind":"cli_parse"}`. The POSIX/GNU CLI convention — universally honored by cargo, git, npm, gh, kubectl, grep, ls, find, etc. — is that `--` terminates flag parsing and treats everything after it as positional arguments. claw rejects `--` itself as an unknown flag. **Sibling misleading-suggestion bug (recurring from #429):** the `cli_parse` hint suggests `Did you mean -V?` for `--`. `-V` is the version flag; `--` is the end-of-flags separator. They have no semantic relationship; the auto-complete is matching on prefix-character similarity only. **Sibling shorthand-prompt limitation:** `claw "-just a prompt" --output-format json` returns `{"error":"unknown option: -just a prompt","kind":"cli_parse"}` and `claw "--bogus-flag-like" --output-format json` returns the same. The shorthand non-interactive prompt mode (documented as `claw [--model MODEL] [--output-format text|json] TEXT`) cannot accept any TEXT that starts with `-` or `--`, even when the entire string is shell-quoted as a single token. Users must use the explicit `prompt` verb (`claw prompt "-prompt-with-dash"` works) to escape this, but the explicit verb is documented as alternative not required. **Required fix shape:** (a) accept POSIX `--` as the end-of-flags marker globally — every arg after `--` is positional; (b) shorthand prompt mode must distinguish "this looks like a flag" from "this is a quoted positional that happens to start with `-`" by looking at whether the token matches any registered flag name (`-h`, `-V`, `--help`, `--version`, etc.) — strings that don't match any flag should be treated as prompt text; (c) fix the "Did you mean" hint algorithm to filter by semantic category (don't suggest `-V` for `--`, suggest "use \`--\` to terminate flag parsing" if the user types just `--`); (d) regression test: `claw -- "-foo"` reaches the runtime with prompt=`-foo`; `claw "-not-a-flag"` is treated as shorthand prompt when no registered flag matches; canonical `--` is recognized. **Why this matters:** POSIX `--` is the universal mechanism for passing arbitrary text (filenames starting with `-`, prompts containing flag-like syntax, log lines, etc.) to a CLI. Failing on `--` makes claw fundamentally unergonomic in shell pipelines (`echo "-q for quiet" | xargs claw` fails). The shorthand-prompt limitation forces users to remember the `prompt` verb specifically when their prompt happens to start with `-`. Cross-references #422 (unknown subcommand fallthrough), #423 (stdin not consumed by prompt), #429 ("Did you mean --acp" misleading suggestion). Source: Jobdori live dogfood, `0e5f6958`, 2026-05-11.
434. **DONE — POSIX `--` and dash-prefixed shorthand prompts stay on the prompt path** — fixed 2026-06-04 in CI recovery after `41678eb` turned Rust CI red. Global argument parsing now treats `--` as an end-of-flags separator, stops JSON-output pre-scans before the separator, and forwards every following token as positional prompt text. Shorthand prompt mode accepts dash-prefixed text that is not a registered or near-miss CLI flag (`-not-a-flag`, `--bogus-flag-like literal`) while still rejecting real typo-like options such as `--resum` with the `--resume` suggestion. Direct `/status` invocation routes to the local status action per the resume-safe slash command contract, matching the CI-red parser regression tests restored after `7cfd83f`. Help, USAGE, and rust README document the `claw -- "-prompt-with-dash"` form. Regression coverage: `parses_dash_prefixed_prompt_text_434`, plus rerun CI-red parser tests `parses_bare_prompt_and_json_output_flag` and `parses_direct_agents_mcp_and_skills_slash_commands`. Verification: `cargo fmt --manifest-path rust/Cargo.toml --all -- --check`, focused parser tests, `scripts/roadmap-check-ids.sh`, `git diff --check -- USAGE.md rust/README.md rust/crates/rusty-claude-cli/src/main.rs ROADMAP.md`, docs source-of-truth/release-readiness/unit helper checks, `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli --no-run`, `cargo build --manifest-path rust/Cargo.toml --workspace --locked`, and `cargo clippy --manifest-path rust/Cargo.toml --workspace`. Local `cargo test --manifest-path rust/Cargo.toml --workspace` still hits the pre-existing Darwin-only `runtime::worker_boot::startup_preflight_warns_when_git_metadata_is_not_writable` permission assertion after all CLI parser tests pass; the red GitHub jobs were parser failures on head `41678eb`.
435. **`claw --resume latest` on a fresh workspace exit code is 0 in text mode but 1 in JSON mode (text mode lies about success); sibling: failed `--resume` creates the `.claw/sessions/<fingerprint>/` directory tree as a filesystem side effect of the failure** — dogfooded 2026-05-11 by Jobdori on `e29010ed` in response to Clawhip pinpoint nudge at `1503305692566655096`. Reproduction (fresh empty dir, no `.claw/`, no sessions): `claw --resume latest` (text mode) prints `failed to restore session: no managed sessions found in .claw/sessions/0ead448127a2de44/` and exits **0**. Same invocation with `--output-format json` correctly exits **1** with `kind:"session_load_failed"`. Exit-code parity broken on the same input depending on format flag. **Sibling filesystem-side-effect bug:** after the failed `--resume latest` on a fresh empty workspace, the directory `.claw/sessions/0ead448127a2de44/` (the workspace-fingerprint partition) is created on disk despite the operation failing. The user did not opt into creating workspace metadata — they asked to resume an existing session, the resume failed, and now there's a partition directory hanging around. The fingerprint directory ought to be created lazily on first successful session save, not as a side effect of every resume attempt. **Three sibling findings in the same probe:** (a) **`claw --compact` alone (no other args) drops into the interactive REPL with the ANSI welcome banner** — `--compact` is documented as a modifier that strips tool call details in text mode for piping (`--compact ... useful for piping`), not as a verb that activates the REPL. Running `claw --compact` with no positional should be a no-op or an error explaining the flag needs a subcommand or prompt; entering the REPL is the wrong default. (b) **`claw --compact "hello"` (shorthand prompt) returns `{"error":"unknown subcommand: hello.","hint":"Did you mean help","kind":"unknown"}``--compact` disables shorthand prompt mode entirely**, treating the positional as a subcommand instead of as prompt text. Users must use the explicit `prompt` verb (`claw --compact prompt "hello"`) which contradicts the `claw [flags] TEXT` usage line in `--help`. (c) `kind:"unknown"` again for the unknown-subcommand error in --compact path — same catch-all bucket bug appearing for the 11th time across pinpoints. **Required fix shape:** (a) exit code 1 for all `failed_to_restore` / `session_load_failed` text-mode failures; text mode should print to stderr and exit non-zero, not print to stdout and exit 0; (b) defer `.claw/sessions/<fingerprint>/` creation to first successful save; failed `--resume` must not leave filesystem droppings; (c) `claw --compact` alone (no positional, no subcommand, stdin is TTY) should emit `kind:"missing_argument"` with `argument:"prompt or subcommand"` rather than activating the REPL; (d) `--compact` must be transparent to shorthand prompt mode parsing — `claw --compact "hello"` is equivalent to `claw --compact prompt "hello"`, both should reach the prompt path; (e) emit typed `kind:"unknown_subcommand"` not `kind:"unknown"` for fallthrough cases. **Why this matters:** scripts that gate on `$?` after `claw --resume latest` see success on text mode and failure on JSON mode — the same operation, two outcomes. The filesystem side effect pollutes a user's worktree with workspace partitions they didn't ask for, and CI pipelines that snapshot `.claw/` size silently grow on every failed `--resume`. Cross-references #422 (exit-code parity across error envelopes), #423 (`kind:"unknown"` for `missing_argument`), #434 (shorthand prompt limitations). Source: Jobdori live dogfood, `e29010ed`, 2026-05-11.
435. **DONE — failed resume is non-zero and side-effect free; `--compact` stays a prompt modifier** — fixed 2026-06-04 in `fix: keep failed resume side-effect free`. Fresh-workspace `claw --resume latest` exits 1 in text and JSON modes; text writes the restore failure to stderr, JSON writes a typed `no_managed_sessions` restore envelope to stdout, and failed lookup no longer creates `.claw/sessions/<fingerprint>/`. `SessionStore::from_cwd`/`from_data_dir` now only derive the fingerprinted path; session save remains responsible for creating it. Global `--compact` no longer starts the REPL when it has no prompt or stdin: it returns typed `missing_argument` with `argument:"prompt or subcommand"`. `claw --compact "hello"` remains shorthand prompt mode and reaches provider/auth validation rather than command-not-found. Regression coverage: `session_store_from_cwd_is_side_effect_free_until_save`, `resume_latest_missing_session_fails_without_creating_session_dirs_435`, `compact_flag_missing_argument_and_shorthand_prompt_contract_435`, and `parses_compact_flag_for_prompt_mode`; broader checks reran `runtime session_control`, `resume_slash_commands`, `output_format_contract`, `claw` bin tests, `cargo fmt --all -- --check`, `scripts/roadmap-check-ids.sh`, `git diff --check`, and `cargo build --workspace --locked`.
436. **`claw init` shipped `.claw.json` template explicitly sets `permissions.defaultMode:"dontAsk"` — every user who runs `claw init` gets a config file that disables permission prompts by default; sibling: `init` creates an empty `.claw/` directory with no settings.json template inside, and when `.claw/` already exists it skips the whole artifact (no settings template materialized)** — dogfooded 2026-05-11 by Jobdori on `b8f989b6` in response to Clawhip pinpoint nudge at `1503313241751949335`. Reproduction: `mkdir /tmp/probe && cd /tmp/probe && claw init --output-format json` returns `artifacts:[{name:".claw/",status:"created"},{name:".claw.json",status:"created"},...]`. Inspecting the created `.claw.json`: `{"permissions":{"defaultMode":"dontAsk"}}`. This is the polar opposite of safe-by-default: every user who follows the documented onboarding flow (`claw init` after `curl install.sh`) ships their workspace with permission prompts disabled. Compounds with **#428** (default runtime permission_mode is `danger-full-access`) — between the runtime default and the init template, a fresh claw setup has zero user-facing safety friction. **Sibling: `.claw/` artifact is an empty directory.** After `claw init`, `find .claw -type f` returns nothing. No `settings.json`, no template, no scaffolding — just `mkdir .claw`. The `--help` description implies init produces a usable workspace, but `.claw/settings.json` (the project-scope counterpart of `~/.claw/settings.json`) is never templated. **Sibling: `.claw/` skip-on-exists drops the entire artifact.** If `.claw/` already exists (e.g., from a partial setup, a `--resume` failure side effect per #435, or manual creation), `claw init` returns `.claw/: skipped` and does not materialize any expected sub-content. The other artifacts (`.claw.json`, `.gitignore`, `CLAUDE.md`) are still created, but a future `claw skills install` or `claw plugins enable` may expect `.claw/` to contain template files that are now missing. **Required fix shape:** (a) the shipped `.claw.json` template must default to `permissions.defaultMode:"acceptEdits"` or `"plan"` (safe-by-default modes per #428 spec) — `"dontAsk"` requires explicit opt-in; (b) `claw init` must materialize `.claw/settings.json` with documented schema defaults inside `.claw/` so the directory is useful on its own; (c) when `.claw/` already exists, `init` must report `partial` status (not `skipped`) and still try to create missing sub-files like `.claw/settings.json` without overwriting existing files; (d) emit per-sub-file artifact entries for `.claw/settings.json` and `.claw/sessions/` (skipped status if absent, deferred-to-first-save acceptable) so automation knows what's present; (e) regression test: `claw init` produces a `.claw.json` whose `permissions.defaultMode` is NOT `dontAsk`; `.claw/` contains at least one templated file. **Why this matters:** init is the primary onboarding surface. Every first-time user piping `curl install.sh | sh && claw init` gets a workspace pre-configured to skip permission prompts — and that workspace gets committed to the user's repo via the `init`-added entry. The `.claw/` empty-directory bug means feature discovery (skills, plugins) lacks the scaffolding it implies. Cross-references #428 (runtime default permission_mode), #50/#87/#91/#94/#97/#101/#106/#115/#123 (permission-rule audit), #435 (filesystem side effects on failed resume). Source: Jobdori live dogfood, `b8f989b6`, 2026-05-11.
436. **DONE — `claw init` scaffolds safe project settings and reports partial/deferred artifacts** — fixed 2026-06-04 in `fix: scaffold safe init settings`. The starter `.claw.json` and new `.claw/settings.json` template now both use `permissions.defaultMode:"acceptEdits"` instead of unsafe `dontAsk`. Fresh init materializes `.claw/settings.json`, keeps `.claw/sessions/` deferred until the first successful session save, and emits per-artifact entries for `.claw/`, `.claw/settings.json`, `.claw/sessions/`, `.claw.json`, `.gitignore`, and `CLAUDE.md`. When `.claw/` already exists but its settings template is missing, init creates `.claw/settings.json` without overwriting existing files and reports `.claw/` as `partial` rather than `skipped`; idempotent reruns keep existing artifacts skipped and session storage deferred. JSON init output now includes `partial[]` and `deferred[]` alongside `created[]`, `updated[]`, and `skipped[]`, and init help/USAGE document the artifact statuses. Regression coverage: `initialize_repo_creates_expected_files_and_gitignore_entries`, `initialize_repo_is_idempotent_and_preserves_existing_files`, `artifacts_with_status_partitions_fresh_and_idempotent_runs`, and `init_json_envelope_has_hint_and_already_initialized_783`.
437. **`version --output-format json` omits build provenance fields — no `is_dirty`, `branch`, `commit_date`, `commit_timestamp`, `rustc_version`; `git_sha` is truncated to 7 chars instead of full 40-char hash; sibling: `executable_path` leaks the build host's path (`/tmp/claw-dog-0530/...`) into runtime output** — dogfooded 2026-05-11 by Jobdori on `8cf628a5` in response to Clawhip pinpoint nudge at `1503320791582900344`. Reproduction: `claw version --output-format json` returns `{"build_date":"2026-05-11","executable_path":"/tmp/claw-dog-0530/rust/target/release/claw","git_sha":"b98b9a7","kind":"version","message":"Claw Code\n Version 0.1.0\n Git SHA b98b9a7\n Target aarch64-apple-darwin\n Build date 2026-05-11","target":"aarch64-apple-darwin","version":"0.1.0"}`. Critical provenance fields missing: (a) **`is_dirty`** — was the working tree clean at build time? Automation that pins on build provenance cannot tell if the binary was built from a clean commit or includes uncommitted changes; (b) **`branch`** — was this built from `main`, `dev/rust`, a release tag, or a feature branch? The `git_sha` alone doesn't reveal the integration point; (c) **`commit_date` / `commit_timestamp`** — only `build_date` (when the binary was compiled) is exposed; the commit itself might be days/weeks older if the build happened later. Reproducibility audits need both; (d) **`rustc_version`** — what Rust compiler version produced this binary? Critical for security advisories (e.g., known regressions in specific rustc versions); (e) **`git_sha` truncated to 7 chars** ("b98b9a7" instead of full "b98b9a71..."): 7-char shas have known collision rates in large repos and prevent unambiguous git rev-parse round-trip. **Sibling: `executable_path` leaks build-host path.** The `executable_path` field returns `/tmp/claw-dog-0530/rust/target/release/claw` — the directory where the binary was compiled, embedded into the binary metadata. For a binary copied/installed/symlinked to a different location, this field still reports the build path, not the actual invocation path. Either the field should reflect the runtime path via `std::env::current_exe()` at runtime (not compile-time), or it should be dropped to avoid leaking compile-host filesystem layout. **Sibling: prose `message` field duplicates structured data.** The `message` field still contains the entire text-mode prose version block (`"Claw Code\n Version 0.1.0\n Git SHA b98b9a7\n..."`) — every field present as structured JSON (`version`, `git_sha`, `target`, `build_date`) is also embedded in the prose. Same issue as #391 (`version json includes prose message field`) which was closed as "fixed" — the prose remains. **Required fix shape:** (a) add `is_dirty:bool`, `branch:string|null`, `commit_date:string` (ISO-8601), `commit_timestamp:int` (Unix epoch), `rustc_version:string` to the JSON envelope; (b) preserve full 40-char `git_sha` and add `git_sha_short:string` as a derived field if 7-char form is needed for UX; (c) `executable_path` should be `std::env::current_exe()` at runtime, not the compile-time path; (d) drop the prose `message` field from JSON or rename it `human_readable:string` and make it explicitly secondary to the structured fields; (e) re-verify #391 closure — the prose `message` is still present, the fix didn't fully land. **Why this matters:** version surface is the canonical provenance probe for security audits, build reproducibility, and bug-report metadata. Missing `is_dirty` means automated triage cannot distinguish "issue against a clean main commit" from "issue against a developer's uncommitted hack". Truncated `git_sha` blocks unambiguous git lookup. Leaked `executable_path` exposes build-host layout. Cross-references #391 (version prose duplication — apparently not fully fixed), #334 (version json omits build_date — fixed, but partial scope), #100 (commit identity audit). Source: Jobdori live dogfood, `8cf628a5`, 2026-05-11.
437. **DONE — `version --output-format json` exposes complete build provenance without duplicating prose** — fixed 2026-06-04 in `fix: expose complete version provenance`. Build metadata now records the full 40-character `git_sha`, separate derived `git_sha_short`, `is_dirty`, `branch`, ISO-8601 `commit_date`, Unix `commit_timestamp`, `rustc_version`, target, and build date. Version JSON exposes those fields at top level and mirrors them under `binary_provenance`; `workspace_git_sha` is also a full SHA and `workspace_match` now compares full commit identities. `executable_path` is resolved at runtime with `std::env::current_exe()` instead of reporting a compile-host path. The prose report is no longer duplicated in JSON as `message`; JSON callers get the secondary text block as `human_readable`. Docs in `USAGE.md` and `rust/README.md` describe the provenance contract. Regression coverage: `version_emits_json_when_requested`, `version_status_doctor_include_binary_provenance_797`, `resumed_version_and_init_emit_structured_json_when_requested`, and `resumed_version_command_emits_structured_json`.
438. **Memory file discovery only recognizes `CLAUDE.md` — `AGENTS.md` (industry convention used by OpenCode/Codex/Aider/Cursor) and `CLAW.md` (project's own brand name) are silently ignored despite being present in the workspace** — dogfooded 2026-05-11 by Jobdori on `d3a982dd` in response to Clawhip pinpoint nudge at `1503328341422244012`. Reproduction (fresh empty dir, isolated `CLAW_CONFIG_HOME`): create three files in cwd — `CLAUDE.md` (marker `MARKER-FROM-CLAUDE-MD`), `AGENTS.md` (marker `MARKER-FROM-AGENTS-MD`), `CLAW.md` (marker `MARKER-FROM-CLAW-MD`). Run `claw status --output-format json``workspace.memory_file_count: 1`. Run `claw system-prompt --output-format json` and search the `message` field for each marker: only `MARKER-FROM-CLAUDE-MD` is found; `MARKER-FROM-AGENTS-MD` and `MARKER-FROM-CLAW-MD` are absent. `claw-code` exclusively recognizes the Claude-branded filename inherited from upstream Claude Code; the project's own `CLAW.md` brand name and the cross-tool industry convention `AGENTS.md` are both silently dropped. **Three sibling implications:** (a) **brand-consistency gap**: a project rebranded from Claude Code to Claw Code that introduces `CLAUDE.md` as its only memory file is internally inconsistent. Users naturally expect `claw <subcommand>` to read `CLAW.md`. (b) **industry-convention gap**: `AGENTS.md` is the convergent convention for OpenCode (oh-my-opencode/sisyphus), OpenAI Codex CLI, Aider, Cursor, Continue.dev, and most ACP harnesses. Users with mixed-tool workflows maintain a shared `AGENTS.md` and expect every AI coding tool to honor it. (c) **silent failure mode**: there is no warning when `AGENTS.md` or `CLAW.md` exist but are not loaded. Users who copy-paste `AGENTS.md` from another tool's docs see `memory_file_count` stay at 0 or 1 and have to guess why their instructions aren't applied. **Required fix shape:** (a) discover and load **`CLAUDE.md`, `CLAW.md`, `AGENTS.md`** in that priority order (existing config-precedence pattern); (b) all three contribute to `memory_file_count` with `memory_files:[{path, source:"claude_md"|"claw_md"|"agents_md", chars}]` array exposed in `status --output-format json`; (c) when multiple files exist, merge or document the precedence: project-specific `CLAUDE.md`/`CLAW.md` overrides industry-shared `AGENTS.md`; (d) `claw doctor --output-format json` adds a `memory` check that warns when `AGENTS.md` exists but is not the loaded variant (alerting users that they may be relying on the wrong file); (e) regression test: workspace with all three files results in `memory_file_count >= 1` and the system prompt contains markers from at least the highest-precedence file. **Why this matters:** `AGENTS.md` is the lingua-franca instruction file for cross-tool AI coding workflows. A team using OpenCode for one project and Claw Code for another keeps their conventions in a shared `AGENTS.md`. Forcing them to also maintain a `CLAUDE.md` for claw-code (with identical content) is friction that breaks the value proposition of a fork. Cross-references #438 itself (the multi-file convention), and AGENTS.md ecosystem references in oh-my-opencode/sisyphus docs. Source: Jobdori live dogfood, `d3a982dd`, 2026-05-11.
438. **DONE — memory discovery loads `CLAUDE.md`, `CLAW.md`, and `AGENTS.md` with structured provenance** — fixed 2026-06-04 in `fix: load Claw and Agents memory files`. Project memory discovery now checks root instruction files in `CLAUDE.md`, `CLAW.md`, then `AGENTS.md` order for each discovered directory, preserves existing scoped `.claw/CLAUDE.md`, `.claude/CLAUDE.md`, `.claw/instructions.md`, and rules-directory imports, and exposes each loaded file's `path`, `source`, `chars`, and `contributes` in `status --output-format json` as `workspace.memory_files[]`. `system-prompt --output-format json` returns the same memory metadata alongside the rendered `message`/`sections`, and all non-duplicate loaded files contribute to the prompt so CLAUDE/CLAW/AGENTS markers are visible together. `claw doctor --output-format json` now includes a dedicated `memory` check with loaded memory metadata and `unloaded_memory_files[]` warnings for present `CLAW.md`/`AGENTS.md` candidates that were skipped (for example empty or duplicate-content variants). Docs in `USAGE.md` and `rust/README.md` describe the priority and JSON contracts. Regression coverage: `discovers_claude_claw_agents_and_dot_claude_instruction_files_together`, `memory_files_load_claude_claw_agents_and_surface_json_438`, and `memory_health_surfaces_loaded_and_unloaded_files_438`.
439. **Memory file discovery walks ALL ancestor directories up to `$HOME` boundary, silently loading any `CLAUDE.md` it finds — `/tmp/CLAUDE.md` left from a previous test silently bleeds into every project under `/tmp/*/`; no `--no-parent-memory` flag, no `.no-claude-md-boundary` marker file to limit discovery scope** — dogfooded 2026-05-11 by Jobdori on `f4a96740` in response to Clawhip pinpoint nudge at `1503335892461293675`. Reproduction: create three nested `CLAUDE.md` files with unique markers — `/tmp/claw-nested-probe/CLAUDE.md` (`PARENT_CLAUDE`), `subproj/CLAUDE.md` (`CHILD_CLAUDE`), `subproj/deep/CLAUDE.md` (`DEEP_CLAUDE`). Run `claw system-prompt --output-format json` from `subproj/deep/nest/` (note: `nest` has no `CLAUDE.md`). The `message` field contains **all three markers** (PARENT + CHILD + DEEP) and `status --output-format json` reports `memory_file_count: 3`. Boundary tests: (a) `$HOME/CLAUDE.md` is NOT picked up from `/tmp/no-claude-dir` (discovery stops at `$HOME` boundary, good); (b) From `/tmp/deep` (no nested CLAUDE.md), `/tmp/CLAUDE.md` IS picked up (count: 1); (c) git-root is NOT a discovery boundary — running from a git subdir still walks above the git root. **Ambient-context-bleed footgun:** any stale `/tmp/CLAUDE.md` (or `/home/<user>/projects/CLAUDE.md`, or any ancestor-path CLAUDE.md left over from a previous experiment, copy-paste, or AI-generated example) silently bleeds into every workspace nested below it. The user has no signal in `status --output-format json` indicating which ancestor file is contributing — only the aggregate `memory_file_count`. **Three required fixes:** (a) **expose discovery list**: `status --output-format json` and `system-prompt --output-format json` must include `memory_files:[{path, source:"workspace"|"ancestor"|"parent_dir"|"home", chars, contributes:bool}]` so users can see what's leaking in; (b) **add `--no-parent-memory` flag** to limit discovery to cwd only (no ancestor walk), or add a boundary marker (`.claude-no-walk`, `.claw-root`, or honor `.git` as the boundary by default — most users expect repo-root scope); (c) **`doctor` warns** when ancestor `CLAUDE.md` files are loaded from outside the current git repo (suggests they may be unintentional). **Sibling discovery scope question:** discovery walks up to `$HOME` — but for a user with a project at `/Users/foo/work/proj`, that's `/Users/foo/work/CLAUDE.md` + `/Users/foo/CLAUDE.md` (if it exists) both load. The home boundary is exclusive, but the entire `/Users/foo` tree under home is in scope. **Why this matters:** test workspaces, scratch dirs, AI-generated example projects, and shared `/tmp` workdirs are full of stale `CLAUDE.md` files. The current discovery rule means every claw invocation can silently inherit context from arbitrary ancestor paths. Cross-references #438 (memory discovery only finds CLAUDE.md, not AGENTS.md or CLAW.md), #421 (cwd canonicalization leak — the canonicalized form determines which ancestor walk path is used). Source: Jobdori live dogfood, `f4a96740`, 2026-05-11.
439. **DONE — memory discovery is git-root bounded and reports memory origins** — fixed 2026-06-04 in `fix: bound parent memory discovery`. Project memory discovery now walks only from the current directory up to the nearest git root when one exists, and otherwise stays cwd-local, so stale parent `CLAUDE.md` files outside the project no longer bleed into scratch workspaces. Loaded memory JSON in both `status --output-format json` and `system-prompt --output-format json` now includes `origin`, `scope_path`, and `outside_project` alongside `path`, `source`, `chars`, and `contributes`; origins distinguish `workspace`, `parent_dir`, `ancestor`, `home`, and `outside_project`. The `memory` doctor check warns if an outside-project memory file is ever loaded, while still listing loaded and skipped memory candidates structurally. Docs in `USAGE.md` and `rust/README.md` describe the git-root boundary and expanded JSON fields. Regression coverage: `discovery_stops_at_git_root_boundary_439`, `discovery_without_git_root_stays_cwd_local_439`, and `memory_discovery_stops_at_git_root_and_reports_origins_439`.
440. **One invalid `mcpServers` entry blocks ALL OTHER valid MCP servers from loading — `mcp list --output-format json` returns `configured_servers: 0, servers: []` when even one server has a missing/invalid `command` field, despite other servers in the same config being well-formed; sibling: config parser halts on first invalid entry, never reports the remaining invalid entries** — dogfooded 2026-05-11 by Jobdori on `bd126905` in response to Clawhip pinpoint nudge at `1503343442904879156`. Reproduction: write `.claw.json` containing six `mcpServers` entries — one valid (`valid-server: {command:"/bin/echo", args:["hello"]}`) and five with progressive defects (missing-command, empty-command, null-command, wrong-type-command, extra-unknown-field). Run `claw mcp list --output-format json``{"action":"list","config_load_error":"/private/tmp/claw-mcp-probe/.claw.json: mcpServers.missing-command-server: missing string field command","configured_servers":0,"kind":"mcp","servers":[],"status":"degraded"}`. The error mentions only `missing-command-server` (the first invalid entry in JSON-object iteration order); the other four invalid entries are never surfaced. The valid `valid-server` entry is silently dropped because the parser bails on the first error. `status --output-format json` correctly propagates the same `config_load_error` and sets `status:"degraded"`, but no field tells automation which servers are valid vs broken — `servers:[]` is the only signal. **Three problems compounded:** (a) **all-or-nothing loading**: ROADMAP product principle #5 says "partial success is first-class," but mcp config loading is binary. One bad server kills the entire MCP plane; (b) **first-error-only reporting**: a `.claw.json` with five invalid entries surfaces only one error message — the user fixes that one and runs again, gets the next error, and so on. Five iterations needed to discover all errors; (c) **no per-server status**: even with the partial-success fix, the JSON envelope needs `servers:[{name, valid:bool, error?, command?, args?}]` so automation can see which entries are usable. **Required fix shape:** (a) the MCP config parser must collect ALL invalid entries into an `invalid_servers:[{name, error_field, reason}]` array and load all valid ones into `servers:[]`; do not abort on first error; (b) `configured_servers` reflects the count of *valid* loaded servers (not zero) when there are valid entries alongside invalid ones; (c) expose `total_configured:int` (count of entries in source `.claw.json`) AND `valid_count:int` (loaded), AND `invalid_count:int` (rejected) — three distinct counts; (d) `doctor --output-format json` adds an `mcp_validation` check that lists each invalid entry with its error message; (e) regression test: `.claw.json` with one valid + one invalid entry results in `configured_servers: 1, invalid_servers: [{name:"...", reason:"..."}]`. **Why this matters:** users iterate on MCP server lists during onboarding — one typo kills the entire plane, including servers they got working previously. The first-error-only reporting forces N iterations through N invalid entries instead of a single fix-everything-at-once pass. Cross-references #407 (config files no load_error per-file), #415 (config section merged_keys count only), #416 (plugins list prose), #428 (default permission mode), and Product Principle #5. Source: Jobdori live dogfood, `bd126905`, 2026-05-11.
440. **DONE — invalid `mcpServers` siblings no longer drop valid MCP servers** — fixed 2026-06-04 in `fix: load partial MCP configs`. MCP config loading now records every invalid server entry as `invalid_servers:[{name, scope, path, error_field, reason, valid:false}]` while retaining valid siblings in `servers[]`; valid entries carry `valid:true`, `configured_servers` and `valid_count` report loaded valid servers, `invalid_count` reports rejected entries, and `total_configured` reports all discovered entries. `status --output-format json` mirrors the `mcp_validation` summary, and `doctor --output-format json` includes an `mcp validation` check for one-pass repair. Empty stdio commands and unknown per-transport fields are per-server validation errors instead of global config failures. Regression coverage: `loads_valid_mcp_servers_and_collects_all_invalid_siblings_440`, `records_invalid_mcp_server_shapes_without_rejecting_config_440`, `mcp_loads_valid_servers_and_reports_invalid_siblings_440`, and `mcp_degraded_config_and_failed_usage_are_distinct_json_contracts`.
441. **`hooks` config schema diverges from Claude Code documented format — claw-code expects `{"hooks":{"PreToolUse":["command-string"]}}` (array of command strings) while Claude Code documentation specifies `{"hooks":{"PreToolUse":[{"matcher":"Read","hooks":[{"type":"command","command":"..."}]}]}}` (structured matcher objects); users copy-pasting from Claude Code docs see `field "hooks.PreToolUse" must be an array of strings`** — dogfooded 2026-05-11 by Jobdori on `86ff83c2` in response to Clawhip pinpoint nudge at `1503350990680887418`. Reproduction: write `.claw.json` with the Claude-Code-documented hook format `{"hooks":{"PreToolUse":[{"matcher":"Read","hooks":[{"type":"command","command":"/bin/echo pretool"}]}]}}`. Run `claw status --output-format json``config_load_error: "/private/tmp/claw-hook-probe/.claw.json: field \"hooks.PreToolUse\" must be an array of strings, got an array (line 3)"`, `status: "degraded"`. The error wording ("must be an array of strings, got an array") is confusingly tautological — the user did provide an array; the parser objects that the array contains objects instead of strings. Replacing with the claw-code-actual format `{"hooks":{"PreToolUse":["/bin/echo pretool"]}}` succeeds: `config_load_error: null, status: "ok"`. The two formats are fundamentally incompatible: claw-code drops the `matcher` field (no tool-specific filtering at the config layer), drops the `type:"command"` discriminator (no future expansion to other hook types), and treats each entry as a bare command string instead of a structured hook spec. **Sibling: PR #3000 (justcode049) was attempting to tolerate object-style hook entries** — that PR's title `fix: tolerate object-style hook entries in config parser` confirms this is a known user complaint, but the PR is still conflicting and unmerged. **Three sibling findings in same probe:** (a) **unknown event names reject entire hooks config**: `.claw.json` with `hooks.InvalidEvent` (not a real event name like `PreToolUse`/`PostToolUse`/`Stop`/`Notification`) triggers `config_load_error: "unknown key \"hooks.InvalidEvent\""` and rejects ALL hooks in the same file, even valid ones — same "one bad apple kills all" pattern as #440 (MCP servers). (b) **`kind:"unknown"` for the validation error** — should be `kind:"invalid_hooks_config"` or `kind:"unknown_hook_event"` (catch-all cluster #422/#423/#424/#428/#430/#431/#432/#433/#435 — 13th occurrence). (c) **first-error-only halting**: a `.claw.json` with `hooks.Stop:"not-an-array"` (type mismatch) AND `hooks.InvalidEvent` (unknown name) AND `hooks.Notification:[{}]` (empty entry) surfaces only the FIRST error in iteration order — user must fix one at a time across 3 iterations. **Required fix shape:** (a) **adopt Claude Code's structured hook format as the canonical**: support `{matcher, hooks:[{type, command}]}` natively, with `matcher` for tool-filtering, `type` for hook-type discriminator (future-proof for `inline`/`webhook`/etc beyond just `command`); (b) **keep backward compat for bare command strings**: legacy `["command-string"]` arrays still load, but emit a deprecation warning suggesting migration to the structured form; (c) **partial-success loading**: invalid hook entries surface in `invalid_hooks:[{event, index, reason}]` while valid ones load — same fix as #440 for MCP; (d) **typed `kind:"invalid_hooks_config"` envelope** instead of `kind:"unknown"`; (e) **rebase and merge PR #3000** which addresses this directly; (f) regression test: Claude-Code-documented hook config loads without error on claw-code. **Why this matters:** users migrating from Claude Code to Claw Code hit this on their first `.claw.json` write. The error message ("array of strings, got an array") is unhelpful; the documentation doesn't surface the schema divergence; and Claude Code's structured format is strictly more expressive (matchers, types) than claw-code's bare-string format. Cross-references #407 (config files no load_error), #410 (list-envelope schema drift), #428 (default permission mode), #440 (one invalid MCP entry blocks all), PR #3000 (justcode049's pending fix). Source: Jobdori live dogfood, `86ff83c2`, 2026-05-11.
@ -7756,10 +7755,14 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
794. **`claw plugins install /nonexistent/path` returned `error_kind:"unknown"` + `hint:null`** — dogfooded 2026-05-27 on `57a57ef7`. The error message `"plugin source '/path' was not found"` had no classifier arm, falling to `"unknown"`. Fix: added `plugin_source_not_found` classifier arm (`message.contains("plugin source") && message.contains("was not found")`); added `"plugin_source_not_found"``"Check that the path or URL is correct..."` to `fallback_hint_for_error_kind`. Unit test assertion added to `test_classify_error_kind`; integration test `plugins_install_not_found_path_returns_typed_kind_794` added. 56 CLI contract tests pass. [SCOPE: claw-code] Source: Jobdori plugins install probe on `57a57ef7`, 2026-05-27.
795. **`claw skills install /nonexistent` returned `skill_not_found + hint:null` and `claw skills uninstall x` returned `unsupported_skills_action + hint:null`** — dogfooded 2026-05-27 on `491f179a`. Both error kinds were missing from `fallback_hint_for_error_kind` table, so even though classify returned a typed kind, the hint field was always null. Fix: added `"skill_not_found"` → hint suggesting `claw skills list` / `claw skills install`; added `"unsupported_skills_action"` → hint listing supported actions. Integration test `skills_install_not_found_and_unsupported_action_have_hints_795` covers both paths. 57 CLI contract tests pass. [SCOPE: claw-code] Source: Jobdori skills lifecycle probe on `491f179a`, 2026-05-27.
795. **`claw skills install /nonexistent` returned `skill_not_found + hint:null` and `claw skills uninstall x` returned `unsupported_skills_action + hint:null`** — dogfooded 2026-05-27 on `491f179a`. Both error kinds were missing from `fallback_hint_for_error_kind` table, so even though classify returned a typed kind, the hint field was always null. Fix: added `skill_not_found` and `unsupported_skills_action` fallback hints. ROADMAP #431 later moved the lifecycle surface fully local: install failures now emit typed `invalid_install_source`, uninstall failures emit local `skill_not_found` with `skills_dir` and `available_names`, and the combined regression is covered by `skills_lifecycle_errors_have_typed_local_json_795_431` plus the install/uninstall roundtrip test. [SCOPE: claw-code] Source: Jobdori skills lifecycle probe on `491f179a`, 2026-05-27.
796. **`claw agents show <name> <extra>` and `claw skills show <name> <extra>` returned confusing `agent_not_found`/`skill_not_found` for the concatenated "name extra" string** — dogfooded 2026-05-27 on `18b4cee5`. `join_optional_args` passes all tokens as a space-joined string; both `show` handlers called `split_once(' ')` to extract the name but did not check if the remainder (after the first split) contained additional tokens. Extra positional args (including `--flags`) became part of the "name", silently mangling the lookup. Fix: added second `split_once(' ')` on the extracted name; if the result has two parts, return `unexpected_extra_args` with a usage hint. Valid single-name lookups are unaffected. Two new integration tests `agents_show_extra_positional_arg_returns_unexpected_extra_796`, `skills_show_extra_positional_arg_returns_unexpected_extra_796`. 59 CLI contract tests pass. [SCOPE: claw-code] Source: Jobdori agents/skills show extra-arg probe on `18b4cee5`, 2026-05-27.
797. **Installed `claw version --output-format json` reports `git_sha:null` / `Git SHA unknown`, so dogfood cannot tie the binary under test to a source revision** — dogfooded 2026-05-27 from `#clawcode-building-in-public` using the installed `/home/bellman/.cargo/bin/claw` binary in a clean `ultraworkers/claw-code` checkout. `claw version --output-format json` returned `{"kind":"version","version":"0.1.0","git_sha":null,"target":null,"build_date":"2026-03-31"...}` while `claw status --output-format json` only reported workspace state (`git_branch`, clean/dirty counts) and did not provide any executable-vs-workspace provenance comparison. This is a clawability gap in event/log opacity and stale-binary confusion: an operator can run `doctor/status/version` successfully but still cannot prove which commit the installed CLI came from, whether it matches `origin/main`, or whether the observed behavior is from a stale packaged binary. **Required fix shape:** (a) embed build git SHA/target/build provenance in installed/release binaries whenever the source tree is available; (b) when provenance is missing, emit a typed `binary_provenance.status:"unknown"` rather than only `git_sha:null`; (c) have `status`/`doctor` include a redaction-safe comparison between executable provenance and workspace HEAD when running inside a git checkout; (d) add regression/packaging coverage proving release/local install paths preserve or explicitly classify provenance. **Why this matters:** dogfood reports and automation need to distinguish current-source failures from stale or unknown binary lineage before opening/rebasing/closing PRs. Source: gaebal-gajae live dogfood on 2026-05-27; active repo checkout had open PR #3124 DIRTY with no checks and PR #3125 CLEAN, but the installed binary itself could not identify its source revision.
797. **DONE — Installed `claw version --output-format json` reports `git_sha:null` / `Git SHA unknown`, so dogfood cannot tie the binary under test to a source revision** — dogfooded 2026-05-27 from `#clawcode-building-in-public` using an installed binary in a clean `ultraworkers/claw-code` checkout. The gap was that version/status/doctor did not provide a structured executable-vs-workspace provenance object when build metadata was missing or stale. [SCOPE: claw-code]
**Fix applied.** `version --output-format json` now includes a `binary_provenance` object with `status:"known"|"unknown"`, build git SHA, target, build date, executable path, workspace HEAD SHA, `workspace_match`, and a structured hint when provenance is missing or mismatched. `status --output-format json` exposes the same object, and `doctor --output-format json` includes it in the `system` check so dogfood reports can distinguish current-source failures from stale or unknown binary lineage.
**Verification.** `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli version_status_doctor_include_binary_provenance_797 -- --nocapture`; `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli version_emits_json_when_requested -- --nocapture`; `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli doctor_and_resume_status_emit_json_when_requested -- --nocapture`; `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli --test output_format_contract -- --nocapture`; direct probes `cargo run --manifest-path rust/Cargo.toml -q -p rusty-claude-cli -- --output-format json version` and `cargo run --manifest-path rust/Cargo.toml -q -p rusty-claude-cli -- --output-format json status`; `cargo build --manifest-path rust/Cargo.toml --workspace --locked`.
798. **`claw plugins show <name> <extra-arg>` returned `unexpected_extra_args` + `hint:null`** — dogfooded 2026-05-27 on `9976585f`. The plugins arg parser at the top level emitted `"unexpected extra arguments after 'claw plugins show ...': ..."` with no `\n` delimiter (parity gap with #791 config fix). Fix: appended `\nUsage: claw plugins [list|show <id>|...]` to the error format string. Integration test `plugins_extra_args_have_non_null_hint_797`. Committed as `bff37000`. 60 CLI contract tests pass. [SCOPE: claw-code]
@ -7778,14 +7781,34 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
805. **`claw skills show <not-found>` in text mode silently returned "No skills found." instead of an error** — dogfooded 2026-05-27 on `2c3c0f60`. The text-mode show handler in `handle_skills_slash_command` returned `render_skills_report(&matched)` with an empty vec instead of checking for empty match and returning an error. JSON mode already returned `skill_not_found` since #706. Fix: added `matched.is_empty()` guard with `skill_not_found` error + `\n` hint suggesting `claw skills list`. 62 CLI contract tests pass. [SCOPE: claw-code]
806. **`claw plugins show <not-found>` in text mode returned "No plugins installed." instead of an error** — dogfooded 2026-05-27 on `ae6a207d`. The text-mode path in `print_plugins` printed `payload.message` (the full list render) without checking if the requested plugin existed. JSON mode correctly returned `plugin_not_found`. Fix: added show-action filtering + not-found guard to text-mode path; added `starts_with("plugin_not_found:")` arm to classifier for the new error prefix. 63 CLI contract tests pass. [SCOPE: claw-code]
807. **`claw models` / `claw model` with `--output-format json` hang with zero stdout instead of returning bounded model discovery/help JSON or a typed unsupported response** — dogfooded 2026-05-27 on `ae6a207` while checking docs/usage model-alias surface after PR #3162 opened. Both `cargo run -q -p rusty-claude-cli -- models --output-format json` and the actual rebuilt `./rust/target/debug/claw models --output-format json` timed out under an 8s outer timeout with stdout `0`; stderr only contained config deprecation warnings. The same silent timeout reproduced for `models help --output-format json`, `model --output-format json`, and `model help --output-format json`. **Required fix shape:** (a) make `model(s)` help/list/discovery commands return bounded stdout JSON without entering prompt/provider/auth paths; (b) if the command is unsupported, return a standard typed JSON error envelope with `error_kind`, non-null `hint`, and `message`; (c) ensure docs model-alias tables and CLI model discovery surfaces do not diverge; (d) add regression coverage for `models --output-format json`, `models help --output-format json`, `model --output-format json`, and `model help --output-format json` proving they do not hang or emit zero-byte stdout. **Why this matters:** model selection is a setup/control-plane surface. If the natural model discovery commands hang silently, claws cannot verify aliases like `qwen-max` / `qwen-plus`, distinguish unsupported command spelling from provider startup, or safely guide users during first-run model setup. Source: gaebal-gajae 13:30/14:00 dogfood probe; GitHub issue creation was blocked by API rate limit, so the finding was recorded directly in ROADMAP.
808. **Control-plane commands `claw config`, `claw settings`, `claw status`, and `claw doctor` with `--output-format json` hang with zero stdout instead of returning bounded JSON/help or a typed unsupported envelope** — dogfooded 2026-05-27 on `86f45a1` after ROADMAP #807 landed. Each of `./rust/target/debug/claw config --output-format json`, `config help --output-format json`, `settings --output-format json`, `settings help --output-format json`, `status --output-format json`, and `doctor --output-format json` timed out under an 8s outer timeout with stdout `0`; stderr only contained the local deprecated `enabledPlugins` settings warning. **Required fix shape:** keep non-interactive control-plane/info commands out of prompt/provider startup paths; return bounded JSON stdout for supported status/config/help surfaces, or a standard typed JSON error envelope with `error_kind`, non-null `hint`, and `message` for unsupported spellings; add timeout/nonzero-stdout regression coverage for the six repro commands. **Why this matters:** claws and users need first-run diagnostics/config/status surfaces that are safe to call from scripts. Silent hangs make setup triage indistinguishable from provider startup, auth, or model discovery failures. Source: gaebal-gajae 17:00 dogfood probe; rechecked 17:30 after `cargo build --manifest-path rust/Cargo.toml -p rusty-claude-cli` produced `claw --version` Git SHA `23a7de6`, and the same timeout reproduced for current HOME and a clean `HOME=/tmp/claw-clean-home-1730` (clean HOME produced rc 124, stdout 0, stderr 0 for `config`, `status`, and `doctor`). [SCOPE: claw-code]
809. **Top-level help/version/MCP/plugin JSON spellings hang with zero stdout in trailing `--output-format json` form instead of returning bounded JSON/help or typed unsupported envelopes** — dogfooded 2026-05-27 on rebuilt main `db81598` (`cargo build --manifest-path rust/Cargo.toml -p rusty-claude-cli`; `claw --version` Git SHA `db81598`). `help --output-format json`, `version --output-format json`, `mcp --output-format json`, `mcp help --output-format json`, `plugins --output-format json`, and `plugins help --output-format json` each timed out under an 8s outer timeout with stdout `0`; stderr only contained the local deprecated `enabledPlugins` settings warning. Leading global-style probes (`--help --output-format json`, `--version --output-format json`) fail immediately as `[error-kind: cli_parse] unknown option`, so the hang is again in the trailing subcommand-style routing/startup path. **Required fix shape:** treat help/version/MCP/plugin discovery surfaces as bounded non-interactive control-plane commands; either return JSON help/list/version payloads or standard typed JSON unsupported envelopes with `error_kind`, non-null `hint`, and `message`; add timeout/nonzero-stdout regression coverage for the six trailing repro commands and parser-envelope coverage for leading global-style spellings. **Why this matters:** claws need safe scriptable help/version/plugin/MCP discovery before provider/session startup; silent hangs hide whether a command is unsupported, misparsed, or initializing runtime state. Source: gaebal-gajae 19:00 dogfood probe. [SCOPE: claw-code]
810. **TTY JSON success for `config`/`plugins --output-format json` contaminates stdout with deprecated-settings warnings before the JSON object** — dogfooded 2026-05-27 on rebuilt main `db81598` after #809. Under pseudo-TTY (`script -q -c "./rust/target/debug/claw config --output-format json"` and `plugins --output-format json`), the commands return rc `0` and bounded JSON, but stdout begins with `warning: /home/bellman/.claw/settings.json: field "enabledPlugins" is deprecated ...` before the JSON object (`first_json_index=121`). Parsing succeeds only after manually stripping the warning/prefix; raw stdout is not valid JSON. **Required fix shape:** in JSON mode, keep diagnostics/warnings on stderr or include structured warning fields inside the JSON envelope, but never prepend human warnings to stdout; add regression coverage that raw stdout from JSON commands parses from byte 0 under TTY and non-TTY modes. **Why this matters:** even when the TTY path avoids the hang from #807/#808/#809, claws and scripts still cannot safely `json.loads(stdout)` if configuration warnings are mixed into stdout. Source: gaebal-gajae 20:00 pseudo-TTY dogfood probe. [SCOPE: claw-code]
811. **Previously typed JSON error/list surfaces hang in plain non-TTY trailing `--output-format json` form instead of emitting their JSON envelopes** — dogfooded 2026-05-27 on rebuilt main `b0e94c9` after #810. In plain non-TTY automation, `agents list --bogus --output-format json`, `skills show does-not-exist --output-format json`, `plugins show does-not-exist --output-format json`, `diff --output-format json`, `sessions show does-not-exist --output-format json`, and `resume bogus --output-format json` each timed out under an 8s outer timeout with stdout `0`; stderr only contained the local deprecated `enabledPlugins` settings warning. Several of these surfaces had prior roadmap fixes for typed JSON/text envelopes, so this is a regression-class scriptability gap: the command-specific envelope may exist, but plain non-TTY trailing JSON invocation routes into interactive startup before reaching it. **Required fix shape:** ensure trailing `--output-format json` is honored before any interactive/provider/session startup for error/list surfaces; add plain non-TTY timeout regression coverage that asserts raw stdout is a parseable typed JSON envelope for the six repro commands, including `error_kind`, non-null `hint`, and `message` where applicable. **Why this matters:** claws primarily invoke CLI checks from non-TTY automation; a fix that only works in manual/TTY mode still leaves JSON error handling unusable for agents. Source: gaebal-gajae 20:30 dogfood probe. [SCOPE: claw-code]
807. **DONE — `claw models` / `claw model` with `--output-format json` hang with zero stdout instead of returning bounded model discovery/help JSON or a typed unsupported response** — dogfooded 2026-05-27 on `ae6a207` while checking docs/usage model-alias surface after PR #3162 opened. Both `cargo run -q -p rusty-claude-cli -- models --output-format json` and the actual rebuilt `./rust/target/debug/claw models --output-format json` timed out under an 8s outer timeout with stdout `0`; stderr only contained config deprecation warnings. The same silent timeout reproduced for `models help --output-format json`, `model --output-format json`, and `model help --output-format json`. **Required fix shape:** (a) make `model(s)` help/list/discovery commands return bounded stdout JSON without entering prompt/provider/auth paths; (b) if the command is unsupported, return a standard typed JSON error envelope with `error_kind`, non-null `hint`, and `message`; (c) ensure docs model-alias tables and CLI model discovery surfaces do not diverge; (d) add regression coverage for `models --output-format json`, `models help --output-format json`, `model --output-format json`, and `model help --output-format json` proving they do not hang or emit zero-byte stdout. **Why this matters:** model selection is a setup/control-plane surface. If the natural model discovery commands hang silently, claws cannot verify aliases like `qwen-max` / `qwen-plus`, distinguish unsupported command spelling from provider startup, or safely guide users during first-run model setup. Source: gaebal-gajae 13:30/14:00 dogfood probe; GitHub issue creation was blocked by API rate limit, so the finding was recorded directly in ROADMAP.
**Fix applied.** `model` and `models` now route to a local `CliAction::Models` surface. Bare `models --output-format json` emits bounded local model metadata (default model, built-in aliases, optional configured model) without provider startup, while `model help --output-format json` routes through the structured local help envelope.
**Verification.** Regression test `models_json_and_model_help_json_are_local_807` asserts bounded exit, parseable stdout JSON, empty stderr, no `missing_credentials`, and `requires_provider_request:false` for the models list envelope.
808. **DONE — Control-plane commands `claw config`, `claw settings`, `claw status`, and `claw doctor` with `--output-format json` hang with zero stdout instead of returning bounded JSON/help or a typed unsupported envelope** — dogfooded 2026-05-27 on `86f45a1` after ROADMAP #807 landed. Each of `./rust/target/debug/claw config --output-format json`, `config help --output-format json`, `settings --output-format json`, `settings help --output-format json`, `status --output-format json`, and `doctor --output-format json` timed out under an 8s outer timeout with stdout `0`; stderr only contained the local deprecated `enabledPlugins` settings warning. **Required fix shape:** keep non-interactive control-plane/info commands out of prompt/provider startup paths; return bounded JSON stdout for supported status/config/help surfaces, or a standard typed JSON error envelope with `error_kind`, non-null `hint`, and `message` for unsupported spellings; add timeout/nonzero-stdout regression coverage for the six repro commands. **Why this matters:** claws and users need first-run diagnostics/config/status surfaces that are safe to call from scripts. Silent hangs make setup triage indistinguishable from provider startup, auth, or model discovery failures. Source: gaebal-gajae 17:00 dogfood probe; rechecked 17:30 after `cargo build --manifest-path rust/Cargo.toml -p rusty-claude-cli` produced `claw --version` Git SHA `23a7de6`, and the same timeout reproduced for current HOME and a clean `HOME=/tmp/claw-clean-home-1730` (clean HOME produced rc 124, stdout 0, stderr 0 for `config`, `status`, and `doctor`). [SCOPE: claw-code]
**Fix applied.** `settings` now routes locally: bare `settings --output-format json` reuses the config JSON envelope for the synthetic `settings` section, and `settings help --output-format json` returns a structured local help envelope. Existing `config`, `status`, and `doctor` JSON routes remain local.
**Verification.** Regression test `settings_json_and_help_json_are_local_808` asserts bounded exit, parseable stdout JSON, empty stderr, no `missing_credentials`, `section:"settings"` for bare settings, and structured help for `settings help --output-format json`.
809. **DONE — Top-level help/version/MCP/plugin JSON spellings hang with zero stdout in trailing `--output-format json` form instead of returning bounded JSON/help or typed unsupported envelopes** — dogfooded 2026-05-27 on rebuilt main `db81598` (`cargo build --manifest-path rust/Cargo.toml -p rusty-claude-cli`; `claw --version` Git SHA `db81598`). `help --output-format json`, `version --output-format json`, `mcp --output-format json`, `mcp help --output-format json`, `plugins --output-format json`, and `plugins help --output-format json` each timed out under an 8s outer timeout with stdout `0`; stderr only contained the local deprecated `enabledPlugins` settings warning. The current parser routes these as bounded local surfaces.
**Fix applied.** `help`, `version`, `mcp`, and `plugins` now resolve to local `CliAction` paths with parsed `CliOutputFormat::Json`; `parse_local_help_action()` maps `mcp` and `plugins` help topics directly to local JSON help envelopes.
**Verification.** Static evidence in `rust/crates/rusty-claude-cli/src/main.rs`: `wants_help`/`wants_version` preserve `CliOutputFormat::Json`, `parse_local_help_action()` maps `mcp` and `plugins` to local help topics, and match arms route `mcp`/`plugins` to local handlers before prompt/provider startup.
810. **DONE — TTY JSON success for `config`/`plugins --output-format json` contaminates stdout with deprecated-settings warnings before the JSON object** — dogfooded 2026-05-27 on rebuilt main `db81598` after #809. Under pseudo-TTY (`script -q -c "./rust/target/debug/claw config --output-format json"` and `plugins --output-format json`), the commands return rc `0` and bounded JSON, but stdout begins with `warning: /home/bellman/.claw/settings.json: field "enabledPlugins" is deprecated ...` before the JSON object (`first_json_index=121`). Parsing succeeds only after manually stripping the warning/prefix; raw stdout is not valid JSON. **Required fix shape:** in JSON mode, keep diagnostics/warnings on stderr or include structured warning fields inside the JSON envelope, but never prepend human warnings to stdout; add regression coverage that raw stdout from JSON commands parses from byte 0 under TTY and non-TTY modes. **Why this matters:** even when the TTY path avoids the hang from #807/#808/#809, claws and scripts still cannot safely `json.loads(stdout)` if configuration warnings are mixed into stdout. Source: gaebal-gajae 20:00 pseudo-TTY dogfood probe. [SCOPE: claw-code]
**Fix applied.** Existing global JSON-mode settings warning suppression now prevents deprecated `enabledPlugins` prose from prefixing JSON stdout, and the regression matrix asserts stdout starts with `{` at byte 0 for representative local JSON surfaces under an isolated deprecated settings fixture.
**Verification.** `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli global_json_surfaces_suppress_config_deprecation_stderr_810_821_824 -- --nocapture`.
811. **DONE — Previously typed JSON error/list surfaces hang in plain non-TTY trailing `--output-format json` form instead of emitting their JSON envelopes** — dogfooded 2026-05-27 on rebuilt main `b0e94c9` after #810. In plain non-TTY automation, `agents list --bogus --output-format json`, `skills show does-not-exist --output-format json`, `plugins show does-not-exist --output-format json`, `diff --output-format json`, `sessions show does-not-exist --output-format json`, and `resume bogus --output-format json` each timed out under an 8s outer timeout with stdout `0`; stderr only contained the local deprecated `enabledPlugins` settings warning. Current code parses trailing JSON mode before local dispatch and routes JSON abort envelopes to stdout.
**Fix applied.** Trailing `--output-format json` is parsed globally before local command matching, so inventory/error surfaces keep their typed local JSON envelopes instead of falling through to runtime/provider startup. The top-level JSON abort handler routes structured errors to stdout.
**Verification.** Static evidence in `parse_args()` shows global `--output-format` parsing before local command matching; focused tests cover representative affected surfaces including `agents_list_flag_shaped_filter_returns_unknown_option_792`, `plugins_list_flag_shaped_filter_returns_cli_parse_on_stdout_793_817`, `diff_non_git_dir_has_error_kind_and_hint_801`, and resume/export abort-envelope checks around #819/#820/#823.
812. **`claw --output-format json doctor --help` must stay a local help fast path and never fall through into runtime/provider startup** — dogfooded 2026-05-28 04:01 UTC after #701 worktree drift. The reported repro was `cargo run -q --bin claw -- --output-format json doctor --help`, which did not produce local help promptly and had to be killed, while the positive control `cargo run -q --bin claw -- --output-format json --help` emitted valid JSON help. Fresh bounded repro on branch `fix/doctor-help-json-local` did not reproduce the hang on current code (`timeout 5s cargo run -q --bin claw -- --output-format json doctor --help` exited 0 with a `kind:"help"`/`status:"ok"` doctor help envelope), which means the parser fast path is present but under-tested for this exact dogfood surface.
812. **DONE — `claw --output-format json doctor --help` must stay a local help fast path and never fall through into runtime/provider startup** — dogfooded 2026-05-28 04:01 UTC after #701 worktree drift. The reported repro was `cargo run -q --bin claw -- --output-format json doctor --help`, which did not produce local help promptly and had to be killed, while the positive control `cargo run -q --bin claw -- --output-format json --help` emitted valid JSON help. Fresh bounded repro on branch `fix/doctor-help-json-local` did not reproduce the hang on current code (`timeout 5s cargo run -q --bin claw -- --output-format json doctor --help` exited 0 with a `kind:"help"`/`status:"ok"` doctor help envelope), and current regression coverage preserves this fast path.
**Pinpoint.** The guarded path is `rust/crates/rusty-claude-cli/src/main.rs`: global `--output-format json` is parsed before `rest`, `parse_local_help_action()` maps `doctor --help` to `CliAction::HelpTopic { topic: Doctor }`, and `print_help_topic()` must return without calling `run_doctor()`, `LiveCli`, provider setup, session resume, or runtime startup. The previous risk class is help fallthrough: treating `doctor --help` as prompt text or as `doctor` diagnostics would either hit provider/session startup or run checks instead of local help.
@ -7795,13 +7818,13 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
**Verification.** Regression tests: `doctor_help_json_is_local_structured_and_bounded_702` and `doctor_help_text_stays_plaintext_and_local_702` in `rust/crates/rusty-claude-cli/tests/output_format_contract.rs`; focused command repros recorded in `/tmp/claw_doctor_help_json.out` and `/tmp/claw_doctor_help_json.err` during the doctor-help fix branch.
813. **Dogfood probe shell-string loops can fabricate CLI argv failures for JSON help surfaces** — dogfooded 2026-05-28 after #3185 merged. A verification loop used a single shell string variable (`cmd="--output-format json doctor --help"`; then `cargo run -q --bin claw -- $cmd | python3 -c 'json.load(...)'`). The resulting channel transcript showed `unknown option: --output-format json doctor --help` and Python JSON parse stack noise, even though explicit argv invocations on fresh `main` all returned valid JSON: `cargo run -q --bin claw -- --output-format json doctor --help`, `cargo run -q --bin claw -- doctor --help --output-format json`, and `cargo run -q --bin claw -- help doctor --output-format json`. This is a dogfood-harness test-brittleness / event-log opacity gap, not a product parser regression. The log made a probe-construction mistake look like a claw-code failure.
813. **DONE — Dogfood probe shell-string loops can fabricate CLI argv failures for JSON help surfaces** — dogfooded 2026-05-28 after #3185 merged. A verification loop used a single shell string variable (`cmd="--output-format json doctor --help"`; then `cargo run -q --bin claw -- $cmd | python3 -c 'json.load(...)'`). The resulting channel transcript showed `unknown option: --output-format json doctor --help` and Python JSON parse stack noise, even though explicit argv invocations on fresh `main` all returned valid JSON: `cargo run -q --bin claw -- --output-format json doctor --help`, `cargo run -q --bin claw -- doctor --help --output-format json`, and `cargo run -q --bin claw -- help doctor --output-format json`. This is a dogfood-harness test-brittleness / event-log opacity gap, not a product parser regression. The log made a probe-construction mistake look like a claw-code failure.
**Required fix shape.** Add a tiny argv-safe dogfood helper (script or documented recipe) that runs CLI probes as explicit argv arrays rather than interpolated shell strings, captures stdout/stderr separately, and labels probe-construction failures distinctly from product failures. For ad-hoc shell loops, prefer arrays/functions (`run_probe --output-format json doctor --help`) over `$cmd` strings; never pipe unknown stdout directly into a JSON parser without first recording rc/stdout/stderr.
**Fix applied.** Added `scripts/dogfood-probe.py`, an argv-safe helper that accepts a target executable plus arguments after `--`, invokes `subprocess.run` without shell interpolation, records the exact argv vector, captures rc/stdout/stderr as separate fields, and labels `timeout`, `probe_error`, and `product_error` separately. Its optional `--stdout-json-byte0` assertion requires stdout to be parseable JSON starting at byte 0, so JSON parser stack noise is replaced by a structured probe result.
**Acceptance.** Future dogfood reports for argv-sensitive CLI surfaces include the exact argv vector and can distinguish `probe_error` from `product_error`; reproducing the three doctor-help forms through the helper yields three parseable JSON objects from byte 0 without Python parser stack noise. [SCOPE: claw-code dogfood harness]
**Verification.** `python3 -m unittest tests.test_roadmap_helpers.RoadmapHelperTests.test_dogfood_probe_runs_explicit_argv_and_separates_channels tests.test_roadmap_helpers.RoadmapHelperTests.test_dogfood_probe_labels_timeout_separately_from_product_error tests.test_roadmap_helpers.RoadmapHelperTests.test_dogfood_probe_labels_probe_construction_failure tests.test_roadmap_helpers.RoadmapHelperTests.test_dogfood_probe_labels_stdout_json_prefix_failure_as_product_error` passes. The fixture tests cover explicit argv preservation for `--output-format json doctor --help`, separated stdout/stderr capture, timeout classification, construction-error classification, and byte-0 JSON product-error classification. [SCOPE: claw-code dogfood harness]
814. **Plain non-TTY trailing `--output-format json` still times out for inventory/error surfaces after #3186** — dogfooded 2026-05-28 07:00 on fresh `main` `0e6d48d9d` after #3186 merged. Using explicit argv probes with separated stdout/stderr (per #813) reproduced the older #811 class on current main: `cargo run -q --bin claw -- agents list --bogus --output-format json`, `skills show does-not-exist --output-format json`, and `plugins show does-not-exist --output-format json` each hit the 5s timeout (`rc=124`) with `stdout` length 0; stderr contained only compile warnings plus the local deprecated `enabledPlugins` settings warning. This confirms the argv-safe probe harness can distinguish product failure from probe-construction failure, and the product gap remains for trailing JSON flag forms on inventory/error surfaces.
814. **DONE — Plain non-TTY trailing `--output-format json` still times out for inventory/error surfaces after #3186** — dogfooded 2026-05-28 07:00 on fresh `main` `0e6d48d9d` after #3186 merged. Using explicit argv probes with separated stdout/stderr (per #813) reproduced the older #811 class on current main: `cargo run -q --bin claw -- agents list --bogus --output-format json`, `skills show does-not-exist --output-format json`, and `plugins show does-not-exist --output-format json` each hit the 5s timeout (`rc=124`) with `stdout` length 0; stderr contained only compile warnings plus the local deprecated `enabledPlugins` settings warning. Follow-up evidence showed the product path fixed upstream and current tests preserve local JSON error routing.
**Required fix shape.** Parse trailing `--output-format json` for local inventory/error commands before any REPL/provider startup in plain non-TTY mode, matching the already-working leading global form where applicable. Add timeout regression coverage for at least `agents list --bogus --output-format json`, `skills show does-not-exist --output-format json`, and `plugins show does-not-exist --output-format json` asserting nonzero stdout with a single parseable JSON envelope containing `status:"error"`, `error_kind`, and non-null `hint`. Keep deprecation/config warnings out of stdout in JSON mode.
@ -7809,43 +7832,71 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
**Follow-up verification (2026-05-28 07:30 on `main` `09ff1caf4`).** After #3187 merged, rerunning the same three commands with explicit argv showed the product path had already been fixed upstream: `agents list --bogus --output-format json` returned rc 1 with a JSON `unknown_option` envelope, `skills show does-not-exist --output-format json` returned rc 1 with `skill_not_found`, and `plugins show does-not-exist --output-format json` returned rc 1 with `plugin_not_found`. Stdout was nonzero and parseable in all three cases; warnings stayed on stderr. Remaining actionable lesson is process-level: ROADMAP record #814 is preserved as historical repro + verification, not an open product blocker.
815. **`claw --output-format json config` reports the same deprecated-settings warning twice: once structurally in `warnings[]` and once as prose on stderr** — dogfooded 2026-05-28 08:00 on current `main` after #3188. `timeout 5s cargo run -q --bin claw -- --output-format json config >out 2>err` exits 0 with parseable stdout JSON (`kind:"config", action:"list", status:"ok"`) and `warnings.length == 1`, but stderr still contains the same `enabledPlugins is deprecated` warning once. This is better than older stdout contamination, but still duplicates the same diagnostic across two channels in JSON mode. A machine consumer that reads the structured warning also sees an extra prose warning on stderr; a log scraper may count one config issue twice.
**Fix applied.** Current trailing JSON-mode routing reaches local inventory handlers for these argv-safe probes; handled JSON errors emit parseable stdout envelopes and do not require provider credentials/session startup.
**Verification.** Existing follow-up probe evidence records `agents list --bogus --output-format json`, `skills show does-not-exist --output-format json`, and `plugins show does-not-exist --output-format json` returning parseable JSON envelopes; current contract tests additionally cover agents/plugin local parse/error envelopes with stdout JSON.
815. **DONE — `claw --output-format json config` reports the same deprecated-settings warning twice: once structurally in `warnings[]` and once as prose on stderr** — dogfooded 2026-05-28 08:00 on current `main` after #3188. `timeout 5s cargo run -q --bin claw -- --output-format json config >out 2>err` exits 0 with parseable stdout JSON (`kind:"config", action:"list", status:"ok"`) and `warnings.length == 1`, but stderr still contains the same `enabledPlugins is deprecated` warning once. Current config JSON keeps that diagnostic structured without duplicating it on stderr.
**Required fix shape.** In JSON mode for config/list surfaces that already include `warnings[]`, suppress eager prose emission of the same config warning on stderr or mark it as already collected. Text mode should keep the human stderr warning. Add regression coverage asserting `claw --output-format json config` returns exactly one structured warning and zero duplicate `enabledPlugins` prose lines on stderr.
**Acceptance.** With a deprecated `enabledPlugins` key present, `claw --output-format json config` exits 0, stdout parses from byte 0 and includes `warnings[]`, and stderr has no duplicate deprecation warning for the same file/key. [SCOPE: claw-code]
816. **JSON-mode local/list surfaces still leak deprecated config prose warnings on stderr outside `config`** — dogfooded 2026-05-28 09:30 on `main` `89e7f415a` after #3190. `./target/debug/claw --output-format json config` is now fixed (`rc=0`, parseable stdout, `warnings[]`, stderr empty), but sibling JSON surfaces still emit the same app-level config warning to stderr when `~/.claw/settings.json` contains deprecated `enabledPlugins`: `plugins list` (`kind:"plugin"`), `mcp list` (`kind:"mcp"`), and `doctor` (`kind:"doctor"`) all return parseable JSON with `rc=0` while stderr contains `enabledPlugins is deprecated`. `skills list` and `version` stay clean. This leaves machine consumers with a global JSON-mode cleanliness gap even after the config-specific duplicate was fixed.
**Fix applied.** JSON-mode config rendering collects deprecated settings diagnostics into `warnings[]` and suppresses the duplicate prose `enabledPlugins` warning on stderr; text mode preserves the human stderr warning.
**Verification.** `config_json_reports_deprecations_structurally_without_stderr_duplicate_815` asserts a deprecated `enabledPlugins` fixture appears in JSON `warnings[]`, does not appear on stderr for `--output-format json config`, and still appears on stderr for text `config`.
816. **DONE — JSON-mode local/list surfaces still leak deprecated config prose warnings on stderr outside `config`** — dogfooded 2026-05-28 09:30 on `main` `89e7f415a` after #3190. `./target/debug/claw --output-format json config` was fixed, but sibling JSON surfaces still emitted the same app-level config warning to stderr when `~/.claw/settings.json` contained deprecated `enabledPlugins`: `plugins list`, `mcp list`, and `doctor`. Current global JSON-mode suppression covers these local/list surfaces.
**Required fix shape.** Treat JSON output mode as a global app-level diagnostic routing contract: local/list/status surfaces that successfully return structured JSON should not write config deprecation prose to stderr. Either collect those warnings into each relevant JSON envelope where a warnings field exists, or suppress config-warning emission during JSON-mode preloading/default resolution for surfaces that cannot represent warnings yet. Preserve human stderr warnings in text mode.
**Acceptance.** With deprecated `enabledPlugins` present, `claw --output-format json plugins list`, `claw --output-format json mcp list`, and `claw --output-format json doctor` exit 0, stdout parses from byte 0, and stderr contains zero `enabledPlugins is deprecated` app-level warning lines. Text mode still prints the warning. [SCOPE: claw-code]
817. **`claw --output-format json plugins list --` writes its JSON error envelope to stderr while sibling local inventory commands use stdout** — dogfooded 2026-05-28 12:30 on `main` `9494e3c26`. Trailing bare `--` is a useful parser edge because automation sometimes injects delimiter sentinels. `agents list --` and `skills list --` return rc 1 with parseable JSON on stdout and empty stderr. `mcp list --` also returns a parseable JSON error on stdout. `config --` returns rc 0 with a structured config error on stdout. But `plugins list --` returns rc 1, stdout empty, and writes the JSON error envelope to stderr: `{"action":"abort","error":"unknown option for `claw plugins list`: --", ...}`. This is machine-readable, but channel-inconsistent and surprising for JSON-mode consumers that read stdout for command payloads.
**Fix applied.** JSON-mode config-warning suppression is applied globally before local JSON surfaces load settings, covering sibling list/status/diagnostic commands while preserving text-mode stderr warnings.
**Verification.** `global_json_surfaces_suppress_config_deprecation_stderr_810_821_824` covers `plugins list`, `mcp list`, `doctor`, and additional JSON surfaces under a deprecated `enabledPlugins` fixture with empty stderr; `local_text_surface_preserves_config_deprecation_stderr_816` verifies text mode still emits the warning.
817. **DONE — `claw --output-format json plugins list --` writes its JSON error envelope to stderr while sibling local inventory commands use stdout** — dogfooded 2026-05-28 12:30 on `main` `9494e3c26`. Trailing bare `--` is a useful parser edge because automation sometimes injects delimiter sentinels. `plugins list --` returned rc 1, stdout empty, and wrote the JSON error envelope to stderr. Current plugin list parse-error routing matches sibling JSON inventory/local surfaces.
**Required fix shape.** Align `plugins list` parse-error routing with the other JSON inventory/local surfaces: in JSON mode, print the structured CLI error envelope to stdout and keep stderr empty for this handled parse error. Preserve text-mode stderr behavior. Add regression coverage for `claw --output-format json plugins list --` asserting rc 1, stdout parseable JSON with `error_kind:"cli_parse"`, and empty stderr.
**Acceptance.** `claw --output-format json plugins list --` exits 1, stdout parses from byte 0 as the existing JSON error envelope, stderr is empty, and text mode still reports the parse error to stderr. [SCOPE: claw-code]
818. **`AGENTS.md` and `.claude/CLAUDE.md` silently omitted from instruction file cascade** — dogfooded 2026-05-29 08:00. When a repo contains `AGENTS.md` (OpenAI Codex / multi-agent convention) or `.claude/CLAUDE.md` (scoped Claude Code convention), claw-code does not load either file as part of the instruction/context cascade on startup. Users following either convention discover this only by noticing their persona/context instructions have no effect — no warning, no missing-file diagnostic, no documentation note. This is a friction gap for any team migrating to or simultaneously using claw-code alongside Claude Code or Codex workflows, since the two most common non-CLAUDE.md instruction files are silently ignored.
**Fix applied.** The `plugins list` flag/filter guard emits handled JSON parse errors directly to stdout in JSON mode while keeping text-mode parse errors on stderr.
**Verification.** `plugins_list_trailing_dash_json_error_uses_stdout_817`, `plugins_list_trailing_dash_text_error_stays_on_stderr_817`, and `plugins_list_flag_shaped_filter_returns_cli_parse_on_stdout_793_817` cover rc 1, stdout JSON with `error_kind:"cli_parse"`, empty stderr in JSON mode, and preserved text stderr behavior.
818. **DONE — `AGENTS.md` and `.claude/CLAUDE.md` silently omitted from instruction file cascade** — dogfooded 2026-05-29 08:00. When a repo contains `AGENTS.md` (OpenAI Codex / multi-agent convention) or `.claude/CLAUDE.md` (scoped Claude Code convention), claw-code does not load either file as part of the instruction/context cascade on startup. Users following either convention discover this only by noticing their persona/context instructions have no effect — no warning, no missing-file diagnostic, no documentation note. This is a friction gap for any team migrating to or simultaneously using claw-code alongside Claude Code or Codex workflows, since the two most common non-CLAUDE.md instruction files are silently ignored.
**Required fix shape.** Add `AGENTS.md` (project root) and `.claude/CLAUDE.md` (`.claude/` subdirectory) to the instruction file cascade that already loads `CLAUDE.md`. Apply the same merge-and-precedence semantics as existing instruction files. Log a debug trace (not stderr noise) when either file is loaded. Add test coverage: a fixture repo with `AGENTS.md` only, `.claude/CLAUDE.md` only, and both present alongside `CLAUDE.md` should each have the relevant content visible in the resolved instruction context.
**Acceptance.** `claw` launched in a repo containing `AGENTS.md` or `.claude/CLAUDE.md` loads those files into the instruction context. No warning emitted for absent optional files. Existing `CLAUDE.md`-only repos unaffected. PR #3195. [SCOPE: claw-code]
819. **`claw --output-format json export --session <missing>` writes JSON error envelope to stderr, stdout empty** — dogfooded 2026-05-29 09:30 on `main` `37a9a543`. `claw --output-format json export --session does-not-exist` exits rc=1 with stdout length 0 and the full JSON error envelope on stderr: `{"action":"abort","error":"session not found: does-not-exist","error_kind":"session_not_found",...}`. This is the same channel-routing inconsistency class as #817 (plugins list trailing-dash, fixed in #3194): handled errors in JSON mode should go to stdout, not stderr, so machine consumers can parse the envelope from stdout byte 0 regardless of which surface triggered the error.
**Fix applied.** The instruction cascade now loads `AGENTS.md` and `.claude/CLAUDE.md` from the same ancestor walk that already loads `CLAUDE.md`, `CLAUDE.local.md`, `.claw/CLAUDE.md`, and `.claw/instructions.md`, preserving the existing merge/dedupe semantics and avoiding warnings for absent optional files.
**Verification.** `cargo test --manifest-path rust/Cargo.toml -p runtime discovers_agents_markdown_instruction_file -- --nocapture`; `cargo test --manifest-path rust/Cargo.toml -p runtime discovers_scoped_dot_claude_claude_markdown_instruction_file -- --nocapture`; `cargo test --manifest-path rust/Cargo.toml -p runtime discovers_claude_agents_and_dot_claude_instruction_files_together -- --nocapture`.
819. **DONE — `claw --output-format json export --session <missing>` writes JSON error envelope to stderr, stdout empty** — dogfooded 2026-05-29 09:30 on `main` `37a9a543`. `claw --output-format json export --session does-not-exist` exits rc=1 with stdout length 0 and the full JSON error envelope on stderr: `{"action":"abort","error":"session not found: does-not-exist","error_kind":"session_not_found",...}`. This is the same channel-routing inconsistency class as #817 (plugins list trailing-dash, fixed in #3194): handled errors in JSON mode should go to stdout, not stderr, so machine consumers can parse the envelope from stdout byte 0 regardless of which surface triggered the error.
**Required fix shape.** Align `export --session <missing>` error routing with the inventory surfaces fixed in #817: in JSON mode, write the `session_not_found` error envelope to stdout (rc=1) and keep stderr empty. Preserve text-mode behavior (stderr message). Add regression coverage asserting rc=1, stdout parseable JSON with `error_kind:"session_not_found"`, and empty stderr.
**Acceptance.** `claw --output-format json export --session does-not-exist` exits 1, stdout contains the JSON error envelope from byte 0, stderr is empty. Text mode still prints the error to stderr. [SCOPE: claw-code]
820. **`interactive_only` error class always routes JSON envelope to stderr (stdout empty)** — dogfooded 2026-05-29 10:00 on `main` `efe59c22`. All `interactive_only` errors share the same routing gap as #819 (`export --session <missing>`): `claw --output-format json session list`, `session switch <id>`, `session delete <id>`, and `session fork <id>` each exit rc=1, stdout empty, JSON envelope on stderr. The envelope is well-formed (`error_kind:"interactive_only"`, `hint:...`, `action:"abort"`) but the channel is wrong for JSON mode. Any surface that returns `interactive_only` is affected; these are all the `claw session` subcommands. This is the same root cause as #817 (plugins) and #819 (export): the top-level error handler writes `Err(...)` to stderr instead of routing to stdout when `--output-format json` is active.
**Fix applied.** The JSON abort handler now emits export/session-not-found envelopes on stdout in JSON mode while preserving text-mode stderr behavior. The explicit missing-session regression asserts rc 1, `error_kind:"session_not_found"`, abort envelope, and empty stderr.
**Verification.** `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli export_missing_session_json_error_uses_stdout_819 -- --nocapture`.
820. **DONE — `interactive_only` error class always routes JSON envelope to stderr (stdout empty)** — dogfooded 2026-05-29 10:00 on `main` `efe59c22`. All `interactive_only` errors shared the same routing gap as #819: JSON envelopes were well-formed but written to stderr. Current JSON abort handling routes `interactive_only` envelopes to stdout.
**Required fix shape.** In the top-level error handler (or the `interactive_only` classifier arm in `main.rs`), detect JSON output mode and write the structured error envelope to stdout (rc=1) instead of stderr. Scope the fix to the `interactive_only` error_kind so all affected surfaces are repaired in one pass. Add regression coverage for at least `claw --output-format json session list` asserting rc=1, stdout parseable JSON with `error_kind:"interactive_only"`, stderr empty.
**Acceptance.** All `claw --output-format json session <subcommand>` invocations exit 1 with the JSON envelope on stdout and empty stderr. Text mode continues to print the error to stderr. [SCOPE: claw-code]
821. **`status`, `sandbox`, and `system-prompt` in JSON mode still emit config deprecation warning to stderr** — dogfooded 2026-05-29 10:30 on `main` `42aff269`. After #816 fixed config deprecation stderr leakage for `plugins list`, `mcp list`, `doctor`, and `config`, three JSON-mode surfaces continue to emit the `enabledPlugins is deprecated` prose warning to stderr: `claw --output-format json status` (122 bytes stderr), `claw --output-format json sandbox` (122 bytes stderr), `claw --output-format json system-prompt` (122 bytes stderr). These surfaces return well-formed JSON on stdout (rc=0) but leak the config warning to stderr, leaving machine consumers with mixed-channel output. `version`, `acp`, `agents`, `skills`, `mcp`, `plugins`, and `doctor` all have clean stderr after #816.
**Fix applied.** The JSON abort handler now classifies `interactive_only` and prints the structured envelope to stdout in JSON mode; text-mode errors still use stderr.
**Verification.** Session/abort contract assertions in `output_format_contract.rs` around #819/#820/#823 require JSON-mode interactive-only failures to provide a stdout JSON envelope and no JSON envelope on stderr.
821. **DONE — `status`, `sandbox`, and `system-prompt` in JSON mode still emit config deprecation warning to stderr** — dogfooded 2026-05-29 10:30 on `main` `42aff269`. After #816 fixed config deprecation stderr leakage for `plugins list`, `mcp list`, `doctor`, and `config`, three JSON-mode surfaces continue to emit the `enabledPlugins is deprecated` prose warning to stderr: `claw --output-format json status` (122 bytes stderr), `claw --output-format json sandbox` (122 bytes stderr), `claw --output-format json system-prompt` (122 bytes stderr). These surfaces return well-formed JSON on stdout (rc=0) but leak the config warning to stderr, leaving machine consumers with mixed-channel output. `version`, `acp`, `agents`, `skills`, `mcp`, `plugins`, and `doctor` all have clean stderr after #816.
**Required fix shape.** Extend the JSON-mode config-warning suppression applied in #816 to cover `status`, `sandbox`, and `system-prompt`. The fix should apply globally: any JSON-mode surface that completes successfully should not emit config deprecation prose to stderr. Text mode should keep the human stderr warning.
@ -7853,56 +7904,110 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
**Follow-up (2026-05-29 12:00, `main` `3dbb35c3`).** Broader sweep confirms additional surfaces with the same 122-byte stderr leak in JSON mode: `--resume latest /config` (all subforms: bare, `env`, `hooks`, `model`, `plugins`) and `--resume latest /providers` (doctor alias). The fix must apply to all config-loading paths, not just the three originally documented surfaces. The suppression guard should fire at the settings-load level so any JSON-mode invocation benefits without per-surface patching.
822. **Unknown top-level subcommand falls through to REPL/provider startup instead of returning a `command_not_found` error** — dogfooded 2026-05-29 11:00 on `main` `69b59079`. `claw --output-format json foobar` does not return a structured `command_not_found` error; instead it falls through to the interactive/API path and hits `missing_credentials` (rc=1, stderr: `{"error_kind":"missing_credentials",...}`). Two gaps in one: (1) the unrecognized command word is silently treated as a prompt/text argument, not flagged as unknown, so the user gets a misleading "no credentials" error instead of "command not found"; (2) the resulting error goes to stderr. This makes automation scripts that probe for command availability impossible to distinguish from auth failures.
**Fix applied.** The warning-suppression regression matrix now covers `status`, `sandbox`, and `system-prompt` with deprecated `enabledPlugins` settings, asserting successful JSON, stdout JSON from byte 0, and empty stderr while preserving the existing text-mode warning assertion.
**Verification.** `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli global_json_surfaces_suppress_config_deprecation_stderr_810_821_824 -- --nocapture`; text-mode preservation remains covered by `local_text_surface_preserves_config_deprecation_stderr_816`.
822. **DONE — Unknown top-level subcommand falls through to REPL/provider startup instead of returning a `command_not_found` error** — dogfooded 2026-05-29 11:00 on `main` `69b59079`. `claw --output-format json foobar` returned `missing_credentials` after prompt/provider fallthrough instead of a structured `command_not_found`. Current command-shaped unknown tokens are rejected before provider startup.
**Required fix shape.** Before falling through to the REPL/prompt path, check whether the first positional arg matches any known subcommand. If not, return a typed error: `{"error_kind":"command_not_found","message":"unknown command: foobar","hint":"Run `claw --help` for available commands.","status":"error"}` on stdout (JSON mode, rc=1) or stderr (text mode). This mirrors the behavior of `--bogus-flag` (which correctly returns `cli_parse`) but for unknown positional commands.
**Acceptance.** `claw --output-format json foobar` exits 1, stdout contains JSON with `error_kind:"command_not_found"`, stderr empty. Text mode prints the error to stderr. No provider startup attempted. [SCOPE: claw-code]
823. **`claw --output-format json prompt` with missing/empty prompt text routes JSON error to stderr (stdout empty)** — dogfooded 2026-05-29 11:30 on `main` `3a76c4f4`. `claw --output-format json prompt` (no text) and `claw --output-format json prompt ""` (empty string) both exit rc=1, stdout empty, and write `{"error_kind":"missing_prompt","action":"abort",...}` to stderr. The envelope is well-formed but channel-inconsistent: JSON mode machine consumers reading stdout for command results get empty stdout and must check stderr to detect the error. This is the same class as #819 (export session-not-found) and #820 (interactive_only / session subcommands), and the same root cause: the top-level abort handler writes to stderr regardless of output-format mode.
**Fix applied.** Unknown command-shaped top-level tokens now trip the pre-provider `command_not_found:` guard, and the classifier maps that prefix to `error_kind:"command_not_found"` with JSON-mode output on stdout.
**Verification.** `unknown_subcommand_json_emits_command_not_found`, `unknown_subcommand_text_emits_command_not_found_on_stderr`, `unknown_subcommand_typo_with_suggestions_json_emits_command_not_found`, and updated `unknown_subcommand_returns_typed_kind_785` cover JSON stdout, text stderr, suggestion hints, and no `missing_credentials` fallthrough.
823. **DONE — `claw --output-format json prompt` with missing/empty prompt text routes JSON errors to stdout with empty stderr** — dogfooded 2026-05-29 11:30 on `main` `3a76c4f4`. `claw --output-format json prompt` (no text) and `claw --output-format json prompt ""` (empty string) both exited rc=1, stdout empty, and wrote `{"error_kind":"missing_prompt","action":"abort",...}` to stderr. The envelope was well-formed but channel-inconsistent: JSON mode machine consumers reading stdout for command results got empty stdout and had to check stderr to detect the error. This is the same class as #819 (export session-not-found) and #820 (interactive_only / session subcommands), and the same root cause: the top-level abort handler wrote to stderr regardless of output-format mode.
**Required fix shape.** In JSON mode, route `missing_prompt` abort errors to stdout (rc=1) and keep stderr empty. This is the same fix pattern as #817/#819/#820: detect JSON output mode in the abort handler and redirect the structured envelope to stdout. Add regression coverage for `claw --output-format json prompt` (no arg) and `claw --output-format json prompt ""` asserting rc=1, stdout parseable JSON with `error_kind:"missing_prompt"`, stderr empty.
**Acceptance.** Both invocations exit 1 with JSON envelope on stdout and empty stderr. Text mode still prints to stderr. [SCOPE: claw-code]
824. **Global settings-load deprecation warning still leaks to stderr in JSON mode for `status`, `sandbox`, `system-prompt`, `skills`, `mcp`, `agents` surfaces** — dogfooded 2026-05-29 13:30 on `main` `b4b1ba10`. After #816 and #821 (doc), the `enabledPlugins is deprecated` config warning still reaches stderr on every JSON-mode surface that loads settings: `claw --output-format json status`, `sandbox`, `system-prompt`, `mcp list`, `skills list`, `agents list` all emit `warning: /path/.claw/settings.json: field "enabledPlugins" is deprecated (line 2)...` to stderr. Root cause: `emit_config_warning_once()` in `runtime/src/config.rs` always uses `eprintln!` with no output-format awareness. The `config` surface avoids the duplicate by collecting warnings into a structured `warnings[]` field, but all other surfaces hit the raw `eprintln!` path.
**Fix applied.** The top-level JSON abort handler now emits structured error envelopes to stdout, so both missing `prompt` text paths keep the existing `missing_prompt` classification while preserving empty stderr. Regression coverage now asserts `claw --output-format json prompt` and `claw --output-format json prompt ""` exit rc=1, parse stdout JSON with `error_kind:"missing_prompt"`, and leave stderr empty.
**Verification.** `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli prompt_no_arg_json_error_kind_750 -- --nocapture`; `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli prompt_empty_arg_json_stdout_missing_prompt_823 -- --nocapture`.
824. **DONE — Global settings-load deprecation warning still leaks to stderr in JSON mode for `status`, `sandbox`, `system-prompt`, `skills`, `mcp`, `agents` surfaces** — dogfooded 2026-05-29 13:30 on `main` `b4b1ba10`. After #816 and #821 (doc), the `enabledPlugins is deprecated` config warning still reaches stderr on every JSON-mode surface that loads settings: `claw --output-format json status`, `sandbox`, `system-prompt`, `mcp list`, `skills list`, `agents list` all emit `warning: /path/.claw/settings.json: field "enabledPlugins" is deprecated (line 2)...` to stderr. Root cause: `emit_config_warning_once()` in `runtime/src/config.rs` always uses `eprintln!` with no output-format awareness. The `config` surface avoids the duplicate by collecting warnings into a structured `warnings[]` field, but all other surfaces hit the raw `eprintln!` path.
**Required fix shape.** Add a global `SUPPRESS_CONFIG_WARNINGS_STDERR: AtomicBool` flag in `config.rs`. Set it to `true` immediately when `--output-format json` is detected in `main.rs` (before any settings load). Gate `emit_config_warning_once` on that flag. Text-mode invocations continue to print to stderr; JSON-mode invocations silently suppress the prose warning (warnings remain available via structured `config` output).
**Acceptance.** With deprecated `enabledPlugins` in `~/.claw/settings.json`, all JSON-mode surfaces (`status`, `sandbox`, `system-prompt`, `mcp list`, `skills list`, `agents list`, plus all `--resume /config*` forms) exit with empty stderr. Text-mode output is unchanged. [SCOPE: claw-code]
825. **Unknown single-word subcommand falls through to provider startup and surfaces `missing_credentials` instead of `command_not_found`** — dogfooded 2026-05-29 14:00 on `main` `de7edd5b`. `claw foobar` (and `claw --output-format json foobar`) hit the `looks_like_subcommand_typo` guard, which checked for close fuzzy matches but fell through silently when no suggestions matched. The fallthrough routed to `CliAction::Prompt`, triggering Anthropic provider startup and a misleading `missing_credentials` error (or burning API tokens if credentials were present). The `command_not_found` error kind existed in the registry but was never emitted by this path.
**Fix applied.** The focused matrix now exercises the global settings-load path for `status`, `sandbox`, `system-prompt`, `mcp list`, `skills list`, `agents list`, and a generated-session resume `/config` invocation under deprecated `enabledPlugins`, requiring empty stderr for every JSON-mode surface.
**Verification.** `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli global_json_surfaces_suppress_config_deprecation_stderr_810_821_824 -- --nocapture`; `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli local_text_surface_preserves_config_deprecation_stderr_816 -- --nocapture`.
825. **DONE — Unknown single-word subcommand falls through to provider startup and surfaces `missing_credentials` instead of `command_not_found`** — dogfooded 2026-05-29 14:00 on `main` `de7edd5b`. `claw foobar` (and `claw --output-format json foobar`) hit the `looks_like_subcommand_typo` guard, then fell through to provider startup when no suggestions matched. Current code emits `command_not_found` for this path.
**Required fix shape.** When `looks_like_subcommand_typo` fires on a single-word positional arg with no close suggestions, emit `command_not_found:` rather than falling through. Add `command_not_found:` prefix classifier to `classify_error_kind`. Result: clean `{"error_kind":"command_not_found",...}` envelope on stdout (JSON mode), error on stderr (text mode), zero provider startup.
**Acceptance.** `claw --output-format json foobar` exits 1, stdout `error_kind:"command_not_found"`, stderr empty, no Anthropic call. Typo with suggestions (`claw statuz`) also gets `command_not_found` plus `hint` with suggestions. [SCOPE: claw-code]
826. **Multi-word unknown subcommand still falls through to `missing_credentials`** — dogfooded 2026-05-29 14:38 on `main` `70d64be0`. After #825 fixed single-word unknown subcommands, multi-word invocations (`claw foobar baz`) are still undetected: the `looks_like_subcommand_typo` guard only fires when `rest.len() == 1`. When there are two or more positional args, the first word is treated as a prompt and all args join into a prompt string → provider startup → `missing_credentials`. Same misleading-error class as #825 but for multi-word cases.
**Fix applied.** The `looks_like_subcommand_typo` path emits `command_not_found:` for unknown single-word command-shaped tokens even when there are no close suggestions, and typo suggestions are unified under the same typed error kind.
**Verification.** `unknown_subcommand_json_emits_command_not_found`, `unknown_subcommand_text_emits_command_not_found_on_stderr`, `unknown_subcommand_typo_with_suggestions_json_emits_command_not_found`, and classifier coverage in `classify_error_kind_returns_correct_discriminants` verify `command_not_found` instead of `missing_credentials` before provider startup.
826. **DONE — Multi-word unknown subcommand still falls through to `missing_credentials`** — dogfooded 2026-05-29 14:38 on `main` `70d64be0`. After #825 fixed single-word unknown subcommands, multi-word invocations (`claw foobar baz`) are still undetected: the `looks_like_subcommand_typo` guard only fires when `rest.len() == 1`. When there are two or more positional args, the first word is treated as a prompt and all args join into a prompt string → provider startup → `missing_credentials`. Same misleading-error class as #825 but for multi-word cases.
**Required fix shape.** Extend the command-not-found guard to also fire when `rest.len() > 1` and `rest[0]` passes `looks_like_subcommand_typo` but does not match any known subcommand. The multi-arg case should also emit `command_not_found` — with a note that if literal multi-word prompt was intended, use `claw prompt <text>` or `echo 'text' | claw`.
**Acceptance.** `claw --output-format json foobar baz` exits 1, stdout `error_kind:"command_not_found"`, stderr empty, no provider startup. `claw "write a haiku"` (valid prompt passthrough) is unaffected. [SCOPE: claw-code]
827. **`--resume <session> /unknown-slash-command` emits `error_kind:"unknown"` instead of a typed kind** — dogfooded 2026-05-29 14:58 on `main` `d47b0151`. `claw --resume latest --output-format json /bogus` exits rc=2 with `{"error_kind":"unknown",...}` on stdout (correct channel, correct rc) but the opaque `"unknown"` kind gives machine consumers no way to distinguish "unrecognized slash command" from other error classes. The error has a useful `hint` with suggestions, but the `error_kind` field is `"unknown"` across all unrecognized resume slash commands.
**Fix applied.** JSON-mode command-shaped unknown subcommands now emit `command_not_found:` before provider startup even when additional tokens follow. Text-mode multi-word prompt shorthand remains available, but JSON automation no longer turns `claw --output-format json foobar baz` into a credential-gated prompt request.
**Verification.** `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli multi_word_unknown_subcommand_json_emits_command_not_found_826 -- --nocapture`; `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli unknown_subcommand_json_emits_command_not_found -- --nocapture`; direct probe `cargo run --manifest-path rust/Cargo.toml -q -p rusty-claude-cli -- --output-format json foobar baz`.
827. **DONE — `--resume <session> /unknown-slash-command` emits `error_kind:"unknown"` instead of a typed kind** — dogfooded 2026-05-29 14:58 on `main` `d47b0151`. `claw --resume latest --output-format json /bogus` exits rc=2 with `{"error_kind":"unknown",...}` on stdout (correct channel, correct rc) but the opaque `"unknown"` kind gives machine consumers no way to distinguish "unrecognized slash command" from other error classes. The error has a useful `hint` with suggestions, but the `error_kind` field is `"unknown"` across all unrecognized resume slash commands.
**Required fix shape.** Introduce a typed `error_kind` for unrecognized slash commands (e.g. `unknown_slash_command` or `command_not_found`). Update the JSON emit in the `--resume` unknown-command handler to use the typed kind. Add regression coverage asserting the typed kind.
**Acceptance.** `claw --resume latest --output-format json /bogus` exits rc=2, stdout `error_kind:"unknown_slash_command"` (or similar typed constant), stderr empty. [SCOPE: claw-code]
828. **`/approve` and `/deny` outside REPL emit `unknown_slash_command` instead of `interactive_only`** — dogfooded 2026-05-29 16:05 on `main` `9d05573f`. `claw --output-format json /approve` exited rc=1 with `error_kind:"unknown_slash_command"` — these are valid REPL-only slash commands but are not `SlashCommand` enum variants, so they fell through to `format_unknown_direct_slash_command`. Machine consumers saw the wrong error class.
**Fix applied.** Unknown direct and resumed slash commands now use the classifier-friendly `unknown_slash_command:` prefix, so the JSON resume-command error path emits `error_kind:"unknown_slash_command"` instead of falling back to `unknown`. Direct slash command coverage and the new resumed-session regression both assert stdout JSON and empty stderr.
**Verification.** `cargo fmt --manifest-path rust/Cargo.toml --all -- --check`; `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli direct_unknown_slash_command_emits_typed_error_kind -- --nocapture`; `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli resume_unknown_slash_command_emits_typed_error_kind_827 -- --nocapture`; direct probe `cargo run --manifest-path rust/Cargo.toml -q -p rusty-claude-cli -- --resume .claw/sessions/491726c4e6bde42d/session-1778832949219-0.jsonl --output-format json /boguscommand`.
828. **DONE — `/approve` and `/deny` outside REPL emit `unknown_slash_command` instead of `interactive_only`** — dogfooded 2026-05-29 16:05 on `main` `9d05573f`. `claw --output-format json /approve` exited rc=1 with `error_kind:"unknown_slash_command"` — these are valid REPL-only slash commands but are not `SlashCommand` enum variants, so they fell through to `format_unknown_direct_slash_command`. Machine consumers saw the wrong error class.
**Fix applied.** `SlashCommand::Unknown` arm now special-cases `approve | yes | y | deny | no | n` and emits `interactive_only:` prefix before falling through to `format_unknown_direct_slash_command`. Both `error_kind` and hint are correct.
**Acceptance.** `claw --output-format json /approve` exits rc=1, stdout `error_kind:"interactive_only"`, stderr empty. [SCOPE: claw-code]
829. **`interactive_only` hint incorrectly suggests `--resume` for commands that are not resume-safe** — dogfooded 2026-05-29 16:40 on `main` `187aebd7`. `claw --output-format json /commit` hint says `"use claw --resume SESSION.jsonl /commit"` but `/commit` is not resume-safe (no `[resume]` marker in `/help`). Same for `/pr`, `/issue`, `/bughunter`, `/ultraplan`. The generic `interactive_only` hint template does not distinguish resume-safe from live-REPL-only commands, so it always suggests `--resume` regardless. Users who follow the hint will get `interactive_only` again.
**Verification.** `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli approve_deny_outside_repl_emits_interactive_only -- --nocapture`; direct probes `cargo run --manifest-path rust/Cargo.toml -q -p rusty-claude-cli -- --output-format json /approve` and `cargo run --manifest-path rust/Cargo.toml -q -p rusty-claude-cli -- --output-format json /deny`.
829. **DONE — `interactive_only` hint incorrectly suggests `--resume` for commands that are not resume-safe** — dogfooded 2026-05-29 16:40 on `main` `187aebd7`. `claw --output-format json /commit` hint says `"use claw --resume SESSION.jsonl /commit"` but `/commit` is not resume-safe (no `[resume]` marker in `/help`). Same for `/pr`, `/issue`, `/bughunter`, `/ultraplan`. The generic `interactive_only` hint template does not distinguish resume-safe from live-REPL-only commands, so it always suggests `--resume` regardless. Users who follow the hint will get `interactive_only` again.
**Required fix shape.** The generic `interactive_only:` message formatter (line ~1745 in `main.rs`) currently always appends `or use claw --resume SESSION.jsonl {command_name}`. This should be conditioned on whether the slash command appears in the resume-safe command list. Non-resume-safe interactive commands should only say `Start claw and run it there.`
**Acceptance.** `claw --output-format json /commit` hint does NOT mention `--resume`. `claw --output-format json /status` (resume-safe) hint still mentions `--resume`. [SCOPE: claw-code]
830. **`claw mcp show` (missing server name arg) emits `error_kind:"unknown_mcp_action"` instead of `missing_argument`** — dogfooded 2026-05-29 17:00 on `main` `ac5b19de`. `claw --output-format json mcp show` (no server name supplied) exits with `error_kind:"unknown_mcp_action"`. However `show` IS a known MCP action — the error is a missing required argument (server name), not an unknown action. Machine consumers inspecting `error_kind` cannot distinguish "I don't know this action" from "I know this action but a required arg is missing".
**Fix applied.** The direct slash-command guidance now consults the shared `resume_supported` command metadata before adding a `--resume` remediation. Non-resume-safe commands such as `/commit`, `/pr`, `/issue`, `/bughunter`, and `/ultraplan` only point users to the live REPL, while resume-safe commands keep the `--resume SESSION.jsonl` hint.
**Verification.** `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli non_resume_safe_interactive_only_hint_omits_resume_suggestion -- --nocapture`; `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli resume_safe_interactive_only_hint_includes_resume_suggestion -- --nocapture`; direct probes `cargo run --manifest-path rust/Cargo.toml -q -p rusty-claude-cli -- --output-format json /commit` and `cargo run --manifest-path rust/Cargo.toml -q -p rusty-claude-cli -- --output-format json /status`.
830. **DONE — `claw mcp show` (missing server name arg) emits `error_kind:"unknown_mcp_action"` instead of `missing_argument`** — dogfooded 2026-05-29 17:00 on `main` `ac5b19de`. `claw --output-format json mcp show` (no server name supplied) exits with `error_kind:"unknown_mcp_action"`. However `show` IS a known MCP action — the error is a missing required argument (server name), not an unknown action. Machine consumers inspecting `error_kind` cannot distinguish "I don't know this action" from "I know this action but a required arg is missing".
**Required fix shape.** The MCP subcommand parser should detect `show` with no following token and emit `missing_argument: mcp show requires a server name.\nUsage: claw mcp show <server>` with a distinct `error_kind`. Update the classifier arm to return `missing_argument` for this prefix.
**Acceptance.** `claw --output-format json mcp show` exits rc=1, stdout `error_kind:"missing_argument"`, stderr empty. Hint contains usage example. [SCOPE: claw-code]
**Fix applied.** `mcp show` without a server name now emits a typed `missing_argument` response instead of reusing `unknown_mcp_action`. The direct JSON path returns `{kind:"mcp", action:"show", status:"error", error_kind:"missing_argument"}` with a usage hint on stdout and an empty stderr stream; the slash-command parser also classifies `/mcp show` as `missing_argument` via the shared error-kind classifier.
**Verification.** `cargo fmt --manifest-path rust/Cargo.toml --all -- --check`; `cargo test --manifest-path rust/Cargo.toml -p commands mcp -- --nocapture`; `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli mcp_show_missing_server_name_returns_missing_argument_830 -- --nocapture`; `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli classify_error_kind_returns_correct_discriminants -- --nocapture`; direct probe `cargo run --manifest-path rust/Cargo.toml -q -p rusty-claude-cli -- --output-format json mcp show`.
831. **DONE — Direct resume-safe slash commands route to `interactive_only` instead of local JSON actions** — PR #3205 showed that direct slash invocations such as `claw --output-format json /status`, `/diff`, `/version`, `/doctor`, and `/sandbox` were parsed successfully as resume-safe slash commands, but the direct CLI parser still fell through to generic `interactive_only` guidance instead of dispatching to the same pure-local `CliAction` handlers as the bare subcommands.
**Required fix shape.** In the direct slash CLI parser, map resume-safe local slash command variants to the corresponding local `CliAction` variants. Preserve non-resume-safe slash command guidance from #829.
**Acceptance.** `claw --output-format json /version`, `/sandbox`, `/diff`, and `/status` succeed with their expected local JSON `kind` and environment-dependent local `status`, stdout JSON, and empty stderr; non-resume-safe slash commands still emit `interactive_only` without bogus local routing. [SCOPE: claw-code]
**Fix applied.** `parse_direct_slash_cli_action` now routes `/status`, `/diff`, `/version`, `/doctor`, and `/sandbox` directly to the same local `CliAction` variants as `status`, `diff`, `version`, `doctor`, and `sandbox`. The generic `interactive_only` branch remains the fallback for valid but live-REPL-only slash commands, preserving the #829 non-resume-safe hint behavior.
**Verification.** `cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli direct_resume_safe_slash_commands_route_to_local_json_actions_831 -- --nocapture`.
832. **DONE — roadmap-next-id helper missing explicit ROADMAP path behavior lacked regression coverage** — follow-up to #725 and PR #3117 after dogfood showed `scripts/roadmap-next-id.sh /tmp/nonexistent-roadmap` already failed correctly but the helper tests did not pin that behavior. The missing-path case is important because docs-only PRs can otherwise regress back to printing a next id for an absent explicit file.
**Fix applied.** Added `test_roadmap_next_id_fails_when_explicit_roadmap_path_is_missing`, proving an explicit missing ROADMAP path exits nonzero, keeps stdout empty, and reports both `ROADMAP not found` and the requested path on stderr. Added `tests/__init__.py` so `python3 -m unittest tests.test_roadmap_helpers` resolves this repository's tests package consistently.
**Verification.** `python3 -m unittest tests.test_roadmap_helpers`; `scripts/roadmap-check-ids.sh`; `scripts/roadmap-next-id.sh`.

150
USAGE.md
View File

@ -51,26 +51,27 @@ cd rust
```
**Note:** Diagnostic verbs (`doctor`, `status`, `sandbox`, `version`) support `--output-format json` for machine-readable output. Invalid suffix arguments (e.g., `--json`) are now rejected at parse time rather than falling through to prompt dispatch.
`version --output-format json` reports structured build provenance including full `git_sha`, derived `git_sha_short`, `is_dirty`, `branch`, `commit_date`, `commit_timestamp`, `rustc_version`, runtime `executable_path`, and `binary_provenance`; JSON keeps the prose report in `human_readable` instead of duplicating it under `message`. `status --output-format json` exposes `workspace.memory_files[]` with `path`, `source`, `origin`, `scope_path`, `outside_project`, `chars`, and `contributes` for every loaded project memory file.
### Initialize a repository
Set up a new repository with `.claw` config, `.claw.json`, `.gitignore` entries, and a `CLAUDE.md` guidance file:
Set up a new repository with `.claw/settings.json`, `.claw.json`, `.gitignore` entries, and a `CLAUDE.md` guidance file:
```bash
cd /path/to/your/repo
./target/debug/claw init
```
Text mode (human-readable) shows artifact creation summary with project path and next steps. Idempotent — running multiple times in the same repo marks already-created files as "skipped".
Text mode (human-readable) shows artifact creation summary with project path and next steps. Idempotent — running multiple times in the same repo marks already-created files as "skipped", reports `.claw/` as "partial" when missing sub-files are materialized, and keeps `.claw/sessions/` deferred until the first successful session save.
JSON mode for scripting:
```bash
./target/debug/claw init --output-format json
```
Returns structured output with `project_path`, `created[]`, `updated[]`, `skipped[]` arrays (one per artifact), and `artifacts[]` carrying each file's `name` and machine-stable `status` tag. The legacy `message` field preserves backward compatibility.
Returns structured output with `project_path`, `created[]`, `updated[]`, `partial[]`, `deferred[]`, and `skipped[]` arrays (one per artifact status), and `artifacts[]` carrying each file's `name` and machine-stable `status` tag. The legacy `message` field preserves backward compatibility.
**Why structured fields matter:** Claws can detect per-artifact state (`created` vs `updated` vs `skipped`) without substring-matching human prose. Use the `created[]`, `updated[]`, and `skipped[]` arrays for conditional follow-up logic (e.g., only commit if files were actually created, not just updated).
**Why structured fields matter:** Claws can detect per-artifact state (`created`, `updated`, `partial`, `deferred`, or `skipped`) without substring-matching human prose. Use the status arrays for conditional follow-up logic (e.g., only commit if files were actually created, not just updated).
### Interactive REPL
@ -86,6 +87,12 @@ cd rust
./target/debug/claw prompt "summarize this repository"
```
Pipe prompt text through stdin when automation already produces the prompt body:
```bash
printf 'summarize this repository\n' | ./target/debug/claw prompt --output-format json
```
### Shorthand prompt mode
```bash
@ -93,6 +100,12 @@ cd rust
./target/debug/claw "explain rust/crates/runtime/src/lib.rs"
```
Use the POSIX `--` end-of-flags separator when the shorthand prompt itself begins with `-` or `--`:
```bash
./target/debug/claw -- "-summarize this dash-prefixed text"
```
### JSON output for scripting
```bash
@ -187,17 +200,24 @@ cd rust
./target/debug/claw --permission-mode read-only prompt "summarize Cargo.toml"
./target/debug/claw --permission-mode workspace-write prompt "update README.md"
./target/debug/claw --allowedTools read,glob "inspect the runtime crate"
./target/debug/claw --cwd ../other-workspace status --output-format json
```
Supported permission modes:
Global workspace override flags: `--cwd PATH`, `-C PATH`, and `--directory PATH` are accepted before any subcommand. They are validated before command dispatch and take precedence over the process `$PWD`; invalid paths return typed `invalid_cwd` JSON errors in JSON mode.
- `read-only`
- `workspace-write`
- `danger-full-access`
`--allowedTools` accepts canonical snake_case tool names (for example `read_file`, `glob_search`, `web_fetch`) plus documented aliases such as `read`, `glob`, `Read`, and `WebFetch`. `claw status --output-format json` exposes `allowed_tools.available` and `allowed_tools.aliases`, and invalid values return typed `invalid_tool_name` JSON with `tool_name`, `available`, and `tool_aliases`. A missing value before a subcommand or another flag returns `missing_argument` with `argument:"--allowedTools"`.
`--output-format` accepts `text` or `json` case-insensitively and normalizes to the canonical lowercase modes. `CLAW_OUTPUT_FORMAT=json` sets the default output format for scripts, while an explicit `--output-format` flag takes precedence. Repeating the flag emits a stderr warning and JSON status envelopes expose `format_source`, `format_raw`, and `format_overridden` so composed flag arrays are auditable; invalid values return typed `invalid_output_format` JSON with `value` and `expected:["text","json"]`.
Supported permission modes (default: `workspace-write`):
- `read-only` allows inspection-only local tools such as file reads, glob/grep searches, local skills, and status-style reporting. It does not allow workspace mutation, network-fetch/search tools, or arbitrary command execution.
- `workspace-write` is the safe default. It allows reads plus direct file-editing tools inside the current workspace, including write/edit/notebook/config/plan-mode updates, while still gating network-fetch/search tools, arbitrary shell execution, subagent launches, REPL subprocesses, and other full-access tools behind an explicit escalation.
- `danger-full-access` allows every registered tool requirement, including arbitrary command execution, web fetch/search, subagent launches, subprocess REPLs, and unrestricted tool access. Select it only with an explicit `--permission-mode danger-full-access`, `--dangerously-skip-permissions`, `--skip-permissions`, env, or config opt-in.
Model aliases currently supported by the CLI:
- `opus``claude-opus-4-6`
- `opus``claude-opus-4-7`
- `sonnet``claude-sonnet-4-6`
- `haiku``claude-haiku-4-5-20251213`
@ -292,6 +312,18 @@ cd rust
./target/debug/claw --model "llama3.2" prompt "summarize this repository in one sentence"
```
For Ollama tags with punctuation (for example `qwen2.5-coder:7b`), `OPENAI_BASE_URL` selects the local OpenAI-compatible route even when `OPENAI_API_KEY` is unset:
```bash
export OPENAI_BASE_URL="http://127.0.0.1:11434/v1"
unset OPENAI_API_KEY
cd rust
./target/debug/claw --model "qwen2.5-coder:7b" prompt "reply with ready"
```
If the local server exposes a slash-containing model ID, prefix it with `local/` so Claw selects the OpenAI-compatible transport while sending the remainder verbatim on the wire: `--model "local/Qwen/Qwen3.6-27B-FP8"`.
### OpenRouter
```bash
@ -334,7 +366,7 @@ Reasoning variants (`qwen-qwq-*`, `qwq-*`, `*-thinking`) automatically strip `te
The OpenAI-compatible backend also serves as the gateway for **OpenRouter**, **Ollama**, and any other service that speaks the OpenAI `/v1/chat/completions` wire format — just point `OPENAI_BASE_URL` at the service.
**Model-name prefix routing:** If a model name starts with `openai/`, `gpt-`, `qwen/`, `qwen-`, `kimi/`, or `kimi-`, the provider is selected by the prefix regardless of which env vars are set. This prevents accidental misrouting to Anthropic when multiple credentials exist in the environment. For the default OpenAI API, `openai/` is a routing prefix and is stripped before the request hits the wire. For a custom `OPENAI_BASE_URL`, slash-containing OpenAI-compatible slugs (for example OpenRouter-style `openai/gpt-4.1-mini`) are preserved so the gateway receives the model ID it expects.
**Model-name prefix routing:** If a model name starts with `openai/`, `local/`, `gpt-`, `qwen/`, `qwen-`, `kimi/`, or `kimi-`, the provider is selected by the prefix regardless of which env vars are set. This prevents accidental misrouting to Anthropic when multiple credentials exist in the environment. For the default OpenAI API and local/private OpenAI-compatible endpoints, `openai/` is a routing prefix and is stripped before the request hits the wire. For non-local custom `OPENAI_BASE_URL` gateways, slash-containing OpenAI-compatible slugs (for example OpenRouter-style `openai/gpt-4.1-mini`) are preserved so the gateway receives the model ID it expects. The `local/` prefix is an explicit escape hatch for local slash-containing model IDs: it is stripped while the rest of the model ID is sent verbatim.
### Tested models and aliases
@ -342,17 +374,19 @@ These are the models registered in the built-in alias table with known token lim
| Alias | Resolved model name | Provider | Max output tokens | Context window |
|---|---|---|---|---|
| `opus` | `claude-opus-4-6` | Anthropic | 32 000 | 200 000 |
| `opus` | `claude-opus-4-7` | Anthropic | 32 000 | 200 000 |
| `sonnet` | `claude-sonnet-4-6` | Anthropic | 64 000 | 200 000 |
| `haiku` | `claude-haiku-4-5-20251213` | Anthropic | 64 000 | 200 000 |
| `grok` / `grok-3` | `grok-3` | xAI | 64 000 | 131 072 |
| `grok-mini` / `grok-3-mini` | `grok-3-mini` | xAI | 64 000 | 131 072 |
| `grok-2` | `grok-2` | xAI | — | — |
| `kimi` | `kimi-k2.5` | DashScope | 16 384 | 256 000 |
| `qwen-max` | `qwen-max` | DashScope | 8 192 | 131 072 |
| `qwen-plus` | `qwen-plus` | DashScope | 8 192 | 131 072 |
| `gpt-4.1` / `gpt-4.1-mini` / `gpt-4.1-nano` | same | OpenAI-compatible | 32 768 | 1 047 576 |
| `gpt-5.4` / `gpt-5.4-mini` / `gpt-5.4-nano` | same | OpenAI-compatible | 128 000 | 1 000 000 / 400 000 |
Any model name that does not match an alias is passed through verbatim after provider routing is resolved. This is how you use OpenRouter model slugs (`openai/gpt-4.1-mini` with a custom `OPENAI_BASE_URL`), Ollama tags (`llama3.2`), or full Anthropic model IDs (`claude-sonnet-4-20250514`).
Any model name that does not match an alias is passed through verbatim after provider routing is resolved. This is how you use OpenRouter model slugs (`openai/gpt-4.1-mini` with a custom `OPENAI_BASE_URL`), Ollama tags (`llama3.2` or `qwen2.5-coder:7b`), slash-containing local IDs (`local/Qwen/Qwen3.6-27B-FP8`), or full Anthropic model IDs (`claude-sonnet-4-20250514`).
### User-defined aliases
@ -362,7 +396,7 @@ You can add custom aliases in any settings file (`~/.claw/settings.json`, `.claw
{
"aliases": {
"fast": "claude-haiku-4-5-20251213",
"smart": "claude-opus-4-6",
"smart": "claude-opus-4-7",
"cheap": "grok-3-mini"
}
}
@ -370,13 +404,15 @@ You can add custom aliases in any settings file (`~/.claw/settings.json`, `.claw
Local project settings override user-level settings. Aliases resolve through the built-in table, so `"fast": "haiku"` also works.
Model selection precedence is CLI flag, environment, config, then default. The environment model slot accepts `CLAW_MODEL`, `ANTHROPIC_MODEL`, and `ANTHROPIC_DEFAULT_MODEL` in that order; aliases from those variables are resolved and validated before provider startup. `claw --output-format json status` exposes `model_raw`, `model_alias_resolved_to`, and `model_env_var` so automation can see the winning value.
### How provider detection works
1. If the resolved model name starts with `claude` → Anthropic.
2. If it starts with `grok` → xAI.
3. If it starts with `openai/` or `gpt-` → OpenAI-compatible.
3. If it starts with `openai/`, `local/`, or `gpt-` → OpenAI-compatible.
4. If it starts with `qwen/`, `qwen-`, `kimi/`, or `kimi-` → DashScope-compatible OpenAI wire format.
5. If `OPENAI_BASE_URL` and `OPENAI_API_KEY` are set, unknown model names route to the OpenAI-compatible client for local/gateway servers.
5. If `OPENAI_BASE_URL` is set, local-looking unknown model names such as `llama3.2` or `qwen2.5-coder:7b` route to the OpenAI-compatible client for local/gateway servers.
6. Otherwise, `claw` checks which credential is set: Anthropic first, then OpenAI, then xAI. If only `OPENAI_BASE_URL` is set, it still routes to OpenAI-compatible for authless local servers.
7. If nothing matches, it defaults to Anthropic.
@ -417,6 +453,9 @@ The name "codex" appears in the Claw Code ecosystem but it does **not** refer to
export HTTPS_PROXY="http://proxy.corp.example:3128"
export HTTP_PROXY="http://proxy.corp.example:3128"
export NO_PROXY="localhost,127.0.0.1,.corp.example"
export CLAW_OUTPUT_FORMAT="json" # default non-interactive output format; flags override it
export CLAW_LOG="debug" # claw-specific log level selector surfaced by help/doctor
export RUST_LOG="claw=debug" # Rust logging convention surfaced by help/doctor
cd rust
./target/debug/claw prompt "hello via the corporate proxy"
@ -452,11 +491,12 @@ let client = build_http_client_with(&config).expect("proxy client");
## Skills
Use `/skills list` in the interactive REPL or `claw skills --output-format json` from the direct CLI to inspect installed skills. For offline/local installs, install the directory that contains `SKILL.md`, then verify the discovered name before invoking it:
Use `/skills list` in the interactive REPL or `claw skills --output-format json` from the direct CLI to inspect installed skills. For offline/local installs, install the directory that contains `SKILL.md`, then verify the discovered name before invoking it. `skills install`, `skills uninstall`, and `agents create` are local filesystem lifecycle commands; they do not require provider credentials.
```text
/skills install /absolute/path/to/my-skill
/skills list
/skills uninstall my-skill
/skills my-skill
```
@ -469,6 +509,7 @@ cd rust
./target/debug/claw status
./target/debug/claw sandbox
./target/debug/claw agents
./target/debug/claw agents create my-agent
./target/debug/claw mcp
./target/debug/claw skills
./target/debug/claw system-prompt --cwd .. --date 2026-04-04
@ -488,6 +529,7 @@ git clone https://github.com/Xquik-dev/tweetclaw
cd claw-code/rust
./target/debug/claw skills install ../../tweetclaw/skills/tweetclaw
./target/debug/claw skills show tweetclaw
./target/debug/claw skills uninstall tweetclaw
```
TweetClaw gives `claw` users a local skill guide for OpenClaw/Xquik workflows
@ -495,6 +537,15 @@ such as tweet search, reply search, follower export, monitors, webhooks, and
approval-gated posting. Configure any Xquik credentials outside the prompt and
avoid pasting API keys into chat.
## Author a local agent
`claw agents create <name>` scaffolds a local `.claw/agents/<name>.toml` file for the current workspace. The scaffold is intentionally small so you can edit the description, model, and reasoning effort before listing or invoking agents:
```bash
./target/debug/claw agents create release-checker
./target/debug/claw agents list
```
## Session management
REPL turns are persisted under `.claw/sessions/` in the current workspace.
@ -517,6 +568,73 @@ Runtime config is loaded in this order, with later entries overriding earlier on
4. `<repo>/.claw/settings.json`
5. `<repo>/.claw/settings.local.json`
The list is also the precedence chain: project-local settings override project settings, project settings override the legacy project `.claw.json`, and project files override user files. `claw --output-format json config` includes each discovered file's `precedence_rank`, `wins_for_keys`, and `shadowed_keys` so automation can see which file controls each effective key without reimplementing the merge order.
## MCP server validation
`claw mcp --output-format json` loads valid `mcpServers` entries even when sibling entries are malformed. The JSON list envelope distinguishes the total configured entries from the valid and invalid subsets:
```json
{
"configured_servers": 1,
"total_configured": 2,
"valid_count": 1,
"invalid_count": 1,
"servers": [{ "name": "valid-server", "valid": true }],
"invalid_servers": [
{
"name": "missing-command",
"error_field": "command",
"reason": ".claw.json: mcpServers.missing-command: missing string field command",
"valid": false
}
]
}
```
`status --output-format json` mirrors this under `mcp_validation`, and `doctor --output-format json` includes an `mcp validation` check so automation can repair every rejected server entry without losing usable MCP servers.
## Hook configuration
`hooks.PreToolUse`, `hooks.PostToolUse`, and `hooks.PostToolUseFailure` accept either legacy command strings or object-style entries with a `matcher` and nested command hooks:
```json
{
"hooks": {
"PreToolUse": [
"echo legacy hook",
{
"matcher": "Bash",
"hooks": [
{ "type": "command", "command": "scripts/audit-bash.sh" }
]
}
]
}
}
```
Object-style matchers are optional. When present, they match tool names case-insensitively and support `*` wildcards plus comma or pipe separated alternatives. Nested hook `type` may be omitted or set to `"command"`; each nested command runs in configuration order.
## Project instruction rules
In addition to root instruction files such as `CLAUDE.md`, `CLAW.md`, `AGENTS.md`, `.claw/CLAUDE.md`, `.claude/CLAUDE.md`, and `.claw/instructions.md`, `claw` loads sorted Markdown/text rule files from:
- `<repo>/.claw/rules/` (`.md`, `.txt`, `.mdc`) for shared project rules.
- `<repo>/.claw/rules.local/` for personal local rules; this path is gitignored.
Root instruction-file priority is `CLAUDE.md`, then `CLAW.md`, then `AGENTS.md` for each discovered directory. Discovery is bounded to the current git root when one exists, otherwise to the current directory only, so stale parent files outside the project do not silently bleed into the prompt. All loaded files contribute to the system prompt and to `status --output-format json` as `workspace.memory_files:[{path, source, origin, scope_path, outside_project, chars, contributes}]`; `claw doctor --output-format json` includes a `memory` check so automation can detect loaded and unexpected unloaded memory-file candidates without parsing prompt text.
By default, `claw` also imports detected rules from common AI coding tools such as Cursor (`.cursorrules`, `.cursor/rules/`), GitHub Copilot (`.github/copilot-instructions.md`), Windsurf, Plandex, and Crush. Control this with `rulesImport` in any settings file:
```json
{
"rulesImport": "none"
}
```
Use `"auto"` (the default) to import every supported framework, `"none"` to load only Claw instruction/rules files, or an array such as `["cursor", "copilot"]` to import selected frameworks.
## Mock parity harness
The workspace includes a deterministic Anthropic-compatible mock service and parity harness.

View File

@ -148,12 +148,12 @@ pub const DEFAULT_DASHSCOPE_BASE_URL: &str = "https://dashscope.aliyuncs.com/com
**Affected models:** Slash-containing model IDs routed through the OpenAI-compatible provider, especially custom gateways configured with `OPENAI_BASE_URL` such as OpenRouter, local routers, or other `/v1/chat/completions` services.
**Behavior:**
- The default OpenAI API treats `openai/` as a routing prefix and sends the bare model name on the wire.
- Custom OpenAI-compatible base URLs preserve slash-containing slugs such as `openai/gpt-4.1-mini` so the gateway receives the exact model ID it expects.
- The default OpenAI API and local/private OpenAI-compatible base URLs treat `openai/` as a routing prefix and send the bare model name on the wire.
- Non-local custom OpenAI-compatible base URLs preserve slash-containing slugs such as `openai/gpt-4.1-mini` so gateways like OpenRouter receive the exact model ID they expect. Local slash-containing model IDs can use `local/`, which strips only that escape-hatch prefix and sends the remainder verbatim.
- `MessageRequest::extra_body` passes through custom request JSON after core fields are populated. This supports provider-specific options such as `web_search_options` and `parallel_tool_calls`.
- Protected core fields (`model`, `messages`, `stream`, `tools`, `tool_choice`, `max_tokens`, `max_completion_tokens`) cannot be overridden through `extra_body`.
**Testing:** See `custom_openai_gateway_preserves_slash_model_ids_and_extra_body_params` in `openai_compat_integration.rs` and `extra_body_params_are_passed_through_without_overriding_core_fields` in `openai_compat.rs`.
**Testing:** See `custom_openai_gateway_preserves_slash_model_ids_and_extra_body_params` in `openai_compat_integration.rs`, `wire_model_strips_openai_prefix_for_default_and_local_preserves_custom_gateways`, `local_routing_prefix_strips_only_escape_hatch`, and `extra_body_params_are_passed_through_without_overriding_core_fields` in `openai_compat.rs`.
## Implementation Details

View File

@ -13,7 +13,7 @@ If you need the most polished daily-driver experience for a specific non-Claude
## OpenAI-compatible routing basics
Set `OPENAI_BASE_URL` to the servers `/v1` endpoint and set `OPENAI_API_KEY` to either the required token or a harmless placeholder for local servers that expect an Authorization header. The model name must match what the server exposes.
Set `OPENAI_BASE_URL` to the servers `/v1` endpoint and set `OPENAI_API_KEY` to either the required token or a harmless placeholder for local servers that expect an Authorization header. Authless local/private OpenAI-compatible servers can leave `OPENAI_API_KEY` unset. The model name must match what the server exposes.
```bash
export OPENAI_BASE_URL="http://127.0.0.1:11434/v1"
@ -24,8 +24,8 @@ claw --model "qwen3:latest" prompt "Reply exactly HELLO_WORLD_123"
Routing notes:
- Use the `openai/` prefix for OpenAI-compatible gateways when you need prefix routing to win over ambient Anthropic credentials, for example `--model "openai/gpt-4.1-mini"` with OpenRouter.
- For local servers, prefer the exact model ID reported by the server (`qwen3:latest`, `llama3.2`, `Qwen/Qwen2.5-Coder-7B-Instruct`, etc.). If your local gateway exposes slash-containing IDs, use that exact slug.
- If you have multiple provider keys in your environment, remove unrelated keys while smoke-testing a local route or choose a model prefix that unambiguously selects the intended provider.
- For local servers, prefer the exact model ID reported by the server (`qwen3:latest`, `llama3.2`, etc.). If your local gateway exposes slash-containing IDs, prefix the exact slug with `local/` so Claw routes through OpenAI-compatible transport while sending the rest verbatim, for example `--model "local/Qwen/Qwen2.5-Coder-7B-Instruct"`.
- If you have multiple provider keys in your environment, `OPENAI_BASE_URL` plus local-looking tags such as `llama3.2` or `qwen2.5-coder:7b` selects the local OpenAI-compatible route; use `local/` for slash-containing local IDs.
- Tool workflows need model/server support for OpenAI-compatible tool calls. Plain prompt smoke tests can pass even when slash/tool workflows still fail because the server returns an incompatible tool-call shape.
## Raw `/v1/chat/completions` smoke test
@ -58,11 +58,11 @@ In another shell:
```bash
export OPENAI_BASE_URL="http://127.0.0.1:11434/v1"
export OPENAI_API_KEY="local-dev-token"
unset OPENAI_API_KEY
claw --model "qwen3:latest" prompt "Reply exactly HELLO_WORLD_123"
```
If Ollama is running without auth and your build accepts authless local OpenAI-compatible servers, `unset OPENAI_API_KEY` is also acceptable. Use a placeholder token rather than a real cloud API key for local testing.
If Ollama is running without auth, `unset OPENAI_API_KEY` is acceptable. Use a placeholder token rather than a real cloud API key if your local server requires an Authorization header.
## llama.cpp server

1
rust/Cargo.lock generated
View File

@ -2244,7 +2244,6 @@ version = "0.1.3"
dependencies = [
"api",
"commands",
"compat-harness",
"crossterm",
"log",
"mock-anthropic-service",

View File

@ -15,7 +15,7 @@ cargo run -p rusty-claude-cli -- --help
cargo build --workspace
# Run the interactive REPL
cargo run -p rusty-claude-cli -- --model claude-opus-4-6
cargo run -p rusty-claude-cli -- --model claude-opus-4-7
# One-shot prompt
cargo run -p rusty-claude-cli -- prompt "explain this codebase"
@ -87,7 +87,7 @@ Primary artifacts:
| Sub-agent / agent surfaces | ✅ |
| Todo tracking | ✅ |
| Notebook editing | ✅ |
| CLAUDE.md / project memory | ✅ |
| CLAUDE.md / CLAW.md / AGENTS.md project memory | ✅ |
| Config file hierarchy (`.claw.json` + merged config sections) | ✅ |
| Permission system | ✅ |
| MCP server lifecycle + inspection | ✅ |
@ -100,7 +100,7 @@ Primary artifacts:
| Slash commands (including `/skills`, `/agents`, `/mcp`, `/doctor`, `/plugin`, `/subagent`) | ✅ |
| Hooks (`/hooks`, config-backed lifecycle hooks) | ✅ |
| Plugin management surfaces | ✅ |
| Skills inventory / install surfaces | ✅ |
| Skills inventory / install / uninstall surfaces | ✅ |
| Machine-readable JSON output across core CLI surfaces | ✅ |
## Model Aliases
@ -109,7 +109,7 @@ Short names resolve to the latest model versions:
| Alias | Resolves To |
|-------|------------|
| `opus` | `claude-opus-4-6` |
| `opus` | `claude-opus-4-7` |
| `sonnet` | `claude-sonnet-4-6` |
| `haiku` | `claude-haiku-4-5-20251213` |
@ -122,10 +122,11 @@ claw [OPTIONS] [COMMAND]
Flags:
--model MODEL
--output-format text|json
--output-format text|json (case-insensitive; CLAW_OUTPUT_FORMAT supplies the default, flags override env)
--permission-mode MODE
--dangerously-skip-permissions
--allowedTools TOOLS
--cwd PATH, -C PATH, --directory PATH
--dangerously-skip-permissions, --skip-permissions
--allowedTools TOOLS canonical snake_case names or aliases; status JSON exposes allowed_tools.available/aliases
--resume [SESSION.jsonl|session-id|latest]
--version, -V
@ -146,6 +147,12 @@ Top-level commands:
```
`claw acp` is a local discoverability surface for editor-first users: it reports the current ACP/Zed status without starting the runtime. As of April 16, 2026, claw-code does **not** ship an ACP/Zed daemon or JSON-RPC entrypoint yet, and `claw acp serve` is only a status alias until the real protocol surface lands. Status queries exit 0 and expose the same machine-readable contract via `--output-format json`; malformed ACP invocations exit 1 with `kind: unsupported_acp_invocation`.
`--output-format` accepts `text` or `json` in any casing. `CLAW_OUTPUT_FORMAT=json` selects JSON as the default for non-interactive commands, explicit flags override it, repeated flags warn on stderr, and status JSON exposes `format_source`, `format_raw`, and `format_overridden`. Help and doctor output also surface `CLAW_LOG` / `RUST_LOG` as the logging environment knobs.
`claw version --output-format json` is the provenance probe for automation: it reports full `git_sha`, derived `git_sha_short`, `is_dirty`, `branch`, `commit_date`, `commit_timestamp`, `rustc_version`, runtime `executable_path`, and `binary_provenance`; the text report is available as `human_readable` instead of a duplicate `message` field.
`status --output-format json` reports loaded project memory files under `workspace.memory_files[]` with each file's `path`, `source` (`claude_md`, `claw_md`, `agents_md`, or scoped/rule sources), `origin`, `scope_path`, `outside_project`, `chars`, and `contributes`; `claw doctor --output-format json` includes a dedicated `memory` check. Root instruction-file priority is `CLAUDE.md`, then `CLAW.md`, then `AGENTS.md`, discovery is bounded to the current git root when present (otherwise cwd only), and all non-duplicate loaded files contribute to the rendered system prompt.
`claw mcp --output-format json` reports partial MCP config success: valid servers remain in `servers[]` while malformed siblings appear in `invalid_servers[]`, with `total_configured`, `valid_count`, and `invalid_count` split out for automation. `status` mirrors this as `mcp_validation`, and doctor includes an `mcp validation` check.
Shorthand prompt mode honors the POSIX `--` end-of-flags separator, so `claw -- "-prompt-with-dash"` and unknown dash-prefixed non-flag text stay on the prompt path instead of being treated as CLI options.
`claw dump-manifests` is self-contained: it emits the Rust resolver inventory for the selected workspace (commands, tools, agents, skills, and bootstrap phases) without requiring an upstream Claude Code TypeScript checkout. Use `--manifests-dir PATH` only to scope resolver discovery to another directory.
The command surface is moving quickly. For the canonical live help text, run:
@ -166,8 +173,8 @@ The REPL now exposes a much broader surface than the original minimal shell:
- plugin management: `/plugin` (with aliases `/plugins`, `/marketplace`)
Notable claw-first surfaces now available directly in slash form:
- `/skills [list|install <path>|help]`
- `/agents [list|help]`
- `/skills [list|show <name>|install <path>|uninstall <name>|help]`
- `/agents [list|show <name>|create <name>|help]`
- `/mcp [list|show <server>|help]`
- `/doctor`
- `/plugin [list|install <path>|enable <name>|disable <name>|uninstall <id>|update <id>]`
@ -184,7 +191,7 @@ rust/
└── crates/
├── api/ # Provider clients + streaming + request preflight
├── commands/ # Shared slash-command registry + help rendering
├── compat-harness/ # TS manifest extraction harness
├── compat-harness/ # Compatibility/parity harness utilities
├── mock-anthropic-service/ # Deterministic local Anthropic-compatible mock
├── plugins/ # Plugin metadata, manager, install/enable/disable surfaces
├── runtime/ # Session, config, permissions, MCP, prompts, auth/runtime loop
@ -197,7 +204,7 @@ rust/
- **api** — provider clients, SSE streaming, request/response types, auth (`ANTHROPIC_API_KEY` + bearer-token support), request-size/context-window preflight
- **commands** — slash command definitions, parsing, help text generation, JSON/text command rendering
- **compat-harness**extracts tool/prompt manifests from upstream TS source
- **compat-harness**compatibility and parity helpers for comparing behavior with upstream fixtures
- **mock-anthropic-service** — deterministic `/v1/messages` mock for CLI parity tests and local harness runs
- **plugins** — plugin metadata, install/enable/disable/update flows, plugin tool definitions, hook integration surfaces
- **runtime**`ConversationRuntime`, config loading, session persistence, permission policy, MCP client lifecycle, system prompt assembly, usage tracking
@ -210,8 +217,8 @@ rust/
- **~20K lines** of Rust
- **9 crates** in workspace
- **Binary name:** `claw`
- **Default model:** `claude-opus-4-6`
- **Default permissions:** `danger-full-access`
- **Default model:** `claude-opus-4-7`
- **Default permissions:** `workspace-write`
## License

View File

@ -161,7 +161,7 @@ mod tests {
#[test]
fn resolves_existing_and_grok_aliases() {
assert_eq!(resolve_model_alias("opus"), "claude-opus-4-6");
assert_eq!(resolve_model_alias("opus"), "claude-opus-4-7");
assert_eq!(resolve_model_alias("grok"), "grok-3");
assert_eq!(resolve_model_alias("grok-mini"), "grok-3-mini");
}
@ -235,4 +235,22 @@ mod tests {
other => panic!("Expected ProviderClient::OpenAi for qwen-plus, got: {other:?}"),
}
}
#[test]
fn local_openai_base_url_routes_authless_ollama_models() {
let _lock = env_lock();
let _base_url = EnvVarGuard::set("OPENAI_BASE_URL", Some("http://127.0.0.1:11434/v1"));
let _openai_key = EnvVarGuard::set("OPENAI_API_KEY", None);
let _anthropic_key = EnvVarGuard::set("ANTHROPIC_API_KEY", Some("test-anthropic-key"));
let _anthropic_token = EnvVarGuard::set("ANTHROPIC_AUTH_TOKEN", None);
let client = ProviderClient::from_model("qwen2.5-coder:7b")
.expect("local model should route to OpenAI-compatible client without auth");
match client {
ProviderClient::OpenAi(openai_client) => {
assert_eq!(openai_client.base_url(), "http://127.0.0.1:11434/v1")
}
other => panic!("Expected ProviderClient::OpenAi for local model, got: {other:?}"),
}
}
}

View File

@ -487,8 +487,7 @@ impl AnthropicClient {
request: &MessageRequest,
) -> Result<reqwest::Response, ApiError> {
let request_url = format!("{}/v1/messages", self.base_url.trim_end_matches('/'));
let mut request_body = self.request_profile.render_json_body(request)?;
strip_unsupported_beta_body_fields(&mut request_body);
let request_body = render_standard_messages_body(&self.request_profile, request)?;
let request_builder = self.build_request(&request_url).json(&request_body);
request_builder.send().await.map_err(ApiError::from)
}
@ -548,8 +547,7 @@ impl AnthropicClient {
"{}/v1/messages/count_tokens",
self.base_url.trim_end_matches('/')
);
let mut request_body = self.request_profile.render_json_body(request)?;
strip_unsupported_beta_body_fields(&mut request_body);
let request_body = render_standard_messages_body(&self.request_profile, request)?;
let response = self
.build_request(&request_url)
.json(&request_body)
@ -1036,6 +1034,21 @@ fn enrich_bearer_auth_error(error: ApiError, auth: &AuthSource) -> ApiError {
}
}
fn anthropic_wire_model(model: &str) -> &str {
model.strip_prefix("anthropic/").unwrap_or(model)
}
fn render_standard_messages_body(
request_profile: &AnthropicRequestProfile,
request: &MessageRequest,
) -> Result<Value, serde_json::Error> {
let mut wire_request = request.clone();
wire_request.model = anthropic_wire_model(&request.model).to_string();
let mut body = request_profile.render_json_body(&wire_request)?;
strip_unsupported_beta_body_fields(&mut body);
Ok(body)
}
/// Remove beta-only body fields that the standard `/v1/messages` and
/// `/v1/messages/count_tokens` endpoints reject as `Extra inputs are not
/// permitted`. The `betas` opt-in is communicated via the `anthropic-beta`
@ -1609,6 +1622,27 @@ mod tests {
);
}
#[test]
fn standard_messages_body_strips_anthropic_routing_prefix() {
let client = AnthropicClient::new("test-key");
let request = MessageRequest {
model: "anthropic/claude-opus-4-6".to_string(),
max_tokens: 64,
messages: vec![],
system: None,
tools: None,
tool_choice: None,
stream: false,
..Default::default()
};
let rendered = super::render_standard_messages_body(client.request_profile(), &request)
.expect("body should render");
assert_eq!(rendered["model"], serde_json::json!("claude-opus-4-6"));
assert!(rendered.get("betas").is_none());
}
#[test]
fn enrich_bearer_auth_error_appends_sk_ant_hint_on_401_with_pure_bearer_token() {
// given

View File

@ -211,7 +211,7 @@ pub fn resolve_model_alias(model: &str) -> String {
.find_map(|(alias, metadata)| {
(*alias == lower).then_some(match metadata.provider {
ProviderKind::Anthropic => match *alias {
"opus" => "claude-opus-4-6",
"opus" => "claude-opus-4-7",
"sonnet" => "claude-sonnet-4-6",
"haiku" => "claude-haiku-4-5-20251213",
_ => trimmed,
@ -262,6 +262,14 @@ pub fn metadata_for_model(model: &str) -> Option<ProviderMetadata> {
default_base_url: openai_compat::DEFAULT_OPENAI_BASE_URL,
});
}
if canonical.starts_with("local/") {
return Some(ProviderMetadata {
provider: ProviderKind::OpenAi,
auth_env: "OPENAI_API_KEY",
base_url_env: "OPENAI_BASE_URL",
default_base_url: openai_compat::DEFAULT_OPENAI_BASE_URL,
});
}
// Alibaba DashScope compatible-mode endpoint. Routes qwen/* and bare
// qwen-* model names (qwen-max, qwen-plus, qwen-turbo, qwen-qwq, etc.)
// to the OpenAI-compat client pointed at DashScope's /compatible-mode/v1.
@ -337,17 +345,21 @@ pub fn provider_diagnostics_for_model(model: &str) -> ProviderDiagnostics {
}
}
fn looks_like_local_openai_model(model: &str) -> bool {
model.contains(':') || model.contains('.')
}
#[must_use]
pub fn detect_provider_kind(model: &str) -> ProviderKind {
if let Some(metadata) = metadata_for_model(model) {
let resolved_model = resolve_model_alias(model);
if let Some(metadata) = metadata_for_model(&resolved_model) {
return metadata.provider;
}
// When OPENAI_BASE_URL is set, the user explicitly configured an
// OpenAI-compatible endpoint. Prefer it over the Anthropic fallback
// even when the model name has no recognized prefix — this is the
// common case for local providers (Ollama, LM Studio, vLLM, etc.)
// where model names like "qwen2.5-coder:7b" don't match any prefix.
if std::env::var_os("OPENAI_BASE_URL").is_some() && openai_compat::has_api_key("OPENAI_API_KEY")
// When OPENAI_BASE_URL is set and the unknown model name looks like a
// local server tag (for example `llama3.2` or `qwen2.5-coder:7b`), prefer
// the OpenAI-compatible endpoint over ambient Anthropic credentials.
if std::env::var_os("OPENAI_BASE_URL").is_some()
&& looks_like_local_openai_model(&resolved_model)
{
return ProviderKind::OpenAi;
}
@ -608,7 +620,7 @@ pub fn model_token_limit(model: &str) -> Option<ModelTokenLimit> {
let canonical = resolve_model_alias(model);
let base_model = canonical.rsplit('/').next().unwrap_or(canonical.as_str());
match base_model {
"claude-opus-4-6" => Some(ModelTokenLimit {
"claude-opus-4-7" | "claude-opus-4-6" => Some(ModelTokenLimit {
max_output_tokens: 32_000,
context_window_tokens: 200_000,
}),
@ -1042,6 +1054,18 @@ mod tests {
assert_eq!(kind2, ProviderKind::OpenAi);
}
#[test]
fn local_prefix_routes_to_openai_not_anthropic() {
let meta = super::metadata_for_model("local/Qwen/Qwen3.6-27B-FP8")
.expect("local/ prefix must resolve to OpenAI-compatible metadata");
assert_eq!(meta.provider, ProviderKind::OpenAi);
assert_eq!(meta.auth_env, "OPENAI_API_KEY");
assert_eq!(meta.base_url_env, "OPENAI_BASE_URL");
let kind = detect_provider_kind("local/Qwen/Qwen3.6-27B-FP8");
assert_eq!(kind, ProviderKind::OpenAi);
}
#[test]
fn qwen_prefix_routes_to_dashscope_not_anthropic() {
// User request from Discord #clawcode-get-help: web3g wants to use

View File

@ -1,5 +1,6 @@
use std::borrow::Cow;
use std::collections::{BTreeMap, VecDeque};
use std::net::Ipv4Addr;
use std::sync::atomic::{AtomicU64, Ordering};
use std::time::{Duration, SystemTime, UNIX_EPOCH};
@ -131,13 +132,22 @@ impl OpenAiCompatClient {
}
pub fn from_env(config: OpenAiCompatConfig) -> Result<Self, ApiError> {
let Some(api_key) = read_env_non_empty(config.api_key_env)? else {
return Err(ApiError::missing_credentials(
config.provider_name,
config.credential_env_vars(),
));
let base_url = read_base_url(config);
let api_key = match read_env_non_empty(config.api_key_env)? {
Some(api_key) => api_key,
None if config.provider_name == "OpenAI"
&& is_local_openai_compatible_base_url(&base_url) =>
{
"local-dev-token".to_string()
}
None => {
return Err(ApiError::missing_credentials(
config.provider_name,
config.credential_env_vars(),
));
}
};
Ok(Self::new(api_key, config))
Ok(Self::new(api_key, config).with_base_url(base_url))
}
#[must_use]
@ -933,14 +943,18 @@ pub fn model_requires_reasoning_content_in_history(model: &str) -> bool {
/// Strip routing prefix (e.g., "openai/gpt-4" → "gpt-4") for the wire.
/// The prefix is used only to select transport; the backend expects the
/// bare model id.
/// bare model id. Use `local/` to force OpenAI-compatible routing while
/// preserving any slashes that follow the prefix.
#[allow(dead_code)]
fn strip_routing_prefix(model: &str) -> &str {
if let Some(pos) = model.find('/') {
let prefix = &model[..pos];
// Only strip if the prefix before "/" is a known routing prefix,
// not if "/" appears in the middle of the model name for other reasons.
if matches!(prefix, "openai" | "xai" | "grok" | "qwen" | "kimi") {
if matches!(
prefix,
"openai" | "xai" | "grok" | "qwen" | "kimi" | "local"
) {
&model[pos + 1..]
} else {
model
@ -950,6 +964,44 @@ fn strip_routing_prefix(model: &str) -> &str {
}
}
fn normalize_base_url_for_model_routing(url: &str) -> &str {
let trimmed = url.trim_end_matches('/');
trimmed
.strip_suffix("/chat/completions")
.map(|value| value.trim_end_matches('/'))
.unwrap_or(trimmed)
}
fn url_host(url: &str) -> &str {
let after_scheme = url.split_once("://").map_or(url, |(_, rest)| rest);
let authority = after_scheme.split(['/', '?', '#']).next().unwrap_or("");
let host_port = authority
.rsplit_once('@')
.map_or(authority, |(_, host_port)| host_port);
if host_port.starts_with('[') {
return host_port
.split(']')
.next()
.unwrap_or("")
.trim_start_matches('[');
}
host_port.split(':').next().unwrap_or("")
}
fn is_local_openai_compatible_base_url(url: &str) -> bool {
let host = url_host(url.trim());
if host.eq_ignore_ascii_case("localhost") || host == "::1" {
return true;
}
let Ok(address) = host.parse::<Ipv4Addr>() else {
return false;
};
let [first, second, ..] = address.octets();
matches!(first, 10 | 127)
|| first == 192 && second == 168
|| first == 172 && (16..=31).contains(&second)
}
fn wire_model_for_base_url<'a>(
model: &'a str,
config: OpenAiCompatConfig,
@ -962,26 +1014,22 @@ fn wire_model_for_base_url<'a>(
let lowered_prefix = prefix.to_ascii_lowercase();
if lowered_prefix == "openai" {
let trimmed_base_url = base_url.trim_end_matches('/');
let default_openai = DEFAULT_OPENAI_BASE_URL.trim_end_matches('/');
if matches!(
lowered_prefix.as_str(),
"xai" | "grok" | "kimi" | "gemini" | "gemma"
) {
let normalized_base_url = normalize_base_url_for_model_routing(base_url);
let default_base_url = normalize_base_url_for_model_routing(config.default_base_url);
if normalized_base_url.eq_ignore_ascii_case(default_base_url)
|| is_local_openai_compatible_base_url(base_url)
{
return Cow::Borrowed(&model[pos + 1..]);
}
if config.provider_name == "OpenAI" && trimmed_base_url != default_openai {
// Only preserve the full slug if it's NOT a model we want to strip
if !model.contains("gemini") && !model.contains("gemma") {
return Cow::Borrowed(model);
}
}
return Cow::Borrowed(&model[pos + 1..]);
return Cow::Borrowed(model);
}
if matches!(lowered_prefix.as_str(), "xai" | "grok" | "qwen" | "kimi") {
return Cow::Borrowed(&model[pos + 1..]);
}
if lowered_prefix == "local" {
return Cow::Borrowed(&model[pos + 1..]);
}
Cow::Borrowed(model)
}
@ -1133,6 +1181,13 @@ fn build_chat_completion_request_for_base_url(
payload[key] = value.clone();
}
// DeepSeek V4 Pro/Flash thinking mode requires this provider-specific opt-in
// and also requires assistant reasoning history to be echoed as `reasoning_content`.
// Apply it after extra_body so callers cannot accidentally override the required shape.
if model_requires_reasoning_content_in_history(wire_model) {
payload["thinking"] = json!({"type": "enabled"});
}
payload
}
@ -1190,16 +1245,19 @@ pub fn translate_message(message: &InputMessage, model: &str) -> Vec<Value> {
InputContentBlock::ToolResult { .. } => {}
}
}
let include_reasoning =
model_requires_reasoning_content_in_history(model) && !reasoning.is_empty();
if text.is_empty() && tool_calls.is_empty() && !include_reasoning {
let needs_reasoning = model_requires_reasoning_content_in_history(model);
if text.is_empty() && tool_calls.is_empty() && reasoning.is_empty() {
Vec::new()
} else {
let mut msg = serde_json::json!({
"role": "assistant",
"content": (!text.is_empty()).then_some(text),
});
if include_reasoning {
if !text.is_empty() {
msg["content"] = json!(text);
} else if !needs_reasoning {
msg["content"] = Value::Null;
}
if needs_reasoning {
msg["reasoning_content"] = json!(reasoning);
}
// Only include tool_calls when non-empty: some providers reject
@ -1752,6 +1810,7 @@ mod tests {
ToolChoice, ToolDefinition, ToolResultContentBlock,
};
use serde_json::json;
use std::borrow::Cow;
use std::collections::BTreeMap;
use std::sync::{Mutex, OnceLock};
@ -1850,6 +1909,31 @@ mod tests {
assert_eq!(assistant["content"], json!("answer"));
}
#[test]
fn deepseek_v4_assistant_with_only_tool_calls_omits_content_and_includes_reasoning() {
let request = MessageRequest {
model: "deepseek-v4-pro".to_string(),
max_tokens: 100,
messages: vec![InputMessage {
role: "assistant".to_string(),
content: vec![InputContentBlock::ToolUse {
id: "call_1".to_string(),
name: "get_weather".to_string(),
input: json!({"city": "Paris"}),
}],
}],
stream: false,
..Default::default()
};
let payload = build_chat_completion_request(&request, OpenAiCompatConfig::openai());
let assistant = &payload["messages"][0];
assert!(assistant.get("content").is_none());
assert_eq!(assistant["reasoning_content"], json!(""));
assert_eq!(assistant["tool_calls"].as_array().map(Vec::len), Some(1));
}
#[test]
fn deepseek_v4_flash_request_includes_reasoning_content_for_assistant_history() {
// Given an assistant history turn containing thinking.
@ -2036,6 +2120,49 @@ mod tests {
assert_eq!(payload["reasoning_effort"], json!("high"));
}
#[test]
fn deepseek_v4_request_includes_thinking_parameter() {
let payload = build_chat_completion_request(
&MessageRequest {
model: "deepseek-v4-pro".to_string(),
max_tokens: 1024,
messages: vec![InputMessage::user_text("hello")],
..Default::default()
},
OpenAiCompatConfig::openai(),
);
assert_eq!(payload["thinking"], json!({"type": "enabled"}));
assert_eq!(payload["model"], json!("deepseek-v4-pro"));
let mut extra_body = BTreeMap::new();
extra_body.insert("thinking".to_string(), json!({"type": "disabled"}));
let payload_with_override = build_chat_completion_request(
&MessageRequest {
model: "openai/deepseek-v4-flash".to_string(),
max_tokens: 1024,
messages: vec![InputMessage::user_text("hello")],
extra_body,
..Default::default()
},
OpenAiCompatConfig::openai(),
);
assert_eq!(
payload_with_override["thinking"],
json!({"type": "enabled"})
);
let non_deepseek_payload = build_chat_completion_request(
&MessageRequest {
model: "gpt-4o".to_string(),
max_tokens: 64,
messages: vec![InputMessage::user_text("hello")],
..Default::default()
},
OpenAiCompatConfig::openai(),
);
assert!(non_deepseek_payload.get("thinking").is_none());
}
#[test]
fn reasoning_effort_omitted_when_not_set() {
let payload = build_chat_completion_request(
@ -2123,6 +2250,28 @@ mod tests {
));
}
#[test]
fn local_openai_base_url_does_not_require_api_key() {
let _lock = env_lock();
let original_base_url = std::env::var_os("OPENAI_BASE_URL");
let original_api_key = std::env::var_os("OPENAI_API_KEY");
std::env::set_var("OPENAI_BASE_URL", "http://127.0.0.1:11434/v1");
std::env::remove_var("OPENAI_API_KEY");
let client = OpenAiCompatClient::from_env(OpenAiCompatConfig::openai())
.expect("local OpenAI-compatible endpoint should not require an API key");
assert_eq!(client.base_url(), "http://127.0.0.1:11434/v1");
match original_base_url {
Some(value) => std::env::set_var("OPENAI_BASE_URL", value),
None => std::env::remove_var("OPENAI_BASE_URL"),
}
match original_api_key {
Some(value) => std::env::set_var("OPENAI_API_KEY", value),
None => std::env::remove_var("OPENAI_API_KEY"),
}
}
#[test]
fn endpoint_builder_accepts_base_urls_and_full_endpoints() {
assert_eq!(
@ -2738,6 +2887,66 @@ mod tests {
}
}
#[test]
fn wire_model_strips_openai_prefix_for_default_and_local_preserves_custom_gateways() {
assert_eq!(
super::wire_model_for_base_url(
"openai/gpt-4o",
OpenAiCompatConfig::openai(),
super::DEFAULT_OPENAI_BASE_URL,
),
Cow::Borrowed("gpt-4o")
);
assert_eq!(
super::wire_model_for_base_url(
"openai/qwen2.5-coder:7b",
OpenAiCompatConfig::openai(),
"http://127.0.0.1:11434/v1",
),
Cow::Borrowed("qwen2.5-coder:7b")
);
assert_eq!(
super::wire_model_for_base_url(
"openai/llama3.2",
OpenAiCompatConfig::openai(),
"http://localhost:11434/v1/chat/completions",
),
Cow::Borrowed("llama3.2")
);
assert_eq!(
super::wire_model_for_base_url(
"openai/gpt-4.1-mini",
OpenAiCompatConfig::openai(),
"https://openrouter.ai/api/v1",
),
Cow::Borrowed("openai/gpt-4.1-mini")
);
assert_eq!(
super::wire_model_for_base_url(
"openai/gpt-4.1-mini",
OpenAiCompatConfig::openai(),
"https://not-localhost.example.com/v1",
),
Cow::Borrowed("openai/gpt-4.1-mini")
);
}
#[test]
fn local_routing_prefix_strips_only_escape_hatch() {
assert_eq!(
super::strip_routing_prefix("local/Qwen/Qwen3.6-27B-FP8"),
"Qwen/Qwen3.6-27B-FP8"
);
assert_eq!(
super::wire_model_for_base_url(
"local/Qwen/Qwen3.6-27B-FP8",
OpenAiCompatConfig::openai(),
"http://127.0.0.1:8000/v1",
),
Cow::Borrowed("Qwen/Qwen3.6-27B-FP8")
);
}
#[test]
fn check_request_body_size_allows_large_requests_for_openai() {
// Create a request that exceeds DashScope's limit but is under OpenAI's 100MB limit

View File

@ -103,6 +103,58 @@ async fn send_message_posts_json_and_parses_response() {
);
}
#[tokio::test]
async fn send_message_strips_anthropic_routing_prefix_on_wire() {
let state = Arc::new(Mutex::new(Vec::<CapturedRequest>::new()));
let server = spawn_server(
state.clone(),
vec![
http_response("200 OK", "application/json", "{\"input_tokens\":1}"),
http_response(
"200 OK",
"application/json",
concat!(
"{",
"\"id\":\"msg_prefixed\",",
"\"type\":\"message\",",
"\"role\":\"assistant\",",
"\"content\":[{\"type\":\"text\",\"text\":\"ok\"}],",
"\"model\":\"claude-opus-4-6\",",
"\"stop_reason\":\"end_turn\",",
"\"stop_sequence\":null,",
"\"usage\":{\"input_tokens\":1,\"output_tokens\":1}",
"}"
),
),
],
)
.await;
let client = AnthropicClient::new("test-key").with_base_url(server.base_url());
client
.send_message(&MessageRequest {
model: "anthropic/claude-opus-4-6".to_string(),
..sample_request(false)
})
.await
.expect("request should succeed");
let captured = state.lock().await;
assert_eq!(
captured.len(),
2,
"count_tokens and messages requests should be captured"
);
let count_tokens_body: serde_json::Value =
serde_json::from_str(&captured[0].body).expect("count_tokens body should be json");
let messages_body: serde_json::Value =
serde_json::from_str(&captured[1].body).expect("request body should be json");
assert_eq!(captured[0].path, "/v1/messages/count_tokens");
assert_eq!(captured[1].path, "/v1/messages");
assert_eq!(count_tokens_body["model"], json!("claude-opus-4-6"));
assert_eq!(messages_body["model"], json!("claude-opus-4-6"));
}
#[tokio::test]
async fn send_message_blocks_oversized_requests_before_the_http_call() {
let state = Arc::new(Mutex::new(Vec::<CapturedRequest>::new()));

View File

@ -159,10 +159,15 @@ async fn send_message_preserves_deepseek_reasoning_content_before_text() {
},
]
);
let captured = state.lock().await;
let request = captured.first().expect("server should capture request");
let body: serde_json::Value = serde_json::from_str(&request.body).expect("json body");
assert_eq!(body["thinking"], json!({"type": "enabled"}));
}
#[tokio::test]
async fn custom_openai_gateway_preserves_slash_model_ids_and_extra_body_params() {
async fn local_openai_gateway_strips_routing_prefix_and_preserves_extra_body_params() {
let state = Arc::new(Mutex::new(Vec::<CapturedRequest>::new()));
let body = concat!(
"{",
@ -206,7 +211,7 @@ async fn custom_openai_gateway_preserves_slash_model_ids_and_extra_body_params()
let captured = state.lock().await;
let request = captured.first().expect("captured request");
let body: serde_json::Value = serde_json::from_str(&request.body).expect("json body");
assert_eq!(body["model"], json!("openai/gpt-4.1-mini"));
assert_eq!(body["model"], json!("gpt-4.1-mini"));
assert_eq!(
body["web_search_options"],
json!({"search_context_size": "low"})

File diff suppressed because it is too large Load Diff

View File

@ -330,20 +330,24 @@ fn prepare_tokio_command(
prepare_sandbox_dirs(cwd);
}
if let Some(launcher) = build_linux_sandbox_command(command, cwd, sandbox_status) {
let mut prepared = TokioCommand::new(launcher.program);
prepared.args(launcher.args);
prepared.current_dir(cwd);
prepared.envs(launcher.env);
return prepared;
}
let mut prepared =
if let Some(launcher) = build_linux_sandbox_command(command, cwd, sandbox_status) {
let mut cmd = TokioCommand::new(launcher.program);
cmd.args(launcher.args);
cmd.envs(launcher.env);
cmd
} else {
let mut cmd = TokioCommand::new("sh");
cmd.arg("-lc").arg(command);
if sandbox_status.filesystem_active {
cmd.env("HOME", cwd.join(".sandbox-home"));
cmd.env("TMPDIR", cwd.join(".sandbox-tmp"));
}
cmd
};
let mut prepared = TokioCommand::new("sh");
prepared.arg("-lc").arg(command).current_dir(cwd);
if sandbox_status.filesystem_active {
prepared.env("HOME", cwd.join(".sandbox-home"));
prepared.env("TMPDIR", cwd.join(".sandbox-tmp"));
}
prepared.current_dir(cwd);
prepared.stdin(Stdio::null());
prepared
}
@ -419,6 +423,27 @@ mod tests {
assert_eq!(structured[0]["event"], "test.hung");
assert_eq!(structured[0]["data"]["provenance"], "bash.timeout");
}
#[test]
fn prevents_stdin_hangs_by_redirecting_to_null() {
let output = execute_bash(BashCommandInput {
command: String::from("cat"),
timeout: Some(2_000),
description: None,
run_in_background: Some(false),
dangerously_disable_sandbox: Some(true),
namespace_restrictions: None,
isolate_network: None,
filesystem_mode: None,
allowed_mounts: None,
})
.expect("bash command should execute cleanly");
assert!(
!output.interrupted,
"Command hung and was cut off by the timeout!"
);
}
}
/// Maximum output bytes before truncation (16 KiB, matching upstream).

File diff suppressed because it is too large Load Diff

View File

@ -92,6 +92,8 @@ enum FieldType {
Bool,
Object,
StringArray,
HookArray,
RulesImport,
Number,
}
@ -102,6 +104,8 @@ impl FieldType {
Self::Bool => "a boolean",
Self::Object => "an object",
Self::StringArray => "an array of strings",
Self::RulesImport => "a string or an array of strings",
Self::HookArray => "an array of strings or hook objects",
Self::Number => "a number",
}
}
@ -114,6 +118,16 @@ impl FieldType {
Self::StringArray => value
.as_array()
.is_some_and(|arr| arr.iter().all(|v| v.as_str().is_some())),
Self::HookArray => value.as_array().is_some_and(|arr| {
arr.iter()
.all(|entry| entry.as_str().is_some() || entry.as_object().is_some())
}),
Self::RulesImport => {
value.as_str().is_some()
|| value
.as_array()
.is_some_and(|arr| arr.iter().all(|v| v.as_str().is_some()))
}
Self::Number => value.as_i64().is_some(),
}
}
@ -201,20 +215,24 @@ const TOP_LEVEL_FIELDS: &[FieldSpec] = &[
name: "provider",
expected: FieldType::Object,
},
FieldSpec {
name: "rulesImport",
expected: FieldType::RulesImport,
},
];
const HOOKS_FIELDS: &[FieldSpec] = &[
FieldSpec {
name: "PreToolUse",
expected: FieldType::StringArray,
expected: FieldType::HookArray,
},
FieldSpec {
name: "PostToolUse",
expected: FieldType::StringArray,
expected: FieldType::HookArray,
},
FieldSpec {
name: "PostToolUseFailure",
expected: FieldType::StringArray,
expected: FieldType::HookArray,
},
];
@ -406,9 +424,10 @@ fn validate_object_keys(
} else if DEPRECATED_FIELDS.iter().any(|d| d.name == key) {
// Deprecated key — handled separately, not an unknown-key error.
} else {
// Unknown key.
// Unknown key — preserve compatibility by surfacing it as a warning
// instead of blocking otherwise valid config files.
let suggestion = suggest_field(key, &known_names);
result.errors.push(ConfigDiagnostic {
result.warnings.push(ConfigDiagnostic {
path: path_display.to_string(),
field: field_path,
line: find_key_line(source, key),
@ -587,10 +606,11 @@ mod tests {
let result = validate_config_file(object, source, &test_path());
// then
assert_eq!(result.errors.len(), 1);
assert_eq!(result.errors[0].field, "unknownField");
assert!(result.errors.is_empty());
assert_eq!(result.warnings.len(), 1);
assert_eq!(result.warnings[0].field, "unknownField");
assert!(matches!(
result.errors[0].kind,
result.warnings[0].kind,
DiagnosticKind::UnknownKey { .. }
));
}
@ -670,9 +690,10 @@ mod tests {
let result = validate_config_file(object, source, &test_path());
// then
assert_eq!(result.errors.len(), 1);
assert_eq!(result.errors[0].line, Some(3));
assert_eq!(result.errors[0].field, "badKey");
assert!(result.errors.is_empty());
assert_eq!(result.warnings.len(), 1);
assert_eq!(result.warnings[0].line, Some(3));
assert_eq!(result.warnings[0].field, "badKey");
}
#[test]
@ -701,8 +722,60 @@ mod tests {
let result = validate_config_file(object, source, &test_path());
// then
assert!(result.errors.is_empty());
assert_eq!(result.warnings.len(), 1);
assert_eq!(result.warnings[0].field, "hooks.BadHook");
}
#[test]
fn validates_object_style_hook_entries() {
let source = r#"{"hooks":{"PreToolUse":["legacy",{"matcher":"Bash","hooks":[{"type":"command","command":"echo ok"}]}]}}"#;
let parsed = JsonValue::parse(source).expect("valid json");
let object = parsed.as_object().expect("object");
let result = validate_config_file(object, source, &test_path());
assert!(result.errors.is_empty(), "{:?}", result.errors);
}
#[test]
fn rejects_wrong_hook_entry_types() {
let source = r#"{"hooks":{"PreToolUse":[42]}}"#;
let parsed = JsonValue::parse(source).expect("valid json");
let object = parsed.as_object().expect("object");
let result = validate_config_file(object, source, &test_path());
assert_eq!(result.errors.len(), 1);
assert_eq!(result.errors[0].field, "hooks.BadHook");
assert_eq!(result.errors[0].field, "hooks.PreToolUse");
}
#[test]
fn validates_rules_import_string_and_array_forms() {
for source in [
r#"{"rulesImport":"auto"}"#,
r#"{"rulesImport":"none"}"#,
r#"{"rulesImport":["cursor","copilot"]}"#,
] {
let parsed = JsonValue::parse(source).expect("valid json");
let object = parsed.as_object().expect("object");
let result = validate_config_file(object, source, &test_path());
assert!(result.errors.is_empty(), "{source}: {:?}", result.errors);
}
}
#[test]
fn rejects_rules_import_wrong_type() {
let source = r#"{"rulesImport":42}"#;
let parsed = JsonValue::parse(source).expect("valid json");
let object = parsed.as_object().expect("object");
let result = validate_config_file(object, source, &test_path());
assert_eq!(result.errors.len(), 1);
assert_eq!(result.errors[0].field, "rulesImport");
}
#[test]
@ -716,8 +789,9 @@ mod tests {
let result = validate_config_file(object, source, &test_path());
// then
assert_eq!(result.errors.len(), 1);
assert_eq!(result.errors[0].field, "permissions.denyAll");
assert!(result.errors.is_empty());
assert_eq!(result.warnings.len(), 1);
assert_eq!(result.warnings[0].field, "permissions.denyAll");
}
#[test]
@ -731,8 +805,9 @@ mod tests {
let result = validate_config_file(object, source, &test_path());
// then
assert_eq!(result.errors.len(), 1);
assert_eq!(result.errors[0].field, "sandbox.containerMode");
assert!(result.errors.is_empty());
assert_eq!(result.warnings.len(), 1);
assert_eq!(result.warnings[0].field, "sandbox.containerMode");
}
#[test]
@ -746,8 +821,9 @@ mod tests {
let result = validate_config_file(object, source, &test_path());
// then
assert_eq!(result.errors.len(), 1);
assert_eq!(result.errors[0].field, "plugins.autoUpdate");
assert!(result.errors.is_empty());
assert_eq!(result.warnings.len(), 1);
assert_eq!(result.warnings[0].field, "plugins.autoUpdate");
}
#[test]
@ -761,8 +837,9 @@ mod tests {
let result = validate_config_file(object, source, &test_path());
// then
assert_eq!(result.errors.len(), 1);
assert_eq!(result.errors[0].field, "oauth.secret");
assert!(result.errors.is_empty());
assert_eq!(result.warnings.len(), 1);
assert_eq!(result.warnings[0].field, "oauth.secret");
}
#[test]
@ -797,8 +874,9 @@ mod tests {
let result = validate_config_file(object, source, &test_path());
// then
assert_eq!(result.errors.len(), 1);
match &result.errors[0].kind {
assert!(result.errors.is_empty());
assert_eq!(result.warnings.len(), 1);
match &result.warnings[0].kind {
DiagnosticKind::UnknownKey {
suggestion: Some(s),
} => assert_eq!(s, "model"),
@ -809,7 +887,7 @@ mod tests {
#[test]
fn format_diagnostics_includes_all_entries() {
// given
let source = r#"{"permissionMode": "plan", "badKey": 1}"#;
let source = r#"{"model": 42, "badKey": 1}"#;
let parsed = JsonValue::parse(source).expect("valid json");
let object = parsed.as_object().expect("object");
let result = validate_config_file(object, source, &test_path());
@ -821,7 +899,7 @@ mod tests {
assert!(output.contains("warning:"));
assert!(output.contains("error:"));
assert!(output.contains("badKey"));
assert!(output.contains("permissionMode"));
assert!(output.contains("model"));
}
#[test]

View File

@ -11,7 +11,7 @@ use std::time::Duration;
use serde_json::{json, Value};
use crate::config::{RuntimeFeatureConfig, RuntimeHookConfig};
use crate::config::{RuntimeFeatureConfig, RuntimeHookCommand, RuntimeHookConfig};
use crate::permissions::PermissionOverride;
const HOOK_PREVIEW_CHAR_LIMIT: usize = 160;
@ -182,7 +182,7 @@ impl HookRunner {
) -> HookRunResult {
Self::run_commands(
HookEvent::PreToolUse,
self.config.pre_tool_use(),
self.config.pre_tool_use_entries(),
tool_name,
tool_input,
None,
@ -232,7 +232,7 @@ impl HookRunner {
) -> HookRunResult {
Self::run_commands(
HookEvent::PostToolUse,
self.config.post_tool_use(),
self.config.post_tool_use_entries(),
tool_name,
tool_input,
Some(tool_output),
@ -282,7 +282,7 @@ impl HookRunner {
) -> HookRunResult {
Self::run_commands(
HookEvent::PostToolUseFailure,
self.config.post_tool_use_failure(),
self.config.post_tool_use_failure_entries(),
tool_name,
tool_input,
Some(tool_error),
@ -312,7 +312,7 @@ impl HookRunner {
#[allow(clippy::too_many_arguments)]
fn run_commands(
event: HookEvent,
commands: &[String],
commands: &[RuntimeHookCommand],
tool_name: &str,
tool_input: &str,
tool_output: Option<&str>,
@ -342,17 +342,21 @@ impl HookRunner {
let payload = hook_payload(event, tool_name, tool_input, tool_output, is_error).to_string();
let mut result = HookRunResult::allow(Vec::new());
for command in commands {
for command in commands
.iter()
.filter(|command| command.matches_tool(tool_name))
{
let command_text = command.command();
if let Some(reporter) = reporter.as_deref_mut() {
reporter.on_event(&HookProgressEvent::Started {
event,
tool_name: tool_name.to_string(),
command: command.clone(),
command: command_text.to_string(),
});
}
match Self::run_command(
command,
command_text,
event,
tool_name,
tool_input,
@ -366,7 +370,7 @@ impl HookRunner {
reporter.on_event(&HookProgressEvent::Completed {
event,
tool_name: tool_name.to_string(),
command: command.clone(),
command: command_text.to_string(),
});
}
merge_parsed_hook_output(&mut result, parsed);
@ -376,7 +380,7 @@ impl HookRunner {
reporter.on_event(&HookProgressEvent::Completed {
event,
tool_name: tool_name.to_string(),
command: command.clone(),
command: command_text.to_string(),
});
}
merge_parsed_hook_output(&mut result, parsed);
@ -388,7 +392,7 @@ impl HookRunner {
reporter.on_event(&HookProgressEvent::Completed {
event,
tool_name: tool_name.to_string(),
command: command.clone(),
command: command_text.to_string(),
});
}
merge_parsed_hook_output(&mut result, parsed);
@ -400,7 +404,7 @@ impl HookRunner {
reporter.on_event(&HookProgressEvent::Cancelled {
event,
tool_name: tool_name.to_string(),
command: command.clone(),
command: command_text.to_string(),
});
}
result.cancelled = true;
@ -737,7 +741,7 @@ fn format_hook_failure(command: &str, code: i32, stdout: Option<&str>, stderr: &
fn shell_command(command: &str) -> CommandWithStdin {
#[cfg(windows)]
let mut command_builder = {
let command_builder = {
let mut command_builder = Command::new("cmd");
command_builder.arg("/C").arg(command);
CommandWithStdin::new(command_builder)
@ -825,7 +829,7 @@ mod tests {
HookAbortSignal, HookEvent, HookProgressEvent, HookProgressReporter, HookRunResult,
HookRunner,
};
use crate::config::{RuntimeFeatureConfig, RuntimeHookConfig};
use crate::config::{RuntimeFeatureConfig, RuntimeHookCommand, RuntimeHookConfig};
use crate::permissions::PermissionOverride;
struct RecordingReporter {
@ -851,6 +855,37 @@ mod tests {
assert_eq!(result, HookRunResult::allow(vec!["pre ok".to_string()]));
}
#[test]
fn object_style_hook_matchers_filter_runtime_execution() {
let runner = HookRunner::new(RuntimeHookConfig::from_hook_commands(
vec![
RuntimeHookCommand::new(shell_snippet("printf 'legacy'")),
RuntimeHookCommand::with_matcher(
shell_snippet("printf 'bash only'"),
Some("Bash".to_string()),
),
RuntimeHookCommand::with_matcher(
shell_snippet("printf 'read only'"),
Some("Read*".to_string()),
),
],
Vec::new(),
Vec::new(),
));
let read_result = runner.run_pre_tool_use("ReadFile", r#"{"path":"README.md"}"#);
let bash_result = runner.run_pre_tool_use("Bash", r#"{"command":"pwd"}"#);
assert_eq!(
read_result,
HookRunResult::allow(vec!["legacy".to_string(), "read only".to_string()])
);
assert_eq!(
bash_result,
HookRunResult::allow(vec!["legacy".to_string(), "bash only".to_string()])
);
}
#[test]
fn denies_exit_code_two() {
let runner = HookRunner::new(RuntimeHookConfig::new(

View File

@ -66,11 +66,13 @@ pub use compact::{
};
pub use config::{
suppress_config_warnings_for_json_mode, ApiTimeoutConfig, ConfigEntry, ConfigError,
ConfigLoader, ConfigSource, McpConfigCollection, McpManagedProxyServerConfig, McpOAuthConfig,
ConfigFileReport, ConfigFileStatus, ConfigInspection, ConfigLoader, ConfigSource,
McpConfigCollection, McpInvalidServerConfig, McpManagedProxyServerConfig, McpOAuthConfig,
McpRemoteServerConfig, McpSdkServerConfig, McpServerConfig, McpStdioServerConfig, McpTransport,
McpWebSocketServerConfig, OAuthConfig, ProviderFallbackConfig, ResolvedPermissionMode,
RuntimeConfig, RuntimeFeatureConfig, RuntimeHookConfig, RuntimePermissionRuleConfig,
RuntimePluginConfig, ScopedMcpServerConfig, CLAW_SETTINGS_SCHEMA_NAME,
RulesImportConfig, RuntimeConfig, RuntimeFeatureConfig, RuntimeHookCommand, RuntimeHookConfig,
RuntimePermissionRuleConfig, RuntimePluginConfig, ScopedMcpServerConfig,
CLAW_SETTINGS_SCHEMA_NAME,
};
pub use config_validate::{
check_unsupported_format, format_diagnostics, validate_config_file, ConfigDiagnostic,
@ -141,8 +143,9 @@ pub use policy_engine::{
PolicyEvaluation, PolicyRule, ReconcileReason, ReviewStatus,
};
pub use prompt::{
load_system_prompt, prepend_bullets, ContextFile, ModelFamilyIdentity, ProjectContext,
PromptBuildError, SystemPromptBuilder, FRONTIER_MODEL_NAME, SYSTEM_PROMPT_DYNAMIC_BOUNDARY,
load_system_prompt, load_system_prompt_with_context, prepend_bullets, ContextFile,
ModelFamilyIdentity, ProjectContext, PromptBuildError, SystemPromptBuilder,
FRONTIER_MODEL_NAME, SYSTEM_PROMPT_DYNAMIC_BOUNDARY,
};
pub use recovery_recipes::{
attempt_recovery, recipe_for, EscalationPolicy, FailureScenario, RecoveryAttemptState,

View File

@ -3,7 +3,7 @@ use std::hash::{Hash, Hasher};
use std::path::{Path, PathBuf};
use std::process::Command;
use crate::config::{ConfigError, ConfigLoader, RuntimeConfig};
use crate::config::{ConfigError, ConfigLoader, RulesImportConfig, RuntimeConfig};
use crate::git_context::GitContext;
/// Errors raised while assembling the final system prompt.
@ -69,6 +69,18 @@ pub struct ContextFile {
pub content: String,
}
impl ContextFile {
#[must_use]
pub fn source(&self) -> &'static str {
instruction_file_source(&self.path)
}
#[must_use]
pub fn char_count(&self) -> usize {
self.content.chars().count()
}
}
/// Project-local context injected into the rendered system prompt.
#[derive(Debug, Clone, Default, PartialEq, Eq)]
pub struct ProjectContext {
@ -86,7 +98,24 @@ impl ProjectContext {
current_date: impl Into<String>,
) -> std::io::Result<Self> {
let cwd = cwd.into();
let instruction_files = discover_instruction_files(&cwd)?;
let instruction_files = discover_instruction_files(&cwd, &RulesImportConfig::default())?;
Ok(Self {
cwd,
current_date: current_date.into(),
git_status: None,
git_diff: None,
git_context: None,
instruction_files,
})
}
pub fn discover_with_rules_import(
cwd: impl Into<PathBuf>,
current_date: impl Into<String>,
rules_import: &RulesImportConfig,
) -> std::io::Result<Self> {
let cwd = cwd.into();
let instruction_files = discover_instruction_files(&cwd, rules_import)?;
Ok(Self {
cwd,
current_date: current_date.into(),
@ -109,6 +138,18 @@ impl ProjectContext {
}
}
fn discover_with_git_and_rules_import(
cwd: impl Into<PathBuf>,
current_date: impl Into<String>,
rules_import: &RulesImportConfig,
) -> std::io::Result<ProjectContext> {
let mut context = ProjectContext::discover_with_rules_import(cwd, current_date, rules_import)?;
context.git_status = read_git_status(&context.cwd);
context.git_diff = read_git_diff(&context.cwd);
context.git_context = GitContext::detect(&context.cwd);
Ok(context)
}
/// Builder for the runtime system prompt and dynamic environment sections.
#[derive(Debug, Clone, Default, PartialEq, Eq)]
pub struct SystemPromptBuilder {
@ -227,30 +268,81 @@ pub fn prepend_bullets(items: Vec<String>) -> Vec<String> {
items.into_iter().map(|item| format!(" - {item}")).collect()
}
fn discover_instruction_files(cwd: &Path) -> std::io::Result<Vec<ContextFile>> {
let mut directories = Vec::new();
let mut cursor = Some(cwd);
while let Some(dir) = cursor {
directories.push(dir.to_path_buf());
cursor = dir.parent();
fn instruction_file_source(path: &Path) -> &'static str {
let file_name = path.file_name().and_then(|name| name.to_str());
let parent_name = path
.parent()
.and_then(|parent| parent.file_name())
.and_then(|name| name.to_str());
match (parent_name, file_name) {
(Some(".claw"), Some("CLAUDE.md")) => "claw_claude_md",
(Some(".claude"), Some("CLAUDE.md")) => "claude_claude_md",
(_, Some("CLAUDE.md")) => "claude_md",
(_, Some("CLAW.md")) => "claw_md",
(_, Some("AGENTS.md")) => "agents_md",
(_, Some("CLAUDE.local.md")) => "claude_local_md",
(Some(".claw"), Some("instructions.md")) => "claw_instructions",
_ => "rule_file",
}
}
fn discover_instruction_files(
cwd: &Path,
rules_import: &RulesImportConfig,
) -> std::io::Result<Vec<ContextFile>> {
let mut directories = instruction_discovery_dirs(cwd);
directories.reverse();
let mut files = Vec::new();
for dir in directories {
for candidate in [
dir.join("CLAUDE.md"),
dir.join("CLAW.md"),
dir.join("AGENTS.md"),
dir.join("CLAUDE.local.md"),
dir.join(".claw").join("CLAUDE.md"),
dir.join(".claude").join("CLAUDE.md"),
dir.join(".claw").join("instructions.md"),
] {
push_context_file(&mut files, candidate)?;
}
push_rules_dir(&mut files, dir.join(".claw").join("rules"))?;
push_rules_dir(&mut files, dir.join(".claw").join("rules.local"))?;
push_framework_imports(&mut files, &dir, rules_import)?
}
Ok(dedupe_instruction_files(files))
}
fn instruction_discovery_dirs(cwd: &Path) -> Vec<PathBuf> {
let boundary = nearest_git_root(cwd).unwrap_or_else(|| cwd.to_path_buf());
let mut directories = Vec::new();
let mut cursor = Some(cwd);
while let Some(dir) = cursor {
directories.push(dir.to_path_buf());
if dir == boundary {
break;
}
cursor = dir.parent();
}
directories
}
fn nearest_git_root(cwd: &Path) -> Option<PathBuf> {
let mut cursor = Some(cwd);
while let Some(dir) = cursor {
let git_marker = dir.join(".git");
if git_marker.is_dir() || git_marker.is_file() {
return Some(dir.to_path_buf());
}
cursor = dir.parent();
}
None
}
fn push_context_file(files: &mut Vec<ContextFile>, path: PathBuf) -> std::io::Result<()> {
if path.is_dir() {
return Ok(());
}
match fs::read_to_string(&path) {
Ok(content) if !content.trim().is_empty() => {
files.push(ContextFile { path, content });
@ -262,6 +354,64 @@ fn push_context_file(files: &mut Vec<ContextFile>, path: PathBuf) -> std::io::Re
}
}
fn push_rules_dir(files: &mut Vec<ContextFile>, dir: PathBuf) -> std::io::Result<()> {
if dir.is_file() {
return Ok(());
}
let entries = match fs::read_dir(&dir) {
Ok(entries) => entries,
Err(error) if error.kind() == std::io::ErrorKind::NotFound => return Ok(()),
Err(error) => return Err(error),
};
let mut paths = entries
.filter_map(Result::ok)
.map(|entry| entry.path())
.filter(|path| path.is_file() && is_supported_rule_file(path))
.collect::<Vec<_>>();
paths.sort();
for path in paths {
push_context_file(files, path)?;
}
Ok(())
}
fn is_supported_rule_file(path: &Path) -> bool {
path.extension()
.and_then(|extension| extension.to_str())
.is_some_and(|extension| {
matches!(
extension.to_ascii_lowercase().as_str(),
"md" | "txt" | "mdc"
)
})
}
fn push_framework_imports(
files: &mut Vec<ContextFile>,
dir: &Path,
rules_import: &RulesImportConfig,
) -> std::io::Result<()> {
if rules_import.should_import("cursor") {
push_context_file(files, dir.join(".cursorrules"))?;
push_rules_dir(files, dir.join(".cursor").join("rules"))?;
}
if rules_import.should_import("copilot") {
push_context_file(files, dir.join(".github").join("copilot-instructions.md"))?;
}
if rules_import.should_import("windsurf") {
push_context_file(files, dir.join(".windsurfrules"))?;
push_rules_dir(files, dir.join(".windsurfrules"))?;
}
if rules_import.should_import("plandex") {
push_context_file(files, dir.join(".plandex").join("instructions.md"))?;
}
if rules_import.should_import("crush") {
push_context_file(files, dir.join(".crush").join("CLAUDE.md"))?;
push_rules_dir(files, dir.join(".crush").join("rules"))?;
}
Ok(())
}
fn read_git_status(cwd: &Path) -> Option<String> {
let output = Command::new("git")
.args(["--no-optional-locks", "status", "--short", "--branch"])
@ -332,7 +482,7 @@ fn render_project_context(project_context: &ProjectContext) -> String {
];
if !project_context.instruction_files.is_empty() {
bullets.push(format!(
"Claude instruction files discovered: {}.",
"Project instruction files discovered: {}.",
project_context.instruction_files.len()
));
}
@ -367,7 +517,7 @@ fn render_project_context(project_context: &ProjectContext) -> String {
}
fn render_instruction_files(files: &[ContextFile]) -> String {
let mut sections = vec!["# Claude instructions".to_string()];
let mut sections = vec!["# Project instructions".to_string()];
let mut remaining_chars = MAX_TOTAL_INSTRUCTION_CHARS;
for file in files {
if remaining_chars == 0 {
@ -476,14 +626,30 @@ pub fn load_system_prompt(
model_family: ModelFamilyIdentity,
) -> Result<Vec<String>, PromptBuildError> {
let cwd = cwd.into();
let project_context = ProjectContext::discover_with_git(&cwd, current_date.into())?;
let (sections, _) =
load_system_prompt_with_context(cwd, current_date, os_name, os_version, model_family)?;
Ok(sections)
}
/// Loads config and project context, then renders the system prompt text plus metadata.
pub fn load_system_prompt_with_context(
cwd: impl Into<PathBuf>,
current_date: impl Into<String>,
os_name: impl Into<String>,
os_version: impl Into<String>,
model_family: ModelFamilyIdentity,
) -> Result<(Vec<String>, ProjectContext), PromptBuildError> {
let cwd = cwd.into();
let config = ConfigLoader::default_for(&cwd).load()?;
Ok(SystemPromptBuilder::new()
let project_context =
discover_with_git_and_rules_import(&cwd, current_date.into(), config.rules_import())?;
let sections = SystemPromptBuilder::new()
.with_os(os_name, os_version)
.with_model_family(model_family)
.with_project_context(project_context)
.with_project_context(project_context.clone())
.with_runtime_config(config)
.build())
.build();
Ok((sections, project_context))
}
fn render_config_section(config: &RuntimeConfig) -> String {
@ -590,11 +756,84 @@ mod tests {
}
}
#[test]
fn discovers_claw_rules_files_in_sorted_order() {
let root = temp_dir();
let rules = root.join(".claw").join("rules");
let local_rules = root.join(".claw").join("rules.local");
fs::create_dir_all(&rules).expect("rules dir");
fs::create_dir_all(&local_rules).expect("local rules dir");
fs::write(rules.join("b.txt"), "b rule").expect("write b rule");
fs::write(rules.join("a.md"), "a rule").expect("write a rule");
fs::write(rules.join("ignored.json"), "ignored rule").expect("write ignored");
fs::write(local_rules.join("c.mdc"), "c local rule").expect("write local rule");
let context = ProjectContext::discover(&root, "2026-03-31").expect("context should load");
let contents = context
.instruction_files
.iter()
.map(|file| file.content.as_str())
.collect::<Vec<_>>();
assert_eq!(contents, vec!["a rule", "b rule", "c local rule"]);
fs::remove_dir_all(root).expect("cleanup temp dir");
}
#[test]
fn rules_import_none_suppresses_external_framework_rules() {
let root = temp_dir();
fs::create_dir_all(root.join(".claw").join("rules")).expect("rules dir");
fs::write(
root.join(".claw").join("rules").join("project.md"),
"claw rule",
)
.expect("write claw rule");
fs::write(root.join(".cursorrules"), "cursor rule").expect("write cursor rule");
let context = ProjectContext::discover_with_rules_import(
&root,
"2026-03-31",
&crate::config::RulesImportConfig::None,
)
.expect("context should load");
let rendered = render_instruction_files(&context.instruction_files);
assert!(rendered.contains("claw rule"));
assert!(!rendered.contains("cursor rule"));
fs::remove_dir_all(root).expect("cleanup temp dir");
}
#[test]
fn rules_import_list_loads_only_selected_framework_rules() {
let root = temp_dir();
fs::create_dir_all(&root).expect("root dir");
fs::write(root.join(".cursorrules"), "cursor rule").expect("write cursor rule");
fs::create_dir_all(root.join(".github")).expect("github dir");
fs::write(
root.join(".github").join("copilot-instructions.md"),
"copilot rule",
)
.expect("write copilot rule");
let context = ProjectContext::discover_with_rules_import(
&root,
"2026-03-31",
&crate::config::RulesImportConfig::List(vec!["copilot".to_string()]),
)
.expect("context should load");
let rendered = render_instruction_files(&context.instruction_files);
assert!(rendered.contains("copilot rule"));
assert!(!rendered.contains("cursor rule"));
fs::remove_dir_all(root).expect("cleanup temp dir");
}
#[test]
fn discovers_instruction_files_from_ancestor_chain() {
let root = temp_dir();
let nested = root.join("apps").join("api");
fs::create_dir_all(nested.join(".claw")).expect("nested claw dir");
fs::create_dir(root.join(".git")).expect("git boundary");
fs::write(root.join("CLAUDE.md"), "root instructions").expect("write root instructions");
fs::write(root.join("CLAUDE.local.md"), "local instructions")
.expect("write local instructions");
@ -636,11 +875,80 @@ mod tests {
fs::remove_dir_all(root).expect("cleanup temp dir");
}
#[test]
fn discovers_agents_markdown_instruction_file() {
let root = temp_dir();
fs::create_dir_all(&root).expect("root dir");
fs::write(root.join("AGENTS.md"), "agents-only instructions").expect("write AGENTS.md");
let context = ProjectContext::discover(&root, "2026-03-31").expect("context should load");
assert_eq!(context.instruction_files.len(), 1);
assert!(context.instruction_files[0].path.ends_with("AGENTS.md"));
assert!(render_instruction_files(&context.instruction_files)
.contains("agents-only instructions"));
fs::remove_dir_all(root).expect("cleanup temp dir");
}
#[test]
fn discovers_scoped_dot_claude_claude_markdown_instruction_file() {
let root = temp_dir();
fs::create_dir_all(root.join(".claude")).expect("dot claude dir");
fs::write(
root.join(".claude").join("CLAUDE.md"),
"dot-claude-only instructions",
)
.expect("write .claude/CLAUDE.md");
let context = ProjectContext::discover(&root, "2026-03-31").expect("context should load");
assert_eq!(context.instruction_files.len(), 1);
assert!(context.instruction_files[0]
.path
.ends_with(".claude/CLAUDE.md"));
assert!(render_instruction_files(&context.instruction_files)
.contains("dot-claude-only instructions"));
fs::remove_dir_all(root).expect("cleanup temp dir");
}
#[test]
fn discovers_claude_claw_agents_and_dot_claude_instruction_files_together() {
let root = temp_dir();
fs::create_dir_all(root.join(".claude")).expect("dot claude dir");
fs::write(root.join("CLAUDE.md"), "claude instructions").expect("write CLAUDE.md");
fs::write(root.join("CLAW.md"), "claw instructions").expect("write CLAW.md");
fs::write(root.join("AGENTS.md"), "agents instructions").expect("write AGENTS.md");
fs::write(
root.join(".claude").join("CLAUDE.md"),
"dot claude instructions",
)
.expect("write .claude/CLAUDE.md");
let context = ProjectContext::discover(&root, "2026-03-31").expect("context should load");
let rendered = render_instruction_files(&context.instruction_files);
let sources = context
.instruction_files
.iter()
.map(ContextFile::source)
.collect::<Vec<_>>();
assert_eq!(
sources,
vec!["claude_md", "claw_md", "agents_md", "claude_claude_md"]
);
assert!(rendered.contains("claude instructions"));
assert!(rendered.contains("claw instructions"));
assert!(rendered.contains("agents instructions"));
assert!(rendered.contains("dot claude instructions"));
fs::remove_dir_all(root).expect("cleanup temp dir");
}
#[test]
fn dedupes_identical_instruction_content_across_scopes() {
let root = temp_dir();
let nested = root.join("apps").join("api");
fs::create_dir_all(&nested).expect("nested dir");
fs::create_dir(root.join(".git")).expect("git boundary");
fs::write(root.join("CLAUDE.md"), "same rules\n\n").expect("write root");
fs::write(nested.join("CLAUDE.md"), "same rules\n").expect("write nested");
@ -653,6 +961,50 @@ mod tests {
fs::remove_dir_all(root).expect("cleanup temp dir");
}
#[test]
fn discovery_stops_at_git_root_boundary_439() {
let root = temp_dir();
let repo = root.join("repo");
let nested = repo.join("subproj").join("deep").join("nest");
fs::create_dir_all(&nested).expect("nested dir");
fs::create_dir(repo.join(".git")).expect("git boundary");
fs::write(root.join("CLAUDE.md"), "PARENT_CLAUDE").expect("write parent");
fs::write(repo.join("CLAUDE.md"), "REPO_CLAUDE").expect("write repo");
fs::write(repo.join("subproj").join("CLAUDE.md"), "CHILD_CLAUDE").expect("write child");
fs::write(
repo.join("subproj").join("deep").join("CLAUDE.md"),
"DEEP_CLAUDE",
)
.expect("write deep");
let context = ProjectContext::discover(&nested, "2026-03-31").expect("context should load");
let rendered = render_instruction_files(&context.instruction_files);
assert!(!rendered.contains("PARENT_CLAUDE"));
assert!(rendered.contains("REPO_CLAUDE"));
assert!(rendered.contains("CHILD_CLAUDE"));
assert!(rendered.contains("DEEP_CLAUDE"));
assert_eq!(context.instruction_files.len(), 3);
fs::remove_dir_all(root).expect("cleanup temp dir");
}
#[test]
fn discovery_without_git_root_stays_cwd_local_439() {
let root = temp_dir();
let nested = root.join("scratch");
fs::create_dir_all(&nested).expect("nested dir");
fs::write(root.join("CLAUDE.md"), "PARENT_CLAUDE").expect("write parent");
fs::write(nested.join("CLAUDE.md"), "SCRATCH_CLAUDE").expect("write scratch");
let context = ProjectContext::discover(&nested, "2026-03-31").expect("context should load");
let rendered = render_instruction_files(&context.instruction_files);
assert!(!rendered.contains("PARENT_CLAUDE"));
assert!(rendered.contains("SCRATCH_CLAUDE"));
assert_eq!(context.instruction_files.len(), 1);
fs::remove_dir_all(root).expect("cleanup temp dir");
}
#[test]
fn truncates_large_instruction_content_for_rendering() {
let rendered = render_instruction_content(&"x".repeat(4500));
@ -876,6 +1228,51 @@ mod tests {
fs::remove_dir_all(root).expect("cleanup temp dir");
}
#[test]
fn load_system_prompt_respects_rules_import_config() {
let root = temp_dir();
fs::create_dir_all(root.join(".claw")).expect("claw dir");
fs::write(root.join(".cursorrules"), "cursor rule").expect("write cursor rule");
fs::write(
root.join(".claw").join("settings.json"),
r#"{"rulesImport":"none"}"#,
)
.expect("write settings");
let _guard = env_lock();
ensure_valid_cwd();
let previous = std::env::current_dir().expect("cwd");
let original_home = std::env::var("HOME").ok();
let original_claw_home = std::env::var("CLAW_CONFIG_HOME").ok();
std::env::set_var("HOME", &root);
std::env::set_var("CLAW_CONFIG_HOME", root.join("missing-home"));
std::env::set_current_dir(&root).expect("change cwd");
let prompt = super::load_system_prompt(
&root,
"2026-03-31",
"linux",
"6.8",
ModelFamilyIdentity::Claude,
)
.expect("system prompt should load")
.join("\n\n");
std::env::set_current_dir(previous).expect("restore cwd");
if let Some(value) = original_home {
std::env::set_var("HOME", value);
} else {
std::env::remove_var("HOME");
}
if let Some(value) = original_claw_home {
std::env::set_var("CLAW_CONFIG_HOME", value);
} else {
std::env::remove_var("CLAW_CONFIG_HOME");
}
assert!(!prompt.contains("cursor rule"));
assert!(prompt.contains("rulesImport"));
fs::remove_dir_all(root).expect("cleanup temp dir");
}
#[test]
fn renders_default_claude_model_family_identity() {
// given: a prompt builder without an explicit model family override
@ -945,7 +1342,7 @@ mod tests {
assert!(prompt.contains("# System"));
assert!(prompt.contains("# Project context"));
assert!(prompt.contains("# Claude instructions"));
assert!(prompt.contains("# Project instructions"));
assert!(prompt.contains("Project rules"));
assert!(prompt.contains("permissionMode"));
assert!(prompt.contains(SYSTEM_PROMPT_DYNAMIC_BOUNDARY));
@ -990,7 +1387,7 @@ mod tests {
path: PathBuf::from("/tmp/project/CLAUDE.md"),
content: "Project rules".to_string(),
}]);
assert!(rendered.contains("# Claude instructions"));
assert!(rendered.contains("# Project instructions"));
assert!(rendered.contains("scope: /tmp/project"));
assert!(rendered.contains("Project rules"));
}

View File

@ -28,7 +28,8 @@ pub struct SessionStore {
impl SessionStore {
/// Build a store from the server's current working directory.
///
/// The on-disk layout becomes `<cwd>/.claw/sessions/<workspace_hash>/`.
/// The on-disk layout is `<cwd>/.claw/sessions/<workspace_hash>/`,
/// created lazily on first successful session save.
pub fn from_cwd(cwd: impl AsRef<Path>) -> Result<Self, SessionControlError> {
let cwd = cwd.as_ref();
// #151: canonicalize so equivalent paths (symlinks, relative vs
@ -40,7 +41,6 @@ impl SessionStore {
.join(".claw")
.join("sessions")
.join(workspace_fingerprint(&canonical_cwd));
fs::create_dir_all(&sessions_root)?;
Ok(Self {
sessions_root,
workspace_root: canonical_cwd,
@ -49,7 +49,8 @@ impl SessionStore {
/// Build a store from an explicit `--data-dir` flag.
///
/// The on-disk layout becomes `<data_dir>/sessions/<workspace_hash>/`
/// The on-disk layout is `<data_dir>/sessions/<workspace_hash>/`,
/// created lazily on first successful session save.
/// where `<workspace_hash>` is derived from `workspace_root`.
pub fn from_data_dir(
data_dir: impl AsRef<Path>,
@ -64,7 +65,6 @@ impl SessionStore {
.as_ref()
.join("sessions")
.join(workspace_fingerprint(&canonical_workspace));
fs::create_dir_all(&sessions_root)?;
Ok(Self {
sessions_root,
workspace_root: canonical_workspace,
@ -760,14 +760,21 @@ mod tests {
use crate::session::Session;
use std::fs;
use std::path::{Path, PathBuf};
use std::sync::atomic::{AtomicU64, Ordering};
use std::time::{SystemTime, UNIX_EPOCH};
static TEMP_COUNTER: AtomicU64 = AtomicU64::new(0);
fn temp_dir() -> PathBuf {
let nanos = SystemTime::now()
.duration_since(UNIX_EPOCH)
.expect("time should be after epoch")
.as_nanos();
std::env::temp_dir().join(format!("runtime-session-control-{nanos}"))
let counter = TEMP_COUNTER.fetch_add(1, Ordering::Relaxed);
std::env::temp_dir().join(format!(
"runtime-session-control-{}-{nanos}-{counter}",
std::process::id()
))
}
fn persist_session(root: &Path, text: &str) -> Session {
@ -981,6 +988,38 @@ mod tests {
}
}
#[test]
fn session_store_from_cwd_is_side_effect_free_until_save() {
// given
let base = temp_dir();
let workspace = base.join("fresh-workspace");
fs::create_dir_all(&workspace).expect("workspace should exist");
// when
let store = SessionStore::from_cwd(&workspace).expect("store should build");
// then — resolving the store must not create .claw/session partitions.
assert!(
!workspace.join(".claw").exists(),
"session store construction must not create .claw side effects"
);
assert!(
!store.sessions_dir().exists(),
"session partition should be created lazily on save"
);
let session = persist_session_via_store(&store, "first saved turn");
assert!(
store
.sessions_dir()
.join(format!("{}.jsonl", session.session_id))
.exists(),
"saving a managed session should create the lazy session partition"
);
fs::remove_dir_all(base).expect("temp dir should clean up");
}
#[test]
fn session_store_from_cwd_isolates_sessions_by_workspace() {
// given

View File

@ -12,7 +12,6 @@ path = "src/main.rs"
[dependencies]
api = { path = "../api" }
commands = { path = "../commands" }
compat-harness = { path = "../compat-harness" }
crossterm = "0.28"
pulldown-cmark = "0.13"
rustyline = "15"

View File

@ -1,10 +1,9 @@
use std::env;
use std::process::Command;
fn main() {
// Get git SHA (short hash)
let git_sha = Command::new("git")
.args(["rev-parse", "--short", "HEAD"])
fn command_output(program: &str, args: &[&str]) -> Option<String> {
Command::new(program)
.args(args)
.output()
.ok()
.and_then(|output| {
@ -14,11 +13,37 @@ fn main() {
None
}
})
.map_or_else(|| "unknown".to_string(), |s| s.trim().to_string());
.map(|value| value.trim().to_string())
.filter(|value| !value.is_empty())
}
fn main() {
let git_sha =
command_output("git", &["rev-parse", "HEAD"]).unwrap_or_else(|| "unknown".to_string());
let git_sha_short = command_output("git", &["rev-parse", "--short=12", "HEAD"])
.or_else(|| git_sha.get(..git_sha.len().min(12)).map(str::to_string))
.unwrap_or_else(|| "unknown".to_string());
let git_dirty = command_output("git", &["status", "--porcelain"])
.map(|status| (!status.trim().is_empty()).to_string())
.unwrap_or_else(|| "false".to_string());
let git_branch = command_output("git", &["branch", "--show-current"])
.unwrap_or_else(|| "unknown".to_string());
let git_commit_date = command_output("git", &["show", "-s", "--format=%cI", "HEAD"])
.unwrap_or_else(|| "unknown".to_string());
let git_commit_timestamp = command_output("git", &["show", "-s", "--format=%ct", "HEAD"])
.unwrap_or_else(|| "unknown".to_string());
let rustc_version =
command_output("rustc", &["--version"]).unwrap_or_else(|| "unknown".to_string());
println!("cargo:rustc-env=GIT_SHA={git_sha}");
println!("cargo:rustc-env=GIT_SHA_SHORT={git_sha_short}");
println!("cargo:rustc-env=GIT_DIRTY={git_dirty}");
println!("cargo:rustc-env=GIT_BRANCH={git_branch}");
println!("cargo:rustc-env=GIT_COMMIT_DATE={git_commit_date}");
println!("cargo:rustc-env=GIT_COMMIT_TIMESTAMP={git_commit_timestamp}");
println!("cargo:rustc-env=RUSTC_VERSION={rustc_version}");
// TARGET is always set by Cargo during build
// TARGET is always set by Cargo during build.
let target = env::var("TARGET").unwrap_or_else(|_| "unknown".to_string());
println!("cargo:rustc-env=TARGET={target}");
@ -35,23 +60,12 @@ fn main() {
})
.or_else(|| std::env::var("BUILD_DATE").ok())
.unwrap_or_else(|| {
// Fall back to current date via `date` command
Command::new("date")
.args(["+%Y-%m-%d"])
.output()
.ok()
.and_then(|o| {
if o.status.success() {
String::from_utf8(o.stdout).ok()
} else {
None
}
})
.map_or_else(|| "unknown".to_string(), |s| s.trim().to_string())
command_output("date", &["+%Y-%m-%d"]).unwrap_or_else(|| "unknown".to_string())
});
println!("cargo:rustc-env=BUILD_DATE={build_date}");
// Rerun if git state changes
println!("cargo:rerun-if-changed=.git/HEAD");
println!("cargo:rerun-if-changed=.git/refs");
// Rerun if git state changes. Paths are relative to this package root.
println!("cargo:rerun-if-changed=../../../.git/HEAD");
println!("cargo:rerun-if-changed=../../../.git/refs");
println!("cargo:rerun-if-changed=../../../.git/index");
}

View File

@ -4,7 +4,14 @@ use std::path::{Path, PathBuf};
const STARTER_CLAW_JSON: &str = concat!(
"{\n",
" \"permissions\": {\n",
" \"defaultMode\": \"dontAsk\"\n",
" \"defaultMode\": \"acceptEdits\"\n",
" }\n",
"}\n",
);
const STARTER_SETTINGS_JSON: &str = concat!(
"{\n",
" \"permissions\": {\n",
" \"defaultMode\": \"acceptEdits\"\n",
" }\n",
"}\n",
);
@ -15,6 +22,8 @@ const GITIGNORE_ENTRIES: [&str; 3] = [".claw/settings.local.json", ".claw/sessio
pub(crate) enum InitStatus {
Created,
Updated,
Partial,
Deferred,
Skipped,
}
@ -24,6 +33,8 @@ impl InitStatus {
match self {
Self::Created => "created",
Self::Updated => "updated",
Self::Partial => "partial (created missing sub-files)",
Self::Deferred => "deferred (created on first session save)",
Self::Skipped => "skipped (already exists)",
}
}
@ -36,6 +47,8 @@ impl InitStatus {
match self {
Self::Created => "created",
Self::Updated => "updated",
Self::Partial => "partial",
Self::Deferred => "deferred",
Self::Skipped => "skipped",
}
}
@ -123,9 +136,30 @@ pub(crate) fn initialize_repo(cwd: &Path) -> Result<InitReport, Box<dyn std::err
let mut artifacts = Vec::new();
let claw_dir = cwd.join(".claw");
let claw_dir_status = ensure_dir(&claw_dir)?;
let settings_json = claw_dir.join("settings.json");
let settings_status = write_file_if_missing(&settings_json, STARTER_SETTINGS_JSON)?;
let claw_dir_status =
if claw_dir_status == InitStatus::Skipped && settings_status == InitStatus::Created {
InitStatus::Partial
} else {
claw_dir_status
};
artifacts.push(InitArtifact {
name: ".claw/",
status: ensure_dir(&claw_dir)?,
status: claw_dir_status,
});
artifacts.push(InitArtifact {
name: ".claw/settings.json",
status: settings_status,
});
artifacts.push(InitArtifact {
name: ".claw/sessions/",
status: if claw_dir.join("sessions").is_dir() {
InitStatus::Skipped
} else {
InitStatus::Deferred
},
});
let claw_json = cwd.join(".claw.json");
@ -414,11 +448,26 @@ mod tests {
concat!(
"{\n",
" \"permissions\": {\n",
" \"defaultMode\": \"dontAsk\"\n",
" \"defaultMode\": \"acceptEdits\"\n",
" }\n",
"}\n",
)
);
assert_eq!(
fs::read_to_string(root.join(".claw").join("settings.json"))
.expect("read project settings"),
concat!(
"{\n",
" \"permissions\": {\n",
" \"defaultMode\": \"acceptEdits\"\n",
" }\n",
"}\n",
)
);
assert!(
!root.join(".claw").join("sessions").exists(),
"sessions directory should be deferred until first session save"
);
let gitignore = fs::read_to_string(root.join(".gitignore")).expect("read gitignore");
assert!(gitignore.contains(".claw/settings.local.json"));
assert!(gitignore.contains(".claw/sessions/"));
@ -436,14 +485,24 @@ mod tests {
fs::create_dir_all(&root).expect("create root");
fs::write(root.join("CLAUDE.md"), "custom guidance\n").expect("write existing claude md");
fs::write(root.join(".gitignore"), ".claw/settings.local.json\n").expect("write gitignore");
fs::create_dir_all(root.join(".claw")).expect("create existing .claw dir");
let first = initialize_repo(&root).expect("first init should succeed");
assert!(first
.render()
.contains("CLAUDE.md skipped (already exists)"));
assert_eq!(
first.artifacts_with_status(InitStatus::Partial),
vec![".claw/".to_string()],
"existing .claw/ should report partial when init creates missing settings.json"
);
assert!(root.join(".claw").join("settings.json").is_file());
let second = initialize_repo(&root).expect("second init should succeed");
let second_rendered = second.render();
assert!(second_rendered.contains(".claw/"));
assert!(second_rendered.contains(".claw/settings.json"));
assert!(second_rendered.contains(".claw/sessions/"));
assert!(second_rendered.contains(".claw.json"));
assert!(second_rendered.contains("skipped (already exists)"));
assert!(second_rendered.contains(".gitignore skipped (already exists)"));
@ -474,16 +533,22 @@ mod tests {
created_names,
vec![
".claw/".to_string(),
".claw/settings.json".to_string(),
".claw.json".to_string(),
".gitignore".to_string(),
"CLAUDE.md".to_string(),
],
"fresh init should place all four artifacts in created[]"
"fresh init should place created artifacts in created[]"
);
assert!(
fresh.artifacts_with_status(InitStatus::Skipped).is_empty(),
"fresh init should have no skipped artifacts"
);
assert_eq!(
fresh.artifacts_with_status(InitStatus::Deferred),
vec![".claw/sessions/".to_string()],
"fresh init should report session storage as deferred"
);
let second = initialize_repo(&root).expect("second init should succeed");
let skipped_names = second.artifacts_with_status(InitStatus::Skipped);
@ -491,27 +556,38 @@ mod tests {
skipped_names,
vec![
".claw/".to_string(),
".claw/settings.json".to_string(),
".claw.json".to_string(),
".gitignore".to_string(),
"CLAUDE.md".to_string(),
],
"idempotent init should place all four artifacts in skipped[]"
"idempotent init should place existing artifacts in skipped[]"
);
assert!(
second.artifacts_with_status(InitStatus::Created).is_empty(),
"idempotent init should have no created artifacts"
);
assert_eq!(
second.artifacts_with_status(InitStatus::Deferred),
vec![".claw/sessions/".to_string()],
"idempotent init should keep session storage deferred until first save"
);
// artifact_json_entries() uses the machine-stable `json_tag()` which
// never changes wording (unlike `label()` which says "skipped (already exists)").
let entries = second.artifact_json_entries();
assert_eq!(entries.len(), 4);
assert_eq!(entries.len(), 6);
for entry in &entries {
let name = entry.get("name").and_then(|v| v.as_str()).unwrap();
let status = entry.get("status").and_then(|v| v.as_str()).unwrap();
assert_eq!(
status, "skipped",
"machine status tag should be the bare word 'skipped', not label()'s 'skipped (already exists)'"
);
if name == ".claw/sessions/" {
assert_eq!(status, "deferred");
} else {
assert_eq!(
status, "skipped",
"machine status tag should be the bare word 'skipped', not label()'s 'skipped (already exists)'"
);
}
}
fs::remove_dir_all(root).expect("cleanup temp dir");

File diff suppressed because it is too large Load Diff

View File

@ -1,6 +1,7 @@
#![allow(clippy::while_let_on_iterator)]
use std::fs;
use std::io::Write;
use std::path::PathBuf;
use std::process::{Command, Output, Stdio};
use std::sync::atomic::{AtomicU64, Ordering};
@ -246,8 +247,121 @@ stderr:
}
#[test]
fn compact_subcommand_json_help_fails_fast_when_stdin_closed() {
let workspace = unique_temp_dir("compact-nontty-json-help");
fn prompt_subcommand_reads_prompt_from_stdin_when_no_positional_arg_423() {
let runtime = tokio::runtime::Runtime::new().expect("tokio runtime should build");
let server = runtime
.block_on(MockAnthropicService::spawn())
.expect("mock service should start");
let base_url = server.base_url();
let workspace = unique_temp_dir("prompt-stdin-423");
let config_home = workspace.join("config-home");
let home = workspace.join("home");
fs::create_dir_all(&workspace).expect("workspace should exist");
fs::create_dir_all(&config_home).expect("config home should exist");
fs::create_dir_all(&home).expect("home should exist");
let prompt = format!("{SCENARIO_PREFIX}streaming_text\n");
let output = run_claw_with_stdin(
&workspace,
&config_home,
&home,
&base_url,
&[
"prompt",
"--output-format",
"json",
"--compact",
"--permission-mode",
"read-only",
"--model",
"sonnet",
],
&prompt,
);
assert!(
output.status.success(),
"prompt stdin run should succeed\nstdout:\n{}\n\nstderr:\n{}",
String::from_utf8_lossy(&output.stdout),
String::from_utf8_lossy(&output.stderr),
);
let parsed: Value = serde_json::from_slice(&output.stdout).expect("stdout should parse");
assert_eq!(
parsed["message"],
"Mock streaming says hello from the parity harness."
);
let captured = runtime.block_on(server.captured_requests());
assert!(
captured
.iter()
.any(|request| request.raw_body.contains("PARITY_SCENARIO:streaming_text")),
"stdin prompt should reach the provider request: {captured:?}"
);
fs::remove_dir_all(&workspace).expect("workspace cleanup should succeed");
}
#[test]
fn prompt_subcommand_stdin_flag_appends_pipe_context_423() {
let runtime = tokio::runtime::Runtime::new().expect("tokio runtime should build");
let server = runtime
.block_on(MockAnthropicService::spawn())
.expect("mock service should start");
let base_url = server.base_url();
let workspace = unique_temp_dir("prompt-stdin-flag-423");
let config_home = workspace.join("config-home");
let home = workspace.join("home");
fs::create_dir_all(&workspace).expect("workspace should exist");
fs::create_dir_all(&config_home).expect("config home should exist");
fs::create_dir_all(&home).expect("home should exist");
let prompt_context = format!("{SCENARIO_PREFIX}streaming_text\n");
let output = run_claw_with_stdin(
&workspace,
&config_home,
&home,
&base_url,
&[
"prompt",
"Use stdin context",
"--stdin",
"--output-format",
"json",
"--compact",
"--permission-mode",
"read-only",
"--model",
"sonnet",
],
&prompt_context,
);
assert!(
output.status.success(),
"prompt --stdin run should succeed\nstdout:\n{}\n\nstderr:\n{}",
String::from_utf8_lossy(&output.stdout),
String::from_utf8_lossy(&output.stderr),
);
let captured = runtime.block_on(server.captured_requests());
let provider_body = captured
.iter()
.find(|request| request.raw_body.contains("Use stdin context"))
.expect("merged prompt should reach provider");
assert!(
provider_body
.raw_body
.contains("PARITY_SCENARIO:streaming_text"),
"merged prompt should include stdin context: {provider_body:?}"
);
fs::remove_dir_all(&workspace).expect("workspace cleanup should succeed");
}
#[test]
fn compact_subcommand_json_fails_fast_when_stdin_closed() {
let workspace = unique_temp_dir("compact-nontty-json");
let config_home = workspace.join("config-home");
let home = workspace.join("home");
fs::create_dir_all(&workspace).expect("workspace should exist");
@ -258,19 +372,19 @@ fn compact_subcommand_json_help_fails_fast_when_stdin_closed() {
&workspace,
&config_home,
&home,
&["compact", "--output-format", "json", "--help"],
&["compact", "--output-format", "json"],
Duration::from_secs(2),
);
assert!(
!output.status.success(),
"compact json help should fail non-zero"
"compact json should fail non-zero"
);
// #819/#820/#823: JSON abort envelopes route to stdout
let stderr = String::from_utf8(output.stderr).expect("stderr should be utf8");
assert!(
stderr.trim().is_empty() || !stderr.trim_start().starts_with('{'),
"compact json help should not emit JSON envelope to stderr (#819/#820/#823): {stderr}"
"compact json should not emit JSON envelope to stderr (#819/#820/#823): {stderr}"
);
let stdout = String::from_utf8(output.stdout).expect("stdout should be utf8");
let parsed: Value =
@ -356,6 +470,39 @@ fn run_claw(
command.output().expect("claw should launch")
}
fn run_claw_with_stdin(
cwd: &std::path::Path,
config_home: &std::path::Path,
home: &std::path::Path,
base_url: &str,
args: &[&str],
stdin: &str,
) -> Output {
let mut child = Command::new(env!("CARGO_BIN_EXE_claw"))
.current_dir(cwd)
.env_clear()
.env("ANTHROPIC_API_KEY", "test-compact-key")
.env("ANTHROPIC_BASE_URL", base_url)
.env("CLAW_CONFIG_HOME", config_home)
.env("HOME", home)
.env("NO_COLOR", "1")
.env("PATH", "/usr/bin:/bin")
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.args(args)
.spawn()
.expect("claw should launch");
child
.stdin
.as_mut()
.expect("stdin should be piped")
.write_all(stdin.as_bytes())
.expect("stdin should write");
child.stdin.take();
child.wait_with_output().expect("output should collect")
}
fn run_claw_closed_stdin_with_timeout(
cwd: &std::path::Path,
config_home: &std::path::Path,

File diff suppressed because it is too large Load Diff

View File

@ -222,6 +222,73 @@ fn resume_latest_restores_the_most_recent_managed_session() {
assert!(stdout.contains(newer_path.to_str().expect("utf8 path")));
}
#[test]
fn resume_latest_missing_session_fails_without_creating_session_dirs_435() {
// given
let temp_dir = unique_temp_dir("resume-latest-missing-435");
let project_dir = temp_dir.join("project");
let config_home = temp_dir.join("config-home");
let home = temp_dir.join("home");
fs::create_dir_all(&project_dir).expect("project dir should exist");
fs::create_dir_all(&config_home).expect("config home should exist");
fs::create_dir_all(&home).expect("home should exist");
let envs = [
(
"CLAW_CONFIG_HOME",
config_home.to_str().expect("utf8 config home"),
),
("HOME", home.to_str().expect("utf8 home")),
("ANTHROPIC_API_KEY", ""),
("ANTHROPIC_AUTH_TOKEN", ""),
("OPENAI_API_KEY", ""),
];
// when — both text and JSON resume failures should be non-zero and read-only.
let text = run_claw_with_env(&project_dir, &["--resume", "latest"], &envs);
let json = run_claw_with_env(
&project_dir,
&["--output-format", "json", "--resume", "latest"],
&envs,
);
// then
assert_eq!(
text.status.code(),
Some(1),
"text resume failure must be non-zero"
);
assert!(
text.stdout.is_empty(),
"text resume failure should not claim success on stdout: {}",
String::from_utf8_lossy(&text.stdout)
);
let text_stderr = String::from_utf8_lossy(&text.stderr);
assert!(
text_stderr.contains("no managed sessions found"),
"text failure should explain missing sessions: {text_stderr}"
);
assert_eq!(
json.status.code(),
Some(1),
"JSON resume failure must be non-zero"
);
assert!(
json.stderr.is_empty(),
"JSON resume failure should keep stderr empty: {}",
String::from_utf8_lossy(&json.stderr)
);
let parsed: Value = serde_json::from_slice(&json.stdout)
.expect("JSON resume failure should emit JSON to stdout");
assert_eq!(parsed["status"], "error");
assert_eq!(parsed["action"], "restore");
assert_eq!(parsed["error_kind"], "no_managed_sessions");
assert!(
!project_dir.join(".claw").exists(),
"failed resume must not create .claw/session directories"
);
}
#[test]
fn resumed_status_command_emits_structured_json_when_requested() {
// given
@ -268,7 +335,7 @@ fn resumed_status_command_emits_structured_json_when_requested() {
assert_eq!(parsed["kind"], "status");
// model is null in resume mode (not known without --model flag)
assert!(parsed["model"].is_null());
assert_eq!(parsed["permission_mode"], "danger-full-access");
assert_eq!(parsed["permission_mode"], "workspace-write");
assert_eq!(parsed["usage"]["messages"], 1);
assert!(parsed["usage"]["turns"].is_number());
assert!(parsed["workspace"]["cwd"].as_str().is_some());
@ -396,6 +463,9 @@ fn resumed_version_command_emits_structured_json() {
assert!(parsed["version"].as_str().is_some());
assert!(parsed["git_sha"].as_str().is_some());
assert!(parsed["target"].as_str().is_some());
assert!(parsed["git_sha_short"].as_str().is_some());
assert!(parsed.get("message").is_none());
assert!(parsed["human_readable"].as_str().is_some());
}
#[test]

View File

@ -201,30 +201,20 @@ impl GlobalToolRegistry {
return Ok(None);
}
let builtin_specs = mvp_tool_specs();
let canonical_names = builtin_specs
.iter()
.map(|spec| spec.name.to_string())
.chain(
self.plugin_tools
.iter()
.map(|tool| tool.definition().name.clone()),
)
.chain(self.runtime_tools.iter().map(|tool| tool.name.clone()))
.collect::<Vec<_>>();
let mut name_map = canonical_names
.iter()
.map(|name| (normalize_tool_name(name), name.clone()))
.collect::<BTreeMap<_, _>>();
let actual_names = self.actual_tool_names();
let canonical_names = self.canonical_allowed_tool_names();
let canonical_name_set = canonical_names.iter().cloned().collect::<BTreeSet<_>>();
let mut name_map = BTreeMap::new();
for actual in &actual_names {
let canonical = canonical_allowed_tool_name(actual);
name_map.insert(allowed_tool_lookup_key(actual), canonical.clone());
name_map.insert(allowed_tool_lookup_key(&canonical), canonical);
}
for (alias, canonical) in [
("read", "read_file"),
("write", "write_file"),
("edit", "edit_file"),
("glob", "glob_search"),
("grep", "grep_search"),
] {
name_map.insert(alias.to_string(), canonical.to_string());
for (alias, canonical) in self.allowed_tool_aliases() {
if canonical_name_set.contains(&canonical) {
name_map.insert(allowed_tool_lookup_key(&alias), canonical);
}
}
let mut allowed = BTreeSet::new();
@ -233,11 +223,11 @@ impl GlobalToolRegistry {
.split(|ch: char| ch == ',' || ch.is_whitespace())
.filter(|token| !token.is_empty())
{
let normalized = normalize_tool_name(token);
let canonical = name_map.get(&normalized).ok_or_else(|| {
let canonical = name_map.get(&allowed_tool_lookup_key(token)).ok_or_else(|| {
format!(
"unsupported tool in --allowedTools: {token} (expected one of: {})",
canonical_names.join(", ")
"invalid_tool_name: unsupported tool in --allowedTools: {token}\nAvailable: {}\nAliases: {}\nHint: Use canonical snake_case tool names from Available or aliases from Aliases.",
canonical_names.join(", "),
format_allowed_tool_aliases(&self.allowed_tool_aliases())
)
})?;
allowed.insert(canonical.clone());
@ -258,7 +248,10 @@ impl GlobalToolRegistry {
pub fn definitions(&self, allowed_tools: Option<&BTreeSet<String>>) -> Vec<ToolDefinition> {
let builtin = mvp_tool_specs()
.into_iter()
.filter(|spec| allowed_tools.is_none_or(|allowed| allowed.contains(spec.name)))
.filter(|spec| {
allowed_tools
.is_none_or(|allowed| allowed.contains(&canonical_allowed_tool_name(spec.name)))
})
.map(|spec| ToolDefinition {
name: spec.name.to_string(),
description: Some(spec.description.to_string()),
@ -267,7 +260,11 @@ impl GlobalToolRegistry {
let runtime = self
.runtime_tools
.iter()
.filter(|tool| allowed_tools.is_none_or(|allowed| allowed.contains(tool.name.as_str())))
.filter(|tool| {
allowed_tools.is_none_or(|allowed| {
allowed.contains(&canonical_allowed_tool_name(&tool.name))
})
})
.map(|tool| ToolDefinition {
name: tool.name.clone(),
description: tool.description.clone(),
@ -277,8 +274,11 @@ impl GlobalToolRegistry {
.plugin_tools
.iter()
.filter(|tool| {
allowed_tools
.is_none_or(|allowed| allowed.contains(tool.definition().name.as_str()))
allowed_tools.is_none_or(|allowed| {
allowed.contains(&canonical_allowed_tool_name(
tool.definition().name.as_str(),
))
})
})
.map(|tool| ToolDefinition {
name: tool.definition().name.clone(),
@ -294,19 +294,29 @@ impl GlobalToolRegistry {
) -> Result<Vec<(String, PermissionMode)>, String> {
let builtin = mvp_tool_specs()
.into_iter()
.filter(|spec| allowed_tools.is_none_or(|allowed| allowed.contains(spec.name)))
.filter(|spec| {
allowed_tools
.is_none_or(|allowed| allowed.contains(&canonical_allowed_tool_name(spec.name)))
})
.map(|spec| (spec.name.to_string(), spec.required_permission));
let runtime = self
.runtime_tools
.iter()
.filter(|tool| allowed_tools.is_none_or(|allowed| allowed.contains(tool.name.as_str())))
.filter(|tool| {
allowed_tools.is_none_or(|allowed| {
allowed.contains(&canonical_allowed_tool_name(&tool.name))
})
})
.map(|tool| (tool.name.clone(), tool.required_permission));
let plugin = self
.plugin_tools
.iter()
.filter(|tool| {
allowed_tools
.is_none_or(|allowed| allowed.contains(tool.definition().name.as_str()))
allowed_tools.is_none_or(|allowed| {
allowed.contains(&canonical_allowed_tool_name(
tool.definition().name.as_str(),
))
})
})
.map(|tool| {
permission_mode_from_plugin(tool.required_permission())
@ -316,6 +326,52 @@ impl GlobalToolRegistry {
Ok(builtin.chain(runtime).chain(plugin).collect())
}
#[must_use]
pub fn actual_tool_names(&self) -> Vec<String> {
mvp_tool_specs()
.iter()
.map(|spec| spec.name.to_string())
.chain(
self.plugin_tools
.iter()
.map(|tool| tool.definition().name.clone()),
)
.chain(self.runtime_tools.iter().map(|tool| tool.name.clone()))
.collect()
}
#[must_use]
pub fn canonical_allowed_tool_names(&self) -> Vec<String> {
self.actual_tool_names()
.into_iter()
.map(|name| canonical_allowed_tool_name(&name))
.collect::<BTreeSet<_>>()
.into_iter()
.collect()
}
#[must_use]
pub fn allowed_tool_aliases(&self) -> BTreeMap<String, String> {
let mut aliases = BTreeMap::from([
("read".to_string(), "read_file".to_string()),
("Read".to_string(), "read_file".to_string()),
("write".to_string(), "write_file".to_string()),
("Write".to_string(), "write_file".to_string()),
("edit".to_string(), "edit_file".to_string()),
("Edit".to_string(), "edit_file".to_string()),
("glob".to_string(), "glob_search".to_string()),
("Glob".to_string(), "glob_search".to_string()),
("grep".to_string(), "grep_search".to_string()),
("Grep".to_string(), "grep_search".to_string()),
]);
for actual in self.actual_tool_names() {
let canonical = canonical_allowed_tool_name(&actual);
if actual != canonical {
aliases.insert(actual, canonical);
}
}
aliases
}
#[must_use]
pub fn has_runtime_tool(&self, name: &str) -> bool {
self.runtime_tools.iter().any(|tool| tool.name == name)
@ -378,8 +434,40 @@ impl GlobalToolRegistry {
}
}
fn normalize_tool_name(value: &str) -> String {
value.trim().replace('-', "_").to_ascii_lowercase()
pub fn canonical_allowed_tool_name(value: &str) -> String {
let trimmed = value.trim().replace('-', "_");
let mut output = String::new();
let chars = trimmed.chars().collect::<Vec<_>>();
for (index, ch) in chars.iter().copied().enumerate() {
if ch == '_' || ch.is_whitespace() {
output.push('_');
continue;
}
let previous = index.checked_sub(1).and_then(|i| chars.get(i)).copied();
let next = chars.get(index + 1).copied();
if ch.is_ascii_uppercase()
&& index > 0
&& !output.ends_with('_')
&& (previous.is_some_and(|p| p.is_ascii_lowercase() || p.is_ascii_digit())
|| next.is_some_and(|n| n.is_ascii_lowercase()))
{
output.push('_');
}
output.push(ch.to_ascii_lowercase());
}
output.trim_matches('_').to_string()
}
fn allowed_tool_lookup_key(value: &str) -> String {
canonical_allowed_tool_name(value).replace('_', "")
}
fn format_allowed_tool_aliases(aliases: &BTreeMap<String, String>) -> String {
aliases
.iter()
.map(|(alias, canonical)| format!("{alias}={canonical}"))
.collect::<Vec<_>>()
.join(", ")
}
fn permission_mode_from_plugin(value: &str) -> Result<PermissionMode, String> {
@ -514,7 +602,7 @@ pub fn mvp_tool_specs() -> Vec<ToolSpec> {
"required": ["url", "prompt"],
"additionalProperties": false
}),
required_permission: PermissionMode::ReadOnly,
required_permission: PermissionMode::DangerFullAccess,
},
ToolSpec {
name: "WebSearch",
@ -535,7 +623,7 @@ pub fn mvp_tool_specs() -> Vec<ToolSpec> {
"required": ["query"],
"additionalProperties": false
}),
required_permission: PermissionMode::ReadOnly,
required_permission: PermissionMode::DangerFullAccess,
},
ToolSpec {
name: "TodoWrite",
@ -1225,13 +1313,14 @@ pub fn mvp_tool_specs() -> Vec<ToolSpec> {
},
ToolSpec {
name: "GitShow",
description: "Show a commit, tag, or tree object with its diff. Supports showing a specific file at a commit (commit:path) and stat-only mode. Use this instead of running git show via bash to get structured output.",
description: "Show a commit, tag, or tree object. Use format to control output: patch (default) shows the full diff, stat shows a diffstat summary, and metadata shows commit info without the diff. Supports showing a specific file at a commit (commit:path) for patch/stat output. Use this instead of running git show via bash to get structured output.",
input_schema: json!({
"type": "object",
"properties": {
"commit": { "type": "string" },
"path": { "type": "string" },
"stat": { "type": "boolean" }
"stat": { "type": "boolean" },
"format": { "type": "string", "enum": ["patch", "stat", "metadata"] },
},
"required": ["commit"],
"additionalProperties": false
@ -1320,8 +1409,26 @@ fn execute_tool_with_enforcer(
maybe_enforce_permission_check_with_mode(enforcer, name, input, required_mode)?;
run_grep_search(grep_input)
}
"WebFetch" => from_value::<WebFetchInput>(input).and_then(run_web_fetch),
"WebSearch" => from_value::<WebSearchInput>(input).and_then(run_web_search),
"WebFetch" => {
let web_input = from_value::<WebFetchInput>(input)?;
maybe_enforce_permission_check_with_mode(
enforcer,
name,
input,
PermissionMode::DangerFullAccess,
)?;
run_web_fetch(web_input)
}
"WebSearch" => {
let web_input = from_value::<WebSearchInput>(input)?;
maybe_enforce_permission_check_with_mode(
enforcer,
name,
input,
PermissionMode::DangerFullAccess,
)?;
run_web_search(web_input)
}
"TodoWrite" => from_value::<TodoWriteInput>(input).and_then(run_todo_write),
"Skill" => from_value::<SkillInput>(input).and_then(run_skill),
"Agent" => from_value::<AgentInput>(input).and_then(run_agent),
@ -2008,14 +2115,37 @@ fn run_git_log(input: GitLogInput) -> Result<String, String> {
}
}
#[allow(clippy::needless_pass_by_value)]
/// Execute `git show` for a given commit, optionally with --stat or a file path.
/// Uses the `commit:path` syntax when a path is specified.
fn run_git_show(input: GitShowInput) -> Result<String, String> {
let mut args: Vec<String> = vec!["show".to_string()];
if input.stat.unwrap_or(false) {
args.push("--stat".to_string());
match input.format.as_deref() {
Some("metadata") if input.path.is_some() => {
return Err(
"GitShow format \"metadata\" cannot be combined with path; metadata describes a commit, not a blob. Use format \"patch\" or \"stat\" with path, or omit path."
.to_string(),
);
}
Some("metadata") => {
args.push("--format=medium".to_string());
args.push("--no-patch".to_string());
}
Some("stat") => {
args.push("--stat".to_string());
}
Some("patch") | None => {
if input.format.is_none() && input.stat.unwrap_or(false) {
args.push("--stat".to_string());
}
}
Some(other) => {
return Err(format!(
"unknown GitShow format: \"{other}\". Supported values: \"patch\" (default), \"stat\", \"metadata\"."
));
}
}
if let Some(ref path) = input.path {
args.push(format!("{}:{}", input.commit, path));
} else {
@ -2964,6 +3094,9 @@ struct GitShowInput {
#[serde(default)]
/// If true, show diffstat summary instead of full diff.
stat: Option<bool>,
#[serde(default)]
/// Output format: "patch" (default) shows the full diff, "stat" shows a diffstat summary, and "metadata" shows commit info without the diff. When set, takes priority over `stat`.
format: Option<String>,
}
/// Input for the GitBlame tool: shows per-line author/revision info for a file.
@ -4165,7 +4298,7 @@ fn allowed_tools_for_subagent(subagent_type: &str) -> BTreeSet<String> {
"PowerShell",
],
};
tools.into_iter().map(str::to_string).collect()
tools.into_iter().map(canonical_allowed_tool_name).collect()
}
fn agent_permission_policy() -> PermissionPolicy {
@ -5193,7 +5326,10 @@ impl SubagentToolExecutor {
impl ToolExecutor for SubagentToolExecutor {
fn execute(&mut self, tool_name: &str, input: &str) -> Result<String, ToolError> {
if !self.allowed_tools.contains(tool_name) {
if !self
.allowed_tools
.contains(&canonical_allowed_tool_name(tool_name))
{
return Err(ToolError::new(format!(
"tool `{tool_name}` is not enabled for this sub-agent"
)));
@ -5208,7 +5344,10 @@ impl ToolExecutor for SubagentToolExecutor {
fn tool_specs_for_allowed_tools(allowed_tools: Option<&BTreeSet<String>>) -> Vec<ToolSpec> {
mvp_tool_specs()
.into_iter()
.filter(|spec| allowed_tools.is_none_or(|allowed| allowed.contains(spec.name)))
.filter(|spec| {
allowed_tools
.is_none_or(|allowed| allowed.contains(&canonical_allowed_tool_name(spec.name)))
})
.collect()
}
@ -6779,6 +6918,87 @@ mod tests {
assert!(names.contains(&"WorkerSendPrompt"));
}
#[test]
fn git_show_schema_exposes_format_enum() {
let spec = mvp_tool_specs()
.into_iter()
.find(|spec| spec.name == "GitShow")
.expect("GitShow spec");
assert_eq!(
spec.input_schema["properties"]["format"]["enum"],
json!(["patch", "stat", "metadata"])
);
}
#[test]
fn git_show_supports_patch_stat_metadata_and_rejects_metadata_path() {
let _guard = env_guard();
let root = temp_path("git-show-format");
init_git_repo(&root);
commit_file(&root, "README.md", "initial\nupdated\n", "update readme");
let previous = std::env::current_dir().expect("cwd");
std::env::set_current_dir(&root).expect("set cwd");
let patch = execute_tool("GitShow", &json!({"commit": "HEAD", "format": "patch"}))
.expect("patch git show");
let patch: serde_json::Value = serde_json::from_str(&patch).expect("patch json");
assert!(patch["output"]
.as_str()
.expect("patch output")
.contains("diff --git"));
let stat = execute_tool("GitShow", &json!({"commit": "HEAD", "format": "stat"}))
.expect("stat git show");
let stat: serde_json::Value = serde_json::from_str(&stat).expect("stat json");
assert!(stat["output"]
.as_str()
.expect("stat output")
.contains("README.md"));
let legacy_stat = execute_tool("GitShow", &json!({"commit": "HEAD", "stat": true}))
.expect("legacy stat git show");
let legacy_stat: serde_json::Value =
serde_json::from_str(&legacy_stat).expect("legacy stat json");
assert!(legacy_stat["output"]
.as_str()
.expect("legacy stat output")
.contains("README.md"));
let metadata = execute_tool("GitShow", &json!({"commit": "HEAD", "format": "metadata"}))
.expect("metadata git show");
let metadata: serde_json::Value = serde_json::from_str(&metadata).expect("metadata json");
let metadata_output = metadata["output"].as_str().expect("metadata output");
assert!(metadata_output.contains("commit "));
assert!(metadata_output.contains("update readme"));
assert!(!metadata_output.contains("diff --git"));
let file_patch = execute_tool(
"GitShow",
&json!({"commit": "HEAD", "path": "README.md", "format": "patch"}),
)
.expect("file patch git show");
let file_patch: serde_json::Value =
serde_json::from_str(&file_patch).expect("file patch json");
assert_eq!(
file_patch["output"].as_str().expect("file patch output"),
"initial\nupdated"
);
let metadata_path = execute_tool(
"GitShow",
&json!({"commit": "HEAD", "path": "README.md", "format": "metadata"}),
)
.expect_err("metadata with path should be rejected");
assert!(metadata_path.contains("cannot be combined with path"));
let invalid = execute_tool("GitShow", &json!({"commit": "HEAD", "format": "bogus"}))
.expect_err("invalid format should be rejected");
assert!(invalid.contains("unknown GitShow format"));
std::env::set_current_dir(&previous).expect("restore cwd");
let _ = fs::remove_dir_all(root);
}
#[test]
fn rejects_unknown_tool_names() {
let error = execute_tool("nope", &json!({})).expect_err("tool should be rejected");
@ -7477,6 +7697,29 @@ mod tests {
}
}
#[test]
fn allowed_tools_normalize_to_canonical_snake_case_and_aliases_432() {
let registry = GlobalToolRegistry::builtin();
let allowed = registry
.normalize_allowed_tools(&["Read,WebFetch,MCP".to_string()])
.expect("aliases and legacy names should normalize")
.expect("allow-list should be populated");
assert!(allowed.contains("read_file"));
assert!(allowed.contains("web_fetch"));
assert!(allowed.contains("mcp"));
assert!(!allowed.contains("Read"));
assert!(!allowed.contains("WebFetch"));
let canonical = registry.canonical_allowed_tool_names();
assert!(canonical.contains(&"web_fetch".to_string()));
assert!(canonical.contains(&"todo_write".to_string()));
assert!(!canonical.contains(&"WebFetch".to_string()));
assert_eq!(
registry.allowed_tool_aliases().get("WebFetch"),
Some(&"web_fetch".to_string())
);
}
#[test]
fn runtime_tools_extend_registry_definitions_permissions_and_search() {
let registry = GlobalToolRegistry::builtin()
@ -8458,7 +8701,7 @@ mod tests {
.expect("spawn job should be captured");
assert_eq!(captured_job.prompt, "Check tests and outstanding work.");
assert!(captured_job.allowed_tools.contains("read_file"));
assert!(!captured_job.allowed_tools.contains("Agent"));
assert!(!captured_job.allowed_tools.contains("agent"));
let normalized = execute_tool(
"Agent",
@ -9058,7 +9301,7 @@ mod tests {
let general = allowed_tools_for_subagent("general-purpose");
assert!(general.contains("bash"));
assert!(general.contains("write_file"));
assert!(!general.contains("Agent"));
assert!(!general.contains("agent"));
let explore = allowed_tools_for_subagent("Explore");
assert!(explore.contains("read_file"));
@ -9066,13 +9309,13 @@ mod tests {
assert!(!explore.contains("bash"));
let plan = allowed_tools_for_subagent("Plan");
assert!(plan.contains("TodoWrite"));
assert!(plan.contains("StructuredOutput"));
assert!(!plan.contains("Agent"));
assert!(plan.contains("todo_write"));
assert!(plan.contains("structured_output"));
assert!(!plan.contains("agent"));
let verification = allowed_tools_for_subagent("Verification");
assert!(verification.contains("bash"));
assert!(verification.contains("PowerShell"));
assert!(verification.contains("power_shell"));
assert!(!verification.contains("write_file"));
}
@ -10156,6 +10399,26 @@ printf 'pwsh:%s' "$1"
);
}
#[test]
fn given_workspace_write_enforcer_when_web_tools_then_denied() {
let registry = workspace_write_registry();
for (tool, input) in [
(
"WebFetch",
json!({"url":"https://example.com", "prompt":"summarize"}),
),
("WebSearch", json!({"query":"rust language"})),
] {
let err = registry
.execute(tool, &input)
.expect_err("network tools should require explicit full access");
assert!(
err.contains("requires 'danger-full-access'"),
"{tool} should require elevated mode: {err}"
);
}
}
#[test]
fn given_workspace_write_enforcer_when_bash_uses_shell_expansion_then_denied() {
let registry = workspace_write_registry();

145
scripts/dogfood-probe.py Normal file
View File

@ -0,0 +1,145 @@
#!/usr/bin/env python3
from __future__ import annotations
import argparse
import json
import subprocess
import sys
from dataclasses import dataclass
from pathlib import Path
from typing import Sequence
@dataclass(frozen=True)
class ProbeResult:
kind: str
argv: list[str]
returncode: int | None
stdout: bytes
stderr: bytes
message: str | None = None
@property
def stdout_text(self) -> str:
return self.stdout.decode('utf-8', errors='replace')
@property
def stderr_text(self) -> str:
return self.stderr.decode('utf-8', errors='replace')
def to_json_dict(self) -> dict[str, object]:
return {
'kind': self.kind,
'argv': self.argv,
'returncode': self.returncode,
'stdout': self.stdout_text,
'stderr': self.stderr_text,
'message': self.message,
}
def run_probe(argv: Sequence[str], *, timeout: float = 10.0, require_stdout_json_byte0: bool = False) -> ProbeResult:
explicit_argv = [str(arg) for arg in argv]
if not explicit_argv:
return ProbeResult(
kind='probe_error',
argv=[],
returncode=None,
stdout=b'',
stderr=b'',
message='argv must contain at least the executable path',
)
try:
completed = subprocess.run(
explicit_argv,
capture_output=True,
check=False,
timeout=timeout,
)
except subprocess.TimeoutExpired as exc:
return ProbeResult(
kind='timeout',
argv=explicit_argv,
returncode=None,
stdout=exc.stdout or b'',
stderr=exc.stderr or b'',
message=f'probe timed out after {timeout:g}s',
)
except (OSError, ValueError) as exc:
return ProbeResult(
kind='probe_error',
argv=explicit_argv,
returncode=None,
stdout=b'',
stderr=b'',
message=str(exc),
)
if require_stdout_json_byte0:
if not completed.stdout:
return ProbeResult(
kind='product_error',
argv=explicit_argv,
returncode=completed.returncode,
stdout=completed.stdout,
stderr=completed.stderr,
message='stdout is empty; expected JSON at byte 0',
)
if completed.stdout[:1] not in (b'{', b'['):
return ProbeResult(
kind='product_error',
argv=explicit_argv,
returncode=completed.returncode,
stdout=completed.stdout,
stderr=completed.stderr,
message='stdout JSON does not start at byte 0',
)
try:
json.loads(completed.stdout.decode('utf-8'))
except (UnicodeDecodeError, json.JSONDecodeError) as exc:
return ProbeResult(
kind='product_error',
argv=explicit_argv,
returncode=completed.returncode,
stdout=completed.stdout,
stderr=completed.stderr,
message=f'stdout is not parseable JSON: {exc}',
)
if completed.returncode != 0:
return ProbeResult(
kind='product_error',
argv=explicit_argv,
returncode=completed.returncode,
stdout=completed.stdout,
stderr=completed.stderr,
message=f'process exited with code {completed.returncode}',
)
return ProbeResult(
kind='ok',
argv=explicit_argv,
returncode=completed.returncode,
stdout=completed.stdout,
stderr=completed.stderr,
)
def main(argv: Sequence[str] | None = None) -> int:
parser = argparse.ArgumentParser(description='Run an argv-safe dogfood probe and emit separated channels as JSON.')
parser.add_argument('--timeout', type=float, default=10.0)
parser.add_argument('--stdout-json-byte0', action='store_true', help='Require stdout to be parseable JSON starting at byte 0.')
parser.add_argument('command', nargs=argparse.REMAINDER, help='Executable and arguments to run. Use -- before the target argv.')
args = parser.parse_args(argv)
command = args.command
if command and command[0] == '--':
command = command[1:]
result = run_probe(command, timeout=args.timeout, require_stdout_json_byte0=args.stdout_json_byte0)
print(json.dumps(result.to_json_dict(), sort_keys=True))
return 0 if result.kind == 'ok' else 1
if __name__ == '__main__':
raise SystemExit(main())

0
tests/__init__.py Normal file
View File

View File

@ -9,6 +9,9 @@ from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parents[1]
NEXT_ID = REPO_ROOT / 'scripts' / 'roadmap-next-id.sh'
DOGFOOD_PROBE = REPO_ROOT / 'scripts' / 'dogfood-probe.py'
def run_next_id(roadmap: Path, script: Path = NEXT_ID) -> subprocess.CompletedProcess[str]:
@ -21,6 +24,16 @@ def run_next_id(roadmap: Path, script: Path = NEXT_ID) -> subprocess.CompletedPr
)
def run_dogfood_probe(args: list[str]) -> subprocess.CompletedProcess[str]:
return subprocess.run(
['python3', str(DOGFOOD_PROBE), *args],
cwd=REPO_ROOT,
capture_output=True,
text=True,
check=False,
)
class RoadmapHelperTests(unittest.TestCase):
def test_roadmap_next_id_prints_only_next_id_after_duplicate_check(self) -> None:
with tempfile.TemporaryDirectory() as temp_dir:
@ -46,6 +59,17 @@ class RoadmapHelperTests(unittest.TestCase):
self.assertIn('999', result.stderr)
self.assertNotIn('1000', result.stdout)
def test_roadmap_next_id_fails_when_explicit_roadmap_path_is_missing(self) -> None:
with tempfile.TemporaryDirectory() as temp_dir:
roadmap = Path(temp_dir) / 'missing-ROADMAP.md'
result = run_next_id(roadmap)
self.assertNotEqual(0, result.returncode)
self.assertEqual('', result.stdout)
self.assertIn('ROADMAP not found', result.stderr)
self.assertIn(str(roadmap), result.stderr)
def test_roadmap_next_id_fails_closed_when_checker_is_unavailable(self) -> None:
with tempfile.TemporaryDirectory() as temp_dir:
script_dir = Path(temp_dir) / 'scripts'
@ -62,6 +86,78 @@ class RoadmapHelperTests(unittest.TestCase):
self.assertIn('required ROADMAP id checker not found or not readable', result.stderr)
self.assertIn('refusing to print a next id', result.stderr)
def test_dogfood_probe_runs_explicit_argv_and_separates_channels(self) -> None:
with tempfile.TemporaryDirectory() as temp_dir:
fixture = Path(temp_dir) / 'fixture.py'
fixture.write_text(
'from __future__ import annotations\n'
'import json\n'
'import sys\n'
'print(json.dumps({"argv": sys.argv[1:]}))\n'
'print("diagnostic", file=sys.stderr)\n'
)
result = run_dogfood_probe([
'--stdout-json-byte0',
'--',
'python3',
str(fixture),
'--output-format',
'json',
'doctor',
'--help',
])
self.assertEqual(0, result.returncode)
payload = __import__('json').loads(result.stdout)
self.assertEqual('ok', payload['kind'])
self.assertEqual([
'python3',
str(fixture),
'--output-format',
'json',
'doctor',
'--help',
], payload['argv'])
self.assertEqual(0, payload['returncode'])
self.assertEqual('{"argv": ["--output-format", "json", "doctor", "--help"]}\n', payload['stdout'])
self.assertEqual('diagnostic\n', payload['stderr'])
def test_dogfood_probe_labels_timeout_separately_from_product_error(self) -> None:
with tempfile.TemporaryDirectory() as temp_dir:
fixture = Path(temp_dir) / 'sleep.py'
fixture.write_text('import time\ntime.sleep(2)\n')
result = run_dogfood_probe(['--timeout', '0.1', '--', 'python3', str(fixture)])
self.assertEqual(1, result.returncode)
payload = __import__('json').loads(result.stdout)
self.assertEqual('timeout', payload['kind'])
self.assertIsNone(payload['returncode'])
self.assertIn('timed out', payload['message'])
def test_dogfood_probe_labels_probe_construction_failure(self) -> None:
result = run_dogfood_probe([])
self.assertEqual(1, result.returncode)
payload = __import__('json').loads(result.stdout)
self.assertEqual('probe_error', payload['kind'])
self.assertEqual([], payload['argv'])
self.assertIsNone(payload['returncode'])
self.assertIn('argv must contain', payload['message'])
def test_dogfood_probe_labels_stdout_json_prefix_failure_as_product_error(self) -> None:
with tempfile.TemporaryDirectory() as temp_dir:
fixture = Path(temp_dir) / 'prefixed.py'
fixture.write_text('print("warning before json")\nprint("{}")\n')
result = run_dogfood_probe(['--stdout-json-byte0', '--', 'python3', str(fixture)])
self.assertEqual(1, result.returncode)
payload = __import__('json').loads(result.stdout)
self.assertEqual('product_error', payload['kind'])
self.assertEqual(0, payload['returncode'])
self.assertIn('byte 0', payload['message'])
if __name__ == '__main__':
unittest.main()