Cherry-picked from PR #2816 onto current upstream/main, resolving
conflicts from PR #3015's merge (which added retry_after to ApiError
but some construction sites were missing it).
Commits preserved:
- ade85398: API timeout config, Retry-After header, configurable retry
- TimeoutConfig in HTTP client builder (connect 30s, request 5min)
- CLAW_API_CONNECT_TIMEOUT and CLAW_API_REQUEST_TIMEOUT env vars
- Retry-After header parsing on 429 responses
- ApiTimeoutConfig in runtime config (settings.json)
- 8a883430: retry 400 responses with transient gateway error bodies
- Detects known gateway phrases in 400 response bodies
- Marks them as retryable instead of hard-failing
- ed91a61e: add 'no parseable body' to CONTEXT_WINDOW_ERROR_MARKERS
- Some providers return 400 with 'no parseable body' for oversized
requests instead of a proper context_length_exceeded error
Commits skipped (already in upstream via PR #3015):
- 453ab642: optional id field (already merged)
- baa8d1ba: HTML detection in streaming (already merged)
- 33d2f789: JSON error detection in streaming (already merged)
8 files changed, 299 insertions, 80 deletions
- Add TimeoutConfig to HTTP client builder with connect_timeout (30s)
and request_timeout (5min) defaults, configurable via
CLAW_API_CONNECT_TIMEOUT and CLAW_API_REQUEST_TIMEOUT env vars
- Add with_timeout() builder to both AnthropicClient and
OpenAiCompatClient for per-client timeout configuration
- Parse Retry-After header on 429 responses and use it to override
exponential backoff delay when present
- Add ApiTimeoutConfig to runtime config with apiTimeout settings
in ~/.claw/settings.json (connectTimeout, requestTimeout, maxRetries)
- Add retry_after field to ApiError::Api for propagating rate limit
backoff hints through the retry pipeline
Add SUPPRESS_CONFIG_WARNINGS_STDERR AtomicBool flag in runtime/config.rs
and expose suppress_config_warnings_for_json_mode() via runtime crate.
In main.rs, scan raw argv for --output-format json before parse_args
and activate the flag so no settings-load warnings reach stderr on any
JSON-mode surface (status, sandbox, system-prompt, mcp list, skills list,
agents list, --resume /config*, etc.).
Text-mode surfaces are unaffected; prose deprecation warnings continue
to appear on stderr.
All 572+ tests pass (one pre-existing worker_boot failure unrelated).
JSON config output already carries collected config diagnostics in warnings[], so prose stderr emission must be reserved for text/local paths. Lazy permission-mode default resolution prevents an earlier config load from leaking the same deprecation before the JSON renderer runs.\n\nConstraint: ROADMAP #815 requires text mode to keep human stderr warnings while JSON config/list suppresses duplicate app-level config prose.\nRejected: Filtering all stderr in JSON mode | would hide cargo/compiler or unrelated diagnostics outside the app config warning path.\nConfidence: high\nScope-risk: narrow\nDirective: Keep load_collecting_warnings side-effect-free; use load() for human stderr emission.\nTested: cargo fmt; cargo test -p rusty-claude-cli --test output_format_contract config_json_reports_deprecations_structurally_without_stderr_duplicate_815; cargo test -p rusty-claude-cli --test output_format_contract; manual target/debug/claw JSON config fixture.\nNot-tested: cargo clippy -p rusty-claude-cli --all-targets -- -D warnings is blocked by pre-existing runtime dead_code/trident warnings.
Use git rev-parse --git-dir so startup preflight follows worktree .git indirections to the real metadata directory, then check directory permission metadata without creating probe files. Add a regression that verifies both the warning kind and structured event path for a read-only external gitdir.
Constraint: ROADMAP #695 requires early startup/worktree diagnostics without destructive writes or broad sandbox redesign.
Rejected: write-probe detection | it mutates git metadata during a diagnostic path.
Confidence: high
Scope-risk: narrow
Directive: Keep startup preflight warnings non-destructive and structured by warning kind/path.
Tested: cargo fmt --manifest-path rust/Cargo.toml --all -- --check; cargo test --manifest-path rust/Cargo.toml -p runtime startup_preflight -- --nocapture; cargo test --manifest-path rust/Cargo.toml -p runtime worker_boot -- --nocapture; cargo check --manifest-path rust/Cargo.toml --workspace
Not-tested: full cargo test --manifest-path rust/Cargo.toml --workspace
Preserve session-list consumers from depending on encoded session IDs by carrying persisted creation timestamps through managed summaries and JSON detail output, with an ID timestamp fallback only for legacy metadata that lacks created_at_ms.\n\nConstraint: ROADMAP #335 requires created_at_ms in session_details with the same millisecond unit as updated_at_ms.\nRejected: Making callers parse session IDs | undocumented ID structure is brittle and was the issue being fixed.\nRejected: Session storage redesign | scope is limited to detail metadata propagation and legacy compatibility.\nConfidence: high\nScope-risk: narrow\nDirective: Keep ID timestamp parsing fallback-only; persisted session_meta.created_at_ms remains the source of truth when present.\nTested: cargo fmt --manifest-path rust/Cargo.toml --all -- --check; cargo check --manifest-path rust/Cargo.toml --workspace; cargo build --manifest-path rust/Cargo.toml --workspace; cargo test --manifest-path rust/Cargo.toml --workspace; cargo clippy --manifest-path rust/Cargo.toml -p runtime -p rusty-claude-cli --all-targets\nNot-tested: live interactive /session list against a real provider-backed REPL
Adds session_store_lifecycle_regression_160 test that verifies the full
SessionStore CRUD lifecycle. Also fixes pre-existing non-exhaustive match
errors in trident.rs for the ContentBlock::Thinking variant.
- Remove retry_after: None from ApiError::Api structs in openai_compat.rs (field was removed)
- Remove SlashCommand::Team parse arm (variant was removed from enum)
- Add config_load_error_kind: None to doctor path StatusContext initializer
- Add Thinking arm to all ContentBlock match blocks in trident.rs
- Remove cargo fmt drift across commands, config, compact, tools, trident
Instead of re-nesting prior highlights under '- Previously compacted context:', flatten them directly into the top-level list with '- ' prefix. This prevents each compaction cycle from adding a nesting layer, which inflated the summary by ~depth * overhead per turn.
Adds MAX_GIT_DIFF_CHARS (50_000) limit and truncate_diff() function to prevent oversized git diffs from blowing up the system prompt. Truncation respects UTF-8 character boundaries and appends a clear truncation notice. Includes unit tests.
Resolve the G012 evidence gate by fixing permission-mode regressions, platform-sensitive tests, and the clippy surface that blocked an all-targets verification run.
Constraint: G012 final gate required docs, board, full workspace tests, and clippy -D warnings evidence before checkpointing.
Rejected: documenting the worker-2 gate failure as an accepted gap | the failing tests and lints were locally reproducible and fixable.
Confidence: high
Scope-risk: moderate
Directive: Preserve read-only permission requirements for read/glob/grep tools; write/edit remain workspace-write or danger-full-access when outside the workspace.
Tested: python3 .github/scripts/check_doc_source_of_truth.py; python3 .github/scripts/check_release_readiness.py; python3 scripts/validate_cc2_board.py --board .omx/cc2/board.json; python3 .omx/cc2/validate_issue_parity_intake.py .omx/cc2/issue-parity-intake.json; cargo fmt --manifest-path rust/Cargo.toml --all -- --check; cargo check --manifest-path rust/Cargo.toml --workspace; cargo test --manifest-path rust/Cargo.toml --workspace -- --nocapture; cargo clippy --manifest-path rust/Cargo.toml --workspace --all-targets -- -D warnings
Not-tested: live network provider smoke tests and remote PR/issue mutations.
Expose MCP server requiredness through config parsing, inventory reports, config hashes, and degraded startup failure context so orchestrators can distinguish optional degradation from required startup breakage.
Constraint: G007-plugin-mcp Task 3 requires required vs optional MCP behavior and must not mutate .omx/ultragoal.
Rejected: Treating all MCP failures as equivalent | it preserves the existing opacity that prevents required-server failures from being escalated differently.
Confidence: high
Scope-risk: moderate
Directive: Preserve required=false as the backward-compatible default; keep required surfaced in JSON/text inventory and degraded failure context when extending MCP lifecycle states.
Tested: cargo test -p runtime parses_typed_mcp_and_oauth_config -- --nocapture; cargo test -p runtime manager_discovery_report_keeps_healthy_servers_when_one_server_fails -- --nocapture; cargo test -p runtime manager_records_unsupported_non_stdio_servers_without_panicking -- --nocapture; cargo test -p commands renders_mcp_reports -- --nocapture; cargo check --workspace; cargo fmt --all -- --check
Not-tested: cargo clippy -p runtime -p commands -- -D warnings is blocked by pre-existing runtime/src/policy_engine.rs LaneContext clippy::struct_excessive_bools.
Co-authored-by: OmX <omx@oh-my-codex.dev>