Commit Graph

145 Commits

Author SHA1 Message Date
bellman be8112f5f5 feat: add native Ollama provider support via OLLAMA_HOST env var
- OLLAMA_HOST takes priority over OPENAI_BASE_URL for local Ollama instances
- No API key required; placeholder token used for Authorization header
- Model names like 'qwen3:8b' bypass strict provider/model syntax validation
- detect_provider_kind() checks OLLAMA_HOST first in routing cascade
- ProviderClient dispatch uses from_ollama_env() when OLLAMA_HOST is set
- Updated USAGE.md and docs with OLLAMA_HOST as preferred env var
- Added OLLAMA_CONFIG constant and from_ollama_env() to openai_compat
- Added test_ollama_host_bypasses_model_validation unit test
- Supersedes PR #3213 (which had a duplicate if-let bug in mod.rs)
2026-06-05 12:12:56 +09:00
Bellman b90f18f75a
Merge pull request #3214 from TheArchitectit/worktree-api-timeout-retry-v2
feat: API timeout config, Retry-After header, configurable retry, and 400 transient retry
2026-06-05 10:33:35 +09:00
bellman d4aad7103e fix: add actionable auth hint to 401/403 API errors (#28)
401 and 403 errors now include a hint explaining which env vars to
check for each provider (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.)
and suggesting claw doctor for credential verification.

Generated with https://github.com/Yeachan-Heo/gajae-code
Co-authored-by: Gajae Code <dev@gajae-code.com>
2026-06-05 10:10:24 +09:00
TheArchitectit 9e50cb6e20 Merge remote-tracking branch 'upstream/main' into worktree-api-timeout-retry-v2
# Conflicts:
#	rust/crates/runtime/src/config.rs
#	rust/crates/runtime/src/lib.rs
2026-06-04 09:17:43 -05:00
TheArchitectit 76783377ec fix: address CI failures and reviewer feedback on #3214
- Add missing retry_after: None field to ApiError::Api construction
  in main.rs test. This field was introduced by the Retry-After
  header support but was not added to the test's error initializer,
  causing a compile error under CI's strict mode.

- Remove duplicate #[must_use] attribute on retry_after() method
  in error.rs (lines 134+138 both had it; kept the outer one
  above the doc comment per convention).

- Cargo fmt --all run.

- Reviewer question "Are defaults preserved?" — answered yes:
  ApiTimeoutConfig defaults to 30s connect / 300s request / 8 retries.
  with_retry_policy() is opt-in. No behavior change without explicit
  configuration.
2026-06-03 13:19:25 -05:00
bellman fa35018769 fix: validate env model selection 2026-06-04 00:30:13 +09:00
bellman bcc5bfde9c fix: route local OpenAI-compatible models 2026-06-03 23:16:46 +09:00
bellman c91a3062d5 fix: normalize Anthropic model routing 2026-06-03 22:20:23 +09:00
bellman 54d785d0c0 fix: preserve DeepSeek V4 thinking history 2026-06-03 21:53:54 +09:00
TheArchitectit 04bc5f5788 feat: API timeout config, Retry-After header, configurable retry, and 400 transient retry
Cherry-picked from PR #2816 onto current upstream/main, resolving
conflicts from PR #3015's merge (which added retry_after to ApiError
but some construction sites were missing it).

Commits preserved:
- ade85398: API timeout config, Retry-After header, configurable retry
  - TimeoutConfig in HTTP client builder (connect 30s, request 5min)
  - CLAW_API_CONNECT_TIMEOUT and CLAW_API_REQUEST_TIMEOUT env vars
  - Retry-After header parsing on 429 responses
  - ApiTimeoutConfig in runtime config (settings.json)
- 8a883430: retry 400 responses with transient gateway error bodies
  - Detects known gateway phrases in 400 response bodies
  - Marks them as retryable instead of hard-failing
- ed91a61e: add 'no parseable body' to CONTEXT_WINDOW_ERROR_MARKERS
  - Some providers return 400 with 'no parseable body' for oversized
    requests instead of a proper context_length_exceeded error

Commits skipped (already in upstream via PR #3015):
- 453ab642: optional id field (already merged)
- baa8d1ba: HTML detection in streaming (already merged)
- 33d2f789: JSON error detection in streaming (already merged)

8 files changed, 299 insertions, 80 deletions
2026-06-02 15:35:29 -05:00
TheArchitectit 571d3cdc0f fix: add "no parseable body" to CONTEXT_WINDOW_ERROR_MARKERS
Some OpenAI-compat backends (e.g. glm-5.1-fast) return 400 with
"no parseable body" when the request payload is too large to parse,
rather than a proper context_length_exceeded error. Without this marker,
is_context_window_error() returns false and the auto-compact retry
loop never triggers — the user just sees an opaque 400 error.

💘 Generated with Crush

Assisted-by: GLM 5.1 FP8 via Crush <crush@charm.land>
2026-06-02 15:31:04 -05:00
TheArchitectit 414a1aca4f fix: retry 400 responses with transient gateway error bodies
Some providers/proxies return HTTP 400 with bodies like "no parseable
body" or "connection reset" during transient network blips. These are
not real bad requests — they're gateway errors wearing a 400 mask.
Detect known gateway error phrases in 400 response bodies and mark
them as retryable so the existing exponential backoff handles them.
2026-06-02 15:30:41 -05:00
TheArchitectit d8c57ed317 feat: API timeout config, Retry-After header support, and configurable retry
- Add TimeoutConfig to HTTP client builder with connect_timeout (30s)
  and request_timeout (5min) defaults, configurable via
  CLAW_API_CONNECT_TIMEOUT and CLAW_API_REQUEST_TIMEOUT env vars
- Add with_timeout() builder to both AnthropicClient and
  OpenAiCompatClient for per-client timeout configuration
- Parse Retry-After header on 429 responses and use it to override
  exponential backoff delay when present
- Add ApiTimeoutConfig to runtime config with apiTimeout settings
  in ~/.claw/settings.json (connectTimeout, requestTimeout, maxRetries)
- Add retry_after field to ApiError::Api for propagating rate limit
  backoff hints through the retry pipeline
2026-06-02 15:30:22 -05:00
YeonGyu-Kim c70312bd04 fix(#754): missing_credentials hint now newline-delimited so JSON hint field is non-null 2026-05-26 21:23:03 +09:00
YeonGyu-Kim 63a5a87471 fix(#696): exit with typed error when stdin is not a TTY and no prompt piped; fix anthropic/ prefix detection in metadata_for_model 2026-05-25 13:16:12 +09:00
YeonGyu-Kim 78a0ff615a
Merge pull request #3014 from wangguan1995/fix_qwen
Add Qwen model token limits for DashScope compatibility
2026-05-25 12:58:59 +09:00
Yeachan-Heo fdbc789694 fix(api): skip preflight for unknown model limits 2026-05-25 12:49:36 +09:00
Yeachan-Heo 779cf1c234 test(api): fill thinking in stream chunk fixtures 2026-05-25 12:49:36 +09:00
YeonGyu-Kim 495e7a015c fix: remove stale retry_after field, Team variant, config_load_error_kind, denied_tools initializer errors
- Remove retry_after: None from ApiError::Api structs in openai_compat.rs (field was removed)
- Remove SlashCommand::Team parse arm (variant was removed from enum)
- Add config_load_error_kind: None to doctor path StatusContext initializer
- Add Thinking arm to all ContentBlock match blocks in trident.rs
- Remove cargo fmt drift across commands, config, compact, tools, trident
2026-05-25 12:01:09 +09:00
YeonGyu-Kim 3364dc4bee chore: fix conflict markers and cargo fmt drift in main (commands, openai_compat, trident, config, tools) 2026-05-25 11:51:44 +09:00
TheArchitectit 7149bbc3d9
fix: streaming robustness — OpenAI parsing, error detection, reasoning content
Improves SSE parsing with raw JSON error detection, HTML response detection (for misconfigured endpoints), thinking/reasoning content from provider-specific delta fields, #[serde(default)] on streaming types for lenient deserialization, compact session boundary guard, and /team slash command. Adds install.sh convenience script.
2026-05-25 11:22:47 +09:00
Ajinkya Kardile b071fac2cf
feat: add native Gemini support to openai_compat provider
Adds early return in wire_model_for_base_url for Gemini/Gemma/XAI/Kimi/Grok model prefixes to ensure the provider prefix is preserved correctly when routing through the OpenAI-compatible provider path.
2026-05-25 11:21:37 +09:00
Luke 739488f613
fix: return conservative token limits for unspecified models
Changes the catch-all arm in model_token_limit() from None to conservative defaults (max_output_tokens: 16_384, context_window_tokens: 131_072) to prevent crashes when an unknown model is used.
2026-05-25 11:21:22 +09:00
Luke a61d023583
fix: unify user_agent to 'clawd-rust-tools/0.1'
Sets user_agent on both build_http_client_or_default() and build_http_client_with() to 'clawd-rust-tools/0.1' for consistent HTTP client identification.
2026-05-25 11:21:13 +09:00
bellman 04c2abb412 Stabilize final gate before release checkpoint
Resolve the G012 evidence gate by fixing permission-mode regressions, platform-sensitive tests, and the clippy surface that blocked an all-targets verification run.

Constraint: G012 final gate required docs, board, full workspace tests, and clippy -D warnings evidence before checkpointing.

Rejected: documenting the worker-2 gate failure as an accepted gap | the failing tests and lints were locally reproducible and fixable.

Confidence: high

Scope-risk: moderate

Directive: Preserve read-only permission requirements for read/glob/grep tools; write/edit remain workspace-write or danger-full-access when outside the workspace.

Tested: python3 .github/scripts/check_doc_source_of_truth.py; python3 .github/scripts/check_release_readiness.py; python3 scripts/validate_cc2_board.py --board .omx/cc2/board.json; python3 .omx/cc2/validate_issue_parity_intake.py .omx/cc2/issue-parity-intake.json; cargo fmt --manifest-path rust/Cargo.toml --all -- --check; cargo check --manifest-path rust/Cargo.toml --workspace; cargo test --manifest-path rust/Cargo.toml --workspace -- --nocapture; cargo clippy --manifest-path rust/Cargo.toml --workspace --all-targets -- -D warnings

Not-tested: live network provider smoke tests and remote PR/issue mutations.
2026-05-15 13:34:57 +09:00
bellman 8c9a05e71b Restore provider compatibility diagnostics as API types
Keep the G008 capability and diagnostic helpers compile-ready by restoring the public report/support/severity types that team integrations referenced after merge reconciliation.

Constraint: Final G008 verification failed on missing provider capability and diagnostic type definitions.
Confidence: high
Scope-risk: narrow
Directive: Keep provider diagnostics exported as typed API surfaces; do not replace them with ad-hoc JSON-only status fields.
Tested: cargo fmt --manifest-path rust/Cargo.toml --all -- --check; git diff --check; cargo test --manifest-path rust/Cargo.toml -p api providers:: -- --nocapture --test-threads=1; cargo test --manifest-path rust/Cargo.toml -p api --test openai_compat_integration -- --nocapture --test-threads=1
Not-tested: full workspace clippy; known unrelated runtime policy_engine struct_excessive_bools remains outside G008.

Co-authored-by: OmX <omx@oh-my-codex.dev>
2026-05-15 10:37:20 +09:00
bellman ea95bf2576 omx(team): auto-checkpoint worker-3 [unknown] 2026-05-15 10:30:16 +09:00
bellman dec8efa5c8 omx(team): auto-checkpoint worker-1 [1] 2026-05-15 10:30:09 +09:00
bellman ce02ace3a2 omx(team): auto-checkpoint worker-1 [1] 2026-05-15 10:30:06 +09:00
bellman bc32639ce3 omx(team): auto-checkpoint worker-1 [1] 2026-05-15 10:30:03 +09:00
bellman a212c662e5 omx(team): auto-checkpoint worker-1 [1] 2026-05-15 10:30:00 +09:00
bellman 2cac66cd38 Stabilize provider compatibility integration verification
Keep integrated G008 provider changes formatted and compile-ready so worker follow-up commits can merge against a clean leader baseline.

Constraint: G008 provider verification must pass before ultragoal checkpointing.
Confidence: high
Scope-risk: narrow
Directive: Keep provider compatibility follow-ups rebased on this formatted baseline before retrying failed cherry-picks.
Tested: cargo test --manifest-path rust/Cargo.toml -p api providers:: -- --nocapture; cargo test --manifest-path rust/Cargo.toml -p api --test openai_compat_integration -- --nocapture --test-threads=1
Not-tested: full workspace clippy; known pre-existing runtime policy_engine LaneContext clippy warning remains outside this change.

Co-authored-by: OmX <omx@oh-my-codex.dev>
2026-05-15 10:28:50 +09:00
bellman 685f078204 omx(team): auto-checkpoint worker-1 [1] 2026-05-15 10:23:37 +09:00
bellman 82ec223ed4 omx(team): auto-checkpoint worker-2 [unknown] 2026-05-15 10:21:55 +09:00
bellman a6ca5c489b omx(team): auto-checkpoint worker-4 [unknown] 2026-05-15 10:21:28 +09:00
bellman 3ff8743e79 omx(team): auto-checkpoint worker-2 [unknown] 2026-05-15 10:21:23 +09:00
bellman 29029bfc14 omx(team): auto-checkpoint worker-1 [1] 2026-05-15 10:21:18 +09:00
wangguan1995 8cada12c48 Add Qwen model token limits for DashScope compatibility 2026-05-10 13:09:07 +00:00
YeonGyu-Kim 75c08bc982 fix: REPL display, /compact panic, identity leak, DeepSeek reasoning, thinking blocks
Five interrelated fixes from parallel Hephaestus sessions:

1. fix(repl): display assistant text after spinner (#2981, #2982, #2937)
   - Added final_assistant_text() call after run_turn spinner completes
   - REPL now shows response text like run_prompt_json does

2. fix(compact): handle Thinking content blocks (#2985)
   - Added ContentBlock::Thinking variant throughout compact summarizer
   - Prevents panic when /compact encounters thinking blocks

3. fix(prompt): provider-aware model identity (#2822)
   - New ModelFamilyIdentity enum (Claude vs Generic)
   - Non-Anthropic models no longer say 'I am Claude'
   - model_family_identity_for() detects provider and sets identity

4. fix(openai): preserve DeepSeek reasoning_content (#2821)
   - Stream parser now captures reasoning_content from OpenAI-compat
   - Emits ThinkingDelta/SignatureDelta events for reasoning models
   - Thinking blocks included in conversation history for re-send

5. feat(runtime): Thinking block support across codebase
   - AssistantEvent::Thinking variant in conversation.rs
   - ContentBlock::Thinking in session serialization
   - Thinking-aware compact summarization
   - Tests for thinking block ordering and content

Closes #2981, #2982, #2937, #2985, #2822, #2821
2026-05-06 15:32:34 +09:00
Andreas Haida 9a512633a5 Cap OpenAI default output tokens using model metadata 2026-05-03 22:16:12 +02:00
Andreas Haida 6ac13ffdad Handle OpenAI token-limit errors as context-window failures 2026-05-03 22:16:12 +02:00
Yeachan-Heo 74ea754d29 Restore Rust formatting compliance
Run rustfmt from the Rust workspace so CI format checks pass without changing behavior.

Constraint: Scope is formatting-only across tracked Rust files

Confidence: high

Scope-risk: narrow

Tested: cd rust && cargo fmt --check

Tested: git diff --check
2026-04-28 09:19:16 +00:00
Yeachan-Heo 00d0eb61d4 US-024: Add token limit metadata for kimi models
Add ModelTokenLimit entries for kimi-k2.5 and kimi-k1.5 to enable
preflight context window validation. Per Moonshot AI documentation:
- Context window: 256,000 tokens
- Max output: 16,384 tokens

Includes 3 unit tests:
- returns_context_window_metadata_for_kimi_models
- kimi_alias_resolves_to_kimi_k25_token_limits
- preflight_blocks_oversized_requests_for_kimi_models

All tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-17 04:15:38 +00:00
Yeachan-Heo d037f9faa8 Fix strip_routing_prefix to handle kimi provider prefix (US-023)
Add "kimi" to the strip_routing_prefix matches so that models like
"kimi/kimi-k2.5" have their prefix stripped before sending to the
DashScope API (consistent with qwen/openai/xai/grok handling).

Also add unit test strip_routing_prefix_strips_kimi_provider_prefix.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 19:50:15 +00:00
Yeachan-Heo cec8d17ca8 Implement US-023: Add automatic routing for kimi models to DashScope
Changes in rust/crates/api/src/providers/mod.rs:
- Add 'kimi' alias to MODEL_REGISTRY resolving to 'kimi-k2.5' with DashScope config
- Add kimi/kimi- prefix routing to DashScope endpoint in metadata_for_model()
- Add resolve_model_alias() handling for kimi -> kimi-k2.5
- Add unit tests: kimi_prefix_routes_to_dashscope, kimi_alias_resolves_to_kimi_k2_5

Users can now use:
- --model kimi (resolves to kimi-k2.5)
- --model kimi-k2.5 (auto-routes to DashScope)
- --model kimi/kimi-k2.5 (explicit provider prefix)

All 127 tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 19:44:21 +00:00
Yeachan-Heo 4cb1db9faa Implement US-022: Enhanced error context for API failures
Add structured error context to API failures:
- Request ID tracking across retries with full context in error messages
- Provider-specific error code mapping with actionable suggestions
- Suggested user actions for common error types (401, 403, 413, 429, 500, 502-504)
- Added suggested_action field to ApiError::Api variant
- Updated enrich_bearer_auth_error to preserve suggested_action

Files changed:
- rust/crates/api/src/error.rs: Add suggested_action field, update Display
- rust/crates/api/src/providers/openai_compat.rs: Add suggested_action_for_status()
- rust/crates/api/src/providers/anthropic.rs: Update error handling

All tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 19:15:00 +00:00
Yeachan-Heo 5e65b33042 US-021: Add request body size pre-flight check for OpenAI-compatible provider 2026-04-16 17:41:57 +00:00
Yeachan-Heo 87b982ece5 US-011: Performance optimization for API request serialization
Added criterion benchmarks and optimized flatten_tool_result_content:
- Added criterion dev-dependency and request_building benchmark suite
- Optimized flatten_tool_result_content to pre-allocate capacity and avoid
  intermediate Vec construction (was collecting to Vec then joining)
- Made key functions public for benchmarking: translate_message,
  build_chat_completion_request, flatten_tool_result_content,
  is_reasoning_model, model_rejects_is_error_field

Benchmark results:
- flatten_tool_result_content/single_text: ~17ns
- translate_message/text_only: ~200ns
- build_chat_completion_request/10 messages: ~16.4µs
- is_reasoning_model detection: ~26-42ns

All 119 unit tests and 29 integration tests pass.
cargo clippy passes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 11:11:45 +00:00
Yeachan-Heo 3e4e1585b5 US-009: Add comprehensive unit tests for kimi model compatibility fix
Added 4 unit tests to verify is_error field handling for kimi models:
- model_rejects_is_error_field_detects_kimi_models: Detects kimi-k2.5, kimi-k1.5, dashscope/kimi-k2.5 (case insensitive)
- translate_message_includes_is_error_for_non_kimi_models: Verifies gpt-4o, grok-3, claude include is_error
- translate_message_excludes_is_error_for_kimi_models: Verifies kimi models exclude is_error (prevents 400 Bad Request)
- build_chat_completion_request_kimi_vs_non_kimi_tool_results: Full integration test for request building

All 119 unit tests and 29 integration tests pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 10:54:48 +00:00
Yeachan-Heo 124e8661ed Remove the deprecated Claude subscription login path and restore a green Rust workspace
ROADMAP #37 was still open even though several earlier backlog items were
already closed. This change removes the local login/logout surface, stops
startup auth resolution from treating saved OAuth credentials as a supported
path, and updates diagnostics/help to point users at ANTHROPIC_API_KEY or
ANTHROPIC_AUTH_TOKEN only.

While proving the change with the user-requested workspace gates, clippy
surfaced additional pre-existing warning failures across the Rust workspace.
Those were cleaned up in-place so the required `cargo fmt`, `cargo clippy
--workspace --all-targets -- -D warnings`, and `cargo test --workspace`
sequence now passes end to end.

Constraint: User explicitly required full-workspace fmt/clippy/test before commit/push
Constraint: Existing dirty leader worktree had to be stashed before attempted OMX team worktree launch
Rejected: Keep login/logout but hide them from help | left unsupported auth flow and saved OAuth fallback intact
Rejected: Stop after ROADMAP #37 targeted tests | did not satisfy required full-workspace verification gate
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Do not reintroduce saved OAuth as a silent Anthropic startup fallback without an explicit supported auth policy
Tested: cargo fmt --all --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: Remote push effects beyond origin/main update
2026-04-11 17:24:44 +00:00