# Research & Design Decisions — graphiti-ollama-embedder ## Summary - **Feature**: `graphiti-ollama-embedder` - **Discovery Scope**: Extension (small, narrowly scoped change to an existing adapter + supporting docs) - **Key Findings**: - The Graphiti `OpenAIEmbedder` already accepts an arbitrary `base_url` and `api_key`. Pointing it at Ollama's OpenAI-compatible `/v1/embeddings` endpoint requires **no code change** — only documentation. - The silent placeholder-UUID fallback in `_GraphNamespace.add_batch` violates the project's existing background-task error-handling contract (`error-handling.md`: "Long-running tasks must always reach a terminal state"). The plumbing to surface a failure already exists in `_build_graph_worker`. - `mxbai-embed-large` is the only widely-available local embedder that matches Graphiti's hard-coded `EMBEDDING_DIM = 1024`. Smaller models (`nomic-embed-text` at 768) would silently mis-fit Neo4j vector indexes and are out of scope. ## Research Log ### Ollama's OpenAI-compatible embeddings API - **Context**: Verify that no Ollama-specific Graphiti embedder class is required. - **Sources Consulted**: Existing code at `backend/app/services/graphiti_adapter.py:92–115` (`OpenAIEmbedderConfig` accepts arbitrary `base_url`); ticket #18 description; Graphiti `embedder/client.py:22` (`EMBEDDING_DIM = 1024`). - **Findings**: - Ollama exposes `POST /v1/embeddings` mirroring the OpenAI shape. - The current `_build_llm_and_embedder("openai")` branch already uses `EMBEDDING_API_KEY or LLM_API_KEY` and `EMBEDDING_BASE_URL or LLM_BASE_URL`, so any OpenAI-compatible endpoint just works. - Ollama ignores the auth header but `OpenAIEmbedderConfig` requires a non-empty `api_key`; the literal string `"ollama"` is the de-facto convention. - **Implications**: This is a documentation-only ask for R1. No new provider literal, no new factory branch. ### Failure-propagation contract - **Context**: Confirm that removing the broad `except` in `_GraphNamespace.add_batch` will result in `Task.status = FAILED` in the UI. - **Sources Consulted**: - `.kiro/steering/error-handling.md` § Background Task Errors — outer `except Exception` in worker calls `fail_task(task_id, str(e))`. - `backend/app/services/graph_builder.py:289–308` — `add_text_batches` already wraps `client.graph.add_batch` in `try/except` and re-raises after a localized progress message. - `backend/app/services/graph_builder.py:231–234` — `_build_graph_worker` catches every exception and calls `self.task_manager.fail_task(task_id, error_msg)` with a full traceback. - **Findings**: The chain `add_episode → _GraphNamespace.add_batch → add_text_batches → _build_graph_worker → fail_task` is intact except for the swallow at the adapter layer. Removing the swallow is sufficient; no caller-side change is required. - **Implications**: R2.3 / R2.5 are realized for free as soon as R2.2 is implemented. ### Single vs. batch ingestion path - **Context**: Determine whether the single-episode `_GraphNamespace.add(...)` (line 441) needs a parallel fix. - **Sources Consulted**: `graphiti_adapter.py:441–453`. No `try/except`; exceptions bubble naturally. - **Findings**: Only the batch path swallows. The single path already complies. - **Implications**: Fix is local to `add_batch`. Do not introduce symmetric handling in `add(...)`. ### Logging level - **Context**: Decide between `WARNING` and `ERROR` for the failure log line. - **Sources Consulted**: `.kiro/steering/error-handling.md` § Logging: - `ERROR` — task failure, unrecoverable exception - `WARNING` — retry triggered, transient failure, recovered state - **Findings**: A failure that terminates the surrounding task is unrecoverable from the task's perspective, so `ERROR` is correct. The current `WARNING` is mislabelled. - **Implications**: R2.4 — change to `logger.exception(...)` (which logs at ERROR with traceback). ### Documentation surfaces - **Context**: Decide which files need updating to satisfy R1. - **Sources Consulted**: `.env.example` (canonical config), `CLAUDE.md` lines 60–82, `README.md` lines 148–165, `docker-compose.yml` lines 21–37. - **Findings**: All four are appropriate. `README.md` already has a placeholder for "non-OpenAI provider" and is the natural home for the `curl` smoke test snippet. `docker-compose.yml` benefits from one additional comment about `host.docker.internal`. - **Implications**: Update all four; keep edits minimal and additive. ## Architecture Pattern Evaluation | Option | Description | Strengths | Risks / Limitations | Notes | |--------|-------------|-----------|---------------------|-------| | A. Drop swallow + docs | Remove `except` block in `add_batch`; update four docs files | Smallest surface; honors steering rules; symmetric with `add()` | Loses (broken) "best effort" intent | Recommended | | B. Narrow + retry | Catch only transient classes (`httpx.TimeoutException`, `openai.APIConnectionError`); use `retry_with_backoff` from `app/utils/retry.py`; raise everything else | Adds resilience to genuine network blips | More moving parts; would also need to update `add()` for symmetry | Defer to follow-up | | C. New `ollama` provider literal | Extend `_build_llm_and_embedder` with a third branch | Symmetric with `openai`/`gemini` | Explicitly out of scope per ticket; duplicate code path (Ollama is OpenAI-SDK with custom `base_url`) | Rejected | ## Design Decisions ### Decision: Adopt Option A (drop the placeholder fallback entirely; documentation only for Ollama support) - **Context**: R2 mandates that embedding failures during graph build surface as visible task failures. R1 mandates documentation for an Ollama embedder. The adapter already supports any OpenAI-compatible base URL. - **Alternatives Considered**: 1. **Option B (narrow + retry)** — keep a small `except` clause for transient errors and use the project's `retry_with_backoff`. 2. **Option C (new provider literal)** — add an `ollama` branch in `_build_llm_and_embedder`. - **Selected Approach**: - In `_GraphNamespace.add_batch`, replace the `try/except Exception` block with a straightforward call. Failures from `_run(self._g.add_episode(...))` propagate to the caller. - Use `logger.exception(...)` immediately before re-raise is unnecessary — `_build_graph_worker` already calls `logger.exception(f"task {task_id} failed")` per the error-handling steering. To honor R2.4 explicitly without double-logging, wrap the call in a narrow `try/except: logger.exception(...); raise` so the adapter-level context (`group_id`, episode index) is captured before bubbling. - Update `.env.example`, `CLAUDE.md`, `README.md`, and `docker-compose.yml` to document Ollama configuration (R1). - **Rationale**: - The ticket explicitly lists transient-retry behavior and per-provider factory as out of scope. - Steering's error-handling chapter forbids catch-and-continue in service code. - Smaller surface = lower regression risk. - **Trade-offs**: - +Visibility: real config errors now surface at the UI. - +Code symmetry: `add()` and `add_batch()` behave the same on failure. - −One-time noise: operators whose graph builds were "succeeding" only because of the silent fallback will now see a failed task. This is the intended correction; mention in PR body. - **Follow-up**: - If transient blips become an operational issue, revisit Option B in a separate ticket using `retry_with_backoff` against `_g.add_episode`. ### Decision: Use `logger.exception(...)` not `logger.error(...)` - **Context**: R2.4 requires ERROR-level logging of the underlying exception. - **Alternatives Considered**: `logger.error(str(e))` (no traceback), `logger.warning(...)` (current behavior). - **Selected Approach**: `logger.exception("Episode add failed (group_id=%s)", graph_id)` then `raise`. - **Rationale**: `logger.exception` logs at ERROR with the full traceback, which is what the steering doc prescribes for unrecoverable adapter failures. - **Trade-offs**: A small amount of duplication if `_build_graph_worker` also logs via `logger.exception`. Acceptable — the two log lines describe different layers (adapter vs. task) and have different identifying context. ### Decision: Document Ollama under the existing OpenAI provider, not as a separate provider literal - **Context**: The ticket lists "per-provider embedder factory" as out of scope; Ollama is already reachable via the existing `openai` branch. - **Selected Approach**: Document Ollama as a configuration *choice* of the existing `openai` Graphiti provider (set the three `EMBEDDING_*` env vars). - **Rationale**: Avoids code duplication and matches the ticket's scope. ## Risks & Mitigations - **Risk**: Operators currently relying on the silent fallback see new failed tasks. **Mitigation**: PR body calls this out explicitly with a "what changed" note pointing at the embedder env vars. - **Risk**: The `except` is removed but a transient timeout intermittently fails the entire graph build. **Mitigation**: Documented as a known follow-up (Option B). Acceptable today because the alternative was an empty graph that *looked* successful. - **Risk**: Documentation drifts between `.env.example`, `CLAUDE.md`, `README.md`. **Mitigation**: Keep all four edits in this PR and reference the same env-var triple verbatim. ## References - Ticket #18 — `.ticket/18.md` (snapshot in this repo) - Steering — `.kiro/steering/error-handling.md` § Background Task Errors and § Logging - Steering — `.kiro/steering/tech.md` § Key Libraries (`graphiti-core` adapter rule) - Code — `backend/app/services/graphiti_adapter.py:92–115, :441–475` - Code — `backend/app/services/graph_builder.py:143–234, :256–310`