9.5 KiB
9.5 KiB
Research & Design Decisions — graphiti-ollama-embedder
Summary
- Feature:
graphiti-ollama-embedder - Discovery Scope: Extension (small, narrowly scoped change to an existing adapter + supporting docs)
- Key Findings:
- The Graphiti
OpenAIEmbedderalready accepts an arbitrarybase_urlandapi_key. Pointing it at Ollama's OpenAI-compatible/v1/embeddingsendpoint requires no code change — only documentation. - The silent placeholder-UUID fallback in
_GraphNamespace.add_batchviolates the project's existing background-task error-handling contract (error-handling.md: "Long-running tasks must always reach a terminal state"). The plumbing to surface a failure already exists in_build_graph_worker. mxbai-embed-largeis the only widely-available local embedder that matches Graphiti's hard-codedEMBEDDING_DIM = 1024. Smaller models (nomic-embed-textat 768) would silently mis-fit Neo4j vector indexes and are out of scope.
- The Graphiti
Research Log
Ollama's OpenAI-compatible embeddings API
- Context: Verify that no Ollama-specific Graphiti embedder class is required.
- Sources Consulted: Existing code at
backend/app/services/graphiti_adapter.py:92–115(OpenAIEmbedderConfigaccepts arbitrarybase_url); ticket #18 description; Graphitiembedder/client.py:22(EMBEDDING_DIM = 1024). - Findings:
- Ollama exposes
POST /v1/embeddingsmirroring the OpenAI shape. - The current
_build_llm_and_embedder("openai")branch already usesEMBEDDING_API_KEY or LLM_API_KEYandEMBEDDING_BASE_URL or LLM_BASE_URL, so any OpenAI-compatible endpoint just works. - Ollama ignores the auth header but
OpenAIEmbedderConfigrequires a non-emptyapi_key; the literal string"ollama"is the de-facto convention.
- Ollama exposes
- Implications: This is a documentation-only ask for R1. No new provider literal, no new factory branch.
Failure-propagation contract
- Context: Confirm that removing the broad
exceptin_GraphNamespace.add_batchwill result inTask.status = FAILEDin the UI. - Sources Consulted:
.kiro/steering/error-handling.md§ Background Task Errors — outerexcept Exceptionin worker callsfail_task(task_id, str(e)).backend/app/services/graph_builder.py:289–308—add_text_batchesalready wrapsclient.graph.add_batchintry/exceptand re-raises after a localized progress message.backend/app/services/graph_builder.py:231–234—_build_graph_workercatches every exception and callsself.task_manager.fail_task(task_id, error_msg)with a full traceback.
- Findings: The chain
add_episode → _GraphNamespace.add_batch → add_text_batches → _build_graph_worker → fail_taskis intact except for the swallow at the adapter layer. Removing the swallow is sufficient; no caller-side change is required. - Implications: R2.3 / R2.5 are realized for free as soon as R2.2 is implemented.
Single vs. batch ingestion path
- Context: Determine whether the single-episode
_GraphNamespace.add(...)(line 441) needs a parallel fix. - Sources Consulted:
graphiti_adapter.py:441–453. Notry/except; exceptions bubble naturally. - Findings: Only the batch path swallows. The single path already complies.
- Implications: Fix is local to
add_batch. Do not introduce symmetric handling inadd(...).
Logging level
- Context: Decide between
WARNINGandERRORfor the failure log line. - Sources Consulted:
.kiro/steering/error-handling.md§ Logging:ERROR— task failure, unrecoverable exceptionWARNING— retry triggered, transient failure, recovered state
- Findings: A failure that terminates the surrounding task is unrecoverable from the task's perspective, so
ERRORis correct. The currentWARNINGis mislabelled. - Implications: R2.4 — change to
logger.exception(...)(which logs at ERROR with traceback).
Documentation surfaces
- Context: Decide which files need updating to satisfy R1.
- Sources Consulted:
.env.example(canonical config),CLAUDE.mdlines 60–82,README.mdlines 148–165,docker-compose.ymllines 21–37. - Findings: All four are appropriate.
README.mdalready has a placeholder for "non-OpenAI provider" and is the natural home for thecurlsmoke test snippet.docker-compose.ymlbenefits from one additional comment abouthost.docker.internal. - Implications: Update all four; keep edits minimal and additive.
Architecture Pattern Evaluation
| Option | Description | Strengths | Risks / Limitations | Notes |
|---|---|---|---|---|
| A. Drop swallow + docs | Remove except block in add_batch; update four docs files |
Smallest surface; honors steering rules; symmetric with add() |
Loses (broken) "best effort" intent | Recommended |
| B. Narrow + retry | Catch only transient classes (httpx.TimeoutException, openai.APIConnectionError); use retry_with_backoff from app/utils/retry.py; raise everything else |
Adds resilience to genuine network blips | More moving parts; would also need to update add() for symmetry |
Defer to follow-up |
C. New ollama provider literal |
Extend _build_llm_and_embedder with a third branch |
Symmetric with openai/gemini |
Explicitly out of scope per ticket; duplicate code path (Ollama is OpenAI-SDK with custom base_url) |
Rejected |
Design Decisions
Decision: Adopt Option A (drop the placeholder fallback entirely; documentation only for Ollama support)
- Context: R2 mandates that embedding failures during graph build surface as visible task failures. R1 mandates documentation for an Ollama embedder. The adapter already supports any OpenAI-compatible base URL.
- Alternatives Considered:
- Option B (narrow + retry) — keep a small
exceptclause for transient errors and use the project'sretry_with_backoff. - Option C (new provider literal) — add an
ollamabranch in_build_llm_and_embedder.
- Option B (narrow + retry) — keep a small
- Selected Approach:
- In
_GraphNamespace.add_batch, replace thetry/except Exceptionblock with a straightforward call. Failures from_run(self._g.add_episode(...))propagate to the caller. - Use
logger.exception(...)immediately before re-raise is unnecessary —_build_graph_workeralready callslogger.exception(f"task {task_id} failed")per the error-handling steering. To honor R2.4 explicitly without double-logging, wrap the call in a narrowtry/except: logger.exception(...); raiseso the adapter-level context (group_id, episode index) is captured before bubbling. - Update
.env.example,CLAUDE.md,README.md, anddocker-compose.ymlto document Ollama configuration (R1).
- In
- Rationale:
- The ticket explicitly lists transient-retry behavior and per-provider factory as out of scope.
- Steering's error-handling chapter forbids catch-and-continue in service code.
- Smaller surface = lower regression risk.
- Trade-offs:
- +Visibility: real config errors now surface at the UI.
- +Code symmetry:
add()andadd_batch()behave the same on failure. - −One-time noise: operators whose graph builds were "succeeding" only because of the silent fallback will now see a failed task. This is the intended correction; mention in PR body.
- Follow-up:
- If transient blips become an operational issue, revisit Option B in a separate ticket using
retry_with_backoffagainst_g.add_episode.
- If transient blips become an operational issue, revisit Option B in a separate ticket using
Decision: Use logger.exception(...) not logger.error(...)
- Context: R2.4 requires ERROR-level logging of the underlying exception.
- Alternatives Considered:
logger.error(str(e))(no traceback),logger.warning(...)(current behavior). - Selected Approach:
logger.exception("Episode add failed (group_id=%s)", graph_id)thenraise. - Rationale:
logger.exceptionlogs at ERROR with the full traceback, which is what the steering doc prescribes for unrecoverable adapter failures. - Trade-offs: A small amount of duplication if
_build_graph_workeralso logs vialogger.exception. Acceptable — the two log lines describe different layers (adapter vs. task) and have different identifying context.
Decision: Document Ollama under the existing OpenAI provider, not as a separate provider literal
- Context: The ticket lists "per-provider embedder factory" as out of scope; Ollama is already reachable via the existing
openaibranch. - Selected Approach: Document Ollama as a configuration choice of the existing
openaiGraphiti provider (set the threeEMBEDDING_*env vars). - Rationale: Avoids code duplication and matches the ticket's scope.
Risks & Mitigations
- Risk: Operators currently relying on the silent fallback see new failed tasks. Mitigation: PR body calls this out explicitly with a "what changed" note pointing at the embedder env vars.
- Risk: The
exceptis removed but a transient timeout intermittently fails the entire graph build. Mitigation: Documented as a known follow-up (Option B). Acceptable today because the alternative was an empty graph that looked successful. - Risk: Documentation drifts between
.env.example,CLAUDE.md,README.md. Mitigation: Keep all four edits in this PR and reference the same env-var triple verbatim.
References
- Ticket #18 —
.ticket/18.md(snapshot in this repo) - Steering —
.kiro/steering/error-handling.md§ Background Task Errors and § Logging - Steering —
.kiro/steering/tech.md§ Key Libraries (graphiti-coreadapter rule) - Code —
backend/app/services/graphiti_adapter.py:92–115, :441–475 - Code —
backend/app/services/graph_builder.py:143–234, :256–310