9.7 KiB
9.7 KiB
Gap Analysis — graphiti-ollama-embedder
1. Current State Investigation
Domain assets touched by this feature
backend/app/services/graphiti_adapter.py- Lines 92–139 —
_build_llm_and_embedder(provider). Builds anOpenAIEmbedder(whenprovider == "openai") usingEMBEDDING_API_KEY or LLM_API_KEY,EMBEDDING_BASE_URL or LLM_BASE_URL, andEMBEDDING_MODEL. Already supports pointing the embedder at any OpenAI-compatible endpoint — no code change is needed for Ollama support. This is a documentation gap, not a code gap. - Lines 455–475 —
_GraphNamespace.add_batch. Iterates episodes, callsadd_episode, and onexcept Exception as elogs a one-lineWARNINGand substitutes a fresh placeholder UUID. This is the silent-swallow path. - Line 441–453 —
_GraphNamespace.add(...). Single-episode path. Already raises naturally because there is notry/except. - Lines 504–506 —
_GraphNamespace.search(...). Has its ownexcept Exceptionthat logs and returns empty results. Pererror-handling.md("for non-fatal search failures, log and return empty results") this is the documented contract; out of scope.
- Lines 92–139 —
backend/app/services/graph_builder.py- Lines 256–310 —
add_text_batches(...). Already wrapsclient.graph.add_batch(...)intry/except Exceptionand re-raises after a progress message. So if_GraphNamespace.add_batchraises, the exception propagates correctly. - Lines 143–234 —
_build_graph_worker. Outertry/except Exceptioncallsself.task_manager.fail_task(task_id, error_msg)withf"{str(e)}\n{traceback.format_exc()}". This already implements the "task always terminates" rule fromerror-handling.md.
- Lines 256–310 —
backend/app/config.py- Lines 40, 50–51 — defines
EMBEDDING_MODEL,EMBEDDING_API_KEY,EMBEDDING_BASE_URL. No change required.
- Lines 40, 50–51 — defines
.env.example(project root) — currently only documents the OpenAI/Gemini path with commented-outEMBEDDING_API_KEY/EMBEDDING_BASE_URLlines.CLAUDE.mdlines 60–82 — "Required Environment Variables" section listsEMBEDDING_MODELwith a note about Gemini overrides only.README.mdlines 148–165 — "Required Environment Variables" section, mentions "uncomment if using a non-OpenAI provider, e.g. Gemini" but no Ollama example.docker-compose.ymllines 21–37 —mirofishservice usesenv_file: .envand overridesNEO4J_URI. No Ollama hint, but the standardhost.docker.internalroute works.
Conventions extracted from steering
tech.md: "All graph reads/writes go through thegraphiti_adapter; do not call Neo4j drivers directly from feature code." — adapter is the right place for the fix.error-handling.md: "Long-running tasks must always reach a terminal state (COMPLETEDorFAILED)" — silent placeholder UUID violates this.error-handling.md: "Don't catchExceptioninside an API handler just to log and continue" — same anti-pattern in the adapter today.error-handling.md§ Logging:WARNINGis for "retry triggered, transient failure, recovered state";ERRORis for "task failure, unrecoverable exception". The currentWARNINGmislabels what is actually an unrecoverable failure for the task.tech.md: Ollama is not currently an officially listed provider. CLAUDE.md only enumerates OpenAI and Gemini.commits.md/dev-guidelines.md: 4-space indent, max 120 chars/line, double-quoted Python strings, snake_case, conventional commits.
Integration surfaces
- The
OpenAIEmbedderfromgraphiti_core.embedder.openaialready accepts an arbitrarybase_url. Ollama exposes/v1/embeddingsathttp://localhost:11434/v1. No new client class is required. - Background-task lifecycle: API handler →
GraphBuilderService.build_graph_async()→ background thread →_build_graph_worker→fail_task(task_id, msg). Already in place; this feature just needs to stop short-circuiting it.
2. Requirements Feasibility Analysis
| Req | Need | Maps to | Gap |
|---|---|---|---|
| R1.1 | .env.example Ollama block |
.env.example |
Missing (docs) |
| R1.2 | CLAUDE.md lists OpenAI/Gemini/Ollama, dim constraint |
CLAUDE.md |
Missing (docs) |
| R1.3 | Docker-compose / README note about host.docker.internal:11434 |
docker-compose.yml comments / README.md |
Missing (docs) |
| R1.4 | curl smoke-test snippet |
README.md |
Missing (docs) |
| R1.5 | Pipeline works end-to-end with mxbai-embed-large | adapter is already provider-agnostic via OpenAI-SDK | No code gap — already supported, just undocumented |
| R2.1 | Drop placeholder-UUID fallback | graphiti_adapter.py:471–473 |
Constraint — narrow change only |
| R2.2 | Propagate ingest exception | graphiti_adapter.py:471–473 + caller |
Missing — adapter swallows; caller re-raises if it sees an exception |
| R2.3 | Task transitions to FAILED with non-empty error |
graph_builder.py:231–234 |
Already implemented — relies on R2.2 |
| R2.4 | Log at ERROR level |
graphiti_adapter.py:472 |
Missing — currently WARNING |
| R2.5 | UI shows error, no fake-success placeholder | downstream of R2.3 | Already implemented via task polling |
| R2.6 | Preserve happy-path UUID contract | graphiti_adapter.py:455–474 |
Constraint — keep return shape on success |
| R3.1 | OpenAI/Gemini behavior unchanged | _build_llm_and_embedder |
No change needed — branch untouched |
| R3.2 | No new env var | scope rule | Constraint |
| R3.3 | Document 1024-dim constraint | CLAUDE.md |
Missing (docs) |
Research needed
- None for this feature —
OpenAIEmbedderalready supports custombase_url, and Ollama's/v1/embeddingsis OpenAI-compatible (well-known and used in many projects). The 1024-dim constraint comes fromgraphiti_core/embedder/client.py:22(EMBEDDING_DIM = 1024) and is documented in the ticket itself. - One mild unknown: whether to narrow the
exceptto a transient subset (e.g.,httpx.TimeoutException,httpx.NetworkError) and retry, or simply drop the catch entirely. Decided in design phase, not blocking.
Complexity signal
- Mostly documentation. The code change is 5 lines in one method.
3. Implementation Approach Options
Option A — Pure narrow fix in _GraphNamespace.add_batch + docs only (RECOMMENDED)
- What: delete the
except Exceptionblock inadd_batch(or replace withlogger.exception(...)+raise); update.env.example,CLAUDE.md,README.md,docker-compose.ymlcomments. - Files:
backend/app/services/graphiti_adapter.py,.env.example,CLAUDE.md,README.md,docker-compose.yml. - Trade-offs:
- ✅ Minimal blast radius — adapter behavior outside
add_batchis untouched. - ✅ Existing background-task contract carries the failure to the UI for free.
- ✅ Honors steering rules: don't catch
Exceptionto log-and-continue; tasks must terminate; ERROR-level logging for unrecoverable failures. - ❌ Loses the (currently broken) "best effort, keep going on a partial failure" intent. In practice that intent never produced a usable graph anyway, so the loss is theoretical.
- ✅ Minimal blast radius — adapter behavior outside
Option B — Narrow the catch to transient errors and retry, fail loud on the rest
- What: keep a
try/except, but only catch a small set of transient classes (httpx.TimeoutException,httpx.NetworkError,openai.APIConnectionError), wrap the wholeadd_episodecall inretry_with_backofffromapp/utils/retry.py, and re-raise everything else immediately. - Trade-offs:
- ✅ Adds small resilience for genuinely transient blips.
- ✅ Aligns with the existing
retry_with_backoffpattern. - ❌ More moving parts; broader change for a bug fix.
- ❌ Single-episode
add()would also need the same treatment to avoid two divergent retry semantics. - ❌ Out-of-scope creep: ticket is focused on stopping the silent swallow + documenting Ollama.
Option C — Per-provider embedder factory + Option A
- What: extend
_build_llm_and_embedderwith a third provider literal ("ollama") that usesOpenAIEmbedderunder the hood with hardcoded sensible defaults. - Trade-offs:
- ✅ Symmetric with
openai/gemini. - ❌ The ticket explicitly lists "per-provider embedder factory" as out of scope.
- ❌ Duplicate code path — Ollama is just OpenAI-SDK with a different base URL.
- ✅ Symmetric with
4. Effort & Risk
- Effort: S (≤1 day). One file, one method, ~5 LOC delta plus 4 doc edits.
- Risk: Low. The change makes a previously-silent failure loud; it cannot break the happy path because the happy-path branch is the same return statement. Documentation changes are not load-bearing.
One non-zero risk: if there are real-world users today whose graph builds succeed only by accident (i.e., the fallback hides intermittent embedding failures), they will start seeing failed tasks instead of (broken) successful ones. This is the intended correction — but worth noting in the PR description so the operator can re-check their embedder credentials.
5. Recommendations for design phase
- Preferred approach: Option A. Smallest correct fix; documentation reflects the already-supported configuration; follows steering's error-handling philosophy literally.
- Key decisions to lock in design:
- Drop the
exceptentirely, or narrow it? Default: drop. Rationale: the only retry path that matters is transient network blips, and those would also kill the surrounding_runloop today; addressing them would be a follow-up using the project'sretry_with_backoffdecorator on the underlying graph driver call, not a band-aid inadd_batch. - Which docs files mention Ollama? Default:
.env.example,CLAUDE.md,README.md,docker-compose.ymlcomment. Two-file or three-file split?
- Drop the
- Carry-forward research: none.