9.5 KiB

Raw Blame History

Research & Design Decisions — graphiti-ollama-embedder

Summary

Feature: graphiti-ollama-embedder
Discovery Scope: Extension (small, narrowly scoped change to an existing adapter + supporting docs)
Key Findings:
- The Graphiti OpenAIEmbedder already accepts an arbitrary base_url and api_key. Pointing it at Ollama's OpenAI-compatible /v1/embeddings endpoint requires no code change — only documentation.
- The silent placeholder-UUID fallback in _GraphNamespace.add_batch violates the project's existing background-task error-handling contract (error-handling.md: "Long-running tasks must always reach a terminal state"). The plumbing to surface a failure already exists in _build_graph_worker.
- mxbai-embed-large is the only widely-available local embedder that matches Graphiti's hard-coded EMBEDDING_DIM = 1024. Smaller models (nomic-embed-text at 768) would silently mis-fit Neo4j vector indexes and are out of scope.

Research Log

Ollama's OpenAI-compatible embeddings API

Context: Verify that no Ollama-specific Graphiti embedder class is required.
Sources Consulted: Existing code at backend/app/services/graphiti_adapter.py:92–115 (OpenAIEmbedderConfig accepts arbitrary base_url); ticket #18 description; Graphiti embedder/client.py:22 (EMBEDDING_DIM = 1024).
Findings:
- Ollama exposes POST /v1/embeddings mirroring the OpenAI shape.
- The current _build_llm_and_embedder("openai") branch already uses EMBEDDING_API_KEY or LLM_API_KEY and EMBEDDING_BASE_URL or LLM_BASE_URL, so any OpenAI-compatible endpoint just works.
- Ollama ignores the auth header but OpenAIEmbedderConfig requires a non-empty api_key; the literal string "ollama" is the de-facto convention.
Implications: This is a documentation-only ask for R1. No new provider literal, no new factory branch.

Failure-propagation contract

Context: Confirm that removing the broad except in _GraphNamespace.add_batch will result in Task.status = FAILED in the UI.
Sources Consulted:
- .kiro/steering/error-handling.md § Background Task Errors — outer except Exception in worker calls fail_task(task_id, str(e)).
- backend/app/services/graph_builder.py:289–308 — add_text_batches already wraps client.graph.add_batch in try/except and re-raises after a localized progress message.
- backend/app/services/graph_builder.py:231–234 — _build_graph_worker catches every exception and calls self.task_manager.fail_task(task_id, error_msg) with a full traceback.
Findings: The chain add_episode → _GraphNamespace.add_batch → add_text_batches → _build_graph_worker → fail_task is intact except for the swallow at the adapter layer. Removing the swallow is sufficient; no caller-side change is required.
Implications: R2.3 / R2.5 are realized for free as soon as R2.2 is implemented.

Single vs. batch ingestion path

Context: Determine whether the single-episode _GraphNamespace.add(...) (line 441) needs a parallel fix.
Sources Consulted: graphiti_adapter.py:441–453. No try/except; exceptions bubble naturally.
Findings: Only the batch path swallows. The single path already complies.
Implications: Fix is local to add_batch. Do not introduce symmetric handling in add(...).

Logging level

Context: Decide between WARNING and ERROR for the failure log line.
Sources Consulted: .kiro/steering/error-handling.md § Logging:
- ERROR — task failure, unrecoverable exception
- WARNING — retry triggered, transient failure, recovered state
Findings: A failure that terminates the surrounding task is unrecoverable from the task's perspective, so ERROR is correct. The current WARNING is mislabelled.
Implications: R2.4 — change to logger.exception(...) (which logs at ERROR with traceback).

Documentation surfaces

Context: Decide which files need updating to satisfy R1.
Sources Consulted: .env.example (canonical config), CLAUDE.md lines 60–82, README.md lines 148–165, docker-compose.yml lines 21–37.
Findings: All four are appropriate. README.md already has a placeholder for "non-OpenAI provider" and is the natural home for the curl smoke test snippet. docker-compose.yml benefits from one additional comment about host.docker.internal.
Implications: Update all four; keep edits minimal and additive.

Architecture Pattern Evaluation

Option	Description	Strengths	Risks / Limitations	Notes
A. Drop swallow + docs	Remove `except` block in `add_batch`; update four docs files	Smallest surface; honors steering rules; symmetric with `add()`	Loses (broken) "best effort" intent	Recommended
B. Narrow + retry	Catch only transient classes (`httpx.TimeoutException`, `openai.APIConnectionError`); use `retry_with_backoff` from `app/utils/retry.py`; raise everything else	Adds resilience to genuine network blips	More moving parts; would also need to update `add()` for symmetry	Defer to follow-up
C. New `ollama` provider literal	Extend `_build_llm_and_embedder` with a third branch	Symmetric with `openai`/`gemini`	Explicitly out of scope per ticket; duplicate code path (Ollama is OpenAI-SDK with custom `base_url`)	Rejected

Design Decisions

Decision: Adopt Option A (drop the placeholder fallback entirely; documentation only for Ollama support)

Context: R2 mandates that embedding failures during graph build surface as visible task failures. R1 mandates documentation for an Ollama embedder. The adapter already supports any OpenAI-compatible base URL.
Alternatives Considered:
1. Option B (narrow + retry) — keep a small except clause for transient errors and use the project's retry_with_backoff.
2. Option C (new provider literal) — add an ollama branch in _build_llm_and_embedder.
Selected Approach:
- In _GraphNamespace.add_batch, replace the try/except Exception block with a straightforward call. Failures from _run(self._g.add_episode(...)) propagate to the caller.
- Use logger.exception(...) immediately before re-raise is unnecessary — _build_graph_worker already calls logger.exception(f"task {task_id} failed") per the error-handling steering. To honor R2.4 explicitly without double-logging, wrap the call in a narrow try/except: logger.exception(...); raise so the adapter-level context (group_id, episode index) is captured before bubbling.
- Update .env.example, CLAUDE.md, README.md, and docker-compose.yml to document Ollama configuration (R1).
Rationale:
- The ticket explicitly lists transient-retry behavior and per-provider factory as out of scope.
- Steering's error-handling chapter forbids catch-and-continue in service code.
- Smaller surface = lower regression risk.
Trade-offs:
- +Visibility: real config errors now surface at the UI.
- +Code symmetry: add() and add_batch() behave the same on failure.
- −One-time noise: operators whose graph builds were "succeeding" only because of the silent fallback will now see a failed task. This is the intended correction; mention in PR body.
Follow-up:
- If transient blips become an operational issue, revisit Option B in a separate ticket using retry_with_backoff against _g.add_episode.

Decision: Use `logger.exception(...)` not `logger.error(...)`

Context: R2.4 requires ERROR-level logging of the underlying exception.
Alternatives Considered: logger.error(str(e)) (no traceback), logger.warning(...) (current behavior).
Selected Approach: logger.exception("Episode add failed (group_id=%s)", graph_id) then raise.
Rationale: logger.exception logs at ERROR with the full traceback, which is what the steering doc prescribes for unrecoverable adapter failures.
Trade-offs: A small amount of duplication if _build_graph_worker also logs via logger.exception. Acceptable — the two log lines describe different layers (adapter vs. task) and have different identifying context.

Decision: Document Ollama under the existing OpenAI provider, not as a separate provider literal

Context: The ticket lists "per-provider embedder factory" as out of scope; Ollama is already reachable via the existing openai branch.
Selected Approach: Document Ollama as a configuration choice of the existing openai Graphiti provider (set the three EMBEDDING_* env vars).
Rationale: Avoids code duplication and matches the ticket's scope.

Risks & Mitigations

Risk: Operators currently relying on the silent fallback see new failed tasks. Mitigation: PR body calls this out explicitly with a "what changed" note pointing at the embedder env vars.
Risk: The except is removed but a transient timeout intermittently fails the entire graph build. Mitigation: Documented as a known follow-up (Option B). Acceptable today because the alternative was an empty graph that looked successful.
Risk: Documentation drifts between .env.example, CLAUDE.md, README.md. Mitigation: Keep all four edits in this PR and reference the same env-var triple verbatim.

References

Ticket #18 — .ticket/18.md (snapshot in this repo)
Steering — .kiro/steering/error-handling.md § Background Task Errors and § Logging
Steering — .kiro/steering/tech.md § Key Libraries (graphiti-core adapter rule)
Code — backend/app/services/graphiti_adapter.py:92–115, :441–475
Code — backend/app/services/graph_builder.py:143–234, :256–310

9.5 KiB Raw Blame History Unescape Escape

Research & Design Decisions — graphiti-ollama-embedder

Summary

Research Log

Ollama's OpenAI-compatible embeddings API

Failure-propagation contract

Single vs. batch ingestion path

Logging level

Documentation surfaces

Architecture Pattern Evaluation

Design Decisions

Decision: Adopt Option A (drop the placeholder fallback entirely; documentation only for Ollama support)

Decision: Use logger.exception(...) not logger.error(...)

Decision: Document Ollama under the existing OpenAI provider, not as a separate provider literal

Risks & Mitigations

References

9.5 KiB

Raw Blame History

Decision: Use `logger.exception(...)` not `logger.error(...)`