MicroFish/.kiro/specs/graphiti-ollama-embedder/research.md

9.5 KiB
Raw Blame History

Research & Design Decisions — graphiti-ollama-embedder

Summary

  • Feature: graphiti-ollama-embedder
  • Discovery Scope: Extension (small, narrowly scoped change to an existing adapter + supporting docs)
  • Key Findings:
    • The Graphiti OpenAIEmbedder already accepts an arbitrary base_url and api_key. Pointing it at Ollama's OpenAI-compatible /v1/embeddings endpoint requires no code change — only documentation.
    • The silent placeholder-UUID fallback in _GraphNamespace.add_batch violates the project's existing background-task error-handling contract (error-handling.md: "Long-running tasks must always reach a terminal state"). The plumbing to surface a failure already exists in _build_graph_worker.
    • mxbai-embed-large is the only widely-available local embedder that matches Graphiti's hard-coded EMBEDDING_DIM = 1024. Smaller models (nomic-embed-text at 768) would silently mis-fit Neo4j vector indexes and are out of scope.

Research Log

Ollama's OpenAI-compatible embeddings API

  • Context: Verify that no Ollama-specific Graphiti embedder class is required.
  • Sources Consulted: Existing code at backend/app/services/graphiti_adapter.py:92115 (OpenAIEmbedderConfig accepts arbitrary base_url); ticket #18 description; Graphiti embedder/client.py:22 (EMBEDDING_DIM = 1024).
  • Findings:
    • Ollama exposes POST /v1/embeddings mirroring the OpenAI shape.
    • The current _build_llm_and_embedder("openai") branch already uses EMBEDDING_API_KEY or LLM_API_KEY and EMBEDDING_BASE_URL or LLM_BASE_URL, so any OpenAI-compatible endpoint just works.
    • Ollama ignores the auth header but OpenAIEmbedderConfig requires a non-empty api_key; the literal string "ollama" is the de-facto convention.
  • Implications: This is a documentation-only ask for R1. No new provider literal, no new factory branch.

Failure-propagation contract

  • Context: Confirm that removing the broad except in _GraphNamespace.add_batch will result in Task.status = FAILED in the UI.
  • Sources Consulted:
    • .kiro/steering/error-handling.md § Background Task Errors — outer except Exception in worker calls fail_task(task_id, str(e)).
    • backend/app/services/graph_builder.py:289308add_text_batches already wraps client.graph.add_batch in try/except and re-raises after a localized progress message.
    • backend/app/services/graph_builder.py:231234_build_graph_worker catches every exception and calls self.task_manager.fail_task(task_id, error_msg) with a full traceback.
  • Findings: The chain add_episode → _GraphNamespace.add_batch → add_text_batches → _build_graph_worker → fail_task is intact except for the swallow at the adapter layer. Removing the swallow is sufficient; no caller-side change is required.
  • Implications: R2.3 / R2.5 are realized for free as soon as R2.2 is implemented.

Single vs. batch ingestion path

  • Context: Determine whether the single-episode _GraphNamespace.add(...) (line 441) needs a parallel fix.
  • Sources Consulted: graphiti_adapter.py:441453. No try/except; exceptions bubble naturally.
  • Findings: Only the batch path swallows. The single path already complies.
  • Implications: Fix is local to add_batch. Do not introduce symmetric handling in add(...).

Logging level

  • Context: Decide between WARNING and ERROR for the failure log line.
  • Sources Consulted: .kiro/steering/error-handling.md § Logging:
    • ERROR — task failure, unrecoverable exception
    • WARNING — retry triggered, transient failure, recovered state
  • Findings: A failure that terminates the surrounding task is unrecoverable from the task's perspective, so ERROR is correct. The current WARNING is mislabelled.
  • Implications: R2.4 — change to logger.exception(...) (which logs at ERROR with traceback).

Documentation surfaces

  • Context: Decide which files need updating to satisfy R1.
  • Sources Consulted: .env.example (canonical config), CLAUDE.md lines 6082, README.md lines 148165, docker-compose.yml lines 2137.
  • Findings: All four are appropriate. README.md already has a placeholder for "non-OpenAI provider" and is the natural home for the curl smoke test snippet. docker-compose.yml benefits from one additional comment about host.docker.internal.
  • Implications: Update all four; keep edits minimal and additive.

Architecture Pattern Evaluation

Option Description Strengths Risks / Limitations Notes
A. Drop swallow + docs Remove except block in add_batch; update four docs files Smallest surface; honors steering rules; symmetric with add() Loses (broken) "best effort" intent Recommended
B. Narrow + retry Catch only transient classes (httpx.TimeoutException, openai.APIConnectionError); use retry_with_backoff from app/utils/retry.py; raise everything else Adds resilience to genuine network blips More moving parts; would also need to update add() for symmetry Defer to follow-up
C. New ollama provider literal Extend _build_llm_and_embedder with a third branch Symmetric with openai/gemini Explicitly out of scope per ticket; duplicate code path (Ollama is OpenAI-SDK with custom base_url) Rejected

Design Decisions

Decision: Adopt Option A (drop the placeholder fallback entirely; documentation only for Ollama support)

  • Context: R2 mandates that embedding failures during graph build surface as visible task failures. R1 mandates documentation for an Ollama embedder. The adapter already supports any OpenAI-compatible base URL.
  • Alternatives Considered:
    1. Option B (narrow + retry) — keep a small except clause for transient errors and use the project's retry_with_backoff.
    2. Option C (new provider literal) — add an ollama branch in _build_llm_and_embedder.
  • Selected Approach:
    • In _GraphNamespace.add_batch, replace the try/except Exception block with a straightforward call. Failures from _run(self._g.add_episode(...)) propagate to the caller.
    • Use logger.exception(...) immediately before re-raise is unnecessary — _build_graph_worker already calls logger.exception(f"task {task_id} failed") per the error-handling steering. To honor R2.4 explicitly without double-logging, wrap the call in a narrow try/except: logger.exception(...); raise so the adapter-level context (group_id, episode index) is captured before bubbling.
    • Update .env.example, CLAUDE.md, README.md, and docker-compose.yml to document Ollama configuration (R1).
  • Rationale:
    • The ticket explicitly lists transient-retry behavior and per-provider factory as out of scope.
    • Steering's error-handling chapter forbids catch-and-continue in service code.
    • Smaller surface = lower regression risk.
  • Trade-offs:
    • +Visibility: real config errors now surface at the UI.
    • +Code symmetry: add() and add_batch() behave the same on failure.
    • One-time noise: operators whose graph builds were "succeeding" only because of the silent fallback will now see a failed task. This is the intended correction; mention in PR body.
  • Follow-up:
    • If transient blips become an operational issue, revisit Option B in a separate ticket using retry_with_backoff against _g.add_episode.

Decision: Use logger.exception(...) not logger.error(...)

  • Context: R2.4 requires ERROR-level logging of the underlying exception.
  • Alternatives Considered: logger.error(str(e)) (no traceback), logger.warning(...) (current behavior).
  • Selected Approach: logger.exception("Episode add failed (group_id=%s)", graph_id) then raise.
  • Rationale: logger.exception logs at ERROR with the full traceback, which is what the steering doc prescribes for unrecoverable adapter failures.
  • Trade-offs: A small amount of duplication if _build_graph_worker also logs via logger.exception. Acceptable — the two log lines describe different layers (adapter vs. task) and have different identifying context.

Decision: Document Ollama under the existing OpenAI provider, not as a separate provider literal

  • Context: The ticket lists "per-provider embedder factory" as out of scope; Ollama is already reachable via the existing openai branch.
  • Selected Approach: Document Ollama as a configuration choice of the existing openai Graphiti provider (set the three EMBEDDING_* env vars).
  • Rationale: Avoids code duplication and matches the ticket's scope.

Risks & Mitigations

  • Risk: Operators currently relying on the silent fallback see new failed tasks. Mitigation: PR body calls this out explicitly with a "what changed" note pointing at the embedder env vars.
  • Risk: The except is removed but a transient timeout intermittently fails the entire graph build. Mitigation: Documented as a known follow-up (Option B). Acceptable today because the alternative was an empty graph that looked successful.
  • Risk: Documentation drifts between .env.example, CLAUDE.md, README.md. Mitigation: Keep all four edits in this PR and reference the same env-var triple verbatim.

References

  • Ticket #18 — .ticket/18.md (snapshot in this repo)
  • Steering — .kiro/steering/error-handling.md § Background Task Errors and § Logging
  • Steering — .kiro/steering/tech.md § Key Libraries (graphiti-core adapter rule)
  • Code — backend/app/services/graphiti_adapter.py:92115, :441475
  • Code — backend/app/services/graph_builder.py:143234, :256310