11 KiB
Requirements Document
Project Description (Input)
Replace the no-op _PassthroughReranker in backend/app/services/graphiti_adapter.py with a real reranker that uses an Ollama-available model, so Graphiti search results are properly reranked for the SearchResult / InsightForge / Panorama / Interview report tools. Add RERANKER_PROVIDER / RERANKER_MODEL / RERANKER_BASE_URL env knobs (defaults: ollama / a small Ollama chat model / EMBEDDING_BASE_URL), keep _PassthroughReranker only when RERANKER_PROVIDER=none, and update .env.example, CLAUDE.md, and the README accordingly. Source ticket: #39 (.ticket/39.md).
Introduction
The Graphiti adapter currently injects a _PassthroughReranker into the Graphiti(...) constructor to bypass the upstream default (OpenAIRerankerClient with a hard-coded gpt-4.1-nano and OpenAI-specific logprobs/logit_bias), which would 401 against Qwen/Dashscope keys and is unavailable through Ollama. The passthrough is a no-op: it returns passages in original order with synthetic descending scores, so search results consumed by the ReportAgent tools (SearchResult, InsightForge, Panorama, Interview) are not actually reranked.
This feature replaces the no-op with a real reranker backed by a model available through the local Ollama stack (matching the existing EMBEDDING_MODEL=mxbai-embed-large precedent). A small set of environment variables makes the provider, model, and endpoint overridable. An explicit none provider preserves the passthrough behavior for CI / lightweight setups that cannot pull the reranker model.
Boundary Context
- In scope:
- A new
CrossEncoderClientimplementation inbackend/app/services/that scores passages against a query by calling an Ollama model through its OpenAI-compatible endpoint. - New
RERANKER_PROVIDER,RERANKER_MODEL,RERANKER_BASE_URL, andRERANKER_API_KEYsettings inbackend/app/config.py, with sensible Ollama defaults. - Provider selection inside
_get_graphiti()soollamaselects the new client andnonekeeps_PassthroughReranker. - Documentation updates in
.env.example,CLAUDE.md(Required Environment Variables), and the projectREADME.md(Ollama prerequisites). - Graceful failure when the configured reranker model is not pulled (clear error, no Flask crash; graph search either falls back to original order or surfaces a logged warning consistent with the existing
_GraphNamespace.searchexception path).
- A new
- Out of scope:
- Changing
LLM_MODEL_NAMEorEMBEDDING_MODELdefaults. - Building OpenAI-only or Dashscope-only reranker clients; this spec is specifically the Ollama path (plus the
noneescape hatch). - Upstream changes to
graphiti-core. - Adding any non-Python reranker library (e.g.
sentence-transformers); the new client must reuse the OpenAI SDK already in the dependency set.
- Changing
- Adjacent expectations:
graphiti_adapter._get_graphiti()continues to be the single Graphiti factory; the new reranker must be wired through it, not at call sites.- All Graphiti reads remain scoped by
group_id— the reranker operates on passages already filtered per project; it does not change isolation rules. - The reranker integrates with
_GraphNamespace.search, which is the path used bySearchResult,InsightForge,Panorama, andInterviewtools; behavior changes propagate to those tools automatically and do not need per-tool code changes.
Requirements
Requirement 1: Default reranker is Ollama-backed, not the OpenAI default
Objective: As a backend developer running MiroFish against the default local Ollama stack, I want Graphiti to rerank search results without requiring an OpenAI key, so that report-tool relevance reflects a real model and not an arbitrary insertion order.
Acceptance Criteria
- The Graphiti Adapter shall instantiate Graphiti with a non-passthrough
CrossEncoderClientwheneverRERANKER_PROVIDERresolves toollama(the default). - The Graphiti Adapter shall not depend on
graphiti_core.cross_encoder.openai_reranker_client.OpenAIRerankerClientfor the default code path. - When
RERANKER_PROVIDERis unset, the Graphiti Adapter shall behave as ifRERANKER_PROVIDER=ollama. - The Graphiti Adapter shall not reference the model name
gpt-4.1-nanoin any reranker code path.
Requirement 2: Ollama-backed reranker scores passages via an OpenAI-compatible chat endpoint
Objective: As a backend developer, I want a reranker that talks to a locally hosted model so that the local-first stack stays self-contained and no remote LLM key is required.
Acceptance Criteria
- The Ollama Reranker shall expose a class that subclasses
graphiti_core.cross_encoder.client.CrossEncoderClientand implements the asynchronousrank(query, passages) -> list[tuple[passage, score]]contract. - The Ollama Reranker shall call its configured chat-completions endpoint through the
openaiSDK usingRERANKER_BASE_URLandRERANKER_API_KEY, so no second SDK is introduced. - The Ollama Reranker shall return passages sorted by descending score (highest relevance first) with one score per input passage.
- When
passagesis empty, the Ollama Reranker shall return an empty list without issuing any model call. - The Ollama Reranker shall preserve passage strings byte-for-byte; it shall not rewrite, truncate, or reorder content within an individual passage.
- If the model response cannot be parsed into a numeric score for a passage, the Ollama Reranker shall assign that passage a deterministic fallback score lower than every successfully-parsed score so the passage still appears in the output exactly once.
Requirement 3: Reranker is configurable via environment variables
Objective: As an operator deploying MiroFish, I want to override the reranker provider, model, and endpoint via environment variables so that I can target a different Ollama host, a different model, or disable reranking entirely.
Acceptance Criteria
- The Configuration module shall expose
RERANKER_PROVIDERwith defaultollamaand accept the valuesollamaandnone. - The Configuration module shall expose
RERANKER_MODELwhose default is a small Ollama-available chat model selected during design (e.g.qwen2.5:3borllama3.2:3b). - The Configuration module shall expose
RERANKER_BASE_URLwhose default is the value ofEMBEDDING_BASE_URL(so the same Ollama host is reused by default). - The Configuration module shall expose
RERANKER_API_KEYwhose default is the value ofEMBEDDING_API_KEY(so Ollama's ignored-token defaultollamaworks without explicit configuration). - If
RERANKER_PROVIDERis set to a value other thanollamaornone, the Graphiti Adapter shall raise a clearValueErrorat startup naming the offending value and listing accepted values. - The Configuration module shall read all four reranker variables from the process environment via the same
os.environ.getpattern used by the surrounding settings, with no additional dependencies.
Requirement 4: none provider preserves the passthrough fallback for CI / lightweight setups
Objective: As a developer running tests or a slim container that cannot pull the reranker model, I want to disable reranking explicitly so the Flask app still boots and graph search still works.
Acceptance Criteria
- Where
RERANKER_PROVIDER=none, the Graphiti Adapter shall continue to inject_PassthroughRerankerand shall not attempt any model call at startup. - While
RERANKER_PROVIDER=none, graph search shall return results in the order Graphiti supplies them with the existing synthetic-descending-score behavior. - The Graphiti Adapter shall log at INFO level the selected reranker provider during initialization so operators can confirm whether reranking is active.
Requirement 5: Graceful degradation when the configured Ollama model is unreachable
Objective: As an operator who forgot to run ollama pull <model> (or whose Ollama service is down), I want the Flask backend to keep serving requests with a clear log signal rather than crashing.
Acceptance Criteria
- If the Ollama Reranker fails to score passages for a given query (e.g. connection refused, 404 model not found, timeout, or unparseable response), the Graphiti Adapter shall log a warning that names the failing model and the error class.
- If the Ollama Reranker raises during a
rankcall, the calling_GraphNamespace.searchshall not propagate the exception to HTTP callers; existing search-error handling already swallows reranker errors into a logged warning, and this behavior shall be preserved. - When the Ollama Reranker fails for a query, the rerank-failure path shall return the passages in their original Graphiti order so search remains functional.
- The Ollama Reranker shall not raise during construction (i.e.
_get_graphiti()must succeed even if the Ollama service is unavailable); failures are deferred until the firstrankcall.
Requirement 6: Documentation reflects the new reranker configuration
Objective: As a new contributor reading the docs, I want the reranker env vars, defaults, and prerequisites documented in the same places the other LLM/embedder settings live so configuration is discoverable.
Acceptance Criteria
- The Environment Example file (
.env.example) shall include entries forRERANKER_PROVIDER,RERANKER_MODEL,RERANKER_BASE_URL, andRERANKER_API_KEY, each commented with its default and accepted values. - The CLAUDE.md document shall list the four reranker variables in its "Required Environment Variables" section with the same level of detail used for
EMBEDDING_MODEL. - The README.md document shall mention the
ollama pull <reranker model>prerequisite alongside the existingollama pull mxbai-embed-largenote (or wherever Ollama setup is documented). - Where the
.kiro/specs/graphiti-neo4j-finalizedocuments state that the reranker is a passthrough no-op, those documents shall either be updated to point at this spec or left untouched (decided in design); the constraint is that no documentation shall continue to claim "a real per-provider reranker is a follow-up" once this spec is implemented.
Requirement 7: Report-tool integration verifies reranked output reaches consumers
Objective: As a developer using the ReportAgent tools, I want SearchResult, InsightForge, Panorama, and Interview to receive properly reranked edges/nodes so their report output reflects model-judged relevance, not Graphiti's hybrid-search ordering alone.
Acceptance Criteria
- When
RERANKER_PROVIDER=ollamais active and the configured model is available, the_GraphNamespace.searchshall return passages whose order is determined by the Ollama Reranker, not Graphiti's default RRF ordering. - The ReportAgent tools (
SearchResult,InsightForge,Panorama,Interview) shall require no changes for this feature; the rerank improvement reaches them transparently through_GraphNamespace.search. - While the Ollama Reranker is active, the per-project
group_idscoping of all Graphiti queries shall remain unchanged.