# Implementation Plan > Foundation tasks introduce the four `RERANKER_*` configuration knobs. > Core tasks add the new `OllamaReranker` and the factory selection branch. > Integration tasks wire documentation parity. > Validation closes the loop with a structural sweep. ## Foundation - [x] 1. Add reranker configuration surface - [x] 1.1 Introduce four `RERANKER_*` settings on the `Config` class - Add `RERANKER_PROVIDER` with default `ollama`, read via `os.environ.get('RERANKER_PROVIDER', 'ollama')`. - Add `RERANKER_MODEL` with default `qwen2.5:3b`, read via `os.environ.get('RERANKER_MODEL', 'qwen2.5:3b')`. - Add `RERANKER_BASE_URL` with default that chains to the embedding host: `os.environ.get('RERANKER_BASE_URL', os.environ.get('EMBEDDING_BASE_URL', 'http://localhost:11434/v1'))`. Do not reference `Config.EMBEDDING_BASE_URL` directly; use the env-lookup form so behaviour stays consistent under reload patterns. - Add `RERANKER_API_KEY` with default that chains to the embedding key the same way (`os.environ.get('RERANKER_API_KEY', os.environ.get('EMBEDDING_API_KEY', 'ollama'))`). - Do not add the reranker to `Config.validate()`; the provider has no mandatory credentials. - Observable completion: a Python REPL that imports `Config` shows the four attributes with the documented defaults, and overriding `EMBEDDING_BASE_URL` in the environment is visible on `Config.RERANKER_BASE_URL` too. - _Requirements: 1.3, 3.1, 3.2, 3.3, 3.4, 3.6_ ## Core - [x] 2. Implement the Ollama-backed reranker - [x] 2.1 Create the new reranker module with the `CrossEncoderClient` subclass - Define a new module under `backend/app/services/` that hosts the reranker class. The class subclasses `graphiti_core.cross_encoder.client.CrossEncoderClient` and implements only the async `rank` method. - Constructor accepts `model`, `base_url`, `api_key` as keyword arguments; it instantiates `openai.AsyncOpenAI(base_url=..., api_key=...)` but performs no network I/O so the Flask app can boot when Ollama is unreachable. - `rank(query, passages)` short-circuits on empty `passages` and returns `[]` without any model call. - For each passage, send a single chat-completion request with `temperature=0.0` and a deterministic system prompt asking for a JSON object `{"score": <0.0..1.0>}` describing the passage's relevance to the query. Use `asyncio.gather` to run all per-passage requests concurrently. - Parse each model response defensively: strip any `...` block, strip markdown code fences, attempt `json.loads`, fall back to regex-extract the first floating-point number, clip the value to `[0.0, 1.0]`. On any per-passage failure, assign a deterministic fallback score of `-0.001 * passage_index` and log at DEBUG once per failure naming the model and error class. The passage string is echoed byte-for-byte regardless of parse outcome. - Wrap the whole call in a `try/except`. On a whole-call failure (connection refused, 404, timeout, etc.), log a single WARNING naming the model and error class, then return `[(p, 1.0 - 0.01 * i) for i, p in enumerate(passages)]` so search remains functional. The method must not raise. - Sort the returned list by score descending before returning. - Observable completion: instantiating the new class with a deliberately bad `base_url` does not raise; an async call to `rank("q", [])` returns `[]`; an async call with two non-empty passages against a reachable Ollama returns two `(passage, float)` tuples in descending-score order, with every input passage byte-identical in the output. - _Requirements: 1.4, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 5.1, 5.2, 5.3, 5.4, 7.1_ - _Boundary: OllamaReranker module_ ## Integration - [x] 3. Wire the new reranker into the Graphiti factory - [x] 3.1 Select the reranker inside `_get_graphiti()` based on `Config.RERANKER_PROVIDER` - Introduce a small allow-list constant alongside `_ALLOWED_GRAPHITI_PROVIDERS` enumerating `("ollama", "none")`. - Read `Config.RERANKER_PROVIDER`, lowercase it, and validate against the allow-list. If the value is not in the allow-list, raise `ValueError` with a message that names the offending value and lists the accepted values — same shape as the existing `GRAPHITI_LLM_PROVIDER` validation. - For `ollama`, construct the new `OllamaReranker(model=Config.RERANKER_MODEL, base_url=Config.RERANKER_BASE_URL, api_key=Config.RERANKER_API_KEY)` and pass it as the `cross_encoder=` argument to `Graphiti(...)`. - For `none`, continue to pass `_PassthroughReranker()` as today; do not change the passthrough class. - Add one INFO log line at construction time that announces the selected reranker provider (sibling of the existing "Initializing Graphiti client (provider=...)" log). - Preserve the double-checked locking and singleton pattern exactly. The provider is read once at first construction; do not re-read at runtime. - Observable completion: with `RERANKER_PROVIDER` unset, app startup logs `Initializing Graphiti reranker (provider=ollama)...` and Graphiti is constructed with the `OllamaReranker`. With `RERANKER_PROVIDER=none`, the log reports `none` and Graphiti uses `_PassthroughReranker`. With `RERANKER_PROVIDER=banana`, `_get_graphiti()` raises `ValueError` listing `('ollama', 'none')`. - _Requirements: 1.1, 1.2, 3.5, 4.1, 4.2, 4.3_ - _Depends: 1.1, 2.1_ - [ ] 4. Update operator-facing documentation - [ ] 4.1 (P) Add the new env knobs to `.env.example` *(deferred — sandbox hook blocks all `.env*` access; see HANDOFF.md)* - Insert a four-line `RERANKER_*` block adjacent to the existing `EMBEDDING_*` block, mirroring the comment style (default, accepted values, and a one-line note that `RERANKER_PROVIDER=none` disables reranking). - Observable completion: opening `.env.example` shows the four new variables with documented defaults, positioned next to the embedding block. - _Requirements: 6.1_ - _Boundary: .env.example_ - _Depends: 1.1_ - [x] 4.2 (P) Extend the `Required Environment Variables` snippet in `CLAUDE.md` - Add the four `RERANKER_*` variables to the existing fenced code block under "Required Environment Variables" in `CLAUDE.md`, keeping the same comment style used for the `EMBEDDING_*` block. - Observable completion: `CLAUDE.md` documents the four reranker variables next to the embedding block and includes a note that `RERANKER_PROVIDER=none` keeps the previous passthrough behaviour. - _Requirements: 6.2_ - _Boundary: CLAUDE.md_ - _Depends: 1.1_ - [x] 4.3 (P) Document the Ollama pull prerequisite and env block in `README.md` - In the existing "Install Ollama and pull the default embedding model" section, add a parallel `ollama pull qwen2.5:3b` step (or note that the model used for reranking must be pulled, using the documented default). - In the `.env` snippet under "Configure Environment Variables", add the four `RERANKER_*` lines with brief comments mirroring the embedding-block style. - Treat `README-EN.md` and `README-ZH.md` translations as out of scope for this ticket — translation belongs to the active i18n workstream and would otherwise drift. - Observable completion: `README.md` shows the `ollama pull qwen2.5:3b` step and the four reranker env lines in the `.env` snippet. - _Requirements: 6.3_ - _Boundary: README.md_ - _Depends: 1.1_ - [x] 4.4 (P) Update the stale follow-up claim in the prior spec - In `.kiro/specs/graphiti-neo4j-finalize/research.md`, find the "A real per-provider reranker is a follow-up" text and either replace it with a pointer to this spec or note that follow-up has shipped under `graphiti-ollama-reranker`. The constraint is that no remaining documentation continues to claim the reranker remains a deferred passthrough. - Observable completion: a grep for "real per-provider reranker is a follow-up" across `.kiro/specs/` returns either zero hits or a pointer note to `graphiti-ollama-reranker`. - _Requirements: 6.4_ - _Boundary: .kiro/specs/graphiti-neo4j-finalize/research.md_ ## Validation - [x] 5. Structural verification sweep - [x] 5.1 Grep for legacy reranker references and verify the new wiring is reachable - Grep `backend/app/services/` for `gpt-4.1-nano` and `OpenAIRerankerClient`; both must return zero hits in code paths owned by this spec. - Grep `backend/app/services/graphiti_adapter.py` for the symbol of the new reranker class; confirm there is exactly one import site and one use site (the `_get_graphiti()` branch). - Confirm the four ReportAgent tools (`SearchResult`, `InsightForge`, `Panorama`, `Interview`) require no source changes by grepping for `client.graph.search(` call sites and verifying the kwarg shape is unchanged. - Confirm `_GraphNamespace.search` still filters by `group_id` (no regression to project isolation). - Observable completion: a short verification summary captured during implementation lists each grep outcome with the expected zero / single hit, and the report-tool call sites are unchanged. - _Requirements: 1.4, 7.1, 7.2, 7.3_ - _Depends: 3.1_