MicroFish/.kiro/specs/graphiti-ollama-reranker/design.md

396 lines
24 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Design — graphiti-ollama-reranker
## Overview
**Purpose**: Replace the no-op `_PassthroughReranker` injected into Graphiti with a real Ollama-backed `CrossEncoderClient`, so that hybrid search results consumed by the ReportAgent tools (`SearchResult`, `InsightForge`, `Panorama`, `Interview`) are ordered by model-judged relevance rather than Graphiti's RRF fallback ordering. Configuration is env-driven (`RERANKER_PROVIDER`, `RERANKER_MODEL`, `RERANKER_BASE_URL`, `RERANKER_API_KEY`) with Ollama-aligned defaults; an explicit `RERANKER_PROVIDER=none` preserves the passthrough for CI and slim containers.
**Users**: Backend developers running the local-first stack against Ollama; operators deploying MiroFish behind any OpenAI-compatible reranker endpoint; CI users who explicitly disable reranking.
**Impact**: Adds one new module under `backend/app/services/`, four `Config` attributes, a small selection branch in `_get_graphiti()`, and documentation in `.env.example`, `CLAUDE.md`, `README.md`. No data schema, no API, no UI changes. Behavior under `RERANKER_PROVIDER=none` is identical to today.
### Goals
- Default Ollama-backed reranker producing one `(passage, score)` tuple per input passage, sorted descending by score.
- Env-driven configuration with sensible Ollama defaults inherited from existing `EMBEDDING_*` settings.
- Graceful degradation: Flask boots and graph search keeps working even when the Ollama service or the configured model is unavailable.
- Documentation parity with `EMBEDDING_*` knobs in `.env.example`, `CLAUDE.md`, and `README.md`.
### Non-Goals
- Building a Dashscope/OpenAI/Gemini reranker (out of scope per ticket #39).
- Changing `LLM_MODEL_NAME` or `EMBEDDING_MODEL` defaults.
- Upstream contributions to `graphiti-core`.
- Adding a `sentence-transformers` or other non-`openai` reranker dependency.
## Boundary Commitments
### This Spec Owns
- The Ollama reranker implementation and its prompt/parse logic.
- The `RERANKER_PROVIDER`, `RERANKER_MODEL`, `RERANKER_BASE_URL`, `RERANKER_API_KEY` settings and their defaults.
- The branch in `_get_graphiti()` that selects between the Ollama reranker and the passthrough.
- The startup INFO log line that announces the selected reranker.
- Documentation entries in `.env.example`, `CLAUDE.md` "Required Environment Variables", and `README.md` Ollama prerequisites.
### Out of Boundary
- Graphiti's own search ranking, hybrid retrieval, or embedding pipeline.
- Per-passage retrieval (still owned by `_GraphNamespace.search` and Graphiti).
- The `group_id` scoping rules.
- Any change to the four ReportAgent tools (`SearchResult`, `InsightForge`, `Panorama`, `Interview`) — they receive reranked output transparently.
- Implementation of additional reranker providers; this design covers only `ollama` and `none`.
### Allowed Dependencies
- Upstream library: `graphiti_core.cross_encoder.client.CrossEncoderClient` (P0).
- In-repo: `Config` (`backend/app/config.py`), `get_logger` (`backend/app/utils/logger.py`), `openai.AsyncOpenAI` (already installed).
- Existing factory: `_get_graphiti()` continues to be the singleton chokepoint.
### Revalidation Triggers
- If `graphiti-core` changes the `CrossEncoderClient.rank` signature, this design must be revisited.
- If a future spec adds a third reranker provider, the inline branch should be considered for promotion to a registry (Option C in `research.md`).
- If `Config.GRAPHITI_LLM_PROVIDER` semantics change in a way that re-couples LLM and reranker, this design must be checked.
## Architecture
### Existing Architecture Analysis
- `_get_graphiti()` already injects an explicit `cross_encoder=_PassthroughReranker()` (line 156). The pattern of double-checked-locking singleton with provider switch (`GRAPHITI_LLM_PROVIDER`) is mature and must be preserved.
- The persistent event loop (`_get_loop`, `_run`) is used for Graphiti async calls from the synchronous Flask layer. The reranker itself runs inside Graphiti's own awaited path; the new reranker therefore does **not** need to schedule work onto `_get_loop()`.
- All four ReportAgent tools call `_GraphNamespace.search`, which already swallows reranker exceptions into a logged warning. The new reranker tightens this further by handling its own errors internally so it never raises.
### Architecture Pattern & Boundary Map
```mermaid
graph LR
subgraph Config
EnvVars[RERANKER_*\nenv vars]
ConfigCls[Config attributes]
EnvVars --> ConfigCls
end
subgraph Adapter
Factory[_get_graphiti]
Passthrough[_PassthroughReranker]
OllamaCls[OllamaReranker]
Factory -->|provider=none| Passthrough
Factory -->|provider=ollama| OllamaCls
end
subgraph Graphiti
GraphitiCore[Graphiti instance]
Search[_GraphNamespace.search]
Tools[Report tools\nSearchResult, InsightForge,\nPanorama, Interview]
end
ConfigCls --> Factory
Passthrough -->|injected as cross_encoder| GraphitiCore
OllamaCls -->|injected as cross_encoder| GraphitiCore
GraphitiCore --> Search
Search --> Tools
OllamaCls -->|chat.completions| Ollama[Ollama OpenAI\n-compatible endpoint]
```
**Architecture Integration**:
- **Selected pattern**: Strategy pattern with two implementations selected at factory time. Same shape as the existing `GRAPHITI_LLM_PROVIDER` branch.
- **Domain/feature boundaries**: Reranker construction and prompt/parse live in `ollama_reranker.py`. Wiring lives in `graphiti_adapter.py`. Config lives in `config.py`. No overlap.
- **Existing patterns preserved**: Double-checked-locking singleton; explicit `cross_encoder` injection (Graphiti never falls back to its OpenAI default); persistent event loop unchanged; `Config` reads via `os.environ.get(..., default)`.
- **New components rationale**: `OllamaReranker` is a new boundary because it owns external I/O against a different endpoint (the Ollama chat surface), separate from the existing OpenAI embedder/LLM clients.
- **Steering compliance**: Single OpenAI-SDK convention preserved; per-project `group_id` scoping unaffected; no new dependency.
### Technology Stack
| Layer | Choice / Version | Role in Feature | Notes |
|-------|------------------|-----------------|-------|
| Backend / Services | Python ≥3.11, async via `asyncio` | Hosts the new reranker class. | Inherits project minimum. |
| LLM client | `openai` SDK (already pinned, v2.x) | `AsyncOpenAI` chat completions against Ollama's `/v1`. | No new dependency. |
| Model | Ollama-served chat model, default `qwen2.5:3b` | Produces a numeric relevance score per passage. | Operator may override via `RERANKER_MODEL`. |
| Endpoint | Ollama's OpenAI-compatible `/v1` | Default `http://localhost:11434/v1`. | Reuses `EMBEDDING_BASE_URL` semantics. |
| Graph layer | `graphiti-core ≥ 0.3` | Consumes the new `CrossEncoderClient`. | No upstream change. |
## File Structure Plan
### Directory Structure
```
backend/app/
├── services/
│ ├── graphiti_adapter.py # MODIFIED — factory branches on RERANKER_PROVIDER
│ └── ollama_reranker.py # NEW — OllamaReranker(CrossEncoderClient)
├── config.py # MODIFIED — adds RERANKER_* attrs
└── utils/
└── logger.py # unchanged
repo-root/
├── .env.example # MODIFIED — adds RERANKER_* block
├── CLAUDE.md # MODIFIED — Required Environment Variables
└── README.md # MODIFIED — Ollama prerequisites note
```
### Modified Files
- `backend/app/services/graphiti_adapter.py` — Add small branch in `_get_graphiti()` that picks `OllamaReranker()` or `_PassthroughReranker()` based on `Config.RERANKER_PROVIDER`. Log the selection at INFO. `_PassthroughReranker` class is unchanged.
- `backend/app/config.py` — Add four new class attributes with documented defaults. No change to existing `validate()` (reranker has no mandatory key).
- `.env.example` — Add a four-line `RERANKER_*` block with comments mirroring the `EMBEDDING_*` style.
- `CLAUDE.md` — Extend the "Required Environment Variables" code block under "Architecture" with the four new vars.
- `README.md` — Update the Ollama prerequisite section to mention `ollama pull qwen2.5:3b` alongside the existing `ollama pull mxbai-embed-large`.
> `_PassthroughReranker` stays in `graphiti_adapter.py` (unchanged contract); only the wiring around it changes.
## System Flows
```mermaid
sequenceDiagram
participant Search as _GraphNamespace.search
participant Graphiti as graphiti-core
participant Reranker as OllamaReranker.rank
participant Ollama as Ollama /v1/chat/completions
Search->>Graphiti: search(query, group_ids=[gid], num_results=N)
Graphiti->>Graphiti: hybrid retrieval (RRF)
Graphiti->>Reranker: rank(query, [p1..pN])
par per-passage scoring
Reranker->>Ollama: chat.completions(prompt p1, temp=0)
Reranker->>Ollama: chat.completions(prompt p2, temp=0)
Reranker->>Ollama: chat.completions(prompt pN, temp=0)
end
alt all scores parsed
Reranker-->>Graphiti: sorted [(p, score), ...]
else any failure
Reranker->>Reranker: log WARNING, return passthrough order
Reranker-->>Graphiti: original order with synthetic scores
end
Graphiti-->>Search: ranked edges/nodes
Search-->>Tools: ranked results
```
**Decision points after diagram**:
- `temperature=0.0` makes the score deterministic per (query, passage, model) tuple.
- Per-passage failures (one bad parse out of N) downrank that passage to `0.0 - 0.001 * index` and continue; only whole-call exceptions degrade to passthrough.
- The reranker never raises; this isolates Graphiti from upstream noise even when `_GraphNamespace.search`'s existing exception swallow is removed in a future refactor.
## Requirements Traceability
| Requirement | Summary | Components | Interfaces | Flows |
|-------------|---------|------------|------------|-------|
| 1.1 | Default reranker is Ollama-backed | `_get_graphiti()`, `OllamaReranker` | Inline factory branch | Adapter init |
| 1.2 | No dependency on `OpenAIRerankerClient` | `_get_graphiti()` | Explicit `cross_encoder=` injection (unchanged behavior) | — |
| 1.3 | Unset → defaults to `ollama` | `Config.RERANKER_PROVIDER` | `os.environ.get('RERANKER_PROVIDER', 'ollama')` | — |
| 1.4 | No `gpt-4.1-nano` reference | All new files | — | — |
| 2.1 | Subclass `CrossEncoderClient.rank` | `OllamaReranker` | `async rank(query, passages) -> list[tuple[str, float]]` | Per-passage scoring |
| 2.2 | Uses `openai.AsyncOpenAI` | `OllamaReranker.__init__` | `AsyncOpenAI(base_url, api_key)` | — |
| 2.3 | Returns passages sorted descending | `OllamaReranker.rank` | Postcondition: descending by score | — |
| 2.4 | Empty input → empty output, no model call | `OllamaReranker.rank` | Guard at method entry | — |
| 2.5 | Preserves passage strings byte-for-byte | `OllamaReranker.rank` | Strings are echoed, never rewritten | — |
| 2.6 | Unparseable score → deterministic low fallback | `OllamaReranker.rank` | Internal `_parse_score` helper | Failure branch |
| 3.1 | `RERANKER_PROVIDER` env knob | `Config` | Class attr, default `ollama`, validated `{ollama, none}` | Adapter init |
| 3.2 | `RERANKER_MODEL` env knob | `Config` | Class attr, default `qwen2.5:3b` | — |
| 3.3 | `RERANKER_BASE_URL` defaults to `EMBEDDING_BASE_URL` | `Config` | Class attr resolves at read time | — |
| 3.4 | `RERANKER_API_KEY` defaults to `EMBEDDING_API_KEY` | `Config` | Class attr | — |
| 3.5 | Unknown value → `ValueError` | `_get_graphiti()` | `_ALLOWED_RERANKER_PROVIDERS` validation | Adapter init |
| 3.6 | Reads via `os.environ.get` only | `Config` | — | — |
| 4.1 | `none` keeps `_PassthroughReranker` | `_get_graphiti()` | Factory branch | Adapter init |
| 4.2 | Graph search remains functional under `none` | `_PassthroughReranker.rank` (unchanged) | — | — |
| 4.3 | INFO log announces selected provider | `_get_graphiti()` | `logger.info` line | Adapter init |
| 5.1 | WARNING log on rerank failure | `OllamaReranker.rank` | `logger.warning` with model + error class | Failure branch |
| 5.2 | No exception propagation to HTTP callers | `OllamaReranker.rank` (never raises) | — | — |
| 5.3 | Original order on whole-call failure | `OllamaReranker.rank` | Passthrough fallback inside method | Failure branch |
| 5.4 | `__init__` never raises | `OllamaReranker.__init__` | `AsyncOpenAI()` lazy I/O | Adapter init |
| 6.1 | `.env.example` documents the four vars | `.env.example` | — | — |
| 6.2 | `CLAUDE.md` lists the four vars | `CLAUDE.md` | — | — |
| 6.3 | `README.md` mentions `ollama pull <model>` | `README.md` | — | — |
| 6.4 | Old "follow-up" claim updated | `graphiti-neo4j-finalize/research.md` (or design.md) | — | — |
| 7.1 | Reranked order reaches `_GraphNamespace.search` | `OllamaReranker`, `_get_graphiti()` | Through Graphiti's own `search()` | End-to-end |
| 7.2 | No changes to report tools | n/a | n/a | — |
| 7.3 | `group_id` scoping unchanged | `_GraphNamespace.search` (unchanged) | — | — |
## Components and Interfaces
| Component | Domain/Layer | Intent | Req Coverage | Key Dependencies (P0/P1) | Contracts |
|-----------|--------------|--------|--------------|--------------------------|-----------|
| `OllamaReranker` | Backend / Services | Score passages against a query via Ollama chat completions. | 1.1, 1.4, 2.12.6, 5.15.4, 7.1 | `graphiti_core.cross_encoder.client.CrossEncoderClient` (P0); `openai.AsyncOpenAI` (P0); `Config` (P0); `get_logger` (P1) | Service |
| `Config` (extended) | Backend / Config | Expose four new reranker attrs with documented defaults. | 1.3, 3.13.6, 4.1 | `os.environ.get` (P0) | State (configuration) |
| `_get_graphiti()` (extended) | Backend / Adapter | Pick reranker implementation; validate provider; log selection. | 1.1, 1.2, 3.5, 4.1, 4.3 | `Config` (P0); `OllamaReranker` (P0); `_PassthroughReranker` (P0); `Graphiti` (P0) | Service |
| `.env.example`, `CLAUDE.md`, `README.md` | Docs | Communicate new knobs and Ollama prerequisite. | 6.16.4 | — | — |
---
### Backend / Services
#### `OllamaReranker`
| Field | Detail |
|-------|--------|
| Intent | Score each passage's relevance to a query via an Ollama-served chat model, returning passages sorted descending by score. |
| Requirements | 1.1, 1.4, 2.12.6, 5.15.4, 7.1 |
**Responsibilities & Constraints**
- Subclass `graphiti_core.cross_encoder.client.CrossEncoderClient`; implement only `rank`.
- Use `openai.AsyncOpenAI`; no second SDK; no top-level network I/O in `__init__`.
- Preserve passage strings byte-for-byte; never rewrite or truncate.
- Never raise from `rank()`. On any failure path, log once at WARNING and fall back to passthrough order with deterministic synthetic scores.
- Deterministic scoring: `temperature=0.0`, no randomness in fallback scores.
- Thread-safety: stateless beyond the immutable `AsyncOpenAI` client and string config; safe under Graphiti's concurrent search.
**Dependencies**
- Inbound: `_get_graphiti()` — instantiates a single instance and passes it as `cross_encoder=` to `Graphiti(...)` (P0).
- Outbound: `Ollama /v1/chat/completions` via `openai.AsyncOpenAI` (P0).
- External: `graphiti_core.cross_encoder.client.CrossEncoderClient` (P0); `openai` SDK (P0).
**Contracts**: Service [x]
##### Service Interface
```python
class OllamaReranker(CrossEncoderClient):
def __init__(
self,
*,
model: str,
base_url: str,
api_key: str,
) -> None: ...
async def rank(
self,
query: str,
passages: list[str],
) -> list[tuple[str, float]]:
"""
Score each passage's relevance to `query` and return
`(passage, score)` tuples sorted in descending order of score.
Preconditions:
- `passages` is a (possibly empty) list of strings.
Postconditions:
- len(return) == len(passages).
- return is sorted by score descending.
- For all i, return[i][0] is byte-identical to one of the inputs.
- For any rank() call, this method does not raise.
Invariants:
- Successfully-parsed scores fall in [0.0, 1.0].
- Fallback scores assigned to unparseable passages fall in [-1.0, 0.0)
and are strictly less than every successfully-parsed score.
"""
```
**Implementation Notes**
- **Integration**: Constructed inside `_get_graphiti()` when `Config.RERANKER_PROVIDER == "ollama"`; injected into `Graphiti(..., cross_encoder=...)`.
- **Validation**:
- Reject empty `passages` immediately with `return []`.
- Clip parsed `score` to `[0.0, 1.0]`.
- Treat any uncaught per-passage exception as parse failure and assign deterministic fallback `-0.001 * passage_index`.
- Treat any whole-call exception (e.g. connection refused) as graceful degrade: return `[(p, 1.0 - 0.01 * i) for i, p in enumerate(passages)]`.
- **Risks**: Default `qwen2.5:3b` must be `ollama pull`-ed by operators; documented in README. If absent, R5 path kicks in.
---
### Backend / Config
#### `Config` (extended)
| Field | Detail |
|-------|--------|
| Intent | Surface env-driven configuration for the reranker with Ollama-aligned defaults. |
| Requirements | 1.3, 3.13.6, 4.1 |
**Responsibilities & Constraints**
- Read from `os.environ.get` only; no new dependency.
- `RERANKER_PROVIDER` default `ollama`; valid values: `ollama`, `none`.
- `RERANKER_MODEL` default `qwen2.5:3b`.
- `RERANKER_BASE_URL` default = `EMBEDDING_BASE_URL` value at module load time.
- `RERANKER_API_KEY` default = `EMBEDDING_API_KEY` value at module load time.
- Validation of `RERANKER_PROVIDER` happens in `_get_graphiti()` (not `Config.validate()`) to keep the validate-at-boot list focused on credential presence.
**Contracts**: State [x]
##### State Management
- **State model**: Read-only class attributes resolved once at import.
- **Persistence & consistency**: None; values come from environment.
- **Concurrency strategy**: Immutable after import; safe.
**Implementation Notes**
- **Integration**: Defaults for `RERANKER_BASE_URL` / `RERANKER_API_KEY` should reference the corresponding `EMBEDDING_*` env vars (not the resolved `Config.EMBEDDING_BASE_URL` constant) so an operator setting only `EMBEDDING_BASE_URL` still gets the reranker pointed at the same Ollama host without needing to set `RERANKER_BASE_URL` explicitly. Implementation reads `os.environ.get('RERANKER_BASE_URL', os.environ.get('EMBEDDING_BASE_URL', 'http://localhost:11434/v1'))`.
- **Validation**: None at config-load time. Provider value is validated by `_get_graphiti()`.
- **Risks**: An operator who overrides `EMBEDDING_BASE_URL` but not `RERANKER_BASE_URL` will silently retarget the reranker too. This is intentional (single-host Ollama deployment) and documented.
---
### Backend / Adapter
#### `_get_graphiti()` (extended)
| Field | Detail |
|-------|--------|
| Intent | Select and inject the appropriate `CrossEncoderClient` based on `Config.RERANKER_PROVIDER`; log the choice. |
| Requirements | 1.1, 1.2, 3.5, 4.1, 4.3 |
**Responsibilities & Constraints**
- Preserve double-checked locking and singleton semantics exactly.
- Read `Config.RERANKER_PROVIDER` once at construction; do not re-read.
- For `ollama`: construct `OllamaReranker(model=..., base_url=..., api_key=...)`.
- For `none`: construct `_PassthroughReranker()` (current behavior preserved).
- For any other value: raise `ValueError("Unknown RERANKER_PROVIDER=%r; allowed: ('ollama', 'none')")` — mirrors the existing `_ALLOWED_GRAPHITI_PROVIDERS` validation pattern.
- Log at INFO once: `f"Initializing Graphiti reranker (provider={provider})..."`.
**Contracts**: Service [x]
##### Service Interface
```python
def _get_graphiti() -> Graphiti:
"""Singleton Graphiti factory; selects reranker via Config.RERANKER_PROVIDER."""
```
**Implementation Notes**
- **Integration**: Replaces the unconditional `cross_encoder=_PassthroughReranker()` at `graphiti_adapter.py:156` with a `cross_encoder=_build_reranker(provider)` call. The factory helper lives next to `_build_llm_and_embedder` in the same file.
- **Validation**: Provider validation raises before constructing the Graphiti instance, so misconfiguration fails fast and obvious.
- **Risks**: A typo such as `RERANKER_PROVIDER=Ollama` (capitalized) would raise; the helper lowercases the value before comparison, matching `_get_graphiti`'s existing `(... or "openai").lower()` pattern.
---
### Documentation
| File | Change | Requirements |
|------|--------|--------------|
| `.env.example` | Add commented block with the four `RERANKER_*` vars and their defaults. Position adjacent to the existing `EMBEDDING_*` block. | 6.1 |
| `CLAUDE.md` | Extend the "Required Environment Variables" code fence under "Architecture" → "Required Environment Variables" with the four new vars and a one-line note about `RERANKER_PROVIDER=none`. | 6.2 |
| `README.md` | In the "Install Ollama and pull the default embedding model" section, add `ollama pull qwen2.5:3b` step (or reference the model variable). In the `.env` snippet, add the four `RERANKER_*` lines with brief comments. | 6.3 |
| `.kiro/specs/graphiti-neo4j-finalize/research.md` | Update the "A real per-provider reranker is a follow-up" claim to point at this spec. | 6.4 |
> README also has `README-EN.md` and `README-ZH.md` — the canonical user-facing README is `README.md` per the existing structure. Other localized READMEs are out of scope unless a quick parity edit fits without translation work; if a Chinese translation already exists for the embedder section, the Chinese README receives the same one-line addition.
## Data Models
Not applicable. No persistent storage, no schema changes, no API payloads. The only structured value flowing through the system is the `list[tuple[str, float]]` already defined by `CrossEncoderClient.rank`.
## Error Handling
### Error Strategy
- **Construction errors**: None possible (no network in `__init__`; no required keys to validate).
- **Per-passage errors**: Caught inside `OllamaReranker.rank`. Logged at DEBUG once per failed passage (suppress spam). Passage receives a deterministic fallback score that places it after all successfully-scored passages but keeps it in the output exactly once.
- **Whole-call errors** (connection refused, 404 model not found, timeout, OpenAI SDK exception): Caught at the outermost `try/except` in `rank`. Logged at WARNING with model name and error class. Returns `[(p, 1.0 - 0.01 * i) for i, p in enumerate(passages)]` — same shape as `_PassthroughReranker` so consumers cannot tell the difference structurally.
- **Configuration errors**: `_get_graphiti()` raises `ValueError` at startup if `RERANKER_PROVIDER` is unknown. The Flask app fails to boot — preferred over silent misconfiguration.
### Error Categories and Responses
| Category | Trigger | Response |
|----------|---------|----------|
| System (5xx-equivalent) | Ollama unreachable, timeout | WARNING log; passthrough order; search succeeds. |
| User input (4xx-equivalent) | Unknown `RERANKER_PROVIDER` value | `ValueError` at startup; clear message naming allowed values. |
| Business rule | Model emits unparseable score | DEBUG log; per-passage fallback score; passage retained. |
### Monitoring
- INFO log at startup states the selected provider.
- WARNING log on whole-call failure includes model and error class; aggregation systems can alert on rate.
- No metrics surface yet; can be added if the reranker becomes a hot path.
## Testing Strategy
This project intentionally keeps the test surface minimal (`backend/scripts/test_profile_format.py` is the lone pytest target). Per `steering/tech.md`, do **not** add a heavy test harness.
- **Unit-level verification** (manual, by the implementer, no committed test files unless small and clearly worth keeping):
1. Constructing `OllamaReranker` with a bad host does not raise; first `rank()` call logs WARNING and returns passthrough output.
2. `rank(query, [])` returns `[]` and does not call the client.
3. Successful path returns the correct number of passages, sorted descending, every input echoed byte-for-byte.
4. Bad JSON output for one passage out of N leaves that passage at the bottom; other passages keep their parsed scores.
- **Integration smoke** (manual): With `qwen2.5:3b` pulled, run a graph build and a report-tool search; confirm the WARNING log is absent and the result order changes vs. `RERANKER_PROVIDER=none`.
- **Boundary verification**: Grep that `gpt-4.1-nano` and `OpenAIRerankerClient` do not appear in any new code path.
## Supporting References
- `research.md` — Discovery findings, alternative scoring strategies, model-choice rationale, defensive parse pattern.
- `gap-analysis.md` — Requirement-to-asset map.
- `.ticket/39.md` — Source ticket text.