23 KiB
Design Document — i18n-report-agent-prompts
Overview
Purpose: Translate every Chinese string-literal that flows into the LLM message stream of backend/app/services/report_agent.py into English so that, under Accept-Language: en, the Report-Agent produces English-flavoured analytical reports and chat replies — and not the Chinese-biased output that today's Chinese-base prompts produce despite the get_language_instruction() English postfix.
Users: MiroFish operators running the 5-step pipeline under English locale; reviewers tracking the i18n epic (#11); developers maintaining sibling i18n issues (#6, #7, #8, #10) downstream of this change.
Impact: Behavioural — under Accept-Language: en, the report's section titles, section bodies, embedded quotations, and chat replies become English-flavoured. No public-API change. No Report.to_dict() shape change. No new dependencies.
Goals
- Replace every Chinese string-literal in
report_agent.pythat is sent to the LLM (system prompt, user prompt, ReACT loop messages, tool descriptions,_define_toolsparameter hints,_execute_toolerror returns,plan_outlinedefaults) with English equivalents. - Preserve every variable interpolation, every JSON schema key, every literal trigger string (
Final Answer:,<tool_call>, tool-name strings), everyget_language_instruction()call site. - Keep the public surface of
ReportAgent,ReportManager,Report,ReportOutline,ReportSection,ReportStatusbyte-for-byte equivalent in shape.
Non-Goals
- Logger calls (
logger.info,logger.warning,logger.error,logger.debug) inside the same file — owned by issue #6. Notably, the single raw-Chineselogger.debug(f"LLM响应: ...")at line 1322 is left untouched. - Module docstring (lines 1–11), class docstrings, dataclass docstrings, method docstrings, inline
#comments — owned by issue #7. - Refactoring prompt structure, the JSON output schema of
PLAN_SYSTEM_PROMPT, the ReACT loop control flow, conflict-resolution branches, or the chat tool-budget caps. - Externalizing prompts into
/locales/*.json. - Live end-to-end report generation under both
enandzh(deferred to fixture-based static checks; reviewer trust on quality parity, matching the precedent of issues #2/#3/#4).
Boundary Commitments
This Spec Owns
- The string-literal content of all LLM-facing regions in
backend/app/services/report_agent.py:- Tool description constants
TOOL_DESC_INSIGHT_FORGE(476–492),TOOL_DESC_PANORAMA_SEARCH(494–509),TOOL_DESC_QUICK_SEARCH(511–521),TOOL_DESC_INTERVIEW_AGENTS(523–548). - PLAN-phase prompts
PLAN_SYSTEM_PROMPT(552–589),PLAN_USER_PROMPT_TEMPLATE(591–611). - EXEC-phase prompts
SECTION_SYSTEM_PROMPT_TEMPLATE(615–767),SECTION_USER_PROMPT_TEMPLATE(769–792), including the embedded "Correct Example" / "Wrong Example" code blocks. - ReACT loop conversation templates
REACT_OBSERVATION_TEMPLATE(796–806),REACT_INSUFFICIENT_TOOLS_MSG(808–811),REACT_INSUFFICIENT_TOOLS_MSG_ALT(813–816),REACT_TOOL_LIMIT_MSG(818–821),REACT_UNUSED_TOOLS_HINT(823),REACT_FORCE_FINAL_MSG(825). - CHAT-phase prompts
CHAT_SYSTEM_PROMPT_TEMPLATE(829–855),CHAT_OBSERVATION_SUFFIX(857). - The
_define_toolsparameter-description dict values (925–952) and the_get_tools_descriptionleader"可用工具:"(1129). - The
_execute_toolerror returns at lines 1058 and 1062. - The inline LLM-visible strings inside
_generate_section_react:report_contextf-string (1294), empty-response retry (1316–1317), conflict-handling block (1342–1346), inlineunused_hintliterals (1380, 1476). - The inline LLM-visible strings inside
chat: report-truncated marker (1799), no-report fallback (1805), observation joiner (1861). - The default / fallback outline content in
plan_outline: success-path default title (1197), exception-path fallbackReportOutline(1212–1218).
- Tool description constants
- The
unused_tools_strjoin separator at line 1454 — switch from"、"to", "for natural English rendering inside the now-English ReACT templates.
Out of Boundary
- All
logger.*calls in this file (issue #6), including the one raw-Chineselogger.debugat line 1322. - All
"""..."""docstrings and#comments in this file (issue #7). backend/app/utils/locale.py,/locales/*.json,/locales/languages.json.backend/app/services/zep_tools.py,zep_entity_reader.py,zep_graph_memory_updater.py.backend/app/api/report.py,backend/app/api/simulation.py,backend/app/api/graph.py.backend/app/services/simulation_runner.py,simulation_ipc.py, OASIS subprocess source.backend/app/config.pyconstants.backend/pyproject.toml,backend/uv.lock.- All other files in the repository.
Allowed Dependencies
- Read access to
get_language_instruction()frombackend/app/utils/locale.py— three call sites preserved verbatim (lines 1166, 1262, 1808). - Read access to
t(...)frombackend/app/utils/locale.py— call sites preserved verbatim. - No new external dependencies.
Revalidation Triggers
- A change to the
Report.to_dict()payload shape would force the report API blueprint and the frontend report panel to re-validate. This spec does not change the shape. - A change to the
PLAN_SYSTEM_PROMPTJSON output schema (title,summary,sections[].title,sections[].description) would forceplan_outline()'s response parser to re-validate. This spec preserves the schema verbatim. - A change to the
Final Answer:literal trigger or the<tool_call>...</tool_call>XML tag would force_generate_section_react's parser branches to re-validate. This spec preserves both byte-for-byte. - A change to the four primary tool names (
insight_forge,panorama_search,quick_search,interview_agents) or the legacy aliases (search_graph,get_graph_statistics,get_entity_summary,get_simulation_context,get_entities_by_type) would force_execute_tooland_is_valid_tool_callto re-validate. This spec does not rename tools.
Architecture
Existing Architecture Analysis
ReportAgent is a single Python class in backend/app/services/report_agent.py. The three LLM invocation paths (PLAN, SECTION, CHAT) follow a uniform pattern:
system_prompt = <chinese system prompt template>
system_prompt = f"{system_prompt}\n\n{get_language_instruction()}"
user_prompt = <chinese user prompt template with {interpolations}>
response = self.llm.chat(messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}
])
_generate_section_react extends this with a multi-turn ReACT loop where the user-role messages re-injected after each tool call (REACT_OBSERVATION_TEMPLATE, etc.) are also Chinese today. There is no abstraction layer between prompt construction and LLM invocation — the prompt text and the call site are colocated. This matches sister modules (simulation_config_generator.py, oasis_profile_generator.py, ontology_generator.py).
Architecture Pattern & Boundary Map
Selected pattern: In-place string-literal translation. No new components, no new modules, no new abstractions.
flowchart TB
subgraph Caller["Caller — api/report.py"]
api["POST /api/report/generate<br/>POST /api/report/chat"]
end
subgraph ReportAgentMod["report_agent.py — IN SCOPE"]
plan["plan_outline<br/>**translate PLAN_*, defaults**"]
sec["_generate_section_react<br/>**translate SECTION_*, REACT_*, inline strings**"]
chat["chat<br/>**translate CHAT_*, inline strings**"]
tools["_define_tools / _get_tools_description<br/>**translate TOOL_DESC_*, params, leader**"]
exec["_execute_tool<br/>**translate error returns**"]
parse["_parse_tool_calls<br/>UNCHANGED (matches literals)"]
manager["ReportManager<br/>UNCHANGED (persistence)"]
end
subgraph Locale["utils/locale.py — UNCHANGED"]
gli[get_language_instruction]
tr[t]
end
subgraph ZepTools["services/zep_tools.py — UNCHANGED"]
zt[ZepTools dispatch]
end
api --> plan
api --> sec
api --> chat
plan --> gli
sec --> gli
chat --> gli
sec --> tools
chat --> tools
sec --> parse
sec --> exec
chat --> parse
chat --> exec
exec --> zt
plan --> manager
sec --> manager
Architecture Integration:
- Selected pattern: in-place string-literal translation; matches the precedent of issues #2/#3/#4.
- Domain/feature boundaries: prompt-content is the only boundary that moves. Logger / docstring / comment boundaries (issues #6, #7) and persistence-layer boundary (
ReportManager) are explicitly preserved. - Existing patterns preserved:
get_language_instruction()postfix injection at three call sites;<tool_call>XML protocol;Final Answer:literal trigger; tool-name registry; JSON output schema for outline planning. - New components rationale: none — no new components.
- Steering compliance: respects
tech.md"preserve both styles working" for comments/docstrings (those are out of scope); respectsstructure.mdper-project file isolation; respectscommits.mdConventional Commits format for the eventual commit message.
Technology Stack
| Layer | Choice / Version | Role in Feature | Notes |
|---|---|---|---|
| Frontend / CLI | n/a | Frontend renders the translated Report payload as plain text/Markdown |
No frontend change required |
| Backend / Services | Python 3.11, Flask 3.0 | Hosts ReportAgent and the report API |
Single-file edit |
| Data / Storage | Neo4j + Graphiti | Source of retrieval results consumed by zep_tools |
Unchanged |
| Messaging / Events | n/a | Report generation runs as a background Task |
Unchanged |
| Infrastructure / Runtime | uv-managed venv | Backend dependency manager | No new dependencies |
No new external dependencies, libraries, or infrastructure components are introduced. Detailed locale-resolution mechanics are documented in
research.md.
File Structure Plan
Modified Files
backend/app/services/report_agent.py— translate every Chinese string-literal that is sent to the LLM, plus the one separator literal at line 1454. No structural code changes; no new methods; no new constants. Line counts will shift due to the typically larger English character count, but the file's overall organization is unchanged.
Unmodified Files (explicitly verified)
backend/app/utils/locale.pybackend/app/services/zep_tools.py,zep_entity_reader.py,zep_graph_memory_updater.pybackend/app/api/report.py,simulation.py,graph.pybackend/app/services/simulation_runner.py,simulation_ipc.pybackend/app/config.pybackend/pyproject.toml,backend/uv.lock/locales/en.json,/locales/zh.json,/locales/languages.json- All frontend files
System Flows
The PLAN / SECTION / CHAT flows are unchanged at the control-flow level — only the string content of system / user / observation messages is translated. No new diagram is required; research.md records the relevant parser-trigger details.
Requirements Traceability
| Requirement | Summary | Components | Interfaces | Flows |
|---|---|---|---|---|
| 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7 | Translate PLAN_SYSTEM_PROMPT and PLAN_USER_PROMPT_TEMPLATE; preserve schema, count limits, interpolations, postfix call site |
PLAN_SYSTEM_PROMPT (552), PLAN_USER_PROMPT_TEMPLATE (591), plan_outline (1137) |
plan_outline() LLM chat_json invocation at line 1177 |
PLAN flow |
| 2.1–2.9 | Translate SECTION_SYSTEM_PROMPT_TEMPLATE (incl. examples) and SECTION_USER_PROMPT_TEMPLATE; preserve Final Answer: / <tool_call> literals; preserve no-headings instruction |
SECTION_SYSTEM_PROMPT_TEMPLATE (615), SECTION_USER_PROMPT_TEMPLATE (769), _generate_section_react (1221) |
_generate_section_react() LLM chat invocation at line 1305 |
SECTION ReACT flow |
| 3.1–3.7 | Translate CHAT_SYSTEM_PROMPT_TEMPLATE and CHAT_OBSERVATION_SUFFIX; preserve <tool_call> literal and prefix-injection contract |
CHAT_SYSTEM_PROMPT_TEMPLATE (829), CHAT_OBSERVATION_SUFFIX (857), chat (1766) |
chat() LLM chat invocations at lines 1828, 1868 |
CHAT flow |
| 4.1–4.6 | Translate ReACT loop conversation templates; preserve Final Answer: literal; switch separator to ", " |
REACT_* constants (796–825) |
_generate_section_react() ReACT loop branches |
SECTION ReACT flow |
| 5.1–5.7 | Translate four TOOL_DESC_* blocks, _define_tools parameter dict values, _get_tools_description leader; preserve tool names |
TOOL_DESC_* (476–548), _define_tools (919), _get_tools_description (1127) |
_define_tools() and _get_tools_description() return values |
SECTION + CHAT flows |
| 6.1–6.7 | Translate inline LLM-visible strings in _generate_section_react and chat |
Inline strings at 1294, 1316–1317, 1342–1346, 1380, 1476, 1799, 1805, 1861 | Direct messages.append(...) calls |
SECTION + CHAT flows |
| 7.1–7.3 | Translate _execute_tool error returns |
f-strings at 1058, 1062 | _execute_tool() return value |
SECTION + CHAT flows (error path) |
| 8.1–8.4 | Translate plan_outline defaults; preserve ReportOutline shape |
plan_outline defaults at 1197, 1212–1218 |
plan_outline() return value |
PLAN flow (default + fallback paths) |
| 9.1–9.5 | Locale switching continues to work | get_language_instruction() call sites at 1166, 1262, 1808 |
unchanged | All flows |
| 10.1–10.5 | Public API stable | ReportAgent, ReportManager, Report, ReportOutline, ReportSection, ReportStatus |
unchanged | All flows |
| 11.1–11.5 | End-to-end Step 4 / Step 5 parity | Verification only | unchanged | All flows |
| 12.1–12.6 | Out-of-scope guardrail | None edited | unchanged | n/a |
Components and Interfaces
| Component | Domain/Layer | Intent | Req Coverage | Key Dependencies (P0/P1) | Contracts |
|---|---|---|---|---|---|
| Tool description constants | Module-scope constants in report_agent.py |
LLM-facing tool catalog injected into SECTION + CHAT system prompts via _get_tools_description |
5.1, 5.2, 5.7 | _define_tools (P0), _get_tools_description (P0) |
State (string literals only) |
PLAN_* prompts |
Module-scope constants | Outline planning system + user prompts | 1.1, 1.2, 1.5, 1.6 | get_language_instruction (P0), plan_outline (P0) |
State |
SECTION_* prompts |
Module-scope constants | Section ReACT system + user prompts | 2.1, 2.2, 2.3, 2.4, 2.6, 2.7 | get_language_instruction (P0), _generate_section_react (P0), _get_tools_description (P1) |
State |
REACT_* templates |
Module-scope constants | ReACT loop user-role messages re-injected after tool calls | 4.1, 4.2, 4.3, 4.4, 4.5 | _generate_section_react (P0) |
State |
CHAT_* prompts |
Module-scope constants | Chat system prompt + observation suffix | 3.1, 3.2, 3.3, 3.4, 3.5, 3.6 | get_language_instruction (P0), chat (P0), _get_tools_description (P1) |
State |
_define_tools parameter dict |
ReportAgent instance method |
Catalog of tools + parameter hints, exposed to LLM via _get_tools_description |
5.3, 5.4, 5.6 | _get_tools_description (P0) |
Service |
_get_tools_description |
ReportAgent instance method |
Renders _define_tools output as a single string for SECTION + CHAT prompts |
5.5 | _define_tools (P0) |
Service |
_execute_tool error returns |
ReportAgent instance method |
Returns observation strings to the LLM for unknown-tool / execution-error paths | 7.1, 7.2, 7.3 | _execute_tool (P0) |
Service |
_generate_section_react inline strings |
ReportAgent instance method body |
LLM-visible strings appended to messages during ReACT loop |
6.1, 6.2, 6.3, 6.4 | _generate_section_react (P0) |
Service |
chat inline strings |
ReportAgent instance method body |
LLM-visible strings appended to messages during chat loop |
6.5, 6.6 | chat (P0) |
Service |
plan_outline defaults |
ReportAgent instance method body |
Default / fallback ReportOutline content emitted on success-without-title or exception path |
8.1, 8.2, 8.3, 8.4 | plan_outline (P0) |
State |
All components are existing module-scope constants or method-internal expressions. None require a full detail block — the responsibility boundary is "translate the string content; preserve the structural shape". The summary table above plus the requirement-level acceptance criteria in
requirements.mdform a complete contract.
Implementation Notes (cross-cutting)
- Translation glossary (consistent across all components — see
research.mdDecision: Standard English phrasing): 上帝视角 → "god's-eye view"; 未来预演 → "forecast simulation" / "simulated future"; 模拟需求 → "simulation requirement"; 模拟世界 → "simulated world"; 章节 → "section"; 大纲 → "outline"; 引用 → "quote"/"quotation"; 正确示例 → "Correct Example"; 错误示例 → "Wrong Example"; 注意 → "Note"; 重要 → "IMPORTANT"; 工具 → "tool"; 检索 → "retrieval". - Literal preservation:
Final Answer:,<tool_call>,</tool_call>, all tool names (insight_forge,panorama_search,quick_search,interview_agents, plus legacy aliases), all{interpolation}tokens, all JSON schema keys, all emoji / box-drawing characters (💡,═). - Locale-agnostic strings:
_execute_toolerror returns andplan_outlinedefault / fallback outline content are returned regardless of locale (noget_language_instruction()injection at those sites). They become locale-agnostic English under this PR. - Separator change:
unused_tools_str = "、".join(unused_tools)at line 1454 →", ".join(unused_tools). This is the only non-string-literal code change.
Data Models
No data-model changes. Report, ReportOutline, ReportSection, ReportStatus, Task, and the report API JSON contract are all preserved verbatim. Report.to_dict() and ReportOutline.to_dict() shapes are unchanged. The persistence schema under reports/<id>/ (meta.json, outline.json, progress.json, section_NN.md, full_report.md, agent_log.jsonl, console_log.txt) is unchanged.
Error Handling
Error Strategy
No new error types or recovery strategies. The translated _execute_tool error returns and plan_outline exception-path fallback continue to behave identically — the only change is the string content.
Error Categories and Responses
- Unknown-tool error:
_execute_toolreturns a translated English string"Unknown tool: {tool_name}. Please use one of: insight_forge, panorama_search, quick_search". The string is fed back to the LLM as the next user-role observation. - Tool-execution exception:
_execute_toolreturns a translated English string"Tool execution failed: {str(e)}". Same flow. plan_outlineLLM exception: returns the translated English fallbackReportOutline(3 sections). Downstream report assembly proceeds normally.- Empty-response retry / conflict-handling / insufficient-tools: translated English messages re-injected into the LLM message stream (R6, R4 acceptance criteria). Loop control flow unchanged.
Testing Strategy
Default sections (adapted to translation work)
- Static lint:
python -m py_compile backend/app/services/report_agent.py— must pass. - Zero-Chinese assertion (in-scope regions): a verification harness (a small ad-hoc script under
scripts/if needed, deleted before PR) importsreport_agentand runsre.findall(r'[一-鿿]', literal)over each in-scope constant, expecting an empty list. The single permitted Chinese remnant is thelogger.debugf-string at line 1322 (not in scope). - Interpolation-shape parity: invoke
PLAN_USER_PROMPT_TEMPLATE.format(simulation_requirement="x", total_nodes=0, total_edges=0, entity_types=[], total_entities=0, related_facts_json="[]"),SECTION_SYSTEM_PROMPT_TEMPLATE.format(report_title="x", report_summary="y", simulation_requirement="z", section_title="t", tools_description="d"),SECTION_USER_PROMPT_TEMPLATE.format(previous_content="x", section_title="t"),CHAT_SYSTEM_PROMPT_TEMPLATE.format(simulation_requirement="x", report_content="r", tools_description="d"),REACT_OBSERVATION_TEMPLATE.format(tool_name="x", result="y", tool_calls_count=1, max_tool_calls=5, used_tools_str="a, b", unused_hint="z"), etc. — each must render without raisingKeyError. - Trigger-literal preservation: assert that
"Final Answer:"is a substring of the translatedSECTION_SYSTEM_PROMPT_TEMPLATE,SECTION_USER_PROMPT_TEMPLATE,REACT_OBSERVATION_TEMPLATE,REACT_TOOL_LIMIT_MSG, andREACT_FORCE_FINAL_MSG; assert that"<tool_call>"is a substring of the translatedSECTION_SYSTEM_PROMPT_TEMPLATEandCHAT_SYSTEM_PROMPT_TEMPLATE. - Tool-name preservation: assert that all four primary tool names appear unchanged in the translated
_define_toolskeys and in the translatedTOOL_DESC_*blocks. - End-to-end (deferred): per the precedent of issues #2/#3/#4, full pipeline runs under
Accept-Language: enandAccept-Language: zhare not part of CI for this PR. Reviewer trust applies. If feasible in the implementer's local environment, a single sample run underento confirm no Markdown headings leak into section bodies and a single sample run underzhto confirm Chinese output quality is preserved — both optional confidence boosters, not gates.
Security Considerations
No new security surface. Translated prompts do not expose new endpoints, do not add new external calls, and do not change authorization semantics. The _execute_tool error returns continue to expose str(e) from any caught exception — pre-existing behavior, unchanged by this PR.
Performance & Scalability
No performance regression expected. English prompts may be ~10–30% longer in token count than the equivalent Chinese (English requires more tokens for the same semantic content), but this is well within the 4096 max_tokens ceiling on the section LLM call and the model's overall context budget. No caching, no batching, no concurrency change.
Migration Strategy
No data or schema migration. The change is a single in-place edit. Rollback strategy: revert the single commit on feat/i18n-5-translate-report-agent-prompts if a regression is detected.
Supporting References
- Detailed discovery, alternatives evaluation, decision rationale, and risk register:
.kiro/specs/i18n-report-agent-prompts/research.md. - Sibling spec (i18n-simulation-config-generator-prompts):
.kiro/specs/i18n-simulation-config-generator-prompts/{requirements,design,gap-analysis,research}.md. - Sibling commits:
0806832(#2),9d1d29b(#3),6c2a412(#4). - Ticket snapshot:
.ticket/5.md.