9.6 KiB
Requirements Document
Introduction
The MiroFish backend currently emits Chinese strings directly from logger.{info,warning,error,debug,exception} calls and from a number of jsonify({"error|message": ...}) API responses. These hardcoded strings bypass the existing t() localization helper in backend/app/utils/locale.py, so log aggregators receive unreadable messages for English-speaking operators and API responses ignore the active locale. This spec defines the work required to externalize every Chinese log message and user-facing API error/message string in the listed backend modules into the locale dictionaries (locales/en.json and locales/zh.json), so logs and responses honor the request locale and English operators get a fully readable pipeline.
Boundary Context
- In scope:
- Replace Chinese string literals inside
logger.{info,warning,error,debug,exception}calls in:backend/app/services/report_agent.pybackend/app/services/zep_tools.pybackend/app/services/simulation_runner.pybackend/app/services/oasis_profile_generator.pybackend/app/services/simulation_config_generator.pybackend/app/services/zep_graph_memory_updater.pybackend/app/services/ontology_generator.pybackend/app/services/simulation_manager.pybackend/app/services/zep_entity_reader.pybackend/app/services/simulation_ipc.pybackend/app/services/graph_builder.pybackend/app/api/simulation.pybackend/app/api/report.pybackend/app/api/graph.py
- Replace Chinese string literals inside user-facing
jsonify({"error": ...})andjsonify({"message": ...})(or equivalent response builders) in those API modules. - Add the corresponding keys to both
locales/en.json(English translation) andlocales/zh.json(preserve original Chinese verbatim) under a domain-grouped namespace (log.<domain>.<key>,api.error.<scope>,api.message.<scope>). - Preserve existing interpolation by passing values through
t(key, **kwargs)(using the helper's{name}placeholder syntax) instead of f-strings or%-formatting around the call. - Ensure
t()returns a safe fallback (and emits a warning, not a crash) when a key is missing.
- Replace Chinese string literals inside
- Out of scope:
- Prompt template strings (handled by tickets #2/#3/#4/#5; the report-agent prompts work is already on the current branch).
- Chinese docstrings and inline comments (handled by ticket #7).
- Re-architecting the
t()helper, switching i18n libraries, or introducing pluralization/ICU formatting. - Changing log levels, log structure, or response status codes beyond the string content.
- Frontend
zh.jsonparity beyond the new keys this work introduces.
- Adjacent expectations:
- The
t()helper atbackend/app/utils/locale.pyalready exposesset_locale,get_locale, andtand is wired up at request time and at background-thread entry; new code must reuse the existing helper. - Locale files (
locales/en.json,locales/zh.json) currently coexist with frontendvue-i18nconsumption; new keys must not collide with existing top-level frontend keys (menu,process,step1, etc.). All new backend keys live under the new top-level namespaceslogandapi(or extend them if already present). - Sibling spec
i18n-report-agent-promptscovered the prompt portion ofreport_agent.py; this spec must not regress those translations.
- The
Requirements
Requirement 1: Externalize Chinese Logger Messages
Objective: As a backend operator viewing logs in an English log aggregator, I want every Chinese log message in the listed backend modules to be emitted in the active locale, so that I can read and triage logs without translation tooling.
Acceptance Criteria
- The Backend Logging Layer shall emit log records whose message text is produced by
t("log.<domain>.<key>", **fmt)for everylogger.{info,warning,error,debug,exception}call in the listed in-scope modules that previously contained Chinese characters. - When the active locale is
en, the Backend Logging Layer shall emit the English translation defined inlocales/en.jsonfor each externalized log key. - When the active locale is
zh, the Backend Logging Layer shall emit the original Chinese text as preserved inlocales/zh.jsonfor each externalized log key. - The Backend Logging Layer shall preserve all interpolated values (entity counts, identifiers, exception text) by passing them as keyword arguments to
t()rather than concatenating or formatting them around thet()call. - The Backend Logging Layer shall not contain any Chinese character (
U+4E00–U+9FFF) inside the string-literal argument of anylogger.{info,warning,error,debug,exception}call within the listed in-scope modules.
Requirement 2: Externalize Chinese API Response Strings
Objective: As a frontend client (or external API consumer) reading the Accept-Language header, I want backend error and message responses in the listed API modules to be returned in the active locale, so that user-facing error surfaces match the rest of the localized UI.
Acceptance Criteria
- The Backend API Layer shall produce the
errorandmessagefield values ofjsonify({...})responses in the listed in-scope API modules (backend/app/api/{simulation,report,graph}.py) by callingt("api.error.<scope>", **fmt)ort("api.message.<scope>", **fmt). - When the request
Accept-Languageheader isen, the Backend API Layer shall return the English translation for the corresponding response key. - When the request
Accept-Languageheader iszhor absent, the Backend API Layer shall return the original Chinese string as preserved inlocales/zh.json. - The Backend API Layer shall not contain any Chinese character inside the string value of an
errorormessagefield in anyjsonify(...)(or equivalent response builder) call within the listed in-scope API modules. - The Backend API Layer shall keep the HTTP status code, response key set, and (for non-i18n keys) value structure of every modified response unchanged.
Requirement 3: Locale Dictionary Parity and Structure
Objective: As a translator or developer adding a new locale, I want every backend log/API key to exist in both en.json and zh.json with identical nested structure, so that the locale files can be diffed and validated mechanically.
Acceptance Criteria
- The Locale Dictionary shall contain, in
locales/en.json, every key introduced by Requirements 1 and 2 with an English translation. - The Locale Dictionary shall contain, in
locales/zh.json, every key introduced by Requirements 1 and 2 with the original Chinese text preserved verbatim from the previous source code. - The Locale Dictionary shall organize new backend keys under the top-level namespaces
log(grouped by domain:graph,simulation,report,agent,pipeline, etc.) andapi(grouped asapi.error.<scope>/api.message.<scope>). - The Locale Dictionary shall expose a structurally identical key tree across
en.jsonandzh.json, such that recursively diffing the key paths (ignoring values) of the two files produces an empty difference. - The Locale Dictionary shall not collide with or overwrite any pre-existing top-level frontend i18n key when the new namespaces are added.
Requirement 4: Safe Fallback for Missing Keys
Objective: As a backend service author who may ship code ahead of a translation update, I want missing translation keys to produce a visible warning without crashing the request or background task, so that incomplete locale dictionaries degrade gracefully.
Acceptance Criteria
- If a
t(key, ...)call references a key that exists in neither the active locale nor thezhfallback, the Locale Helper shall return a non-empty string (the key itself or an explicit placeholder) rather thanNoneor raising. - If a
t(key, ...)call references a missing key, the Locale Helper shall emit a single warning-level log record identifying the missing key, the active locale, and (when available) the call site context. - The Locale Helper shall not raise
KeyError,AttributeError, orTypeErrorfor any key lookup, irrespective of nesting depth or invalid path segments. - When
t()is invoked from a background thread that calledset_locale(...)at entry, the Locale Helper shall resolve the locale set on that thread for the entire call chain.
Requirement 5: Verification and Regression Guards
Objective: As a reviewer of this PR, I want repeatable mechanical checks that prove the in-scope files are clean of stray Chinese log/response strings, so that the acceptance criteria can be re-validated on every future change.
Acceptance Criteria
- The Verification Script shall, when run against the repository, report zero matches for the regular expression
logger\.[a-z]+\(["'][^"']*[一-鿿]across the listed in-scope modules. - The Verification Script shall, when run against the repository, report zero matches for any
jsonify({"error": "<chinese>"})orjsonify({"message": "<chinese>"})literal in the listed in-scope API modules. - The Verification Script shall, when run against
locales/en.jsonandlocales/zh.json, confirm that every newly introduced key path exists in both files (structural-key parity) and exit non-zero if a key is present in only one file. - The Verification Script shall be runnable from the repository root using only tools already available in the dev environment (
grep,python, orjq— no new dependencies introduced). - The Backend Test Suite shall continue to pass (
uv run python -m pytest) after the externalization changes, with no new failures introduced by the rename of message strings.