translate the three llm prompt blocks plus the two prompt-feeding helpers
(_build_context, _summarize_entities) in
backend/app/services/simulation_config_generator.py from chinese to english.
the chinese base prompts were biasing the model toward chinese structure
and lexical choice for content, narrative_direction, hot_topics, and
reasoning fields even when accept-language was en, because
get_language_instruction() only steers the response language as a
postfix.
translation is in-place and preserves every functional contract: the
json output schema for all three prompts, every variable interpolation,
the per-entity-type heuristic ranges in the agent-config prompt, the
trailing english IMPORTANT directives that lock poster_type to
PascalCase and stance to {supportive,opposing,neutral,observer}, and
all three get_language_instruction() postfix call sites. the two
default-path reasoning literals are translated to locale-agnostic
english so generation_reasoning no longer mixes chinese and english on
the failure path.
logger calls, docstrings, and inline comments are intentionally left
chinese (out of scope; covered by issues #6 and #7). public api,
dataclasses, class constants, and the SimulationParameters payload
shape are unchanged.
Closes#4
translate the system prompt constant and the user-message template in
backend/app/services/ontology_generator.py from chinese to english.
the chinese base prompt was biasing the model toward chinese structure
and word choice even when accept-language was en, leaving ontology
descriptions and analysis_summary fields chinese-flavoured.
translation is in-place and preserves every functional contract: the
json output schema, the entity-type and relationship-type taxonomies
verbatim, the reserved-attribute-name list, the count and length
constraints, and all f-string interpolations. the
get_language_instruction() postfix call site and the trailing english
identifier-format directive are unchanged, so zh and other locales
continue to receive locale-appropriate descriptions.
logger calls, docstrings, and inline comments are intentionally left
in chinese — they are owned by issues #6 and #7.
a small static guard script (backend/scripts/test_ontology_prompts_no_cjk.py)
ast-parses the module and asserts zero cjk in the system prompt and in
every string literal of _build_user_message except the docstring, so
the regression cannot reappear silently.
Closes#2
Adds a Neo4j service to docker-compose so `docker compose up -d` works
on a clean checkout, and unhardcodes Graphiti's LLM/embedder so the
documented default provider (Qwen via Dashscope) actually works.
- docker-compose: neo4j:5-community service with cypher-shell
healthcheck, named volumes, and `depends_on: service_healthy` on the
app container; in-Docker NEO4J_URI override leaves the host-mode
default untouched.
- Config: new GRAPHITI_LLM_PROVIDER (openai|gemini, default openai) plus
optional EMBEDDING_API_KEY / EMBEDDING_BASE_URL that fall back to the
chat LLM credentials.
- graphiti_adapter: provider switch inside the singleton factory with
lazy per-provider imports; Gemini path is preserved exactly. The
no-op `_GeminiReranker` becomes a provider-agnostic
`_PassthroughReranker`, still injected explicitly so Graphiti does
not fall back to its OpenAI-only default reranker.
- Drop the ignored `reranker=` kwarg from `_GraphNamespace.search` and
the misleading callers in `zep_tools.py` and
`oasis_profile_generator.py`.
- Refresh `.env.example` to mirror the README env section.
Spec, requirements, and design under
`.kiro/specs/graphiti-neo4j-finalize/`.
Closes#1
Background threads (graph building, simulation prep, report generation,
profile generation) now inherit the requesting user's locale preference.
Previously these fell back to 'zh' because Flask request context was
unavailable in spawned threads.
Ensure poster_type stays PascalCase English and stance stays English enum
values regardless of language setting. Only natural language fields follow
the user's language preference.
The language instruction was causing LLM to change entity/relation naming
conventions. Now explicitly enforce PascalCase/UPPER_SNAKE_CASE for technical
identifiers while only applying language preference to description fields.
- Changed error messages to reflect the new configuration requirement for Neo4j.
- Ensured consistent handling of missing credentials across multiple functions.
Replaces the paid, rate-limited Zep Cloud service with Graphiti (graphiti-core
0.11.6) backed by a local Neo4j instance — free, unlimited, and self-hosted.
Key changes:
- Add GraphitiAdapter: drop-in Zep-compatible wrapper around Graphiti with a
persistent event-loop thread to avoid asyncio/Neo4j driver conflicts
- Switch LLM client to native GeminiClient + GeminiEmbedder (text-embedding-004
fails on Gemini compat endpoint; use google-genai SDK directly)
- Add _GeminiReranker passthrough replacing OpenAIRerankerClient (which
hardcodes gpt-4.1-nano and uses logprobs unsupported by Gemini)
- Fix Cypher queries: use s.uuid/t.uuid for edge source/target instead of
r.source_node_uuid (null property in Graphiti's schema)
- Add ontology-based entity type classifier (_classify_entity_type) so nodes
get colored by type in the D3 graph visualization instead of all being Entity
- Apply classifier in ZepEntityReader so filter_defined_entities finds entities
(previously 0 personas loaded because all labels were ['Entity'])
- Add startup recovery: auto-mark graph_building projects as graph_completed
on backend restart if Neo4j already has their data
- Add resume capability to graph build: skip already-processed episodes after
a restart (断点续传)
- Add non-blocking graph data cache with background refresh in graph.py
- Add EMBEDDING_MODEL config (default: gemini-embedding-001 for Gemini users)
- Add CLAUDE.md with project architecture and dev commands
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Implemented `_get_report_id_for_simulation` to find the most recent report ID associated with a simulation ID by scanning the reports directory.
- Updated `get_simulation_history` to include the retrieved report ID in the response, enhancing the simulation data returned to the client.
- Updated simulation history retrieval to read project details directly from the Simulation file.
- Improved simulation configuration handling by reading simulation requirements from JSON.
- Added project file listing to the simulation history, displaying up to three associated files.
- Refined card layout in HistoryDatabase.vue to accommodate new file display features and improved responsiveness.
- Deleted docker-compose.yml, backend Dockerfile, frontend Dockerfile, and nginx configuration to streamline project setup.
- Updated .env.example to reorganize LLM and ZEP API configurations for clarity and ease of use.
- Enhanced README.md to reflect changes in project structure and provide clearer setup instructions.
- Created package-lock.json for dependency management.
- Updated package.json and frontend package.json to version 0.1.0.
- Adjusted backend pyproject.toml to reflect version 0.1.0.
- Introduced uv.lock for Python dependency resolution.
- Modified the backend setup script to clear the virtual environment before installation.
- Improved README.md by restructuring the prerequisites section into a table for better readability.
- Added installation instructions for the `uv` package and clarified terminal requirements post-installation.