translate the three llm prompt blocks plus the two prompt-feeding helpers
(_build_context, _summarize_entities) in
backend/app/services/simulation_config_generator.py from chinese to english.
the chinese base prompts were biasing the model toward chinese structure
and lexical choice for content, narrative_direction, hot_topics, and
reasoning fields even when accept-language was en, because
get_language_instruction() only steers the response language as a
postfix.
translation is in-place and preserves every functional contract: the
json output schema for all three prompts, every variable interpolation,
the per-entity-type heuristic ranges in the agent-config prompt, the
trailing english IMPORTANT directives that lock poster_type to
PascalCase and stance to {supportive,opposing,neutral,observer}, and
all three get_language_instruction() postfix call sites. the two
default-path reasoning literals are translated to locale-agnostic
english so generation_reasoning no longer mixes chinese and english on
the failure path.
logger calls, docstrings, and inline comments are intentionally left
chinese (out of scope; covered by issues #6 and #7). public api,
dataclasses, class constants, and the SimulationParameters payload
shape are unchanged.
Closes#4
translate the system prompt constant and the user-message template in
backend/app/services/ontology_generator.py from chinese to english.
the chinese base prompt was biasing the model toward chinese structure
and word choice even when accept-language was en, leaving ontology
descriptions and analysis_summary fields chinese-flavoured.
translation is in-place and preserves every functional contract: the
json output schema, the entity-type and relationship-type taxonomies
verbatim, the reserved-attribute-name list, the count and length
constraints, and all f-string interpolations. the
get_language_instruction() postfix call site and the trailing english
identifier-format directive are unchanged, so zh and other locales
continue to receive locale-appropriate descriptions.
logger calls, docstrings, and inline comments are intentionally left
in chinese — they are owned by issues #6 and #7.
a small static guard script (backend/scripts/test_ontology_prompts_no_cjk.py)
ast-parses the module and asserts zero cjk in the system prompt and in
every string literal of _build_user_message except the docstring, so
the regression cannot reappear silently.
Closes#2
Adds a Neo4j service to docker-compose so `docker compose up -d` works
on a clean checkout, and unhardcodes Graphiti's LLM/embedder so the
documented default provider (Qwen via Dashscope) actually works.
- docker-compose: neo4j:5-community service with cypher-shell
healthcheck, named volumes, and `depends_on: service_healthy` on the
app container; in-Docker NEO4J_URI override leaves the host-mode
default untouched.
- Config: new GRAPHITI_LLM_PROVIDER (openai|gemini, default openai) plus
optional EMBEDDING_API_KEY / EMBEDDING_BASE_URL that fall back to the
chat LLM credentials.
- graphiti_adapter: provider switch inside the singleton factory with
lazy per-provider imports; Gemini path is preserved exactly. The
no-op `_GeminiReranker` becomes a provider-agnostic
`_PassthroughReranker`, still injected explicitly so Graphiti does
not fall back to its OpenAI-only default reranker.
- Drop the ignored `reranker=` kwarg from `_GraphNamespace.search` and
the misleading callers in `zep_tools.py` and
`oasis_profile_generator.py`.
- Refresh `.env.example` to mirror the README env section.
Spec, requirements, and design under
`.kiro/specs/graphiti-neo4j-finalize/`.
Closes#1
Background threads (graph building, simulation prep, report generation,
profile generation) now inherit the requesting user's locale preference.
Previously these fell back to 'zh' because Flask request context was
unavailable in spawned threads.
Ensure poster_type stays PascalCase English and stance stays English enum
values regardless of language setting. Only natural language fields follow
the user's language preference.
The language instruction was causing LLM to change entity/relation naming
conventions. Now explicitly enforce PascalCase/UPPER_SNAKE_CASE for technical
identifiers while only applying language preference to description fields.
Replaces the paid, rate-limited Zep Cloud service with Graphiti (graphiti-core
0.11.6) backed by a local Neo4j instance — free, unlimited, and self-hosted.
Key changes:
- Add GraphitiAdapter: drop-in Zep-compatible wrapper around Graphiti with a
persistent event-loop thread to avoid asyncio/Neo4j driver conflicts
- Switch LLM client to native GeminiClient + GeminiEmbedder (text-embedding-004
fails on Gemini compat endpoint; use google-genai SDK directly)
- Add _GeminiReranker passthrough replacing OpenAIRerankerClient (which
hardcodes gpt-4.1-nano and uses logprobs unsupported by Gemini)
- Fix Cypher queries: use s.uuid/t.uuid for edge source/target instead of
r.source_node_uuid (null property in Graphiti's schema)
- Add ontology-based entity type classifier (_classify_entity_type) so nodes
get colored by type in the D3 graph visualization instead of all being Entity
- Apply classifier in ZepEntityReader so filter_defined_entities finds entities
(previously 0 personas loaded because all labels were ['Entity'])
- Add startup recovery: auto-mark graph_building projects as graph_completed
on backend restart if Neo4j already has their data
- Add resume capability to graph build: skip already-processed episodes after
a restart (断点续传)
- Add non-blocking graph data cache with background refresh in graph.py
- Add EMBEDDING_MODEL config (default: gemini-embedding-001 for Gemini users)
- Add CLAUDE.md with project architecture and dev commands
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Decreased the maximum tool calls per section from 8 to 5.
- Reduced the maximum iterations in the ReACT loop from 8 to 5, streamlining the report generation process.
- Reduced maximum tool calls per chat from 5 to 2 for improved efficiency.
- Simplified system prompt to focus on concise responses and report content.
- Implemented report content retrieval with length limitation to prevent context overflow.
- Adjusted tool call execution to limit to one call per iteration, enhancing clarity in responses.
- Updated user message prompts to encourage concise answers based on retrieved data.
- Increased the maximum tool calls per section from 4 to 8, enhancing the agent's capabilities.
- Raised the maximum reflection rounds from 2 to 3 to allow for deeper analysis.
- Adjusted the maximum tool calls per chat from 3 to 5 for improved interaction.
- Expanded the maximum agents for interviews from 5 to 20, facilitating more comprehensive data gathering.
- Increased the maximum iterations for ReACT loops from 5 to 8 and from 3 to 5 in different contexts, optimizing the report generation process.
- Updated the `to_text` method in the `PanoramaResult` class to provide complete outputs for current facts, historical facts, and involved entities, improving data visibility.
- Modified the `to_text` method in the `AgentInterview` class to display the full agent bio without truncation.
- Adjusted the `ZepToolsService` class to retrieve all related entity details and facts without limiting the output, ensuring comprehensive data representation.
- Renamed log_section_complete to log_section_content to better reflect its purpose, and added is_subsection parameter for improved logging of subsection content.
- Introduced log_section_full_complete method to log the completion of entire sections, including all subsections, enhancing tracking of report generation status.
- Adjusted maximum tool call limits for sections and chats to optimize performance during report generation.
- Updated system prompts and user prompts in the ReportAgent class to clarify the report's focus on future predictions rather than current analysis.
- Enhanced the Step3Simulation and Step4Report components for improved user experience, including UI updates and better handling of report generation states.
- Updated the AgentInterview class to display the full agent bio, truncating only if it exceeds 1000 characters for better readability.
- Enhanced the Step4Report component to include structured display for tool results, allowing users to toggle between raw and structured views for various tools, improving user experience and clarity.
- Introduced new components for parsing and displaying results from different tools, including InsightForge, PanoramaSearch, InterviewAgents, and QuickSearch, providing a comprehensive view of the data.
- Introduced a unique report ID generation mechanism to enhance tracking and management of reports.
- Implemented detailed logging for the report generation process, including agent actions, planning stages, and tool calls, improving traceability and debugging.
- Added new API endpoints for retrieving agent and console logs, allowing users to access detailed execution logs and console outputs during report generation.
- Enhanced the frontend GraphPanel component with a notification for users when simulations finish, improving user experience and feedback.
- Introduced strict formatting guidelines for chapter content, prohibiting the use of Markdown headers and emphasizing the use of bold text for section titles.
- Implemented a new method to save chapters along with their subsections into a single file, streamlining the report structure.
- Added content cleaning functionality to remove duplicate titles and ensure proper formatting before saving.
- Enhanced the report assembly process to include post-processing for title management and improved readability.
- Updated SimulationRunner to include additional files for deletion, specifically Twitter and Reddit simulation databases, and environment status files.
- Refactored Step3Simulation component to streamline the status display, removing unnecessary conditions and improving the user interface for simulation phases.
- Introduced a reset function to clear all simulation states before starting a new simulation, ensuring a clean environment for each run.
- Introduced a new JSON data file containing detailed actions and quotes related to the 武大声誉修复基金 initiative.
- Updated the OasisProfileGenerator to ensure compatibility with the new JSON format, emphasizing the inclusion of user_id.
- Modified simulation management to support independent tracking of Twitter and Reddit platforms, including completion status and round information.
- Enhanced the SimulationRunner to accurately reflect the completion state of each platform and added checks for overall simulation completion.
- Improved the GraphPanel and Step3Simulation components to provide real-time updates and better user feedback during simulations.
- Updated the simulation API to include a new 'force' parameter, allowing users to restart simulations while cleaning up previous logs.
- Enhanced the simulation status detail retrieval to include all actions and platform-specific actions for better monitoring.
- Introduced a new SimulationRunView component to manage the simulation process, providing a clear interface for users to start and monitor simulations.
- Improved the Step3Simulation component with detailed logging and progress indicators, ensuring users receive real-time updates during the simulation.
- Added new API endpoints for retrieving simulation posts and actions, enhancing the overall functionality and user engagement.
- Updated the `get_graph_data` method in `graph_builder.py` to include additional attributes such as creation time, validity periods, and episodes for nodes and edges, improving the richness of the graph data.
- Modified the detail panel in `Process.vue` to present new attributes, including properties, episodes, and timestamps, enhancing user interaction and data visibility.
- Improved styling for the detail panel to better organize and present the comprehensive information for selected nodes and edges.
- Updated the "interview_agents" tool in the Report Agent to utilize the OASIS interview API for real-time agent interviews across Twitter and Reddit, providing authentic responses.
- Improved the ZepToolsService to streamline agent selection, question generation, and interview processing, ensuring a more efficient and structured interview workflow.
- Enhanced documentation to reflect the new interview capabilities, including updated usage instructions and important operational notes regarding the OASIS environment.
- Removed deprecated methods related to simulated interviews, focusing on real API interactions for improved accuracy and reliability.
- Introduced a new "interview_agents" tool in the Report Agent to facilitate in-depth interviews with simulation agents, allowing for multi-perspective insights.
- Implemented the InterviewResult and AgentInterview data classes to structure and manage interview data effectively.
- Enhanced ZepToolsService with methods for conducting interviews, including agent selection and question generation based on user requirements.
- Updated documentation to reflect the new interview capabilities and usage instructions for the Report Agent and Zep tools.
- Introduced new core search tools in the Report Agent: InsightForge for deep insights, PanoramaSearch for comprehensive views, and QuickSearch for rapid queries.
- Updated the Report Agent to prioritize tool usage for data retrieval, ensuring all report content is based on simulation results rather than internal knowledge.
- Enhanced the ZepToolsService with methods for InsightForge and PanoramaSearch, allowing for multi-dimensional queries and historical data retrieval.
- Improved documentation to reflect the new functionalities and usage guidelines for the Report Agent and Zep tools.
- Introduced the Report Agent module to facilitate the automatic generation of simulation analysis reports using LangChain and Zep, following the ReACT model.
- Added functionality for report outline planning, segmented content generation, and user interaction through a dialogue interface.
- Implemented new API endpoints for report generation, status checking, and retrieval, enhancing the overall reporting capabilities.
- Updated README.md to include detailed instructions on the new report generation features and API usage.
- Enhanced the project structure to accommodate the new report management functionalities, including report storage and retrieval mechanisms.