Commit Graph

75 Commits

Author SHA1 Message Date
Christian Moellmann 895a5fbaee fix(interviews): accept stringified ints in all 4 subagent validators
Real LLMs (observed with anthropic/claude-haiku-4-5 on a 23-agent run)
sometimes return Likert values as JSON strings ('3' not 3). The 4 subagent
validators rejected this with isinstance(v, int), losing ~30% of agents at
N=23. Added a shared coerce_int helper in base.py that accepts ints and
numeric strings, rejects bools/floats/garbage, and is now used by:

- Longitudinal: response values 1-5
- Diversity: Q-sort placements -3..+3 and 6 Likert axes 1-7
- Delphi: R2 and R3 importance/plausibility 1-5
- Scenario: 4 dimensions 1-7

Validators now coerce in place so downstream code sees ints regardless of
the wire format. Added 8 tests (4 unit on coerce_int + 4 per-subagent
contract tests showing stringified values are accepted).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 14:03:34 +02:00
Christian Moellmann 6a53c110b7 feat(interviews): capture raw LLM output on schema-validation failures
Adds SchemaValidationFailure exception carrying both retry attempts' raw
output, so audit.jsonl preserves what the model actually said when an
agent's response can't be coerced into the instrument schema. Lets us
diagnose persona-vs-format failures without re-running. Two new tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 13:40:43 +02:00
Christian Moellmann 6e1489fe08 fix(interviews): wire Zep updater/memory/hooks correctly for production runs (C1-C5)
Five tightly-coupled fixes that were causing the interview subsystem to silently
degrade in production:

- C1+C2: `_build_orchestrator` now resolves `graph_id` from
  `SimulationManager().get_simulation(sim_id).graph_id` (the real persisted
  state) instead of a `graph_id.txt` that nothing in the codebase writes.
  `ZepGraphMemoryUpdater(graph_id=...)` is now called with the correct
  positional argument; the bare `try/except Exception` that was swallowing the
  TypeError is replaced with a narrow fallback that logs explicitly.
- C3: `SimulationManager._on_ready_hooks` / `_on_completed_hooks` are now
  class-level (mirroring `SimulationRunner._on_completed_callbacks`).
  Hooks registered at app startup now survive across the per-request
  `SimulationManager()` instances created by the Flask API, so the T0
  longitudinal auto-survey actually fires.
- C4: `ZepGraphMemoryUpdater` gains an explicit `add_text_episode(graph_id, text)`
  method for synchronous text writes. `InterviewZepWriter._emit` no longer
  silently falls back to a dict-shaped `add_activity` call that the real
  implementation rejects (its `add_activity` requires an `AgentActivity`
  dataclass).
- C5: `FileSystemPersonaProvider.agent_to_entity()` builds an
  `{agent_id: zep_entity_uuid}` map from the persisted profile files; the map
  is now passed to `ZepMemoryProvider` so `get_entity_with_context` is called
  with real Zep UUIDs instead of `str(agent_id)`. To make this work,
  `OasisProfileGenerator._save_reddit_json` and `_save_twitter_csv` now persist
  `source_entity_uuid` (Reddit JSON: optional field; Twitter CSV: appended
  column).

Tests: 51 unit + 2 integration pass (was 40 + 2). New tests lock in each fix:
- `test_hooks_survive_across_instances` (C3)
- `test_build_orchestrator_reads_graph_id_from_state` (C1+C2+C5)
- `test_build_orchestrator_falls_back_when_state_missing` (C1+C2)
- `test_emit_uses_add_text_episode_with_graph_id`,
  `test_emit_raises_when_updater_lacks_add_text_episode`,
  `test_real_updater_exposes_add_text_episode` (C4)
- `test_agent_to_entity_from_reddit_json`,
  `test_agent_to_entity_empty_when_no_field`,
  `test_agent_to_entity_falls_back_to_twitter_csv`,
  `test_agent_to_entity_reddit_takes_precedence` (C5)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 13:27:47 +02:00
Christian Moellmann 6b04ea5c27 feat(interviews): auto-trigger lifecycle hooks + bridge SimulationRunner→Manager on COMPLETED
- Add backend/app/services/interviews/lifecycle.py with install_hooks() that
  registers on_ready (pre-survey) and on_completed (post-survey + synthesis)
  daemon-thread callbacks on a SimulationManager.
- Add SimulationRunner.register_on_completed() / _fire_on_completed() so
  external callbacks can be notified when _monitor_simulation transitions to
  COMPLETED (both exit-code-0 path and simulation_end event path).
- Wire both in app/__init__.py: create singleton SimulationManager, install
  lifecycle hooks, and register its _notify_on_completed with SimulationRunner.
- Add test_lifecycle.py: verifies install_hooks registers one callable for each
  of ready and completed.
- All 40 unit tests + 2 integration tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:51:13 +02:00
Christian Moellmann bc07170dbf feat(interviews): persona + Zep memory adapters bridging existing services to interview subsystem
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:34:03 +02:00
Christian Moellmann d79c81d2b7 feat(interviews): synthesiser emits cross-method report + tidy CSV + limitations section
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:32:14 +02:00
Christian Moellmann 3322bcb20c feat(interviews): on_ready / on_completed hook registry on SimulationManager
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:29:30 +02:00
Christian Moellmann b3e2039817 feat(interviews): orchestrator with two-phase lifecycle, parallel fan-out, isolated failures
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:27:47 +02:00
Christian Moellmann cca67365b9 feat(interviews): Zep writer adapts add_activity/add_text_episode for per-agent + aggregate episodes
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:24:33 +02:00
Christian Moellmann 998cf1ac27 feat(interviews): JSONL/JSON storage layout with run_id directories and latest pointer
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:23:06 +02:00
Christian Moellmann ae4941df8e feat(interviews): scenario subagent with 4 futures × 4 dimensions + polarity matrix
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:21:21 +02:00
Christian Moellmann 5d7111b54e feat(interviews): Delphi subagent (3 rounds: open, rate, revise) + convergence metrics
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:19:07 +02:00
Christian Moellmann 75762ccc18 feat(interviews): diversity subagent with Q-sort + 6 Likert axes + PCA/k-means typology
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:16:21 +02:00
Christian Moellmann 0fcb815cde feat(interviews): longitudinal subagent + 12-item Likert instrument
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:12:46 +02:00
Christian Moellmann 289a0cff56 feat(interviews): StakeholderInterviewer base with in-character prompting and schema retry
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:10:01 +02:00
Christian Moellmann 29be754ff4 feat(interviews): YAML instrument loader with pydantic validation and hash freezing
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:06:52 +02:00
BaiFu af71244974
Merge pull request #428 from Ghostubborn/feat/i18n
feat(i18n): 添加多语言切换功能,支持中英文
2026-04-02 14:27:04 +08:00
ghostubborn 24e9bee5be feat(i18n): replace all user-visible Chinese logger messages in zep_tools.py
These are shown to users via ConsoleLogger in the report page.
2026-04-01 17:46:39 +08:00
ghostubborn e79569ab4f feat(i18n): replace all user-visible Chinese in report_agent.py
Covers ReportLogger message fields and logger messages shown via ConsoleLogger.
2026-04-01 17:44:52 +08:00
666ghj e3350a919d fix(graph): enforce PascalCase for entity names and SCREAMING_SNAKE_CASE for edge names in ontology validation 2026-04-01 17:42:27 +08:00
ghostubborn 0e55e4cf6b feat(i18n): replace remaining Chinese in config generator and profile generator
Also update simulation prompts to be locale-neutral for timezone/schedule.
2026-04-01 17:19:12 +08:00
ghostubborn 7c07237544 fix(i18n): pass locale to background threads via thread-local storage
Background threads (graph building, simulation prep, report generation,
profile generation) now inherit the requesting user's locale preference.
Previously these fell back to 'zh' because Flask request context was
unavailable in spawned threads.
2026-04-01 16:55:51 +08:00
ghostubborn 592ee52f59 feat(i18n): replace remaining hardcoded Chinese in progress callbacks 2026-04-01 16:53:29 +08:00
ghostubborn da2490ec31 fix(i18n): protect JSON field values from language instruction in config generator
Ensure poster_type stays PascalCase English and stance stays English enum
values regardless of language setting. Only natural language fields follow
the user's language preference.
2026-04-01 16:41:22 +08:00
ghostubborn 97aa58384e fix(i18n): ensure ontology names stay PascalCase regardless of language setting
The language instruction was causing LLM to change entity/relation naming
conventions. Now explicitly enforce PascalCase/UPPER_SNAKE_CASE for technical
identifiers while only applying language preference to description fields.
2026-04-01 16:40:17 +08:00
ghostubborn 9d43b77511 feat(i18n): replace hardcoded Chinese in backend SSE progress messages 2026-04-01 16:32:10 +08:00
ghostubborn f75c6487b3 fix(i18n): replace remaining hardcoded language directives in LLM prompts
- oasis_profile_generator: replace hardcoded "使用中文" with dynamic get_language_instruction()
- ontology_generator: remove hardcoded "(中文)" from schema annotation
- report_agent: replace Chinese-specific language consistency rules with language-neutral ones
- zep_tools: dynamically select quote style based on locale
2026-04-01 15:55:04 +08:00
ghostubborn 8f6110df0f feat(i18n): inject language instruction into LLM system prompts 2026-04-01 15:24:12 +08:00
666ghj da6548e96f feat(graph): implement pagination for fetching nodes and edges; add utility functions for streamlined data retrieval 2026-02-27 15:53:29 +08:00
666ghj 25aa4f75d2 fix(report_agent): refine tool call handling and response validation; enforce strict separation between tool calls and final answers 2026-02-24 17:47:44 +08:00
666ghj 08ec856a58 fix(report_agent): update max_agents parameter description and enforce maximum limit of 10 agents 2026-02-14 18:35:05 +08:00
666ghj ddd9ff2479 feat(report_agent): update report language consistency guidelines; ensure all quoted content is translated to the report language for clarity 2026-02-14 18:24:03 +08:00
666ghj 7601d78fd4 feat(report_agent): enhance interview text processing and response handling; improve quote extraction and formatting for better clarity 2026-02-14 16:56:48 +08:00
666ghj dc0a9261d1 feat(report_agent): add detailed tool descriptions and prompts for future prediction report generation 2026-02-14 15:16:17 +08:00
666ghj d2041f6fb8 fix(report_agent): update description of insight_forge tool to remove "最强大" and enhance clarity 2026-02-14 14:48:23 +08:00
666ghj 0a59bace92 fix(report_agent): increase minimum tool call requirement from 2 to 3 per chapter and enhance user prompts to encourage diverse tool usage 2026-02-06 19:37:52 +08:00
666ghj e004fe8f14 fix(report_agent): update tool call requirements in content generation to allow up to 5 tool calls per chapter and clarify user prompts for insufficient data 2026-02-06 18:34:19 +08:00
666ghj f9abaf8e9f refactor(report_agent, Step4Report): simplify logging and remove subsection handling; update UI to reflect changes in section content generation 2026-02-06 18:13:30 +08:00
666ghj 54f1291967 fix(report_agent): handle None responses from LLM during content generation and enforce fallback behavior 2026-01-29 17:08:39 +08:00
666ghj 56b8babf17 feat(ZepGraphMemoryUpdater): add platform display name mapping and logging enhancements. 2026-01-16 09:00:10 +08:00
666ghj e6da45ee63 feat(history): 添加首页历史项目展示组件
- 新增 HistoryDatabase.vue 组件,实现扇形堆叠到网格展开的动画效果
- 后端 simulation.py 添加历史模拟数据 API 支持
- 修复 SimulationManager 过滤隐藏文件问题
- 前端 simulation.js 添加获取历史模拟数据的 API 方法
- Home.vue 集成历史项目展示组件
- 实现正方形网格背景装饰效果
2025-12-31 17:54:39 +08:00
666ghj 4be144c3f2 Refactor process termination in SimulationRunner to support cross-platform handling and improve code clarity. Update development script to ensure concurrent processes are terminated correctly. 2025-12-30 17:45:27 +08:00
666ghj 8bd768718e Add SIGHUP signal handling in SimulationRunner for Unix systems 2025-12-30 15:28:26 +08:00
666ghj 067855f7b5 Add UTF-8 encoding support for Windows in simulation_runner.py and run_parallel_simulation.py to resolve character encoding issues with third-party libraries. 2025-12-26 18:14:57 +08:00
666ghj 99c1b199d5 Update ReportAgent to reduce maximum tool calls and iterations for improved efficiency
- Decreased the maximum tool calls per section from 8 to 5.
- Reduced the maximum iterations in the ReACT loop from 8 to 5, streamlining the report generation process.
2025-12-16 22:47:14 +08:00
666ghj cb47e9859c Update ReportAgent to enhance report retrieval and streamline tool call process
- Reduced maximum tool calls per chat from 5 to 2 for improved efficiency.
- Simplified system prompt to focus on concise responses and report content.
- Implemented report content retrieval with length limitation to prevent context overflow.
- Adjusted tool call execution to limit to one call per iteration, enhancing clarity in responses.
- Updated user message prompts to encourage concise answers based on retrieved data.
2025-12-16 17:59:34 +08:00
666ghj 0fa2363104 Update maximum limits for tool calls and iterations in ReportAgent class
- Increased the maximum tool calls per section from 4 to 8, enhancing the agent's capabilities.
- Raised the maximum reflection rounds from 2 to 3 to allow for deeper analysis.
- Adjusted the maximum tool calls per chat from 3 to 5 for improved interaction.
- Expanded the maximum agents for interviews from 5 to 20, facilitating more comprehensive data gathering.
- Increased the maximum iterations for ReACT loops from 5 to 8 and from 3 to 5 in different contexts, optimizing the report generation process.
2025-12-14 23:36:44 +08:00
666ghj a097de4094 Enhance text output formatting and remove truncation in zep_tools.py
- Updated the `to_text` method in the `PanoramaResult` class to provide complete outputs for current facts, historical facts, and involved entities, improving data visibility.
- Modified the `to_text` method in the `AgentInterview` class to display the full agent bio without truncation.
- Adjusted the `ZepToolsService` class to retrieve all related entity details and facts without limiting the output, ensuring comprehensive data representation.
2025-12-14 22:41:46 +08:00
666ghj 9be2c28a5d Refactor report logging and enhance report generation features
- Renamed log_section_complete to log_section_content to better reflect its purpose, and added is_subsection parameter for improved logging of subsection content.
- Introduced log_section_full_complete method to log the completion of entire sections, including all subsections, enhancing tracking of report generation status.
- Adjusted maximum tool call limits for sections and chats to optimize performance during report generation.
- Updated system prompts and user prompts in the ReportAgent class to clarify the report's focus on future predictions rather than current analysis.
- Enhanced the Step3Simulation and Step4Report components for improved user experience, including UI updates and better handling of report generation states.
2025-12-14 03:28:41 +08:00
666ghj fde79721e8 Enhance agent bio display and tool result presentation in Step4Report component
- Updated the AgentInterview class to display the full agent bio, truncating only if it exceeds 1000 characters for better readability.
- Enhanced the Step4Report component to include structured display for tool results, allowing users to toggle between raw and structured views for various tools, improving user experience and clarity.
- Introduced new components for parsing and displaying results from different tools, including InsightForge, PanoramaSearch, InterviewAgents, and QuickSearch, providing a comprehensive view of the data.
2025-12-14 01:29:57 +08:00