Commit Graph

292 Commits

Author SHA1 Message Date
Christian Möllmann 609ea6c5bd
Merge 895a5fbaee into 96096ea0ff 2026-05-25 00:06:17 +02:00
BaiFu 96096ea0ff
Merge pull request #640 from lllopic/fix/add-type-hints-and-helper-method
refactor: add type hints and FileParser.is_supported() helper
2026-05-25 00:48:57 +08:00
666ghj 3f4d56116c fix(backend): constrain Python version to 3.11-3.12 2026-05-24 22:59:36 +08:00
BaiFu db1bc144ff
Merge pull request #641 from YunyueLi/docs/readme-polish
docs(readme): fix typo in Shanda logo alt text
2026-05-24 02:02:46 +08:00
Christian Moellmann 895a5fbaee fix(interviews): accept stringified ints in all 4 subagent validators
Real LLMs (observed with anthropic/claude-haiku-4-5 on a 23-agent run)
sometimes return Likert values as JSON strings ('3' not 3). The 4 subagent
validators rejected this with isinstance(v, int), losing ~30% of agents at
N=23. Added a shared coerce_int helper in base.py that accepts ints and
numeric strings, rejects bools/floats/garbage, and is now used by:

- Longitudinal: response values 1-5
- Diversity: Q-sort placements -3..+3 and 6 Likert axes 1-7
- Delphi: R2 and R3 importance/plausibility 1-5
- Scenario: 4 dimensions 1-7

Validators now coerce in place so downstream code sees ints regardless of
the wire format. Added 8 tests (4 unit on coerce_int + 4 per-subagent
contract tests showing stringified values are accepted).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 14:03:34 +02:00
Christian Moellmann 6a53c110b7 feat(interviews): capture raw LLM output on schema-validation failures
Adds SchemaValidationFailure exception carrying both retry attempts' raw
output, so audit.jsonl preserves what the model actually said when an
agent's response can't be coerced into the instrument schema. Lets us
diagnose persona-vs-format failures without re-running. Two new tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 13:40:43 +02:00
Christian Moellmann 6e1489fe08 fix(interviews): wire Zep updater/memory/hooks correctly for production runs (C1-C5)
Five tightly-coupled fixes that were causing the interview subsystem to silently
degrade in production:

- C1+C2: `_build_orchestrator` now resolves `graph_id` from
  `SimulationManager().get_simulation(sim_id).graph_id` (the real persisted
  state) instead of a `graph_id.txt` that nothing in the codebase writes.
  `ZepGraphMemoryUpdater(graph_id=...)` is now called with the correct
  positional argument; the bare `try/except Exception` that was swallowing the
  TypeError is replaced with a narrow fallback that logs explicitly.
- C3: `SimulationManager._on_ready_hooks` / `_on_completed_hooks` are now
  class-level (mirroring `SimulationRunner._on_completed_callbacks`).
  Hooks registered at app startup now survive across the per-request
  `SimulationManager()` instances created by the Flask API, so the T0
  longitudinal auto-survey actually fires.
- C4: `ZepGraphMemoryUpdater` gains an explicit `add_text_episode(graph_id, text)`
  method for synchronous text writes. `InterviewZepWriter._emit` no longer
  silently falls back to a dict-shaped `add_activity` call that the real
  implementation rejects (its `add_activity` requires an `AgentActivity`
  dataclass).
- C5: `FileSystemPersonaProvider.agent_to_entity()` builds an
  `{agent_id: zep_entity_uuid}` map from the persisted profile files; the map
  is now passed to `ZepMemoryProvider` so `get_entity_with_context` is called
  with real Zep UUIDs instead of `str(agent_id)`. To make this work,
  `OasisProfileGenerator._save_reddit_json` and `_save_twitter_csv` now persist
  `source_entity_uuid` (Reddit JSON: optional field; Twitter CSV: appended
  column).

Tests: 51 unit + 2 integration pass (was 40 + 2). New tests lock in each fix:
- `test_hooks_survive_across_instances` (C3)
- `test_build_orchestrator_reads_graph_id_from_state` (C1+C2+C5)
- `test_build_orchestrator_falls_back_when_state_missing` (C1+C2)
- `test_emit_uses_add_text_episode_with_graph_id`,
  `test_emit_raises_when_updater_lacks_add_text_episode`,
  `test_real_updater_exposes_add_text_episode` (C4)
- `test_agent_to_entity_from_reddit_json`,
  `test_agent_to_entity_empty_when_no_field`,
  `test_agent_to_entity_falls_back_to_twitter_csv`,
  `test_agent_to_entity_reddit_takes_precedence` (C5)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 13:27:47 +02:00
Christian Moellmann 6b04ea5c27 feat(interviews): auto-trigger lifecycle hooks + bridge SimulationRunner→Manager on COMPLETED
- Add backend/app/services/interviews/lifecycle.py with install_hooks() that
  registers on_ready (pre-survey) and on_completed (post-survey + synthesis)
  daemon-thread callbacks on a SimulationManager.
- Add SimulationRunner.register_on_completed() / _fire_on_completed() so
  external callbacks can be notified when _monitor_simulation transitions to
  COMPLETED (both exit-code-0 path and simulation_end event path).
- Wire both in app/__init__.py: create singleton SimulationManager, install
  lifecycle hooks, and register its _notify_on_completed with SimulationRunner.
- Add test_lifecycle.py: verifies install_hooks registers one callable for each
  of ready and completed.
- All 40 unit tests + 2 integration tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:51:13 +02:00
Christian Moellmann acaa06170e feat(interviews): d3 visualisations for longitudinal Δ, diversity PCA, Delphi, scenario polarity, synthesis
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:47:34 +02:00
Christian Moellmann fede66cac3 feat(interviews): Step4b Vue scaffold with five-tab navigation, API client, i18n keys
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:44:37 +02:00
Christian Moellmann 61f13a806d test(interviews): end-to-end pipeline test + content-aware LLM stubs for all 4 subagents
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:40:53 +02:00
Christian Moellmann 52bae0a3da feat(interviews): Flask blueprint /api/interview with task-based async + CSV export
Add /api/interview blueprint with POST pre/post/rerun, GET status/results/synthesis/export.csv endpoints. Background tasks tracked by UUID in module-level dict. Add register_blueprints() helper to api/__init__.py and wire app factory through it. Add UPLOADS_DIR to Config with env-override default.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:37:04 +02:00
Christian Moellmann bc07170dbf feat(interviews): persona + Zep memory adapters bridging existing services to interview subsystem
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:34:03 +02:00
Christian Moellmann d79c81d2b7 feat(interviews): synthesiser emits cross-method report + tidy CSV + limitations section
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:32:14 +02:00
Christian Moellmann 3322bcb20c feat(interviews): on_ready / on_completed hook registry on SimulationManager
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:29:30 +02:00
Christian Moellmann b3e2039817 feat(interviews): orchestrator with two-phase lifecycle, parallel fan-out, isolated failures
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:27:47 +02:00
Christian Moellmann cca67365b9 feat(interviews): Zep writer adapts add_activity/add_text_episode for per-agent + aggregate episodes
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:24:33 +02:00
Christian Moellmann 998cf1ac27 feat(interviews): JSONL/JSON storage layout with run_id directories and latest pointer
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:23:06 +02:00
Christian Moellmann ae4941df8e feat(interviews): scenario subagent with 4 futures × 4 dimensions + polarity matrix
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:21:21 +02:00
Christian Moellmann 5d7111b54e feat(interviews): Delphi subagent (3 rounds: open, rate, revise) + convergence metrics
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:19:07 +02:00
Christian Moellmann 75762ccc18 feat(interviews): diversity subagent with Q-sort + 6 Likert axes + PCA/k-means typology
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:16:21 +02:00
Christian Moellmann 0fcb815cde feat(interviews): longitudinal subagent + 12-item Likert instrument
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:12:46 +02:00
Christian Moellmann 289a0cff56 feat(interviews): StakeholderInterviewer base with in-character prompting and schema retry
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:10:01 +02:00
Christian Moellmann eb3c3629c1 feat(interviews): LLM stub mode for deterministic CI tests
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:08:29 +02:00
Christian Moellmann 29be754ff4 feat(interviews): YAML instrument loader with pydantic validation and hash freezing
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:06:52 +02:00
Christian Moellmann f1898b4eac feat(interviews): add pydantic models for instruments and responses
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:04:45 +02:00
Christian Moellmann 071f8b5c4c feat(interviews): add interview config keys (token budget, workers, language, stub mode)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:02:33 +02:00
Christian Moellmann f63bc5542a chore(interviews): add deps and pytest scaffold for interview subsystem
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:00:09 +02:00
Christian Moellmann 815e4758b2 docs(plan): stakeholder interview subagents implementation plan
Bite-sized TDD plan covering 21 tasks across 7 phases: setup → foundation
(models, YAML loader, LLM stub, base interviewer) → 4 subagents
(longitudinal, diversity Q-sort+PCA, Delphi 3-round, scenario) → storage
+ Zep writer → orchestrator + sim lifecycle hooks + synthesiser →
Flask /api/interview blueprint → end-to-end integration test → Vue Step4b
with d3 visualisations. Each task lists exact files, failing test code,
implementation code, run commands, and commit message.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 11:19:02 +02:00
Christian Moellmann bf058080ac docs(spec): stakeholder interview subagents design
Approved design for a four-subagent post-simulation interview system
(Longitudinal, Diversity, Delphi, Scenario) over MiroFish-simulated
German fisheries stakeholders, with cross-method synthesiser. Includes
architecture, instrument design, data flow, API surface, error handling,
validation, testing, and methodological caveats.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 10:53:28 +02:00
YunyueLi faa151131c docs(readme): fix typo in Shanda logo alt text
The Shanda sponsor logo's alt text was `666ghj%2MiroFish | Shanda`,
missing the `F` from the URL-encoded `/`. Every other badge in both
READMEs uses the correct `666ghj%2FMiroFish`. Bring this one in line
with the rest.
2026-05-23 16:30:55 +08:00
lllopic daec4b6be4 refactor: add type hints and FileParser.is_supported() helper
- Add return type annotation (list[str]) to Config.validate()
- Add type annotations (msg: str, -> None) to logger convenience functions
- Add FileParser.is_supported() classmethod for checking file format support
2026-05-23 14:57:46 +08:00
666ghj fa0f6519b1 docs: rename README-EN.md to README.md as default English documentation 2026-04-02 16:52:29 +08:00
666ghj 0e9420e0f8 docs: rename README.md to README-ZH.md for Chinese documentation 2026-04-02 16:52:29 +08:00
BaiFu 7d07fb7f03
Merge pull request #440 from Ghostubborn/fix/security-deps
fix(security): 修复前端 3 个高危依赖漏洞
2026-04-02 15:17:46 +08:00
ghostubborn 223b283da7 fix(security): upgrade axios, rollup, picomatch to fix 3 high severity vulnerabilities 2026-04-02 15:00:33 +08:00
BaiFu af71244974
Merge pull request #428 from Ghostubborn/feat/i18n
feat(i18n): 添加多语言切换功能,支持中英文
2026-04-02 14:27:04 +08:00
ghostubborn ed465908db fix(i18n): set HTML lang attribute before Vue mounts via inline script 2026-04-02 14:21:09 +08:00
ghostubborn 65df257e19 chore(deps): upgrade vue-i18n from v9 to v11 2026-04-02 14:20:50 +08:00
ghostubborn f2404903d6 fix(i18n): validate Accept-Language header against registered locales 2026-04-02 14:20:15 +08:00
ghostubborn 2421010fe1 fix(i18n): fix English workflow desc font size with correct CSS selectors 2026-04-01 19:11:22 +08:00
ghostubborn 3929c3ade2 fix(i18n): further shorten English metrics and improve workflow layout 2026-04-01 19:07:19 +08:00
ghostubborn 21922da6cc fix(i18n): improve English layout for homepage left-pane and report title
- Add sans-serif font for English left-pane (status, workflow sections)
- Shorten English workflow step descriptions
- Reduce English report title font-size from 36px to 28px
2026-04-01 19:04:38 +08:00
ghostubborn c6cafdd532 fix(i18n): translate world1/world2 platform labels in interview tool display 2026-04-01 18:38:22 +08:00
ghostubborn 5072a2eaa8 feat(i18n): replace Chinese UI text in Step4Report.vue render functions
Only UI display text is replaced. Regex parsing patterns are kept as-is
since they match the backend output format.
2026-04-01 18:35:18 +08:00
ghostubborn 6db3f98a48 fix(i18n): fix English homepage layout with proper font and shorter copy
- Use sans-serif font for English titles, descriptions and navbar
- Shorten English hero text to avoid overflow
- Fix :global() scoped CSS issue that was setting root font-size to 3.5rem
- Use separate unscoped style block for html[lang] selectors
2026-04-01 18:04:05 +08:00
ghostubborn 24e9bee5be feat(i18n): replace all user-visible Chinese logger messages in zep_tools.py
These are shown to users via ConsoleLogger in the report page.
2026-04-01 17:46:39 +08:00
ghostubborn e79569ab4f feat(i18n): replace all user-visible Chinese in report_agent.py
Covers ReportLogger message fields and logger messages shown via ConsoleLogger.
2026-04-01 17:44:52 +08:00
ghostubborn 1d358fc492 feat(i18n): replace expand/collapse Chinese text in Step4Report.vue 2026-04-01 17:44:45 +08:00
666ghj e3350a919d fix(graph): enforce PascalCase for entity names and SCREAMING_SNAKE_CASE for edge names in ontology validation 2026-04-01 17:42:27 +08:00