fix: remove response_format=json_object from chat_json, increase ontology max_tokens

Bug 1: chat_json() was passing response_format={'type': 'json_object'}
to the LLM, which enforces JSON grammar from token 0. Reasoning
models (Qwen3, DeepSeek-R1, etc.) generate <think>...</think> blocks
before JSON output, causing garbled results. The fix removes the
response_format parameter since the system prompt already requests
JSON output and the existing <think> cleanup handles any remaining
tags.

Bug 2: ontology_generator hardcoded max_tokens=4096, causing
truncation for models with larger context windows. Increased to
16384 to accommodate reasoning model outputs.

Fixes #642
This commit is contained in:
Md_Mushfiqur Rahim 2026-05-27 02:36:58 +00:00
parent 96096ea0ff
commit 0a3272197b
2 changed files with 1 additions and 2 deletions

View File

@ -217,7 +217,7 @@ class OntologyGenerator:
result = self.llm_client.chat_json( result = self.llm_client.chat_json(
messages=messages, messages=messages,
temperature=0.3, temperature=0.3,
max_tokens=4096 max_tokens=16384
) )
# 验证和后处理 # 验证和后处理

View File

@ -88,7 +88,6 @@ class LLMClient:
messages=messages, messages=messages,
temperature=temperature, temperature=temperature,
max_tokens=max_tokens, max_tokens=max_tokens,
response_format={"type": "json_object"}
) )
# 清理markdown代码块标记 # 清理markdown代码块标记
cleaned_response = response.strip() cleaned_response = response.strip()