fix: remove response_format=json_object from chat_json, increase ontology max_tokens
Bug 1: chat_json() was passing response_format={'type': 'json_object'}
to the LLM, which enforces JSON grammar from token 0. Reasoning
models (Qwen3, DeepSeek-R1, etc.) generate <think>...</think> blocks
before JSON output, causing garbled results. The fix removes the
response_format parameter since the system prompt already requests
JSON output and the existing <think> cleanup handles any remaining
tags.
Bug 2: ontology_generator hardcoded max_tokens=4096, causing
truncation for models with larger context windows. Increased to
16384 to accommodate reasoning model outputs.
Fixes #642
This commit is contained in:
parent
96096ea0ff
commit
0a3272197b
|
|
@ -217,7 +217,7 @@ class OntologyGenerator:
|
||||||
result = self.llm_client.chat_json(
|
result = self.llm_client.chat_json(
|
||||||
messages=messages,
|
messages=messages,
|
||||||
temperature=0.3,
|
temperature=0.3,
|
||||||
max_tokens=4096
|
max_tokens=16384
|
||||||
)
|
)
|
||||||
|
|
||||||
# 验证和后处理
|
# 验证和后处理
|
||||||
|
|
|
||||||
|
|
@ -88,7 +88,6 @@ class LLMClient:
|
||||||
messages=messages,
|
messages=messages,
|
||||||
temperature=temperature,
|
temperature=temperature,
|
||||||
max_tokens=max_tokens,
|
max_tokens=max_tokens,
|
||||||
response_format={"type": "json_object"}
|
|
||||||
)
|
)
|
||||||
# 清理markdown代码块标记
|
# 清理markdown代码块标记
|
||||||
cleaned_response = response.strip()
|
cleaned_response = response.strip()
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue