refactor: native tool calling + generic forced-retry + terminal exit
- llm_client: switch tool_call_loop from text-based <tool_call> regex to OpenAI-native tools=[...] / structured tool_calls field; accumulate delta.reasoning_content for DeepSeek thinking-mode echo-back; fold preserves system msg and aligns boundary to never orphan role:tool - base_agent: generic forced-retry via mandatory_record_tools class attr (filesystem -> add_phenomenon, timeline -> add_temporal_edge, hypothesis -> add_hypothesis, report -> save_report); count via executor wrapper - terminal_tools class attr + loop short-circuit: when a terminal tool is called, loop exits with its raw return as final_text. ReportAgent declares save_report as terminal - replaces the <answer>-tag stop signal that native tool calling broke - _execute_*: return (raw, formatted) - terminal exit uses untruncated raw, conversation history uses 3000-char-capped formatted - evidence_graph + orchestrator: LLM-derived InvestigationArea support (hypothesis-driven coverage check, replaces hardcoded _AREA_KEYWORDS / _AREA_TOOLS); manual yaml block kept as optional seed - strip <answer> references from agent prompts (no longer load-bearing) Verified on CFReDS image across 4 smoke runs: 0 JSON parse failures (was 3); 22 temporal edges from Phase 4 (was 0); ReportAgent exits via save_report (was max_iterations regression). 78/78 unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -24,6 +24,7 @@ class HypothesisAgent(BaseAgent):
|
||||
"and formulate investigative hypotheses about what happened on this system. "
|
||||
"Your ultimate goal: build the most complete picture of events that occurred."
|
||||
)
|
||||
mandatory_record_tools = ("add_hypothesis",)
|
||||
|
||||
def __init__(self, llm: LLMClient, graph: EvidenceGraph) -> None:
|
||||
super().__init__(llm, graph)
|
||||
@@ -68,7 +69,7 @@ class HypothesisAgent(BaseAgent):
|
||||
f"WORKFLOW:\n"
|
||||
f"1. Call list_phenomena and search_graph to review existing findings.\n"
|
||||
f"2. For each hypothesis you want to record, call add_hypothesis (title + description).\n"
|
||||
f"3. Wrap a short summary in <answer> when you have generated 3-7 hypotheses.\n\n"
|
||||
f"3. STOP after you have generated 3-7 hypotheses. Do not call any more tools.\n\n"
|
||||
f"STRICT BOUNDARIES:\n"
|
||||
f"- Your only mutation tool is add_hypothesis. Do NOT attempt list_directory, "
|
||||
f"parse_registry_key, extract_file, or any disk-image investigation tools — "
|
||||
|
||||
Reference in New Issue
Block a user