fix: address agent boundary / JSON robustness / Phase 4 no-op from CFReDS run

Issues found running the system end-to-end on the NIST CFReDS Hacking Case disk image (SCHARDT.001, Mr. Evil). Four interconnected fixes: 1. HypothesisAgent boundary leak (two layers) B.1 Tool set: BaseAgent._register_graph_tools was registering add_phenomenon / add_lead / link_to_entity for every agent. With an empty graph in Phase 2, HypothesisAgent "compensated" by inventing phenomena, dispatching leads, and linking entities. B.2 Prompt leak: BaseAgent's shared system prompt hard-coded "Call investigation tools (list_directory, parse_registry_key, etc.)". HypothesisAgent hallucinated list_directory and wasted 2 LLM rounds on 'unknown tool' errors before backing off. Fix: - Split _register_graph_tools into _register_graph_read_tools + _register_graph_write_tools. - HypothesisAgent, ReportAgent, TimelineAgent override _register_graph_tools to skip write tools. - HypothesisAgent and TimelineAgent override _build_system_prompt with focused, role-specific workflows (no Phase A-D investigation boilerplate). 2. JSON parse failures in Phase 3 lead generation (5/6 hypotheses lost) DeepSeek emits JSON with stray backslashes (Windows path references) and occasional minor syntax slips. Old single-stage sanitize couldn't recover; per-hypothesis fallback silently swallowed each failure. Fix: - _safe_json_loads: progressive — stage 0 as-is, stage 1 escape stray \X (anything not in valid JSON escape set), log raw input on final failure for diagnosis. - New _call_llm_for_json helper: on parse failure, append the error to the prompt and re-call LLM (self-correcting retry, up to 2). - All 4 LLM-JSON callsites in orchestrator refactored to use it. 3. Phase 1 sometimes skipped add_phenomenon (LLM treated <answer> as deliverable) Strengthen BaseAgent's RECORDING REQUIREMENT — explicit "your <answer> is DISCARDED; only graph mutations propagate" plus a new rule: negative findings (searched X, found nothing) MUST also be recorded as phenomena, since they constrain the hypothesis space. 4. Phase 4 Timeline was a no-op TimelineAgent inherited BaseAgent's Phase A-D prompt and never called add_temporal_edge — produced 0 temporal edges. Override the prompt with concrete workflow (build_filesystem_timeline -> get_timestamped_phenomena -> 15-40 add_temporal_edge calls) and restrict tool set to read-only + its 3 temporal tools. Verified end-to-end: HypothesisAgent now 8 tools (no writes), ReportAgent 13 (no graph writes), TimelineAgent 10 (read + temporal + timeline). All 60 unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:14:16 +08:00
parent 0a966d8476
commit 893f5b5de2
5 changed files with 251 additions and 82 deletions
--- a/agents/timeline.py
+++ b/agents/timeline.py
@@ -1,14 +1,21 @@
-"""Timeline Agent — correlates evidence across time."""
+"""Timeline Agent — connects existing phenomena with temporal edges.
+
+Operates on phenomena already in the graph. Does NOT investigate the disk
+image itself. The agent's only useful output is the temporal edges it
+creates between phenomena.
+"""

 from __future__ import annotations

-import json
+import logging

 from base_agent import BaseAgent
 from evidence_graph import EvidenceGraph
 from llm_client import LLMClient
 from tool_registry import TOOL_CATALOG

+logger = logging.getLogger(__name__)
+

 class TimelineAgent(BaseAgent):
    name = "timeline"
@@ -22,24 +29,33 @@ class TimelineAgent(BaseAgent):
        super().__init__(llm, graph)
        self._register_tools()

+    def _register_graph_tools(self) -> None:
+        """Restrict to read-only graph tools — Timeline does not add phenomena."""
+        self._register_graph_read_tools()
+
    def _register_tools(self) -> None:
-        # Filesystem timeline tool from catalog
        td = TOOL_CATALOG.get("build_filesystem_timeline")
        if td:
            self.register_tool(td.name, td.description, td.input_schema, td.executor)

-        # Custom tool to get all phenomena with timestamps for correlation
        self.register_tool(
            name="get_timestamped_phenomena",
-            description="Get all phenomena that have timestamps, sorted chronologically. Use for timeline correlation.",
+            description=(
+                "Get all phenomena that have timestamps, sorted chronologically. "
+                "Returns each phenomenon's id, category, title, and a short description "
+                "preview. Use this as your primary input for temporal correlation."
+            ),
            input_schema={"type": "object", "properties": {}},
            executor=self._get_timestamped_phenomena,
        )

-        # Tool to add temporal edges between phenomena
        self.register_tool(
            name="add_temporal_edge",
-            description="Add a temporal relationship between two phenomena (before, after, or concurrent).",
+            description=(
+                "Add a temporal relationship edge between two existing phenomena. "
+                "Use 'before' when source phenomenon happened before target, "
+                "'concurrent' when they occurred within seconds of each other."
+            ),
            input_schema={
                "type": "object",
                "properties": {
@@ -56,6 +72,42 @@ class TimelineAgent(BaseAgent):
            executor=self._add_temporal_edge,
        )

+    def _build_system_prompt(self, task: str) -> str:
+        """Focused prompt — Timeline connects existing phenomena, doesn't investigate."""
+        return (
+            f"You are {self.name}, a forensic timeline correlation analyst.\n"
+            f"Role: {self.role}\n\n"
+            f"Image: {self.graph.image_path}\n"
+            f"Current state: {self.graph.stats_summary()}\n\n"
+            f"Your task: {task}\n\n"
+            f"WORKFLOW:\n"
+            f"1. Call build_filesystem_timeline once to materialize MAC times for the disk.\n"
+            f"2. Call get_timestamped_phenomena to see all phenomena with timestamps, "
+            f"sorted chronologically. THIS IS YOUR PRIMARY INPUT.\n"
+            f"3. For each meaningful temporal relationship between phenomena, call "
+            f"add_temporal_edge(source_id, target_id, relation). Use 'before' when "
+            f"source happened first (the common case); 'concurrent' for events within "
+            f"a few seconds of each other.\n"
+            f"   Examples of meaningful connections:\n"
+            f"     - 'Cain installer executed' (before) 'Cain.exe first execution'\n"
+            f"     - 'WHOIS first lookup'      (before) 'WHOIS second lookup'\n"
+            f"     - 'Recon tool cluster'      (before) 'Anti-forensics defrag'\n"
+            f"     - 'Tool installation'       (before) 'Tool execution'\n"
+            f"4. Aim for 15-40 temporal edges that connect the major events into a "
+            f"forensic story.\n"
+            f"5. Wrap a short summary in <answer> when done.\n\n"
+            f"STRICT BOUNDARIES:\n"
+            f"- Your job is to CONNECT existing phenomena, NOT to discover new ones. "
+            f"You CANNOT call add_phenomenon — the tool isn't yours.\n"
+            f"- Use ONLY phenomenon IDs returned by get_timestamped_phenomena or "
+            f"list_phenomena. NEVER fabricate IDs.\n"
+            f"- Connect events that tell a forensic story (recon -> exploit -> cover-up). "
+            f"Do not exhaustively pair every two phenomena; focus on causally-relevant "
+            f"sequences.\n"
+            f"- The orchestrator handles report writing in the next phase. Your only "
+            f"output that propagates is the temporal edges you create."
+        )
+
    async def _get_timestamped_phenomena(self) -> str:
        items = [
            ph for ph in self.graph.phenomena.values()