fix: address agent boundary / JSON robustness / Phase 4 no-op from CFReDS run

Issues found running the system end-to-end on the NIST CFReDS Hacking Case disk image (SCHARDT.001, Mr. Evil). Four interconnected fixes: 1. HypothesisAgent boundary leak (two layers) B.1 Tool set: BaseAgent._register_graph_tools was registering add_phenomenon / add_lead / link_to_entity for every agent. With an empty graph in Phase 2, HypothesisAgent "compensated" by inventing phenomena, dispatching leads, and linking entities. B.2 Prompt leak: BaseAgent's shared system prompt hard-coded "Call investigation tools (list_directory, parse_registry_key, etc.)". HypothesisAgent hallucinated list_directory and wasted 2 LLM rounds on 'unknown tool' errors before backing off. Fix: - Split _register_graph_tools into _register_graph_read_tools + _register_graph_write_tools. - HypothesisAgent, ReportAgent, TimelineAgent override _register_graph_tools to skip write tools. - HypothesisAgent and TimelineAgent override _build_system_prompt with focused, role-specific workflows (no Phase A-D investigation boilerplate). 2. JSON parse failures in Phase 3 lead generation (5/6 hypotheses lost) DeepSeek emits JSON with stray backslashes (Windows path references) and occasional minor syntax slips. Old single-stage sanitize couldn't recover; per-hypothesis fallback silently swallowed each failure. Fix: - _safe_json_loads: progressive — stage 0 as-is, stage 1 escape stray \X (anything not in valid JSON escape set), log raw input on final failure for diagnosis. - New _call_llm_for_json helper: on parse failure, append the error to the prompt and re-call LLM (self-correcting retry, up to 2). - All 4 LLM-JSON callsites in orchestrator refactored to use it. 3. Phase 1 sometimes skipped add_phenomenon (LLM treated <answer> as deliverable) Strengthen BaseAgent's RECORDING REQUIREMENT — explicit "your <answer> is DISCARDED; only graph mutations propagate" plus a new rule: negative findings (searched X, found nothing) MUST also be recorded as phenomena, since they constrain the hypothesis space. 4. Phase 4 Timeline was a no-op TimelineAgent inherited BaseAgent's Phase A-D prompt and never called add_temporal_edge — produced 0 temporal edges. Override the prompt with concrete workflow (build_filesystem_timeline -> get_timestamped_phenomena -> 15-40 add_temporal_edge calls) and restrict tool set to read-only + its 3 temporal tools. Verified end-to-end: HypothesisAgent now 8 tools (no writes), ReportAgent 13 (no graph writes), TimelineAgent 10 (read + temporal + timeline). All 60 unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:14:16 +08:00
parent 0a966d8476
commit 893f5b5de2
5 changed files with 251 additions and 82 deletions
--- a/base_agent.py
+++ b/base_agent.py
@@ -93,12 +93,20 @@ class BaseAgent:
            f"  NEVER guess or fabricate a phenomenon ID. If an ID is not in list_phenomena output, it does not exist.\n\n"
            f"Phase D — ANSWER:\n"
            f"  Only give your <answer> AFTER completing Phases B and C.\n\n"
-            f"IMPORTANT:\n"
-            f"- You MUST call add_phenomenon at least once before finishing\n"
-            f"- Complete each phase before starting the next\n"
-            f"- Other agents can ONLY see what you write to the graph\n"
-            f"- If you don't record findings, they are LOST\n"
-            f"- Include relevant file paths, inode numbers, timestamps, and raw data\n\n"
+            f"CRITICAL — RECORDING REQUIREMENT:\n"
+            f"- Your <answer> block is DISCARDED by the orchestrator. Only graph mutations propagate.\n"
+            f"- Other agents and the final report read ONLY the evidence graph "
+            f"(phenomena, entities, edges).\n"
+            f"- You MUST call add_phenomenon for EVERY significant finding BEFORE you end.\n"
+            f"- NEGATIVE findings count too. If you searched X (a directory, a pattern, "
+            f"a registry key) and found NOTHING, that absence IS evidence — call "
+            f"add_phenomenon with a 'No matches for X' title and the search scope in "
+            f"raw_data. Negative findings constrain the hypothesis space and prevent "
+            f"the next agent from wasting time re-searching.\n"
+            f"- If you produce <answer> without having called add_phenomenon at least once, "
+            f"the task is FAILED regardless of what you wrote in <answer>.\n"
+            f"- Include exact file paths, inode numbers, timestamps, and the source_tool "
+            f"that produced each finding.\n\n"
            f"ANTI-HALLUCINATION RULES — STRICTLY ENFORCED:\n"
            f"- ONLY record findings that appear VERBATIM in tool results you received\n"
            f"- NEVER invent or guess timestamps, file paths, inode numbers, or program names\n"
@@ -145,9 +153,17 @@ class BaseAgent:
    # ---- Graph interaction tools --------------------------------------------

    def _register_graph_tools(self) -> None:
-        """Register tools for querying and writing to the evidence graph."""
+        """Register graph query + mutation tools.

-        # --- Read tools ---
+        Subclasses can override to restrict the toolset. For example, a
+        read-only agent (hypothesis, report) overrides this to skip
+        _register_graph_write_tools.
+        """
+        self._register_graph_read_tools()
+        self._register_graph_write_tools()
+
+    def _register_graph_read_tools(self) -> None:
+        """Register read-only graph + asset query tools."""

        self.register_tool(
            name="list_phenomena",
@@ -213,7 +229,49 @@ class BaseAgent:
            executor=self._get_hypothesis_status,
        )

-        # --- Write tools ---
+        self.register_tool(
+            name="list_assets",
+            description=(
+                "List all files extracted from the disk image. "
+                "Shows filename, category, size, local path, and inode. "
+                "Check this before calling extract_file to avoid re-extraction."
+            ),
+            input_schema={
+                "type": "object",
+                "properties": {
+                    "category": {
+                        "type": "string",
+                        "enum": [
+                            "registry_hive", "chat_log", "prefetch", "network_capture",
+                            "config_file", "address_book", "recycle_bin", "executable",
+                            "text_log", "other",
+                        ],
+                        "description": "Filter by category. Omit to list all.",
+                    },
+                },
+            },
+            executor=self._list_assets,
+        )
+
+        self.register_tool(
+            name="find_extracted_file",
+            description=(
+                "Find an already-extracted file by inode or filename. "
+                "Returns the local path so you can use it directly with "
+                "parse_registry_key, read_text_file, etc. without re-extracting."
+            ),
+            input_schema={
+                "type": "object",
+                "properties": {
+                    "inode": {"type": "string", "description": "Inode to look up."},
+                    "filename": {"type": "string", "description": "Filename or partial name to search."},
+                },
+            },
+            executor=self._find_extracted_file,
+        )
+
+    def _register_graph_write_tools(self) -> None:
+        """Register graph mutation tools (add_phenomenon, add_lead, link_to_entity)."""

        self.register_tool(
            name="add_phenomenon",
@@ -282,49 +340,6 @@ class BaseAgent:
            executor=self._link_to_entity,
        )

-        # --- Asset library tools ---
-
-        self.register_tool(
-            name="list_assets",
-            description=(
-                "List all files extracted from the disk image. "
-                "Shows filename, category, size, local path, and inode. "
-                "Check this before calling extract_file to avoid re-extraction."
-            ),
-            input_schema={
-                "type": "object",
-                "properties": {
-                    "category": {
-                        "type": "string",
-                        "enum": [
-                            "registry_hive", "chat_log", "prefetch", "network_capture",
-                            "config_file", "address_book", "recycle_bin", "executable",
-                            "text_log", "other",
-                        ],
-                        "description": "Filter by category. Omit to list all.",
-                    },
-                },
-            },
-            executor=self._list_assets,
-        )
-
-        self.register_tool(
-            name="find_extracted_file",
-            description=(
-                "Find an already-extracted file by inode or filename. "
-                "Returns the local path so you can use it directly with "
-                "parse_registry_key, read_text_file, etc. without re-extracting."
-            ),
-            input_schema={
-                "type": "object",
-                "properties": {
-                    "inode": {"type": "string", "description": "Inode to look up."},
-                    "filename": {"type": "string", "description": "Filename or partial name to search."},
-                },
-            },
-            executor=self._find_extracted_file,
-        )
-
    # ---- Tool executors -----------------------------------------------------

    async def _list_phenomena(self, category: str | None = None) -> str: