feat(refit): complete S1-S6 — case abstraction, grounding, log-odds, plugins, coref, multi-source

Consolidates the long-running refit work (DESIGN.md as authoritative spec) into a single baseline commit. Six stages landed together: S1 Case + EvidenceSource abstraction; tools parameterised by source_id (case.py, main.py multi-source bootstrap, .bin extension support) S2 Grounding gateway in add_phenomenon: verified_facts cite real ToolInvocation ids; substring / normalised match enforced; agent + task scope checked. Phenomenon.description split into verified_facts (grounded) + interpretation (free text). [invocation: inv-xxx] prefix on every wrapped tool result so the LLM can cite. S3 Confidence as additive log-odds: edge_type → log10(LR) calibration table; commutative updates; supported / refuted thresholds derived from log_odds; hypothesis × evidence matrix view. S4 iOS plugin: unzip_archive + parse_plist / sqlite_tables / sqlite_query / parse_ios_keychain / read_idevice_info; IOSArtifactAgent; SOURCE_TYPE_AGENTS routing. S5 Cross-source entity resolution: typed identifiers on Entity, observe_identity gateway, auto coref hypothesis with shared / conflicting strong/weak LR edges, reversible same_as edges, actor_clusters() view. S6 Android partition probe + AndroidArtifactAgent; MediaAgent with OCR fallback; orchestrator Phase 1 iterates every analysable source; platform-aware get_triage_agent_type; ReportAgent renders actor clusters + per-source breakdown. 142 unit tests / 1 skipped — full coverage of the new gateway, log-odds math, coref hypothesis fall-out, and orchestrator multi-source dispatch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 02:12:10 -10:00
parent 444d58726a
commit 81ade8f7ac
24 changed files with 5137 additions and 244 deletions
--- a/agents/media.py
+++ b/agents/media.py
@@ -0,0 +1,52 @@
+"""Media Agent — OCR-based analysis of screenshot/photo evidence.
+
+DESIGN.md §4.7: the LLM backend has no vision capability, so JPEG/PNG
+evidence must go through tesseract first. The agent runs OCR, then
+records extracted strings — especially identifiers (wallet addresses,
+phone numbers, usernames) — via the grounded observe_identity gateway so
+they participate in cross-source coref the same way iOS keychain entries
+or Windows account names do.
+
+If the OCR runtime is missing on the host, ocr_image returns an explicit
+install hint; the agent should record that as a negative finding ("no
+text extracted — tesseract not installed") rather than guessing.
+"""
+
+from __future__ import annotations
+
+from base_agent import BaseAgent
+from evidence_graph import EvidenceGraph
+from llm_client import LLMClient
+from tool_registry import TOOL_CATALOG
+
+
+class MediaAgent(BaseAgent):
+    name = "media"
+    role = (
+        "Media / OCR forensic analyst. You analyse screenshots, photos, and "
+        "scanned documents — any pixel-based evidence the LLM cannot read "
+        "directly. Workflow: list_extracted_dir to enumerate images, "
+        "ocr_image on each promising one, then add_phenomenon (with the "
+        "OCR'd text as the verified_fact value) and observe_identity for "
+        "any wallet addresses, phone numbers, email addresses, or "
+        "usernames the text contains. If OCR fails because tesseract is "
+        "missing, RECORD that as a negative finding instead of fabricating "
+        "image content — the absence is a real fact about this run."
+    )
+
+    def __init__(self, llm: LLMClient, graph: EvidenceGraph) -> None:
+        super().__init__(llm, graph)
+        self._register_tools()
+
+    def _register_tools(self) -> None:
+        tool_names = [
+            "ocr_image",
+            "list_extracted_dir", "find_files",
+            "read_binary_preview",
+            "read_text_file",
+            "search_text_file",
+        ]
+        for name in tool_names:
+            td = TOOL_CATALOG.get(name)
+            if td:
+                self.register_tool(td.name, td.description, td.input_schema, td.executor)