refactor: lead provenance, unified link path, SSOT cleanup, configurable weights

Five interrelated cleanups:

1. Lead -> Phenomenon provenance
   - Phenomenon.from_lead_id field on the dataclass
   - BaseAgent.run(lead_id=...) writes self._current_lead_id
   - _add_phenomenon auto-injects from agent state (LLM unaware)
   - Orchestrator dispatch passes lead.id; Phase 1/2-auto/4/5 stay None
   - Merge path preserves the first non-None lead_id on collision

2. Unified Phenomenon <-> Hypothesis link path
   - HypothesisAgent only adds hypotheses, never links
   - link_phenomenon_to_hypothesis tool + executor removed
   - All links go through Orchestrator._judge_new_phenomena
   - Phase 2 unconditionally judges after hypothesis generation
   - Gap Analysis judges after each dispatch round
   (Three previously-missing judge calls now in place.)

3. SSOT in agent subclasses
   - Remove RoleTemplate dataclass, ROLE_TEMPLATES dict,
     _instantiate_from_template method
   - Each agent subclass owns name, role, and tool list
   - agent_factory.py shrinks from 299 to 153 lines
   - All 7 agents now route through _AGENT_CLASSES (filesystem,
     registry, communication, network, timeline were previously dead
     subclasses overridden by templates)

4. Configurable edge weights
   - HYPOTHESIS_EDGE_WEIGHTS -> _DEFAULT_EDGE_WEIGHTS (private default)
   - EvidenceGraph(edge_weights=...) override via config.yaml
   - hypothesis_edge_weights section in config.yaml (commented example)
   - main.py and regenerate_report.py read and pass through

5. regenerate_report.py auto-picks the latest run/*/graph_state.json
   when no CLI arg is given (was a hardcoded date path)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
BattleTag
2026-05-12 14:10:15 +08:00
parent fde96c7d9f
commit 74e6bde13a
7 changed files with 92 additions and 254 deletions

View File

@@ -1,12 +1,15 @@
"""Hypothesis Agent — analyzes phenomena and generates investigative hypotheses."""
"""Hypothesis Agent — generates investigative hypotheses from phenomena.
Generates hypotheses only. Phenomenon→Hypothesis linking is handled centrally
by Orchestrator._judge_new_phenomena, so all link logic lives in one place.
"""
from __future__ import annotations
import json
import logging
from base_agent import BaseAgent
from evidence_graph import EvidenceGraph, HYPOTHESIS_EDGE_WEIGHTS
from evidence_graph import EvidenceGraph
from llm_client import LLMClient
logger = logging.getLogger(__name__)
@@ -17,8 +20,7 @@ class HypothesisAgent(BaseAgent):
role = (
"Hypothesis analyst. You review all phenomena discovered so far "
"and formulate investigative hypotheses about what happened on this system. "
"Your ultimate goal: build the most complete picture of events that occurred. "
"For each hypothesis, identify which existing phenomena support or contradict it."
"Your ultimate goal: build the most complete picture of events that occurred."
)
def __init__(self, llm: LLMClient, graph: EvidenceGraph) -> None:
@@ -26,10 +28,6 @@ class HypothesisAgent(BaseAgent):
self._register_hypothesis_tools()
def _register_hypothesis_tools(self) -> None:
"""Register hypothesis-specific tools."""
valid_edge_types = list(HYPOTHESIS_EDGE_WEIGHTS.keys())
self.register_tool(
name="add_hypothesis",
description=(
@@ -53,44 +51,6 @@ class HypothesisAgent(BaseAgent):
executor=self._add_hypothesis,
)
self.register_tool(
name="link_phenomenon_to_hypothesis",
description=(
"Link an existing phenomenon to a hypothesis with a relationship type. "
f"Valid relationship types: {', '.join(valid_edge_types)}. "
"direct_evidence = the phenomenon IS the hypothesis. "
"supports = consistent with the hypothesis. "
"prerequisite_met = a necessary condition is satisfied. "
"consequence_observed = an expected result of the hypothesis is found. "
"contradicts = directly contradicts the hypothesis. "
"weakens = makes the hypothesis less likely."
),
input_schema={
"type": "object",
"properties": {
"phenomenon_id": {
"type": "string",
"description": "ID of the phenomenon (e.g. 'ph-a1b2c3d4').",
},
"hypothesis_id": {
"type": "string",
"description": "ID of the hypothesis (e.g. 'hyp-e5f6g7h8').",
},
"edge_type": {
"type": "string",
"enum": valid_edge_types,
"description": "The edge_type of the relationship.",
},
"reason": {
"type": "string",
"description": "The reason this relationship holds (1-2 sentences).",
},
},
"required": ["phenomenon_id", "hypothesis_id", "edge_type", "reason"],
},
executor=self._link_phenomenon_to_hypothesis,
)
async def _add_hypothesis(self, title: str, description: str) -> str:
hid = await self.graph.add_hypothesis(
title=title,
@@ -98,33 +58,3 @@ class HypothesisAgent(BaseAgent):
created_by=self.name,
)
return f"Hypothesis created: {hid}{title} (confidence: 0.50)"
async def _link_phenomenon_to_hypothesis(
self,
phenomenon_id: str,
hypothesis_id: str,
edge_type: str = "",
reason: str = "",
# Common LLM misnaming — accept as fallbacks
relationship: str = "",
note: str = "",
) -> str:
edge_type = edge_type or relationship
reason = reason or note
if not edge_type:
return "Error: edge_type is required."
try:
new_conf = await self.graph.update_hypothesis_confidence(
hyp_id=hypothesis_id,
phenomenon_id=phenomenon_id,
edge_type=edge_type,
reason=reason,
)
weight = HYPOTHESIS_EDGE_WEIGHTS[edge_type]
direction = "+" if weight > 0 else ""
return (
f"Linked: {phenomenon_id} —[{edge_type}]→ {hypothesis_id} "
f"(weight: {direction}{weight}, new confidence: {new_conf:.3f})"
)
except ValueError as e:
return f"Error linking: {e}"