feat(strategist) S2: graph_overview / source_coverage / marginal_yield / budget_status

DESIGN_STRATEGIST.md §2. Four read-only view tools the strategist uses
to ground its decision each round.

  graph_overview()      — hypotheses table (log_odds, conf, edges_in,
                          distinct_sources, recent_flip), sources table,
                          pending leads. distinct_sources is the
                          critical signal: a hypothesis with 23 edges
                          but only 1 distinct_source has fragile cross-
                          source independence and is a candidate for
                          a corroboration-seeking lead.
  source_coverage(src)  — per-source ✓/✗ against an expected-artefact
                          catalogue. Catalogue is heuristic hints,
                          NOT a forced checklist. Footer reminds the
                          strategist to investigate ✗ items only when
                          an active hypothesis depends on them — this
                          is the "应试能力存在但不被绑死" guardrail.
  marginal_yield(N)     — new phenomena / edges / status flips per
                          recent round. Two consecutive zero-yield
                          rounds = strong signal to declare complete.
  budget_status()       — usage vs caps (tool_calls, rounds, wall
                          clock). Pacing warnings at 70% / 90%.

tools/strategy.py also exports EXPECTED_ARTEFACTS, a per-source-type
table of (name, detector, value_for) entries. Detectors are
substring patterns on tool name + args; the matcher resolves at
call time against graph.tool_invocations. Catalogue covers iOS /
Android / Windows disk / media-collection / archive source types.

All four tools registered in tool_registry, listed as read-only in
llm_client.READ_ONLY_TOOLS for parallel execution. They go through
the invocation-logging wrapper so the strategist's reads are
themselves auditable (the wrapper does NOT cache them — graph
state changes between calls).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
BattleTag
2026-05-21 02:19:54 -10:00
parent ca96f29849
commit 6ebbc675c1
4 changed files with 660 additions and 0 deletions

View File

@@ -24,6 +24,7 @@ from tools import mobile_ios as ios
from tools import parsers
from tools import registry as reg
from tools import sleuthkit as tsk
from tools import strategy as strat
logger = logging.getLogger(__name__)
@@ -985,6 +986,97 @@ def register_all_tools(graph: Any) -> None:
tags=["media", "ocr", "image"],
)
# ---- Strategist-loop view tools (DESIGN_STRATEGIST.md §2) ----
# Pure read-only renders over graph state. The strategist agent uses
# these to decide whether to keep investigating or to declare complete.
# They go through invocation logging like every other tool (so the
# strategist's reads are auditable) but are NOT cacheable — graph
# state changes between calls and a stale snapshot would mislead.
async def _exec_graph_overview() -> str:
return strat.graph_overview(graph)
TOOL_CATALOG["graph_overview"] = ToolDefinition(
name="graph_overview",
description=(
"Top-level investigation state: hypotheses (with log-odds, "
"confidence, edges_in, distinct_sources contributing, recent "
"status flips), sources (phenomena/identity counts, last-touched "
"round), and pending leads. Always call this first when deciding "
"the next strategist action."
),
input_schema={"type": "object", "properties": {}},
executor=_exec_graph_overview,
module="strategy",
tags=["strategy", "overview", "read-only"],
)
async def _exec_source_coverage(source_id: str) -> str:
return strat.source_coverage(graph, source_id)
TOOL_CATALOG["source_coverage"] = ToolDefinition(
name="source_coverage",
description=(
"Per-source artefact coverage report: which expected categories "
"have been touched (✓) vs not (✗) on the given source. Coverage "
"items are heuristic hints, not requirements — investigate ✗ "
"items only when an active hypothesis depends on them."
),
input_schema={
"type": "object",
"properties": {
"source_id": {"type": "string", "description": "Source id, e.g. 'src-ios-chan'."},
},
"required": ["source_id"],
},
executor=_exec_source_coverage,
module="strategy",
tags=["strategy", "coverage", "read-only"],
)
async def _exec_marginal_yield(last_n_rounds: int = 2) -> str:
return strat.marginal_yield(graph, int(last_n_rounds))
TOOL_CATALOG["marginal_yield"] = ToolDefinition(
name="marginal_yield",
description=(
"How much information the last N investigation rounds added: "
"new phenomena, new edges, and hypothesis status flips per round. "
"Two consecutive zero-yield rounds means diminishing returns are "
"decisive — declare_investigation_complete with reason "
"marginal_yield_zero."
),
input_schema={
"type": "object",
"properties": {
"last_n_rounds": {"type": "integer", "description": "How many recent rounds to summarise (default 2)."},
},
},
executor=_exec_marginal_yield,
module="strategy",
tags=["strategy", "yield", "read-only"],
)
async def _exec_budget_status() -> str:
return strat.budget_status(
graph,
getattr(graph, "budgets", None),
getattr(graph, "run_start_monotonic", None),
)
TOOL_CATALOG["budget_status"] = ToolDefinition(
name="budget_status",
description=(
"Budget vs caps: tool_calls, strategist_rounds, wall_clock_minutes. "
"Includes pacing hints when usage crosses 70% / 90% thresholds. "
"Use this to decide whether to keep proposing leads or to wind down."
),
input_schema={"type": "object", "properties": {}},
executor=_exec_budget_status,
module="strategy",
tags=["strategy", "budget", "read-only"],
)
# ---- Wrap every executor with invocation logging (+ cache + auto-record) ----
# Must run AFTER all tools are registered. Every tool call now produces
# a ToolInvocation entry on the graph (provenance for grounding), and