feat(strategist) S2: graph_overview / source_coverage / marginal_yield / budget_status
DESIGN_STRATEGIST.md §2. Four read-only view tools the strategist uses
to ground its decision each round.
graph_overview() — hypotheses table (log_odds, conf, edges_in,
distinct_sources, recent_flip), sources table,
pending leads. distinct_sources is the
critical signal: a hypothesis with 23 edges
but only 1 distinct_source has fragile cross-
source independence and is a candidate for
a corroboration-seeking lead.
source_coverage(src) — per-source ✓/✗ against an expected-artefact
catalogue. Catalogue is heuristic hints,
NOT a forced checklist. Footer reminds the
strategist to investigate ✗ items only when
an active hypothesis depends on them — this
is the "应试能力存在但不被绑死" guardrail.
marginal_yield(N) — new phenomena / edges / status flips per
recent round. Two consecutive zero-yield
rounds = strong signal to declare complete.
budget_status() — usage vs caps (tool_calls, rounds, wall
clock). Pacing warnings at 70% / 90%.
tools/strategy.py also exports EXPECTED_ARTEFACTS, a per-source-type
table of (name, detector, value_for) entries. Detectors are
substring patterns on tool name + args; the matcher resolves at
call time against graph.tool_invocations. Catalogue covers iOS /
Android / Windows disk / media-collection / archive source types.
All four tools registered in tool_registry, listed as read-only in
llm_client.READ_ONLY_TOOLS for parallel execution. They go through
the invocation-logging wrapper so the strategist's reads are
themselves auditable (the wrapper does NOT cache them — graph
state changes between calls).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -24,6 +24,7 @@ from tools import mobile_ios as ios
|
||||
from tools import parsers
|
||||
from tools import registry as reg
|
||||
from tools import sleuthkit as tsk
|
||||
from tools import strategy as strat
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -985,6 +986,97 @@ def register_all_tools(graph: Any) -> None:
|
||||
tags=["media", "ocr", "image"],
|
||||
)
|
||||
|
||||
# ---- Strategist-loop view tools (DESIGN_STRATEGIST.md §2) ----
|
||||
# Pure read-only renders over graph state. The strategist agent uses
|
||||
# these to decide whether to keep investigating or to declare complete.
|
||||
# They go through invocation logging like every other tool (so the
|
||||
# strategist's reads are auditable) but are NOT cacheable — graph
|
||||
# state changes between calls and a stale snapshot would mislead.
|
||||
|
||||
async def _exec_graph_overview() -> str:
|
||||
return strat.graph_overview(graph)
|
||||
|
||||
TOOL_CATALOG["graph_overview"] = ToolDefinition(
|
||||
name="graph_overview",
|
||||
description=(
|
||||
"Top-level investigation state: hypotheses (with log-odds, "
|
||||
"confidence, edges_in, distinct_sources contributing, recent "
|
||||
"status flips), sources (phenomena/identity counts, last-touched "
|
||||
"round), and pending leads. Always call this first when deciding "
|
||||
"the next strategist action."
|
||||
),
|
||||
input_schema={"type": "object", "properties": {}},
|
||||
executor=_exec_graph_overview,
|
||||
module="strategy",
|
||||
tags=["strategy", "overview", "read-only"],
|
||||
)
|
||||
|
||||
async def _exec_source_coverage(source_id: str) -> str:
|
||||
return strat.source_coverage(graph, source_id)
|
||||
|
||||
TOOL_CATALOG["source_coverage"] = ToolDefinition(
|
||||
name="source_coverage",
|
||||
description=(
|
||||
"Per-source artefact coverage report: which expected categories "
|
||||
"have been touched (✓) vs not (✗) on the given source. Coverage "
|
||||
"items are heuristic hints, not requirements — investigate ✗ "
|
||||
"items only when an active hypothesis depends on them."
|
||||
),
|
||||
input_schema={
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"source_id": {"type": "string", "description": "Source id, e.g. 'src-ios-chan'."},
|
||||
},
|
||||
"required": ["source_id"],
|
||||
},
|
||||
executor=_exec_source_coverage,
|
||||
module="strategy",
|
||||
tags=["strategy", "coverage", "read-only"],
|
||||
)
|
||||
|
||||
async def _exec_marginal_yield(last_n_rounds: int = 2) -> str:
|
||||
return strat.marginal_yield(graph, int(last_n_rounds))
|
||||
|
||||
TOOL_CATALOG["marginal_yield"] = ToolDefinition(
|
||||
name="marginal_yield",
|
||||
description=(
|
||||
"How much information the last N investigation rounds added: "
|
||||
"new phenomena, new edges, and hypothesis status flips per round. "
|
||||
"Two consecutive zero-yield rounds means diminishing returns are "
|
||||
"decisive — declare_investigation_complete with reason "
|
||||
"marginal_yield_zero."
|
||||
),
|
||||
input_schema={
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"last_n_rounds": {"type": "integer", "description": "How many recent rounds to summarise (default 2)."},
|
||||
},
|
||||
},
|
||||
executor=_exec_marginal_yield,
|
||||
module="strategy",
|
||||
tags=["strategy", "yield", "read-only"],
|
||||
)
|
||||
|
||||
async def _exec_budget_status() -> str:
|
||||
return strat.budget_status(
|
||||
graph,
|
||||
getattr(graph, "budgets", None),
|
||||
getattr(graph, "run_start_monotonic", None),
|
||||
)
|
||||
|
||||
TOOL_CATALOG["budget_status"] = ToolDefinition(
|
||||
name="budget_status",
|
||||
description=(
|
||||
"Budget vs caps: tool_calls, strategist_rounds, wall_clock_minutes. "
|
||||
"Includes pacing hints when usage crosses 70% / 90% thresholds. "
|
||||
"Use this to decide whether to keep proposing leads or to wind down."
|
||||
),
|
||||
input_schema={"type": "object", "properties": {}},
|
||||
executor=_exec_budget_status,
|
||||
module="strategy",
|
||||
tags=["strategy", "budget", "read-only"],
|
||||
)
|
||||
|
||||
# ---- Wrap every executor with invocation logging (+ cache + auto-record) ----
|
||||
# Must run AFTER all tools are registered. Every tool call now produces
|
||||
# a ToolInvocation entry on the graph (provenance for grounding), and
|
||||
|
||||
Reference in New Issue
Block a user