Go to file

BattleTag ff3a05d7ce feat(strategist) S3: propose_lead / declare_investigation_complete

DESIGN_STRATEGIST.md §2.5. The strategist's two write actions.

propose_lead validates motivating_hypothesis exists in the graph,
validates expected_evidence_type is a real edge type, validates
source_id refers to a real source in the case — fast specific
errors so the strategist gets fixable feedback rather than a
generic crash. On success, calls graph.add_lead with proposed_by=
"strategist" and round_number=graph.current_strategist_round so
the round-completion code can collect this round's leads.

declare_investigation_complete sets graph.strategist_complete_requested
which the orchestrator inspects after each strategist run to decide
whether to break the loop. reason must come from a closed enum so
the audit log is consistent.

EvidenceGraph gains two transient run-context fields:
  current_strategist_round       — set by orchestrator at start of round
  strategist_complete_requested  — flipped by declare_complete

These are intentionally NOT persisted — they're per-run flags, not
graph state.

Both tools required to be in InvestigationStrategist.mandatory_record_
tools (added in S4) so the agent's forced-retry mechanism kicks in if
it returns without taking a documented decision.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-21 02:21:13 -10:00

agents

feat(refit): complete S1-S6 — case abstraction, grounding, log-odds, plugins, coref, multi-source

2026-05-21 02:12:10 -10:00

tests

feat(strategist) S3: propose_lead / declare_investigation_complete

2026-05-21 02:21:13 -10:00

tools

feat(strategist) S2: graph_overview / source_coverage / marginal_yield / budget_status

2026-05-21 02:19:54 -10:00

.python-version

Initial commit

2026-05-09 17:36:26 +08:00

agent_factory.py

feat(refit): complete S1-S6 — case abstraction, grounding, log-odds, plugins, coref, multi-source

2026-05-21 02:12:10 -10:00

base_agent.py

fix(base_agent): forced-retry iter cap 10→30 + narrow tools to record+read

2026-05-21 02:15:08 -10:00

case.example.yaml

feat(refit): complete S1-S6 — case abstraction, grounding, log-odds, plugins, coref, multi-source

2026-05-21 02:12:10 -10:00

case.py

feat(refit): complete S1-S6 — case abstraction, grounding, log-odds, plugins, coref, multi-source

2026-05-21 02:12:10 -10:00

DESIGN.md

feat(refit): complete S1-S6 — case abstraction, grounding, log-odds, plugins, coref, multi-source

2026-05-21 02:12:10 -10:00

evidence_graph.py

feat(strategist) S3: propose_lead / declare_investigation_complete

2026-05-21 02:21:13 -10:00

llm_client.py

feat(strategist) S2: graph_overview / source_coverage / marginal_yield / budget_status

2026-05-21 02:19:54 -10:00

log_config.py

Initial commit

2026-05-09 17:36:26 +08:00

main.py

feat(refit): complete S1-S6 — case abstraction, grounding, log-odds, plugins, coref, multi-source

2026-05-21 02:12:10 -10:00

orchestrator.py

feat(refit): complete S1-S6 — case abstraction, grounding, log-odds, plugins, coref, multi-source

2026-05-21 02:12:10 -10:00

pyproject.toml

feat(refit): complete S1-S6 — case abstraction, grounding, log-odds, plugins, coref, multi-source

2026-05-21 02:12:10 -10:00

README.md

refactor: native tool calling + generic forced-retry + terminal exit

2026-05-13 13:51:19 +08:00

regenerate_report.py

feat(refit): complete S1-S6 — case abstraction, grounding, log-odds, plugins, coref, multi-source

2026-05-21 02:12:10 -10:00

tool_registry.py

feat(strategist) S3: propose_lead / declare_investigation_complete

2026-05-21 02:21:13 -10:00

uv.lock

feat(refit): complete S1-S6 — case abstraction, grounding, log-odds, plugins, coref, multi-source

2026-05-21 02:12:10 -10:00

README.md

MASForensics

Multi-Agent System for Digital Forensics — 基于大语言模型的多智能体电子取证系统。

系统通过 7 个专业化 Agent 协同工作，对磁盘镜像进行自动化取证分析，最终生成结构化的取证报告。Agent 之间不直接通信，通过共享的 EvidenceGraph（证据知识图）协作。

架构

main.py                          入口：配置加载、镜像选择、断连恢复
  │
  ├── Orchestrator               五阶段流水线调度
  │     │
  │     ├── FileSystemAgent      分区/文件系统、目录、删除文件、Prefetch
  │     ├── HypothesisAgent      生成假设，链接已有证据
  │     ├── RegistryAgent        注册表分析（SYSTEM/SOFTWARE/SAM/NTUSER.DAT）
  │     ├── CommunicationAgent   邮件、IRC/mIRC 聊天记录
  │     ├── NetworkAgent         浏览器历史、PCAP 抓包
  │     ├── TimelineAgent        跨类别时间线关联
  │     └── ReportAgent          综合报告生成
  │
  ├── EvidenceGraph              带类型边的证据知识图（自动持久化）
  ├── AgentFactory               角色模板 + 动态 Agent 组合
  ├── ToolRegistry               工具目录 + 结果缓存
  └── LLMClient                  Claude API 客户端（异步、tool-use）

EvidenceGraph：证据知识图

三类节点 + 类型化加权边：

节点	前缀	含义
`Phenomenon`	`ph-*`	可观测的取证产物（一条具体发现）
`Hypothesis`	`hyp-*`	解释性假设（待验证的论断）
`Entity`	`ent-*`	人、程序、主机、IP 等可复现的实体

Phenomenon → Hypothesis 的边类型与权重写死在 HYPOTHESIS_EDGE_WEIGHTS：

TODO

当前流程跑通以后，寻找自适应方案

边类型	权重	语义
`direct_evidence`	+0.25	现象就是假设所述行为本身
`supports`	+0.15	与假设一致但非决定性
`consequence_observed`	+0.15	观察到假设预期的结果
`prerequisite_met`	+0.10	满足假设的前置条件
`weakens`	−0.10	降低假设可能性
`contradicts`	−0.20	直接反驳假设

置信度更新公式（收敛于 [0, 1]）：

正向边：delta = weight * (1 - old_conf)
负向边：delta = weight * old_conf

跨阈值自动转状态：≥ 0.8 → supported，≤ 0.2 → refuted，跑完仍 active → inconclusive。LLM 只负责挑边类型（分类任务），权重表与状态转移由代码裁决，避免数值幻觉。

新增 Phenomenon 时通过 Jaccard 相似度合并（title > 0.6 且 description > 0.4 即视为重复，合并后提升置信度并追加 corroborating_agents），避免同一发现被重复入图。

五阶段流水线

阶段	说明
Phase 1	FileSystemAgent 初勘镜像，识别分区/文件系统/关键路径，产出首批 Phenomenon
Phase 2	假设生成 — 优先读 `config.yaml:hypotheses`；未配置则由 HypothesisAgent 从 Phase 1 现象自动生成 3-7 个
Phase 3	假设驱动调查（默认 5 轮迭代）。每轮：一次性为所有 active 假设产出 leads → 按 agent 类型并发派发（信号量 = 3）→ 一次性判定新现象与各假设的关系。所有假设收敛即提前退出。末尾：失败 lead 重试一次 + Gap Analysis
Phase 4	TimelineAgent 用 `build_filesystem_timeline` 生成 MAC 时间线，与 Phenomenon 时间戳关联
Phase 5	ReportAgent 综合假设、证据、实体，生成 Markdown 报告

Investigation Areas（hypothesis-derived）

Phase 2 末尾 orchestrator 调一次 LLM 从所有 active hypothesis 派生 5-12 个 InvestigationArea（snake_case slug、description、suggested_agent、expected_keywords、expected_tools、priority、motivating_hypothesis_ids）。Areas 存进 graph.investigation_areas，序列化到 runs/<ts>/investigation_areas.json。两个用途：

Phase 3 主循环提示 — 每个 hypothesis 块附 Expected areas: a, b, c，LLM 仍自由选 lead 但有软引导
Phase 3 末尾 Gap Analysis — 两层判定覆盖情况：
- 关键词匹配：扫 Phenomenon 标题/描述对照 area.expected_keywords
- 工具命中：检查 area.expected_tools 是否实际调用过

未覆盖的 area 自动派 lead（suggested_agent + priority + motivating_hypothesis_ids[0] 透传给 Lead.hypothesis_id 保留 provenance），最多 3 轮补漏。

手动 override：config.yaml:investigation_areas 默认注释掉，纯 LLM 派生。取消注释可添加强制必查的领域，会先于 LLM 写入并通过 slug-based dedupe 保护不被覆盖（LLM 只会 augment keyword/tool 列表）。这是跨案件/跨平台适配的关键 —— 不再 hardcode Windows-specific 领域。

Agent 体系

AgentFactory 维护 7 个角色模板（ROLE_TEMPLATES），每个模板指定默认工具集。HypothesisAgent 和 ReportAgent 是 BaseAgent 的子类（额外注册专用工具），其余 5 个 Agent 直接由 BaseAgent + 工具列表生成。

Agent 工作流

BaseAgent.run 在 system prompt 中强制四阶段：

A. INVESTIGATE   先查图状态 / Asset Library，再调取证工具
B. RECORD        每条发现写 add_phenomenon
C. LINK          按需 link_to_entity，但禁止凭记忆引用 ph-id，必须先 list_phenomena
D. ANSWER        以上完成后再给最终答复

prompt 内置反幻觉规则：只允许记录工具输出中逐字出现的内容；时间戳/路径/inode 必须来自工具返回；输出被截断须标 [truncated]。

动态 Agent 组合

AgentFactory.create_specialized_agent() 应对能力缺口：将工具目录与假设描述喂给 LLM，由其挑 3-8 个工具并写角色描述，工厂据此实例化新 Agent 并缓存。

工具系统

tool_registry.py 启动时调用 register_all_tools(image_path, partition_offset, graph)，将所有工具一次性注册到全局 TOOL_CATALOG。

工具结果缓存

CACHEABLE_TOOLS 集合标记纯读取/确定性工具（partition_info、list_directory、parse_registry_key …）。镜像只读，同 args 调用产出固定，命中缓存直接复用，错误结果不入缓存。

Asset Library

EvidenceGraph.asset_library 按 inode 索引所有已提取文件，避免重复 extract。Agent 通过 list_assets / find_extracted_file 工具查询。新文件按文件名自动归类到 registry_hive / chat_log / prefetch / network_capture / recycle_bin 等十类之一。

取证工具链

Sleuth Kit（磁盘取证） — 异步子进程调用 TSK：

工具	用途
`mmls`	分区表分析
`fsstat`	文件系统元数据
`fls`	目录列举（含已删除文件）
`icat`	按 inode 提取文件
`srch_strings`	磁盘字符串搜索
`fls -m`	MAC 时间线生成

regipy（注册表解析） — 直接读 SYSTEM / SOFTWARE / SAM / NTUSER.DAT 二进制，提取系统信息、用户账户、网络配置、已安装软件、邮件账户、关机时间等。

文件解析器 — Prefetch 二进制（.pf）、PCAP 字符串提取（HTTP 请求 / Host / Cookie / UA）、通用文本与二进制读取、正则搜索、Hex dump。

断连恢复与运行归档

三层防护：

EvidenceGraph 自动持久化 — 每次 add_phenomenon / add_hypothesis / add_edge / add_lead 等写操作均自动落盘（原子写 .tmp 后 rename）
Agent 级容错 — 单 Agent 失败 → 该 lead 标 failed，连续 3 次失败触发 AnalysisAborted 优雅退出；Phase 3 末尾对失败 lead 重试一次（retry=True 防无限循环）
续跑 — main.py 启动时扫 runs/*/graph_state.json，发现存在但缺 run_metadata.json 的目录即提示恢复，并按 graph 当前状态决定从哪一阶段续起

运行归档目录

runs/
  2026-04-02T14-30-00/
    config.yaml                    配置快照
    graph_state.json               实时图状态（续跑用）
    phenomena.json                 现象导出
    hypotheses.json                假设 + 置信度日志
    entities.json                  实体
    edges.json                     边
    leads.json                     线索及最终状态
    extracted/                     从镜像提取的文件
    <image>_forensic_report.md     取证报告
    run_metadata.json              运行元数据（时长、统计、错误）
    masforensics.log               运行日志

快速开始

环境要求

Python >= 3.14
The Sleuth Kit（系统安装，提供 mmls、fls、icat 等命令）
磁盘镜像文件

安装

uv sync

配置

编辑 config.yaml：

agent:
  base_url: "https://your-api-proxy.com"
  api_key: "sk-your-key"
  model: "claude-sonnet-4-6"
  max_tokens: 16384

max_investigation_rounds: 5          # Phase 3 最大迭代轮数

# hypotheses:                        # 可选：手动指定初始假设
#   - title: "嫌疑人主动实施网络嗅探"
#     description: "..."

# investigation_areas:                 # 可选：手动 override（默认全 LLM 派生）
#   - area: shutdown_time              #         LLM 通过 slug dedupe 只 augment
#     agent: registry                  #         keyword/tool 列表，不覆盖 manual
#     priority: 3
#     keywords: [shutdown]
#     tools: [get_shutdown_time]

未配置 hypotheses 时由 HypothesisAgent 自动生成。

运行

python main.py                       # 交互式选镜像与分区
python main.py /path/to/image/dir    # 指定镜像目录

中断后再次运行会自动检测未完成的 run 并提示是否续跑。

仅重生成报告

跑完一次后若只想换提示词或修复报告：

python regenerate_report.py runs/<timestamp>

跳过 Phase 1-4，直接从已有 graph_state.json 重跑 ReportAgent。

项目结构

MASForensics/
├── main.py                  入口、镜像选择、断连恢复
├── orchestrator.py          五阶段流水线调度
├── evidence_graph.py        证据知识图 + 边权重表 + 持久化
├── base_agent.py            Agent 基类 + 内建 graph 工具
├── agent_factory.py         角色模板 + 动态 Agent 组合
├── tool_registry.py         工具目录 + 结果缓存 + 自动归类
├── llm_client.py            LLM API 客户端
├── log_config.py            彩色终端日志 + 文件日志
├── regenerate_report.py     从已有 graph_state 重生成报告
├── config.yaml              配置 + 调查领域 + 可选假设
├── agents/
│   ├── hypothesis.py        HypothesisAgent（add_hypothesis、link）
│   ├── report.py            ReportAgent（综合报告，自带读取工具）
│   ├── timeline.py          TimelineAgent（保留以备扩展）
│   └── ...                  filesystem/registry/communication/network（同上）
├── tools/
│   ├── sleuthkit.py         TSK 异步封装
│   ├── registry.py          regipy 解析
│   └── parsers.py           Prefetch / PCAP / 通用文件解析
├── image/                   磁盘镜像（用户放）
├── runs/                    运行归档
└── tests/
    └── test_optimizations.py

依赖

包	用途
`httpx[socks]`	异步 HTTP 客户端（支持 SOCKS 代理）
`pyyaml`	配置文件解析
`regipy`	Windows 注册表 hive 解析
`pytest` / `pytest-asyncio`	测试

默认案例

CFReDS Hacking Case（NIST 标准取证教学镜像）：

镜像：SCHARDT.001（~4.6 GB，IBM 硬盘，8 个分段）
系统：Windows XP
场景：涉嫌黑客入侵的计算机取证分析
完整镜像 MD5：AEE4FCD9301C03B3B054623CA261959A（config.yaml 含各分段 MD5 用于校验）

测试

python -m pytest tests/ -v

README.md Unescape Escape

MASForensics

架构

EvidenceGraph：证据知识图

TODO

五阶段流水线

Investigation Areas（hypothesis-derived）

Agent 体系

Agent 工作流

动态 Agent 组合

工具系统

工具结果缓存

Asset Library

取证工具链

断连恢复与运行归档

运行归档目录

快速开始

环境要求

安装

配置

运行

仅重生成报告

项目结构

依赖

默认案例

测试

README.md