Go to file

BattleTag 097d2ce472 Initial commit

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-09 17:36:26 +08:00

agents

Initial commit

2026-05-09 17:36:26 +08:00

tools

Initial commit

2026-05-09 17:36:26 +08:00

.python-version

Initial commit

2026-05-09 17:36:26 +08:00

agent_factory.py

Initial commit

2026-05-09 17:36:26 +08:00

base_agent.py

Initial commit

2026-05-09 17:36:26 +08:00

evidence_graph.py

Initial commit

2026-05-09 17:36:26 +08:00

llm_client.py

Initial commit

2026-05-09 17:36:26 +08:00

log_config.py

Initial commit

2026-05-09 17:36:26 +08:00

main.py

Initial commit

2026-05-09 17:36:26 +08:00

orchestrator.py

Initial commit

2026-05-09 17:36:26 +08:00

pyproject.toml

Initial commit

2026-05-09 17:36:26 +08:00

README.md

Initial commit

2026-05-09 17:36:26 +08:00

regenerate_report.py

Initial commit

2026-05-09 17:36:26 +08:00

tool_registry.py

Initial commit

2026-05-09 17:36:26 +08:00

uv.lock

Initial commit

2026-05-09 17:36:26 +08:00

README.md

MASForensics

Multi-Agent System for Digital Forensics — 基于大语言模型的多智能体电子取证系统。

系统通过 6 个专业化 Agent 协同工作，对磁盘镜像进行自动化取证分析，最终生成结构化的取证报告。

架构

main.py                          入口：配置加载、恢复检测、运行管理
  │
  ├── Orchestrator               四阶段流水线调度
  │     │
  │     ├── FileSystemAgent      磁盘结构、文件系统、删除文件、Prefetch
  │     ├── RegistryAgent        注册表分析（系统/用户/网络/软件）
  │     ├── CommunicationAgent   邮件、IRC 聊天记录
  │     ├── NetworkAgent         浏览器历史、PCAP 抓包
  │     ├── TimelineAgent        跨类别时间线关联
  │     └── ReportAgent          综合报告生成
  │
  ├── Blackboard                 共享知识库（Evidence + Lead）
  └── LLMClient                  Claude API 调用（ReAct 模式）

Agent 之间不直接通信，通过 Blackboard（黑板） 共享发现（Evidence）和线索（Lead）。

调查流程

阶段	说明
Phase 1	FileSystemAgent 勘查磁盘镜像，识别分区、目录结构、关键文件，产出初始 Lead
Phase 2	多轮线索追踪 — Lead 按 Agent 类型分组并行派发，最多 10 轮迭代
Phase 2.5	覆盖率缺口分析 — 对照 config.yaml 中的 10 个调查领域，自动补漏
Phase 3	TimelineAgent 综合所有 evidence 建立事件时间线
Phase 4	ReportAgent 生成 Markdown 格式取证报告

取证工具链

Sleuth Kit（磁盘取证）

通过异步子进程调用 TSK 命令行工具：

工具	用途
`mmls`	分区表分析
`fsstat`	文件系统元数据
`fls`	目录列举（含已删除文件）
`icat`	按 inode 提取文件
`srch_strings`	磁盘字符串搜索
`fls -m`	MAC 时间线生成

regipy（注册表解析）

直接解析 Windows 注册表 hive 二进制文件（SYSTEM、SOFTWARE、SAM、NTUSER.DAT），提取系统信息、用户账户、网络配置、已安装软件、邮件账户、关机时间等。

文件解析器

Prefetch — 二进制解析 Windows XP .pf 文件（运行次数、最后执行时间）
PCAP — 从抓包文件提取 HTTP 请求、Host、Cookie、User-Agent
通用文本/二进制 — 按偏移读取、正则搜索、Hex dump

断连恢复与数据归档

系统设计了三层防护，应对长时间运行中的网络中断：

Blackboard 自动持久化 — 每次 add_evidence / add_lead 自动写盘（原子写入）
Agent 级容错 — 单个 Agent 失败标记 Lead 为 failed，不影响其他 Agent，自动重试一次
优雅退出 — 连续 3 次 Agent 失败后保存现有成果并干净退出

每次运行自动创建带时间戳的归档目录：

runs/
  2026-04-02T14-30-00/
    config.yaml              配置快照
    blackboard_state.json    实时状态（用于恢复）
    evidence.json            结构化证据导出
    leads.json               线索及最终状态
    report.md                取证报告
    run_metadata.json        运行元数据（时长、统计、错误）
    masforensics.log         运行日志

中断后再次运行 python main.py，系统自动检测未完成的运行并提示恢复。

快速开始

环境要求

Python >= 3.14
The Sleuth Kit（系统安装，提供 mmls、fls、icat 等命令）
磁盘镜像文件置于 image/ 目录

安装

uv sync

配置

编辑 config.yaml，填入 LLM API 地址和密钥：

agent:
  base_url: "https://your-api-proxy.com"
  api_key: "sk-your-key"
  model: "claude-sonnet-4-6"
  max_tokens: 4096

investigation_areas 部分定义了必须覆盖的调查领域，可按需增减。

运行

python main.py

报告和所有结构化数据将保存在 runs/<timestamp>/ 目录下。

项目结构

MASForensics/
├── main.py              入口
├── orchestrator.py      流水线调度
├── blackboard.py        共享知识库
├── llm_client.py        LLM API 客户端
├── base_agent.py        Agent 基类
├── config.yaml          配置文件
├── agents/
│   ├── filesystem.py    文件系统 Agent
│   ├── registry.py      注册表 Agent
│   ├── communication.py 通信 Agent
│   ├── network.py       网络 Agent
│   ├── timeline.py      时间线 Agent
│   └── report.py        报告 Agent
├── tools/
│   ├── sleuthkit.py     Sleuth Kit 封装
│   ├── registry.py      注册表解析（regipy）
│   └── parsers.py       文件格式解析器
├── image/               磁盘镜像
├── extracted/           提取的文件（运行时生成）
└── runs/                运行归档

依赖

包	用途
`httpx[socks]`	异步 HTTP 客户端（支持 SOCKS 代理）
`pyyaml`	配置文件解析
`regipy`	Windows 注册表 hive 解析

当前案例

默认配置分析 CFReDS Hacking Case（NIST 标准取证教学镜像）：

镜像：SCHARDT.001（~4.6GB，IBM 硬盘，8 个分段）
系统：Windows XP
场景：涉嫌黑客入侵的计算机取证分析

测试

python -m pytest tests/ -v

README.md Unescape Escape

MASForensics

架构

调查流程

取证工具链

Sleuth Kit（磁盘取证）

regipy（注册表解析）

文件解析器

断连恢复与数据归档

快速开始

环境要求

安装

配置

运行

项目结构

依赖

当前案例

测试

README.md