Compare commits

8 Commits

Author SHA1 Message Date
BattleTag
444d58726a refactor: native tool calling + generic forced-retry + terminal exit
- llm_client: switch tool_call_loop from text-based <tool_call> regex
  to OpenAI-native tools=[...] / structured tool_calls field; accumulate
  delta.reasoning_content for DeepSeek thinking-mode echo-back; fold
  preserves system msg and aligns boundary to never orphan role:tool
- base_agent: generic forced-retry via mandatory_record_tools class attr
  (filesystem -> add_phenomenon, timeline -> add_temporal_edge,
  hypothesis -> add_hypothesis, report -> save_report); count via
  executor wrapper
- terminal_tools class attr + loop short-circuit: when a terminal tool
  is called, loop exits with its raw return as final_text. ReportAgent
  declares save_report as terminal - replaces the <answer>-tag stop
  signal that native tool calling broke
- _execute_*: return (raw, formatted) - terminal exit uses untruncated
  raw, conversation history uses 3000-char-capped formatted
- evidence_graph + orchestrator: LLM-derived InvestigationArea support
  (hypothesis-driven coverage check, replaces hardcoded _AREA_KEYWORDS /
  _AREA_TOOLS); manual yaml block kept as optional seed
- strip <answer> references from agent prompts (no longer load-bearing)

Verified on CFReDS image across 4 smoke runs: 0 JSON parse failures
(was 3); 22 temporal edges from Phase 4 (was 0); ReportAgent exits via
save_report (was max_iterations regression). 78/78 unit tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:51:19 +08:00
BattleTag
0a2b344c84 fix: share _safe_json_loads with tool-call parser, not just orchestrator
Move _safe_json_loads from orchestrator.py to llm_client.py and have
_extract_tool_calls use it when parsing <tool_call> JSON blocks from
model output. orchestrator now imports it from llm_client.

Background: in the first full DeepSeek run (runs/2026-05-12T17-25-38),
~10 'Failed to parse tool call JSON' warnings appeared, all from regex
patterns where the LLM wrote \. or \* inside JSON string values:

  Failed to parse tool call JSON: {..., "pattern": "Outlook Express|...|\.dbx"}
  Failed to parse tool call JSON: {..., "pattern": "ethereal.*\.pcap"}
  Failed to parse tool call JSON: {..., "pattern": "lookatlan.*\.txt|..."}

These are exactly the kind of stray-backslash errors stage-1 sanitize
already handles for orchestrator JSON calls — but tool-call extraction
was using bare json.loads. Result: each failed tool call silently dropped
on the floor, the LLM never got a result, and at least one network agent
burned 14m26s spinning before hitting max_iterations=40.

Now the sanitize/log-on-failure path is shared. Verified against the
three failure cases from yesterday's log: all three now parse cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 20:29:21 +08:00
BattleTag
76df34ed79 docs: add TODO marker for adaptive edge weights
Note that the hard-coded HYPOTHESIS_EDGE_WEIGHTS table is a temporary
choice; an adaptive scheme should be explored once the full pipeline
is stable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:14:23 +08:00
BattleTag
893f5b5de2 fix: address agent boundary / JSON robustness / Phase 4 no-op from CFReDS run
Issues found running the system end-to-end on the NIST CFReDS Hacking Case
disk image (SCHARDT.001, Mr. Evil). Four interconnected fixes:

1. HypothesisAgent boundary leak (two layers)
   B.1 Tool set: BaseAgent._register_graph_tools was registering
       add_phenomenon / add_lead / link_to_entity for every agent. With
       an empty graph in Phase 2, HypothesisAgent "compensated" by
       inventing phenomena, dispatching leads, and linking entities.
   B.2 Prompt leak: BaseAgent's shared system prompt hard-coded "Call
       investigation tools (list_directory, parse_registry_key, etc.)".
       HypothesisAgent hallucinated list_directory and wasted 2 LLM
       rounds on 'unknown tool' errors before backing off.

   Fix:
   - Split _register_graph_tools into _register_graph_read_tools +
     _register_graph_write_tools.
   - HypothesisAgent, ReportAgent, TimelineAgent override
     _register_graph_tools to skip write tools.
   - HypothesisAgent and TimelineAgent override _build_system_prompt
     with focused, role-specific workflows (no Phase A-D investigation
     boilerplate).

2. JSON parse failures in Phase 3 lead generation (5/6 hypotheses lost)
   DeepSeek emits JSON with stray backslashes (Windows path references)
   and occasional minor syntax slips. Old single-stage sanitize couldn't
   recover; per-hypothesis fallback silently swallowed each failure.

   Fix:
   - _safe_json_loads: progressive — stage 0 as-is, stage 1 escape stray
     \X (anything not in valid JSON escape set), log raw input on final
     failure for diagnosis.
   - New _call_llm_for_json helper: on parse failure, append the error
     to the prompt and re-call LLM (self-correcting retry, up to 2).
   - All 4 LLM-JSON callsites in orchestrator refactored to use it.

3. Phase 1 sometimes skipped add_phenomenon (LLM treated <answer> as deliverable)
   Strengthen BaseAgent's RECORDING REQUIREMENT — explicit "your <answer>
   is DISCARDED; only graph mutations propagate" plus a new rule:
   negative findings (searched X, found nothing) MUST also be recorded
   as phenomena, since they constrain the hypothesis space.

4. Phase 4 Timeline was a no-op
   TimelineAgent inherited BaseAgent's Phase A-D prompt and never called
   add_temporal_edge — produced 0 temporal edges. Override the prompt
   with concrete workflow (build_filesystem_timeline ->
   get_timestamped_phenomena -> 15-40 add_temporal_edge calls) and
   restrict tool set to read-only + its 3 temporal tools.

Verified end-to-end: HypothesisAgent now 8 tools (no writes), ReportAgent
13 (no graph writes), TimelineAgent 10 (read + temporal + timeline).
All 60 unit tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:14:16 +08:00
BattleTag
0a966d8476 feat: switch LLM client to OpenAI SDK for DeepSeek compatibility
The previous LLMClient used raw httpx + Claude Messages API (/v1/messages,
x-api-key, Anthropic SSE event types). Incompatible with DeepSeek.

Rewrite LLMClient.__init__/chat/close to use openai.AsyncOpenAI:
- /v1/chat/completions endpoint, OpenAI message format
- Bearer auth, native SDK error types
- Stream chunks via async for + chunk.choices[0].delta.content

Tool calling protocol (ReAct text-based tags) and all surrounding helpers
(_apply_progressive_decay, _fold_old_messages, _partition_tool_calls,
tool_call_loop, etc.) are unchanged — endpoint-agnostic by design.

New optional config params surfaced to config.yaml.agent:
- reasoning_effort: "high" | "medium" | "low" — DeepSeek/o1-style depth
- thinking_enabled: bool — DeepSeek extra_body.thinking switch

main.py and regenerate_report.py pass these through to LLMClient.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:13:54 +08:00
BattleTag
31812a72ee test: track tests/ directory in version control
tests/test_optimizations.py — 60 pytest cases covering:
- EvidenceGraph: quality scoring, Jaccard merge, async safety,
  hypothesis confidence updates, asset library
- llm_client: tool-result truncation, parallel batch execution,
  progressive context decay, message folding
- orchestrator: parallel dispatch, batched lead generation,
  batched judging
- tool_registry: result cache key derivation

FakeAgent.run signatures updated to BaseAgent.run(task, lead_id=None).

Previously listed in .gitignore (which is itself untracked, so the
ignore rule lives only locally).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 14:10:31 +08:00
BattleTag
74e6bde13a refactor: lead provenance, unified link path, SSOT cleanup, configurable weights
Five interrelated cleanups:

1. Lead -> Phenomenon provenance
   - Phenomenon.from_lead_id field on the dataclass
   - BaseAgent.run(lead_id=...) writes self._current_lead_id
   - _add_phenomenon auto-injects from agent state (LLM unaware)
   - Orchestrator dispatch passes lead.id; Phase 1/2-auto/4/5 stay None
   - Merge path preserves the first non-None lead_id on collision

2. Unified Phenomenon <-> Hypothesis link path
   - HypothesisAgent only adds hypotheses, never links
   - link_phenomenon_to_hypothesis tool + executor removed
   - All links go through Orchestrator._judge_new_phenomena
   - Phase 2 unconditionally judges after hypothesis generation
   - Gap Analysis judges after each dispatch round
   (Three previously-missing judge calls now in place.)

3. SSOT in agent subclasses
   - Remove RoleTemplate dataclass, ROLE_TEMPLATES dict,
     _instantiate_from_template method
   - Each agent subclass owns name, role, and tool list
   - agent_factory.py shrinks from 299 to 153 lines
   - All 7 agents now route through _AGENT_CLASSES (filesystem,
     registry, communication, network, timeline were previously dead
     subclasses overridden by templates)

4. Configurable edge weights
   - HYPOTHESIS_EDGE_WEIGHTS -> _DEFAULT_EDGE_WEIGHTS (private default)
   - EvidenceGraph(edge_weights=...) override via config.yaml
   - hypothesis_edge_weights section in config.yaml (commented example)
   - main.py and regenerate_report.py read and pass through

5. regenerate_report.py auto-picks the latest run/*/graph_state.json
   when no CLI arg is given (was a hardcoded date path)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 14:10:15 +08:00
BattleTag
fde96c7d9f docs: rewrite README for EvidenceGraph + 5-phase + 7-agent architecture
Previous README described a Blackboard-based 4-phase, 6-agent system.
The actual code uses:
- EvidenceGraph with typed weighted edges (Phenomenon/Hypothesis/Entity)
- 5 phases (explicit Hypothesis Generation between survey and investigation)
- 7 agents (added HypothesisAgent)

Documents the confidence update formula, Phenomenon Jaccard merging,
Asset Library inode dedup, tool-result caching, Gap Analysis coverage
check, auto-persistence, and the resume mechanism.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 14:09:59 +08:00
14 changed files with 2992 additions and 713 deletions

238
README.md
View File

@@ -2,43 +2,120 @@
Multi-Agent System for Digital Forensics — 基于大语言模型的多智能体电子取证系统。
系统通过 6 个专业化 Agent 协同工作,对磁盘镜像进行自动化取证分析,最终生成结构化的取证报告。
系统通过 7 个专业化 Agent 协同工作,对磁盘镜像进行自动化取证分析,最终生成结构化的取证报告。Agent 之间不直接通信,通过共享的 **EvidenceGraph**(证据知识图)协作。
## 架构
```
main.py 入口:配置加载、恢复检测、运行管理
main.py 入口:配置加载、镜像选择、断连恢复
├── Orchestrator 阶段流水线调度
├── Orchestrator 阶段流水线调度
│ │
│ ├── FileSystemAgent 磁盘结构、文件系统、删除文件、Prefetch
│ ├── RegistryAgent 注册表分析(系统/用户/网络/软件)
│ ├── CommunicationAgent 邮件、IRC 聊天记录
│ ├── FileSystemAgent 分区/文件系统、目录、删除文件、Prefetch
│ ├── HypothesisAgent 生成假设,链接已有证据
│ ├── RegistryAgent 注册表分析SYSTEM/SOFTWARE/SAM/NTUSER.DAT
│ ├── CommunicationAgent 邮件、IRC/mIRC 聊天记录
│ ├── NetworkAgent 浏览器历史、PCAP 抓包
│ ├── TimelineAgent 跨类别时间线关联
│ └── ReportAgent 综合报告生成
├── Blackboard 共享知识库Evidence + Lead
── LLMClient Claude API 调用ReAct 模式)
├── EvidenceGraph 带类型边的证据知识图(自动持久化
── AgentFactory 角色模板 + 动态 Agent 组合
├── ToolRegistry 工具目录 + 结果缓存
└── LLMClient Claude API 客户端异步、tool-use
```
Agent 之间不直接通信,通过 **Blackboard黑板** 共享发现Evidence和线索Lead
## EvidenceGraph证据知识图
## 调查流程
三类节点 + 类型化加权边:
| 节点 | 前缀 | 含义 |
|---|---|---|
| `Phenomenon` | `ph-*` | 可观测的取证产物(一条具体发现) |
| `Hypothesis` | `hyp-*` | 解释性假设(待验证的论断) |
| `Entity` | `ent-*` | 人、程序、主机、IP 等可复现的实体 |
Phenomenon → Hypothesis 的边类型与权重写死在 `HYPOTHESIS_EDGE_WEIGHTS`
# TODO
当前流程跑通以后,寻找自适应方案
| 边类型 | 权重 | 语义 |
|---|---:|---|
| `direct_evidence` | +0.25 | 现象就是假设所述行为本身 |
| `supports` | +0.15 | 与假设一致但非决定性 |
| `consequence_observed` | +0.15 | 观察到假设预期的结果 |
| `prerequisite_met` | +0.10 | 满足假设的前置条件 |
| `weakens` | 0.10 | 降低假设可能性 |
| `contradicts` | 0.20 | 直接反驳假设 |
置信度更新公式(收敛于 [0, 1]
- 正向边:`delta = weight * (1 - old_conf)`
- 负向边:`delta = weight * old_conf`
跨阈值自动转状态:≥ 0.8 → `supported`,≤ 0.2 → `refuted`,跑完仍 active → `inconclusive`。LLM 只负责挑边类型(分类任务),权重表与状态转移由代码裁决,避免数值幻觉。
新增 Phenomenon 时通过 Jaccard 相似度合并title > 0.6 且 description > 0.4 即视为重复,合并后提升置信度并追加 `corroborating_agents`),避免同一发现被重复入图。
## 五阶段流水线
| 阶段 | 说明 |
|------|------|
| **Phase 1** | FileSystemAgent 勘查磁盘镜像,识别分区、目录结构、关键文件,产出初始 Lead |
| **Phase 2** | 多轮线索追踪 — Lead 按 Agent 类型分组并行派发,最多 10 轮迭代 |
| **Phase 2.5** | 覆盖率缺口分析 — 对照 config.yaml 中的 10 个调查领域,自动补漏 |
| **Phase 3** | TimelineAgent 综合所有 evidence 建立事件时间线 |
| **Phase 4** | ReportAgent 生成 Markdown 格式取证报告 |
| **Phase 1** | FileSystemAgent 勘镜像,识别分区/文件系统/关键路径,产出首批 Phenomenon |
| **Phase 2** | 假设生成 — 优先读 `config.yaml:hypotheses`;未配置则由 HypothesisAgent 从 Phase 1 现象自动生成 3-7 个 |
| **Phase 3** | 假设驱动调查(默认 5 轮迭代)。每轮:一次性为所有 active 假设产出 leads → 按 agent 类型并发派发(信号量 = 3→ 一次性判定新现象与各假设的关系。所有假设收敛即提前退出。末尾:失败 lead 重试一次 + Gap Analysis |
| **Phase 4** | TimelineAgent `build_filesystem_timeline` 生成 MAC 时间线,与 Phenomenon 时间戳关联 |
| **Phase 5** | ReportAgent 综合假设、证据、实体,生成 Markdown 报告 |
## 取证工具链
### Investigation Areashypothesis-derived
### Sleuth Kit磁盘取证
Phase 2 末尾 orchestrator 调一次 LLM 从所有 active hypothesis 派生 5-12 个 **InvestigationArea**snake_case slug、description、suggested_agent、expected_keywords、expected_tools、priority、motivating_hypothesis_ids。Areas 存进 `graph.investigation_areas`,序列化到 `runs/<ts>/investigation_areas.json`。两个用途:
通过异步子进程调用 TSK 命令行工具:
1. **Phase 3 主循环提示** — 每个 hypothesis 块附 `Expected areas: a, b, c`LLM 仍自由选 lead 但有软引导
2. **Phase 3 末尾 Gap Analysis** — 两层判定覆盖情况:
- **关键词匹配**:扫 Phenomenon 标题/描述对照 area.expected_keywords
- **工具命中**:检查 area.expected_tools 是否实际调用过
未覆盖的 area 自动派 lead`suggested_agent` + `priority` + `motivating_hypothesis_ids[0]` 透传给 `Lead.hypothesis_id` 保留 provenance最多 3 轮补漏。
**手动 override**`config.yaml:investigation_areas` 默认注释掉,纯 LLM 派生。取消注释可添加强制必查的领域,会先于 LLM 写入并通过 slug-based dedupe 保护不被覆盖LLM 只会 augment keyword/tool 列表)。这是跨案件/跨平台适配的关键 —— 不再 hardcode Windows-specific 领域。
## Agent 体系
`AgentFactory` 维护 7 个角色模板(`ROLE_TEMPLATES`),每个模板指定默认工具集。`HypothesisAgent``ReportAgent``BaseAgent` 的子类(额外注册专用工具),其余 5 个 Agent 直接由 `BaseAgent` + 工具列表生成。
### Agent 工作流
`BaseAgent.run` 在 system prompt 中强制四阶段:
```
A. INVESTIGATE 先查图状态 / Asset Library再调取证工具
B. RECORD 每条发现写 add_phenomenon
C. LINK 按需 link_to_entity但禁止凭记忆引用 ph-id必须先 list_phenomena
D. ANSWER 以上完成后再给最终答复
```
prompt 内置**反幻觉规则**:只允许记录工具输出中逐字出现的内容;时间戳/路径/inode 必须来自工具返回;输出被截断须标 `[truncated]`
### 动态 Agent 组合
`AgentFactory.create_specialized_agent()` 应对能力缺口:将工具目录与假设描述喂给 LLM由其挑 3-8 个工具并写角色描述,工厂据此实例化新 Agent 并缓存。
## 工具系统
`tool_registry.py` 启动时调用 `register_all_tools(image_path, partition_offset, graph)`,将所有工具一次性注册到全局 `TOOL_CATALOG`
### 工具结果缓存
`CACHEABLE_TOOLS` 集合标记纯读取/确定性工具partition_info、list_directory、parse_registry_key …)。镜像只读,同 args 调用产出固定,命中缓存直接复用,错误结果不入缓存。
### Asset Library
`EvidenceGraph.asset_library` 按 inode 索引所有已提取文件,避免重复 extract。Agent 通过 `list_assets` / `find_extracted_file` 工具查询。新文件按文件名自动归类到 `registry_hive` / `chat_log` / `prefetch` / `network_capture` / `recycle_bin` 等十类之一。
### 取证工具链
**Sleuth Kit磁盘取证** — 异步子进程调用 TSK
| 工具 | 用途 |
|------|------|
@@ -49,47 +126,43 @@ Agent 之间不直接通信,通过 **Blackboard黑板** 共享发现E
| `srch_strings` | 磁盘字符串搜索 |
| `fls -m` | MAC 时间线生成 |
### regipy注册表解析
**regipy注册表解析** — 直接读 SYSTEM / SOFTWARE / SAM / NTUSER.DAT 二进制,提取系统信息、用户账户、网络配置、已安装软件、邮件账户、关机时间等。
直接解析 Windows 注册表 hive 二进制文件SYSTEM、SOFTWARE、SAM、NTUSER.DAT提取系统信息、用户账户、网络配置、已安装软件、邮件账户、关机时间等
**文件解析器** — Prefetch 二进制(`.pf`、PCAP 字符串提取HTTP 请求 / Host / Cookie / UA、通用文本与二进制读取、正则搜索、Hex dump
### 文件解析器
## 断连恢复与运行归档
- **Prefetch** — 二进制解析 Windows XP .pf 文件(运行次数、最后执行时间)
- **PCAP** — 从抓包文件提取 HTTP 请求、Host、Cookie、User-Agent
- **通用文本/二进制** — 按偏移读取、正则搜索、Hex dump
三层防护:
## 断连恢复与数据归档
1. **EvidenceGraph 自动持久化** — 每次 `add_phenomenon` / `add_hypothesis` / `add_edge` / `add_lead` 等写操作均自动落盘(原子写 `.tmp` 后 rename
2. **Agent 级容错** — 单 Agent 失败 → 该 lead 标 `failed`,连续 3 次失败触发 `AnalysisAborted` 优雅退出Phase 3 末尾对失败 lead 重试一次(`retry=True` 防无限循环)
3. **续跑**`main.py` 启动时扫 `runs/*/graph_state.json`,发现存在但缺 `run_metadata.json` 的目录即提示恢复,并按 graph 当前状态决定从哪一阶段续起
系统设计了三层防护,应对长时间运行中的网络中断:
1. **Blackboard 自动持久化** — 每次 add_evidence / add_lead 自动写盘(原子写入)
2. **Agent 级容错** — 单个 Agent 失败标记 Lead 为 failed不影响其他 Agent自动重试一次
3. **优雅退出** — 连续 3 次 Agent 失败后保存现有成果并干净退出
每次运行自动创建带时间戳的归档目录:
### 运行归档目录
```
runs/
2026-04-02T14-30-00/
config.yaml 配置快照
blackboard_state.json 实时状态(用于恢复
evidence.json 结构化证据导出
leads.json 线索及最终状态
report.md 取证报告
run_metadata.json 运行元数据(时长、统计、错误)
masforensics.log 运行日志
config.yaml 配置快照
graph_state.json 实时状态(续跑用)
phenomena.json 现象导出
hypotheses.json 假设 + 置信度日志
entities.json 实体
edges.json
leads.json 线索及最终状态
extracted/ 从镜像提取的文件
<image>_forensic_report.md 取证报告
run_metadata.json 运行元数据(时长、统计、错误)
masforensics.log 运行日志
```
中断后再次运行 `python main.py`,系统自动检测未完成的运行并提示恢复。
## 快速开始
### 环境要求
- Python >= 3.14
- The Sleuth Kit系统安装提供 `mmls``fls``icat` 等命令)
- 磁盘镜像文件置于 `image/` 目录
- 磁盘镜像文件
### 安装
@@ -99,50 +172,77 @@ uv sync
### 配置
编辑 `config.yaml`,填入 LLM API 地址和密钥
编辑 `config.yaml`
```yaml
agent:
base_url: "https://your-api-proxy.com"
api_key: "sk-your-key"
model: "claude-sonnet-4-6"
max_tokens: 4096
max_tokens: 16384
max_investigation_rounds: 5 # Phase 3 最大迭代轮数
# hypotheses: # 可选:手动指定初始假设
# - title: "嫌疑人主动实施网络嗅探"
# description: "..."
# investigation_areas: # 可选:手动 override默认全 LLM 派生)
# - area: shutdown_time # LLM 通过 slug dedupe 只 augment
# agent: registry # keyword/tool 列表,不覆盖 manual
# priority: 3
# keywords: [shutdown]
# tools: [get_shutdown_time]
```
`investigation_areas` 部分定义了必须覆盖的调查领域,可按需增减
未配置 `hypotheses` 时由 HypothesisAgent 自动生成
### 运行
```bash
python main.py
python main.py # 交互式选镜像与分区
python main.py /path/to/image/dir # 指定镜像目录
```
报告和所有结构化数据将保存在 `runs/<timestamp>/` 目录下
中断后再次运行会自动检测未完成的 run 并提示是否续跑
### 仅重生成报告
跑完一次后若只想换提示词或修复报告:
```bash
python regenerate_report.py runs/<timestamp>
```
跳过 Phase 1-4直接从已有 `graph_state.json` 重跑 ReportAgent。
## 项目结构
```
MASForensics/
├── main.py 入口
├── orchestrator.py 流水线调度
├── blackboard.py 共享知识库
├── llm_client.py LLM API 客户端
├── base_agent.py Agent 基类
├── config.yaml 配置文件
├── main.py 入口、镜像选择、断连恢复
├── orchestrator.py 五阶段流水线调度
├── evidence_graph.py 证据知识图 + 边权重表 + 持久化
├── base_agent.py Agent 基类 + 内建 graph 工具
├── agent_factory.py 角色模板 + 动态 Agent 组合
├── tool_registry.py 工具目录 + 结果缓存 + 自动归类
├── llm_client.py LLM API 客户端
├── log_config.py 彩色终端日志 + 文件日志
├── regenerate_report.py 从已有 graph_state 重生成报告
├── config.yaml 配置 + 调查领域 + 可选假设
├── agents/
│ ├── filesystem.py 文件系统 Agent
│ ├── registry.py 注册表 Agent
│ ├── communication.py 通信 Agent
── network.py 网络 Agent
│ ├── timeline.py 时间线 Agent
│ └── report.py 报告 Agent
│ ├── hypothesis.py HypothesisAgentadd_hypothesis、link
│ ├── report.py ReportAgent综合报告自带读取工具
│ ├── timeline.py TimelineAgent保留以备扩展
── ... filesystem/registry/communication/network同上
├── tools/
│ ├── sleuthkit.py Sleuth Kit 封装
│ ├── registry.py 注册表解析(regipy
│ └── parsers.py 文件格式解析
├── image/ 磁盘镜像
├── extracted/ 提取的文件(运行时生成)
└── runs/ 运行归档
│ ├── sleuthkit.py TSK 异步封装
│ ├── registry.py regipy 解析
│ └── parsers.py Prefetch / PCAP / 通用文件解析
├── image/ 磁盘镜像(用户放)
├── runs/ 运行归档
└── tests/
└── test_optimizations.py
```
## 依赖
@@ -152,14 +252,16 @@ MASForensics/
| `httpx[socks]` | 异步 HTTP 客户端(支持 SOCKS 代理) |
| `pyyaml` | 配置文件解析 |
| `regipy` | Windows 注册表 hive 解析 |
| `pytest` / `pytest-asyncio` | 测试 |
## 当前案例
## 默认案例
默认配置分析 **CFReDS Hacking Case**NIST 标准取证教学镜像):
**CFReDS Hacking Case**NIST 标准取证教学镜像):
- 镜像SCHARDT.001~4.6GBIBM 硬盘8 个分段)
- 镜像SCHARDT.001~4.6 GBIBM 硬盘8 个分段)
- 系统Windows XP
- 场景:涉嫌黑客入侵的计算机取证分析
- 完整镜像 MD5`AEE4FCD9301C03B3B054623CA261959A``config.yaml` 含各分段 MD5 用于校验)
## 测试

View File

@@ -1,150 +1,50 @@
"""Agent Factory — composes agents from tool registry and role templates.
"""Agent Factory — instantiates agents from registered classes.
Provides both pre-defined agent templates (filesystem, registry, etc.)
and LLM-driven dynamic agent composition for capability gaps.
Each agent type has a dedicated subclass under agents/ that owns its name,
role description, and tool list (single source of truth). The factory just
maps agent_type → class. Also supports LLM-driven dynamic composition for
capability gaps via create_specialized_agent().
"""
from __future__ import annotations
import json
import logging
from dataclasses import dataclass, field
from base_agent import BaseAgent
from evidence_graph import EvidenceGraph
from llm_client import LLMClient
from tool_registry import TOOL_CATALOG, ToolDefinition
from tool_registry import TOOL_CATALOG
# Agent classes with custom tools — keyed by template name
_AGENT_CLASSES: dict[str, type] = {}
# Agent classes keyed by name. Populated lazily to avoid circular imports.
_AGENT_CLASSES: dict[str, type[BaseAgent]] = {}
def _load_agent_classes() -> None:
"""Lazy-import custom agent classes to avoid circular imports."""
"""Lazy-import agent classes to avoid circular imports."""
if _AGENT_CLASSES:
return
from agents.communication import CommunicationAgent
from agents.filesystem import FileSystemAgent
from agents.hypothesis import HypothesisAgent
from agents.network import NetworkAgent
from agents.registry import RegistryAgent
from agents.report import ReportAgent
from agents.timeline import TimelineAgent
_AGENT_CLASSES["filesystem"] = FileSystemAgent
_AGENT_CLASSES["registry"] = RegistryAgent
_AGENT_CLASSES["communication"] = CommunicationAgent
_AGENT_CLASSES["network"] = NetworkAgent
_AGENT_CLASSES["timeline"] = TimelineAgent
_AGENT_CLASSES["hypothesis"] = HypothesisAgent
_AGENT_CLASSES["report"] = ReportAgent
logger = logging.getLogger(__name__)
@dataclass
class RoleTemplate:
"""Pre-defined agent archetype."""
name: str
role: str
default_tools: list[str] # tool names from TOOL_CATALOG
tags: list[str] = field(default_factory=list)
# Pre-defined templates matching the original 6 agents + hypothesis agent.
ROLE_TEMPLATES: dict[str, RoleTemplate] = {
"filesystem": RoleTemplate(
name="filesystem",
role=(
"File system forensic analyst. You examine disk image partition layouts, "
"directory structures, file metadata, and recover deleted files. "
"You identify suspicious files, installed programs, and user data locations. "
"You also handle Recycle Bin forensics and Prefetch execution evidence."
),
default_tools=[
"partition_info", "filesystem_info", "list_directory",
"extract_file", "find_file", "search_strings",
"parse_prefetch", "count_deleted_files",
"read_text_file", "search_text_file", "read_binary_preview",
],
tags=["filesystem", "disk", "files", "deleted", "prefetch"],
),
"registry": RoleTemplate(
name="registry",
role=(
"Windows registry forensic analyst. You parse registry hive files "
"(SYSTEM, SOFTWARE, SAM, NTUSER.DAT) to extract system configuration, "
"user accounts, installed software, network settings, email accounts, "
"and other Windows artifacts."
),
default_tools=[
"extract_file", "list_directory",
"parse_registry_key", "list_installed_software",
"get_user_activity", "search_registry",
"get_system_info", "get_timezone_info", "get_computer_name",
"get_shutdown_time", "enumerate_users",
"get_network_interfaces", "get_email_config",
],
tags=["registry", "windows", "system", "user", "software"],
),
"communication": RoleTemplate(
name="communication",
role=(
"Communication forensic analyst. You analyze email files (.dbx, .pst), "
"IRC/mIRC chat logs, newsgroup data, and other messaging artifacts "
"to identify communication patterns and contacts."
),
default_tools=[
"list_directory", "extract_file",
"read_text_file", "read_binary_preview",
"list_extracted_dir", "search_strings",
"search_text_file", "read_text_file_section",
],
tags=["email", "chat", "irc", "messaging", "communication"],
),
"network": RoleTemplate(
name="network",
role=(
"Network forensic analyst. You analyze browser history, cookies, "
"network captures (PCAP), wireless artifacts, and other network-related "
"evidence to reconstruct online activities."
),
default_tools=[
"list_directory", "extract_file",
"read_text_file", "read_binary_preview",
"list_extracted_dir", "search_strings",
"search_text_file", "read_text_file_section",
"parse_pcap_strings",
],
tags=["network", "browser", "pcap", "http", "internet"],
),
"timeline": RoleTemplate(
name="timeline",
role=(
"Timeline correlation analyst. You build chronological timelines "
"by combining filesystem MAC times with evidence from other agents. "
"You identify temporal patterns and correlate events across categories."
),
default_tools=[
"build_filesystem_timeline",
],
tags=["timeline", "correlation", "temporal"],
),
"report": RoleTemplate(
name="report",
role=(
"Forensic report writer. You synthesize all evidence and hypotheses "
"into a comprehensive forensic analysis report with executive summary, "
"detailed findings organized by hypothesis, timeline of events, and conclusions."
),
default_tools=[], # Report agent uses only graph query tools
tags=["report", "summary", "writing"],
),
"hypothesis": RoleTemplate(
name="hypothesis",
role=(
"Hypothesis analyst. You review all phenomena discovered so far "
"and formulate investigative hypotheses about what happened on the system. "
"For each hypothesis, identify which existing phenomena support or contradict it."
),
default_tools=[], # Uses only graph query + hypothesis tools
tags=["hypothesis", "analysis", "reasoning"],
),
}
class AgentFactory:
"""Creates agents from templates or dynamically via LLM composition."""
"""Creates agents from registered classes or dynamically via LLM composition."""
def __init__(self, llm: LLMClient, graph: EvidenceGraph) -> None:
self.llm = llm
@@ -152,40 +52,20 @@ class AgentFactory:
self._cache: dict[str, BaseAgent] = {}
def get_or_create_agent(self, agent_type: str) -> BaseAgent | None:
"""Get a cached agent or create one from a template."""
"""Get a cached agent or instantiate one from its registered class."""
if agent_type in self._cache:
return self._cache[agent_type]
template = ROLE_TEMPLATES.get(agent_type)
if template is None:
logger.warning("No template for agent type: %s", agent_type)
return None
# Use custom agent class if one exists, otherwise BaseAgent
_load_agent_classes()
agent_cls = _AGENT_CLASSES.get(agent_type)
if agent_cls is not None:
agent = agent_cls(self.llm, self.graph)
else:
agent = self._instantiate_from_template(template)
if agent_cls is None:
logger.warning("No agent class for type: %s", agent_type)
return None
agent = agent_cls(self.llm, self.graph)
self._cache[agent_type] = agent
return agent
def _instantiate_from_template(self, template: RoleTemplate) -> BaseAgent:
"""Create a BaseAgent from a role template, registering tools from the catalog."""
agent = BaseAgent(self.llm, self.graph)
agent.name = template.name
agent.role = template.role
for tool_name in template.default_tools:
td = TOOL_CATALOG.get(tool_name)
if td is None:
logger.warning("Tool '%s' not in catalog (template: %s)", tool_name, template.name)
continue
agent.register_tool(td.name, td.description, td.input_schema, td.executor)
return agent
async def create_specialized_agent(
self,
hypothesis_title: str,
@@ -220,18 +100,15 @@ class AgentFactory:
messages=[{"role": "user", "content": prompt}],
)
# Parse response — try to extract JSON
try:
config = json.loads(response)
except json.JSONDecodeError:
# Try to find JSON in the response
import re
match = re.search(r'\{.*\}', response, re.DOTALL)
if match:
config = json.loads(match.group())
else:
logger.error("Failed to parse agent composition response: %s", response[:300])
# Fallback: create a generic agent with all tools
return self._create_fallback_agent(capability_gap)
agent_name = config.get("agent_name", "specialized")
@@ -239,13 +116,11 @@ class AgentFactory:
strategy = config.get("strategy", "")
tool_names = config.get("tools", [])
# Validate tool names against catalog
valid_tools = [t for t in tool_names if t in TOOL_CATALOG]
if not valid_tools:
logger.warning("No valid tools selected by LLM, using fallback")
return self._create_fallback_agent(capability_gap)
# Build agent
agent = BaseAgent(self.llm, self.graph)
agent.name = agent_name
agent.role = f"{role_text}\n\nInvestigation Strategy:\n{strategy}"

View File

@@ -1,12 +1,17 @@
"""Hypothesis Agent — analyzes phenomena and generates investigative hypotheses."""
"""Hypothesis Agent — generates investigative hypotheses from phenomena.
Generates hypotheses only. Phenomenon→Hypothesis linking is handled centrally
by Orchestrator._judge_new_phenomena. Tool set is restricted to read-only
graph queries + add_hypothesis to prevent the agent from creating phenomena,
leads, or entity links.
"""
from __future__ import annotations
import json
import logging
from base_agent import BaseAgent
from evidence_graph import EvidenceGraph, HYPOTHESIS_EDGE_WEIGHTS
from evidence_graph import EvidenceGraph
from llm_client import LLMClient
logger = logging.getLogger(__name__)
@@ -17,19 +22,19 @@ class HypothesisAgent(BaseAgent):
role = (
"Hypothesis analyst. You review all phenomena discovered so far "
"and formulate investigative hypotheses about what happened on this system. "
"Your ultimate goal: build the most complete picture of events that occurred. "
"For each hypothesis, identify which existing phenomena support or contradict it."
"Your ultimate goal: build the most complete picture of events that occurred."
)
mandatory_record_tools = ("add_hypothesis",)
def __init__(self, llm: LLMClient, graph: EvidenceGraph) -> None:
super().__init__(llm, graph)
self._register_hypothesis_tools()
def _register_graph_tools(self) -> None:
"""Restrict to read-only graph tools. add_hypothesis is registered separately."""
self._register_graph_read_tools()
def _register_hypothesis_tools(self) -> None:
"""Register hypothesis-specific tools."""
valid_edge_types = list(HYPOTHESIS_EDGE_WEIGHTS.keys())
self.register_tool(
name="add_hypothesis",
description=(
@@ -53,42 +58,30 @@ class HypothesisAgent(BaseAgent):
executor=self._add_hypothesis,
)
self.register_tool(
name="link_phenomenon_to_hypothesis",
description=(
"Link an existing phenomenon to a hypothesis with a relationship type. "
f"Valid relationship types: {', '.join(valid_edge_types)}. "
"direct_evidence = the phenomenon IS the hypothesis. "
"supports = consistent with the hypothesis. "
"prerequisite_met = a necessary condition is satisfied. "
"consequence_observed = an expected result of the hypothesis is found. "
"contradicts = directly contradicts the hypothesis. "
"weakens = makes the hypothesis less likely."
),
input_schema={
"type": "object",
"properties": {
"phenomenon_id": {
"type": "string",
"description": "ID of the phenomenon (e.g. 'ph-a1b2c3d4').",
},
"hypothesis_id": {
"type": "string",
"description": "ID of the hypothesis (e.g. 'hyp-e5f6g7h8').",
},
"edge_type": {
"type": "string",
"enum": valid_edge_types,
"description": "The edge_type of the relationship.",
},
"reason": {
"type": "string",
"description": "The reason this relationship holds (1-2 sentences).",
},
},
"required": ["phenomenon_id", "hypothesis_id", "edge_type", "reason"],
},
executor=self._link_phenomenon_to_hypothesis,
def _build_system_prompt(self, task: str) -> str:
"""Focused prompt — no INVESTIGATE/RECORD/LINK workflow."""
return (
f"You are {self.name}, a forensic hypothesis analyst.\n"
f"Role: {self.role}\n\n"
f"Image: {self.graph.image_path}\n"
f"Current investigation state: {self.graph.stats_summary()}\n\n"
f"Your task: {task}\n\n"
f"WORKFLOW:\n"
f"1. Call list_phenomena and search_graph to review existing findings.\n"
f"2. For each hypothesis you want to record, call add_hypothesis (title + description).\n"
f"3. STOP after you have generated 3-7 hypotheses. Do not call any more tools.\n\n"
f"STRICT BOUNDARIES:\n"
f"- Your only mutation tool is add_hypothesis. Do NOT attempt list_directory, "
f"parse_registry_key, extract_file, or any disk-image investigation tools — "
f"they are not yours and you will get 'unknown tool' errors.\n"
f"- You CANNOT create phenomena, leads, or entity links. The orchestrator handles "
f"all phenomenon↔hypothesis linking after you finish.\n"
f"- Each hypothesis must be specific and testable. Avoid generic templates like "
f"'Unauthorized Remote Access' or 'Malware Deployment' unless concrete phenomena "
f"in the graph already point to them.\n"
f"- If the graph is empty, generate broad starting hypotheses and mark them "
f"clearly as exploratory in their description so downstream agents know they "
f"still need evidence."
)
async def _add_hypothesis(self, title: str, description: str) -> str:
@@ -98,33 +91,3 @@ class HypothesisAgent(BaseAgent):
created_by=self.name,
)
return f"Hypothesis created: {hid}{title} (confidence: 0.50)"
async def _link_phenomenon_to_hypothesis(
self,
phenomenon_id: str,
hypothesis_id: str,
edge_type: str = "",
reason: str = "",
# Common LLM misnaming — accept as fallbacks
relationship: str = "",
note: str = "",
) -> str:
edge_type = edge_type or relationship
reason = reason or note
if not edge_type:
return "Error: edge_type is required."
try:
new_conf = await self.graph.update_hypothesis_confidence(
hyp_id=hypothesis_id,
phenomenon_id=phenomenon_id,
edge_type=edge_type,
reason=reason,
)
weight = HYPOTHESIS_EDGE_WEIGHTS[edge_type]
direction = "+" if weight > 0 else ""
return (
f"Linked: {phenomenon_id} —[{edge_type}]→ {hypothesis_id} "
f"(weight: {direction}{weight}, new confidence: {new_conf:.3f})"
)
except ValueError as e:
return f"Error linking: {e}"

View File

@@ -2,9 +2,6 @@
from __future__ import annotations
import json
import os
from base_agent import BaseAgent
from evidence_graph import EvidenceGraph
from llm_client import LLMClient
@@ -15,34 +12,46 @@ class ReportAgent(BaseAgent):
role = (
"Forensic report writer. You synthesize all findings from the investigation "
"into a structured, professional forensic analysis report organized by hypotheses.\n\n"
"IMPORTANT: Only include findings that have a source_tool attribution (marked VERIFIED). "
"Only include findings that have a source_tool attribution (marked VERIFIED). "
"If evidence lacks source attribution, mark it as UNVERIFIED. "
"Do NOT invent or fabricate any data, timestamps, or findings not present in the evidence.\n\n"
"CRITICAL: You MUST call save_report to write the final report."
"Do NOT invent or fabricate any data, timestamps, or findings not present in the evidence."
)
# Calling save_report is BOTH the recording action and the completion
# signal. tool_call_loop returns the moment save_report executes; the
# tool's return value becomes the agent's final_text. The forced-retry
# mechanism fires if save_report is never called.
mandatory_record_tools = ("save_report",)
terminal_tools = ("save_report",)
def __init__(self, llm: LLMClient, graph: EvidenceGraph) -> None:
super().__init__(llm, graph)
self._register_tools()
def _register_graph_tools(self) -> None:
"""Restrict to read-only graph tools. Report agent does not mutate state."""
self._register_graph_read_tools()
def _build_system_prompt(self, task: str) -> str:
"""Report agent gets a clean prompt — no Phase A/B/C/D workflow."""
return (
f"You are a forensic report writer.\n"
f"Role: {self.role}\n\n"
f"Investigation state:\n{self.graph.stats_summary()}\n\n"
f"Your task: {task}\n\n"
f"WORKFLOW:\n"
f"1. Call get_hypotheses_with_evidence to get all hypotheses and their linked evidence\n"
f"2. Call get_all_phenomena to get detailed findings by category\n"
f"3. Call get_entities to get people, programs, and hosts\n"
f"4. Call get_case_info for case metadata\n"
f"5. Write the complete report directly in your <answer> block\n\n"
f"1. Call get_hypotheses_with_evidence, get_all_phenomena, get_entities, get_case_info "
f" to gather all the data needed for the report. Make these calls in parallel.\n"
f"2. Assemble the complete markdown forensic report.\n"
f"3. Call save_report(content=<full markdown>, output_path=\"report.md\").\n"
f" This single call is the completion signal — the run ENDS the moment it executes.\n"
f" Do NOT call any read tools after this point; they will not run.\n"
f" Do NOT write the report as free text outside of save_report; only the\n"
f" `content` argument of save_report is persisted.\n\n"
f"RULES:\n"
f"- Write the report DIRECTLY in <answer> — do NOT use save_report tool\n"
f"- Only include findings present in the evidence graph\n"
f"- Do NOT invent timestamps, file paths, or data not in the phenomena\n"
f"- The report must be complete — do not cut off mid-section\n"
f"- The report must be the complete markdown — do not cut off mid-section.\n"
f"- Only include findings present in the evidence graph.\n"
f"- Do NOT invent timestamps, file paths, or data not in the phenomena.\n"
f"- The `content` argument can be 10K+ chars. JSON-escape inner quotes (\\\") and\n"
f" backslashes (\\\\) and newlines (\\n) correctly.\n"
)
def _register_tools(self) -> None:
@@ -182,10 +191,16 @@ class ReportAgent(BaseAgent):
return "\n".join(lines)
async def _save_report(self, content: str, output_path: str) -> str:
try:
os.makedirs(os.path.dirname(output_path) or ".", exist_ok=True)
with open(output_path, "w") as f:
f.write(content)
return f"Report saved to {output_path} ({len(content)} chars)"
except Exception as e:
return f"Error saving report: {e}"
"""Save the report and return the content itself.
The content is returned (rather than a "saved to ..." status string)
so that when tool_call_loop short-circuits on this terminal tool,
`final_text` is the full markdown — orchestrator writes it to the
canonical report.md path under runs/<ts>/.
The output_path argument is kept for backward compat but the model's
chosen path is ignored — the orchestrator owns the persistence path.
"""
if not content:
return ""
return content

View File

@@ -1,14 +1,21 @@
"""Timeline Agent — correlates evidence across time."""
"""Timeline Agent — connects existing phenomena with temporal edges.
Operates on phenomena already in the graph. Does NOT investigate the disk
image itself. The agent's only useful output is the temporal edges it
creates between phenomena.
"""
from __future__ import annotations
import json
import logging
from base_agent import BaseAgent
from evidence_graph import EvidenceGraph
from llm_client import LLMClient
from tool_registry import TOOL_CATALOG
logger = logging.getLogger(__name__)
class TimelineAgent(BaseAgent):
name = "timeline"
@@ -17,29 +24,39 @@ class TimelineAgent(BaseAgent):
"MAC timestamps and correlate events across all phenomena categories in the "
"evidence graph to reconstruct the sequence of activities on the system."
)
mandatory_record_tools = ("add_temporal_edge",)
def __init__(self, llm: LLMClient, graph: EvidenceGraph) -> None:
super().__init__(llm, graph)
self._register_tools()
def _register_graph_tools(self) -> None:
"""Restrict to read-only graph tools — Timeline does not add phenomena."""
self._register_graph_read_tools()
def _register_tools(self) -> None:
# Filesystem timeline tool from catalog
td = TOOL_CATALOG.get("build_filesystem_timeline")
if td:
self.register_tool(td.name, td.description, td.input_schema, td.executor)
# Custom tool to get all phenomena with timestamps for correlation
self.register_tool(
name="get_timestamped_phenomena",
description="Get all phenomena that have timestamps, sorted chronologically. Use for timeline correlation.",
description=(
"Get all phenomena that have timestamps, sorted chronologically. "
"Returns each phenomenon's id, category, title, and a short description "
"preview. Use this as your primary input for temporal correlation."
),
input_schema={"type": "object", "properties": {}},
executor=self._get_timestamped_phenomena,
)
# Tool to add temporal edges between phenomena
self.register_tool(
name="add_temporal_edge",
description="Add a temporal relationship between two phenomena (before, after, or concurrent).",
description=(
"Add a temporal relationship edge between two existing phenomena. "
"Use 'before' when source phenomenon happened before target, "
"'concurrent' when they occurred within seconds of each other."
),
input_schema={
"type": "object",
"properties": {
@@ -56,6 +73,42 @@ class TimelineAgent(BaseAgent):
executor=self._add_temporal_edge,
)
def _build_system_prompt(self, task: str) -> str:
"""Focused prompt — Timeline connects existing phenomena, doesn't investigate."""
return (
f"You are {self.name}, a forensic timeline correlation analyst.\n"
f"Role: {self.role}\n\n"
f"Image: {self.graph.image_path}\n"
f"Current state: {self.graph.stats_summary()}\n\n"
f"Your task: {task}\n\n"
f"WORKFLOW:\n"
f"1. Call build_filesystem_timeline once to materialize MAC times for the disk.\n"
f"2. Call get_timestamped_phenomena to see all phenomena with timestamps, "
f"sorted chronologically. THIS IS YOUR PRIMARY INPUT.\n"
f"3. For each meaningful temporal relationship between phenomena, call "
f"add_temporal_edge(source_id, target_id, relation). Use 'before' when "
f"source happened first (the common case); 'concurrent' for events within "
f"a few seconds of each other.\n"
f" Examples of meaningful connections:\n"
f" - 'Cain installer executed' (before) 'Cain.exe first execution'\n"
f" - 'WHOIS first lookup' (before) 'WHOIS second lookup'\n"
f" - 'Recon tool cluster' (before) 'Anti-forensics defrag'\n"
f" - 'Tool installation' (before) 'Tool execution'\n"
f"4. Aim for 15-40 temporal edges that connect the major events into a "
f"forensic story.\n"
f"5. STOP after recording all meaningful temporal edges. Do not call any more tools.\n\n"
f"STRICT BOUNDARIES:\n"
f"- Your job is to CONNECT existing phenomena, NOT to discover new ones. "
f"You CANNOT call add_phenomenon — the tool isn't yours.\n"
f"- Use ONLY phenomenon IDs returned by get_timestamped_phenomena or "
f"list_phenomena. NEVER fabricate IDs.\n"
f"- Connect events that tell a forensic story (recon -> exploit -> cover-up). "
f"Do not exhaustively pair every two phenomena; focus on causally-relevant "
f"sequences.\n"
f"- The orchestrator handles report writing in the next phase. Your only "
f"output that propagates is the temporal edges you create."
)
async def _get_timestamped_phenomena(self) -> str:
items = [
ph for ph in self.graph.phenomena.values()

View File

@@ -31,12 +31,28 @@ class BaseAgent:
name: str = "base"
role: str = "A forensic analysis agent."
# Tools the agent MUST invoke at least once for the run to count as productive.
# If none of these were called when tool_call_loop returns, run() fires a
# forced retry with an explicit "you forgot to record" instruction.
# Subclasses override to declare their own recording responsibility
# (timeline → add_temporal_edge, hypothesis → add_hypothesis, report → save_report).
mandatory_record_tools: tuple[str, ...] = ("add_phenomenon",)
# Tools whose invocation ends the run immediately. After any terminal tool
# is called, tool_call_loop returns with that tool's result text as
# final_text. Used by agents whose "completion" is a single explicit
# action rather than "model decides to stop calling tools". For multi-call
# agents (filesystem records many phenomena) leave empty.
terminal_tools: tuple[str, ...] = ()
def __init__(self, llm: LLMClient, graph: EvidenceGraph) -> None:
self.llm = llm
self.graph = graph
self._tools: dict[str, dict] = {} # name -> schema
self._executors: dict[str, Any] = {} # name -> async callable
self._record_call_counts: dict[str, int] = {}
self._work_log: list[str] = []
self._current_lead_id: str | None = None
def register_tool(
self,
@@ -51,7 +67,18 @@ class BaseAgent:
"description": description,
"input_schema": input_schema,
}
self._executors[name] = executor
if name in self.mandatory_record_tools:
self._executors[name] = self._wrap_record_executor(name, executor)
else:
self._executors[name] = executor
def _wrap_record_executor(self, name: str, executor: Any) -> Any:
"""Wrap a mandatory-record executor to count successful invocations."""
async def wrapped(*args, **kwargs):
result = await executor(*args, **kwargs)
self._record_call_counts[name] = self._record_call_counts.get(name, 0) + 1
return result
return wrapped
def get_tool_definitions(self) -> list[dict]:
"""Get tool definitions in Claude API format."""
@@ -90,14 +117,21 @@ class BaseAgent:
f" FIRST call list_phenomena to get the current IDs — do NOT rely on memory.\n"
f" Then call link_to_entity for each relevant phenomenon.\n"
f" NEVER guess or fabricate a phenomenon ID. If an ID is not in list_phenomena output, it does not exist.\n\n"
f"Phase D — ANSWER:\n"
f" Only give your <answer> AFTER completing Phases B and C.\n\n"
f"IMPORTANT:\n"
f"- You MUST call add_phenomenon at least once before finishing\n"
f"- Complete each phase before starting the next\n"
f"- Other agents can ONLY see what you write to the graph\n"
f"- If you don't record findings, they are LOST\n"
f"- Include relevant file paths, inode numbers, timestamps, and raw data\n\n"
f"Phase D — STOP:\n"
f" Once all phenomena are recorded and entities linked, you are DONE.\n"
f" Do not call any more tools. The orchestrator picks up automatically.\n\n"
f"CRITICAL — RECORDING REQUIREMENT:\n"
f"- Only graph mutations propagate to other agents and the final report.\n"
f"- You MUST call add_phenomenon for EVERY significant finding BEFORE you stop.\n"
f"- NEGATIVE findings count too. If you searched X (a directory, a pattern, "
f"a registry key) and found NOTHING, that absence IS evidence — call "
f"add_phenomenon with a 'No matches for X' title and the search scope in "
f"raw_data. Negative findings constrain the hypothesis space and prevent "
f"the next agent from wasting time re-searching.\n"
f"- If you stop without having called add_phenomenon at least once, the task "
f"is FAILED and a forced retry will fire.\n"
f"- Include exact file paths, inode numbers, timestamps, and the source_tool "
f"that produced each finding.\n\n"
f"ANTI-HALLUCINATION RULES — STRICTLY ENFORCED:\n"
f"- ONLY record findings that appear VERBATIM in tool results you received\n"
f"- NEVER invent or guess timestamps, file paths, inode numbers, or program names\n"
@@ -107,13 +141,15 @@ class BaseAgent:
f"- Do NOT fabricate execution timestamps — only report timestamps returned by tools"
)
async def run(self, task: str) -> str:
async def run(self, task: str, lead_id: str | None = None) -> str:
"""Run this agent with a specific task."""
_log(task, event="agent_start", agent=self.name)
self.graph.agent_status[self.name] = "running"
self.graph._current_agent = self.name
self._current_lead_id = lead_id
self._register_graph_tools()
self._record_call_counts.clear()
system = self._build_system_prompt(task)
messages = [{"role": "user", "content": task}]
@@ -122,12 +158,60 @@ class BaseAgent:
ph_before = len(self.graph.phenomena)
try:
final_text, _ = await self.llm.tool_call_loop(
final_text, conversation = await self.llm.tool_call_loop(
messages=messages,
tools=self.get_tool_definitions(),
tool_executor=self._executors,
system=system,
terminal_tools=self.terminal_tools,
)
# Forced-record retry: if the agent has any mandatory recording
# tools but never invoked any of them, force one more round with
# an explicit "you forgot to record" instruction. The mandatory
# set is declared on the class — Timeline → add_temporal_edge,
# Hypothesis → add_hypothesis, ReportAgent → (). For agents with
# empty mandatory_record_tools this branch is a no-op.
registered_mandatory = [
t for t in self.mandatory_record_tools if t in self._executors
]
recorded_any = any(
self._record_call_counts.get(t, 0) > 0
for t in registered_mandatory
)
if registered_mandatory and not recorded_any:
missing = "/".join(registered_mandatory)
logger.warning(
"[%s] finished without calling any of [%s] — forcing RECORD retry",
self.name, missing,
)
conversation.append({
"role": "user",
"content": (
f"STOP. You produced an answer without ever calling "
f"{missing}. Your answer is DISCARDED — only graph "
f"mutations propagate to other agents and the final "
f"report.\n\n"
f"You MUST now call {missing} for every significant "
f"finding from your prior investigation, including "
f"exact identifiers, timestamps, and the source_tool "
f"that produced each finding. If you genuinely found "
f"NOTHING noteworthy, call the recording tool ONCE "
f"with a 'No significant findings' style entry "
f"summarizing what you searched.\n\n"
f"Do not run more investigation tools. Just record "
f"what you already found. Then end."
),
})
final_text, _ = await self.llm.tool_call_loop(
messages=conversation,
tools=self.get_tool_definitions(),
tool_executor=self._executors,
system=system,
max_iterations=10,
terminal_tools=self.terminal_tools,
)
self._work_log.append(f"[Task: {task[:80]}] -> {final_text[:150]}")
except Exception:
self.graph.agent_status[self.name] = "failed"
@@ -143,9 +227,17 @@ class BaseAgent:
# ---- Graph interaction tools --------------------------------------------
def _register_graph_tools(self) -> None:
"""Register tools for querying and writing to the evidence graph."""
"""Register graph query + mutation tools.
# --- Read tools ---
Subclasses can override to restrict the toolset. For example, a
read-only agent (hypothesis, report) overrides this to skip
_register_graph_write_tools.
"""
self._register_graph_read_tools()
self._register_graph_write_tools()
def _register_graph_read_tools(self) -> None:
"""Register read-only graph + asset query tools."""
self.register_tool(
name="list_phenomena",
@@ -211,7 +303,49 @@ class BaseAgent:
executor=self._get_hypothesis_status,
)
# --- Write tools ---
self.register_tool(
name="list_assets",
description=(
"List all files extracted from the disk image. "
"Shows filename, category, size, local path, and inode. "
"Check this before calling extract_file to avoid re-extraction."
),
input_schema={
"type": "object",
"properties": {
"category": {
"type": "string",
"enum": [
"registry_hive", "chat_log", "prefetch", "network_capture",
"config_file", "address_book", "recycle_bin", "executable",
"text_log", "other",
],
"description": "Filter by category. Omit to list all.",
},
},
},
executor=self._list_assets,
)
self.register_tool(
name="find_extracted_file",
description=(
"Find an already-extracted file by inode or filename. "
"Returns the local path so you can use it directly with "
"parse_registry_key, read_text_file, etc. without re-extracting."
),
input_schema={
"type": "object",
"properties": {
"inode": {"type": "string", "description": "Inode to look up."},
"filename": {"type": "string", "description": "Filename or partial name to search."},
},
},
executor=self._find_extracted_file,
)
def _register_graph_write_tools(self) -> None:
"""Register graph mutation tools (add_phenomenon, add_lead, link_to_entity)."""
self.register_tool(
name="add_phenomenon",
@@ -280,49 +414,6 @@ class BaseAgent:
executor=self._link_to_entity,
)
# --- Asset library tools ---
self.register_tool(
name="list_assets",
description=(
"List all files extracted from the disk image. "
"Shows filename, category, size, local path, and inode. "
"Check this before calling extract_file to avoid re-extraction."
),
input_schema={
"type": "object",
"properties": {
"category": {
"type": "string",
"enum": [
"registry_hive", "chat_log", "prefetch", "network_capture",
"config_file", "address_book", "recycle_bin", "executable",
"text_log", "other",
],
"description": "Filter by category. Omit to list all.",
},
},
},
executor=self._list_assets,
)
self.register_tool(
name="find_extracted_file",
description=(
"Find an already-extracted file by inode or filename. "
"Returns the local path so you can use it directly with "
"parse_registry_key, read_text_file, etc. without re-extracting."
),
input_schema={
"type": "object",
"properties": {
"inode": {"type": "string", "description": "Inode to look up."},
"filename": {"type": "string", "description": "Filename or partial name to search."},
},
},
executor=self._find_extracted_file,
)
# ---- Tool executors -----------------------------------------------------
async def _list_phenomena(self, category: str | None = None) -> str:
@@ -375,6 +466,7 @@ class BaseAgent:
raw_data=raw_data,
timestamp=timestamp,
source_tool=source_tool,
from_lead_id=self._current_lead_id,
)
if merged:
return f"Phenomenon merged into existing: {pid}{title} (corroboration boost)"

View File

@@ -18,10 +18,12 @@ from pathlib import Path
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Predefined edge weights for Phenomenon → Hypothesis relationships.
# Default edge weights for Phenomenon → Hypothesis relationships.
# LLM only picks the edge type (categorical); the weight is looked up here.
# Override per-graph via EvidenceGraph(edge_weights=...) or config.yaml's
# `hypothesis_edge_weights` section.
# ---------------------------------------------------------------------------
HYPOTHESIS_EDGE_WEIGHTS: dict[str, float] = {
_DEFAULT_EDGE_WEIGHTS: dict[str, float] = {
"direct_evidence": +0.25,
"supports": +0.15,
"prerequisite_met": +0.10,
@@ -94,6 +96,7 @@ class Phenomenon:
confidence: float = 1.0
source_tool: str = ""
corroborating_agents: list[str] = field(default_factory=list)
from_lead_id: str | None = None
created_at: str = ""
def to_dict(self) -> dict:
@@ -194,6 +197,41 @@ class Lead:
return cls(**d)
@dataclass
class InvestigationArea:
"""An area to investigate to confirm/refute one or more hypotheses.
Derived by the orchestrator from active hypotheses after Phase 2; also
seeded from config.yaml:investigation_areas as an optional manual
override. Each area carries its own keywords + expected tools so the
gap-analysis coverage check is generic, not tied to hard-coded constants.
"""
id: str # "area-{slug}"
area: str # snake_case slug (dedupe key)
description: str
suggested_agent: str # filesystem / registry / communication / network / timeline
expected_keywords: list[str] = field(default_factory=list)
expected_tools: list[str] = field(default_factory=list)
priority: int = 5 # 1 (highest) - 10 (lowest)
motivating_hypothesis_ids: list[str] = field(default_factory=list)
created_by: str = "" # "manual" | "llm_derive" | "fallback"
created_at: str = ""
def to_dict(self) -> dict:
return asdict(self)
@classmethod
def from_dict(cls, d: dict) -> InvestigationArea:
return cls(**d)
def summary(self) -> str:
return (
f"[{self.area}] P{self.priority} agent={self.suggested_agent} "
f"(motivating: {len(self.motivating_hypothesis_ids)})"
)
@dataclass
class ExtractedAsset:
"""A file extracted from the disk image and tracked in the asset library."""
@@ -239,8 +277,12 @@ class EvidenceGraph:
self,
case_info: dict | None = None,
persist_path: Path | None = None,
edge_weights: dict[str, float] | None = None,
) -> None:
self.case_info: dict = case_info or {}
self.edge_weights: dict[str, float] = (
dict(edge_weights) if edge_weights else dict(_DEFAULT_EDGE_WEIGHTS)
)
self.image_path: str = ""
self.partition_offset: int = 0
self.extracted_dir: str = "extracted"
@@ -263,6 +305,11 @@ class EvidenceGraph:
self.asset_library: dict[str, ExtractedAsset] = {}
self._inode_index: dict[str, str] = {} # inode → asset_id
# Investigation areas — derived from hypotheses (LLM) and/or seeded
# from config.yaml:investigation_areas (manual override). Drives the
# gap-analysis coverage check.
self.investigation_areas: dict[str, InvestigationArea] = {}
# Set by BaseAgent.run() before each agent execution
self._current_agent: str = ""
@@ -288,6 +335,9 @@ class EvidenceGraph:
"leads": [l.to_dict() for l in self.leads],
"agent_status": dict(self.agent_status),
"asset_library": {aid: a.to_dict() for aid, a in self.asset_library.items()},
"investigation_areas": {
aid: a.to_dict() for aid, a in self.investigation_areas.items()
},
"saved_at": datetime.now().isoformat(),
}
tmp = self._persist_path.with_suffix(".tmp")
@@ -304,12 +354,17 @@ class EvidenceGraph:
self._persist_path = old
@classmethod
def load_state(cls, path: Path) -> EvidenceGraph:
def load_state(
cls,
path: Path,
edge_weights: dict[str, float] | None = None,
) -> EvidenceGraph:
"""Restore an EvidenceGraph from a saved JSON state file."""
data = json.loads(path.read_text())
graph = cls(
case_info=data.get("case_info", {}),
persist_path=path,
edge_weights=edge_weights,
)
graph.image_path = data.get("image_path", "")
graph.partition_offset = data.get("partition_offset", 0)
@@ -333,6 +388,10 @@ class EvidenceGraph:
asset = ExtractedAsset.from_dict(a_data)
graph.asset_library[aid] = asset
graph._inode_index[asset.inode] = aid
graph.investigation_areas = {
aid: InvestigationArea.from_dict(a)
for aid, a in data.get("investigation_areas", {}).items()
}
graph._rebuild_adjacency()
logger.info(
"EvidenceGraph restored: %d phenomena, %d hypotheses, %d entities, "
@@ -403,6 +462,7 @@ class EvidenceGraph:
raw_data: dict | None = None,
timestamp: str | None = None,
source_tool: str = "",
from_lead_id: str | None = None,
) -> tuple[str, bool]:
"""Add a phenomenon. Returns (id, was_merged).
@@ -419,6 +479,8 @@ class EvidenceGraph:
for k, v in raw_data.items():
if k not in similar.raw_data:
similar.raw_data[k] = v
if from_lead_id and similar.from_lead_id is None:
similar.from_lead_id = from_lead_id
self._auto_save()
return similar.id, True
@@ -437,6 +499,7 @@ class EvidenceGraph:
timestamp=timestamp,
confidence=confidence,
source_tool=source_tool,
from_lead_id=from_lead_id,
created_at=datetime.now().isoformat(),
)
self.phenomena[pid] = ph
@@ -532,14 +595,14 @@ class EvidenceGraph:
) -> float:
"""Update hypothesis confidence based on a phenomenon linkage.
The edge_type must be one of HYPOTHESIS_EDGE_WEIGHTS keys.
Weight is looked up from the predefined table, NOT judged by LLM.
The edge_type must be one of self.edge_weights keys.
Weight is looked up from the configured table, NOT judged by LLM.
Returns the new confidence value.
"""
if edge_type not in HYPOTHESIS_EDGE_WEIGHTS:
if edge_type not in self.edge_weights:
raise ValueError(
f"Invalid hypothesis edge type: {edge_type}. "
f"Must be one of: {list(HYPOTHESIS_EDGE_WEIGHTS.keys())}"
f"Must be one of: {list(self.edge_weights.keys())}"
)
async with self._lock:
@@ -549,7 +612,7 @@ class EvidenceGraph:
if hyp is None:
raise ValueError(f"Hypothesis not found: {hyp_id}")
weight = HYPOTHESIS_EDGE_WEIGHTS[edge_type]
weight = self.edge_weights[edge_type]
old_conf = hyp.confidence
if weight > 0:
@@ -640,6 +703,57 @@ class EvidenceGraph:
break
self._auto_save()
# ---- Investigation areas -------------------------------------------------
async def add_investigation_area(
self,
area: str,
description: str,
suggested_agent: str,
expected_keywords: list[str] | None = None,
expected_tools: list[str] | None = None,
priority: int = 5,
motivating_hypothesis_ids: list[str] | None = None,
created_by: str = "",
) -> tuple[str, bool]:
"""Add or merge an investigation area. Dedupe key is the `area` slug.
On collision, union the three list fields (keywords / tools /
motivating_hypothesis_ids); description / suggested_agent / priority
are preserved from the first writer (manual seed wins over LLM derive).
Returns (id, was_existing).
"""
async with self._lock:
for existing in self.investigation_areas.values():
if existing.area == area:
for kw in (expected_keywords or []):
if kw not in existing.expected_keywords:
existing.expected_keywords.append(kw)
for t in (expected_tools or []):
if t not in existing.expected_tools:
existing.expected_tools.append(t)
for hid in (motivating_hypothesis_ids or []):
if hid not in existing.motivating_hypothesis_ids:
existing.motivating_hypothesis_ids.append(hid)
self._auto_save()
return existing.id, True
aid = f"area-{area}"
self.investigation_areas[aid] = InvestigationArea(
id=aid,
area=area,
description=description,
suggested_agent=suggested_agent,
expected_keywords=list(expected_keywords or []),
expected_tools=list(expected_tools or []),
priority=priority,
motivating_hypothesis_ids=list(motivating_hypothesis_ids or []),
created_by=created_by,
created_at=datetime.now().isoformat(),
)
self._auto_save()
return aid, False
# ---- Asset library -------------------------------------------------------
async def register_asset(

View File

@@ -1,8 +1,10 @@
"""Custom LLM client using httpx for Claude Messages API via third-party proxy.
"""LLM client via the OpenAI SDK (works with DeepSeek's OpenAI-compatible API).
The proxy does not support Claude's native tool_use format (it strips the `tools`
field from requests). So we embed tool definitions in the system prompt and parse
structured JSON tool calls from the model's text output (ReAct-style).
Tool calling uses the OpenAI-native `tools=[...]` parameter. The model
returns structured tool_calls via the streaming protocol; we accumulate
them, dispatch to our executors, and feed results back as `role: "tool"`
messages. This eliminates the fragile "model writes JSON inside free
text" problem of the previous ReAct text mode.
"""
from __future__ import annotations
@@ -18,6 +20,7 @@ from dataclasses import dataclass, field
from typing import Any
import httpx
from openai import APIConnectionError, APIError, APITimeoutError, AsyncOpenAI
logger = logging.getLogger(__name__)
@@ -30,69 +33,81 @@ class LLMAPIError(Exception):
self.attempts = attempts
# Markers the model uses to signal tool calls and final answers
TOOL_CALL_TAG = "<tool_call>"
TOOL_CALL_END = "</tool_call>"
TOOL_RESULT_TAG = "<tool_result>"
TOOL_RESULT_END = "</tool_result>"
# Optional answer tags — kept for backward compat with prompts that wrap
# their final response in <answer>...</answer>. Native tool calling does
# not need these (no tool_calls = final), but if the model continues to
# emit them, we strip the tags so callers see clean text.
ANSWER_TAG = "<answer>"
ANSWER_END = "</answer>"
def _build_tools_prompt(tools: list[dict]) -> str:
"""Format tool definitions for inclusion in the system prompt."""
lines = ["You have access to the following tools:\n"]
for t in tools:
schema = t.get("input_schema", {})
props = schema.get("properties", {})
required = schema.get("required", [])
def _to_openai_tools(tools: list[dict]) -> list[dict]:
"""Convert internal tool definitions to OpenAI native function-tools format."""
return [
{
"type": "function",
"function": {
"name": t["name"],
"description": t["description"],
"parameters": t.get("input_schema", {"type": "object", "properties": {}}),
},
}
for t in tools
]
params = []
for pname, pdef in props.items():
req = " (required)" if pname in required else ""
desc = pdef.get("description", "")
ptype = pdef.get("type", "string")
enum_vals = pdef.get("enum")
if enum_vals:
allowed = ", ".join(f'"{v}"' for v in enum_vals)
params.append(f" - {pname}: {ptype}{req}{desc} Allowed values: [{allowed}]")
else:
params.append(f" - {pname}: {ptype}{req}{desc}")
param_block = "\n".join(params) if params else " (no parameters)"
lines.append(f"## {t['name']}\n{t['description']}\nParameters:\n{param_block}\n")
def _extract_first_balanced(text: str, open_char: str, close_char: str) -> str | None:
"""Return the first balanced [...] or {...} substring, or None if no balanced pair.
lines.append(
"## How to use tools\n"
"To call a tool, output a JSON block wrapped in XML tags like this:\n"
f"{TOOL_CALL_TAG}\n"
'{"name": "tool_name", "arguments": {"param1": "value1"}}\n'
f"{TOOL_CALL_END}\n\n"
"You can call multiple tools in sequence. After each tool call, you will receive the result in:\n"
f"{TOOL_RESULT_TAG}\n...result...\n{TOOL_RESULT_END}\n\n"
"When you have finished your analysis and have a final answer, wrap it in:\n"
f"{ANSWER_TAG}\nyour final answer here\n{ANSWER_END}\n\n"
"Think step by step. Call tools to gather evidence before drawing conclusions.\n"
"You MUST call at least one tool before giving your final answer."
Stack-based — handles nested brackets correctly (regex with .*? would
truncate at the first inner closing bracket, regex with .* would over-eat
trailing text). Brackets inside JSON string literals are ignored by
callers because the caller passes the result through json.loads which
re-parses with proper string handling.
"""
start = text.find(open_char)
if start < 0:
return None
depth = 0
for i in range(start, len(text)):
c = text[i]
if c == open_char:
depth += 1
elif c == close_char:
depth -= 1
if depth == 0:
return text[start:i + 1]
return None
def _safe_json_loads(text: str):
"""Parse JSON with progressive sanitization for LLM-produced output.
Tries (0) as-is, (1) escape stray backslashes outside valid JSON escapes
(\\" \\\\ \\/ \\b \\f \\n \\r \\t \\uXXXX). On final failure, logs raw
input (first 600 chars) so we can diagnose what the model emitted.
Used by orchestrator JSON callsites (_call_llm_for_json) and by
tool_call_loop when parsing tool_call arguments returned by the API.
"""
try:
return json.loads(text)
except json.JSONDecodeError:
pass
stage1 = re.sub(
r'\\(?!["\\/bfnrt]|u[0-9a-fA-F]{4})',
r'\\\\',
text,
)
return "\n".join(lines)
def _extract_tool_calls(text: str) -> list[dict]:
"""Extract tool call JSON blocks from model output."""
pattern = re.compile(
re.escape(TOOL_CALL_TAG) + r"\s*(.*?)\s*" + re.escape(TOOL_CALL_END),
re.DOTALL,
)
calls = []
for match in pattern.finditer(text):
raw = match.group(1).strip()
try:
parsed = json.loads(raw)
calls.append(parsed)
except json.JSONDecodeError:
logger.warning("Failed to parse tool call JSON: %s", raw[:200])
return calls
try:
return json.loads(stage1)
except json.JSONDecodeError as e:
logger.warning(
"_safe_json_loads failed after sanitize (%s); raw head[:600]=%r",
e, text[:600],
)
raise
def _extract_answer(text: str) -> str | None:
@@ -234,50 +249,41 @@ _DECAY_TIERS: list[tuple[int, int]] = [
def _apply_progressive_decay(messages: list[dict]) -> list[dict]:
"""Truncate tool results in older messages to save context space.
"""Truncate the `content` of older `role: "tool"` messages to save context.
Operates in-place-style on a copy. Only touches user messages that
contain <tool_result> blocks (these are the tool-result messages
generated by tool_call_loop).
Each `role: "tool"` message in the conversation corresponds to one tool
call's result. We rank these messages by recency and progressively
truncate older ones according to `_DECAY_TIERS`.
"""
# Count rounds from the end. A "round" is a (assistant, user) pair.
# messages alternate: [user, assistant, user, assistant, user, ...]
# The initial user message is index 0, then pairs start at index 1.
total = len(messages)
if total <= 10: # not enough messages to bother
if total <= 10:
return messages
result = []
# Count tool-result user messages from the end
tool_result_indices = [
i for i, m in enumerate(messages)
if m["role"] == "user" and TOOL_RESULT_TAG in m.get("content", "")
tool_msg_indices = [
i for i, m in enumerate(messages) if m.get("role") == "tool"
]
# Build a set of indices that need decay, mapped to their max_chars
decay_map: dict[int, int] = {}
n_tool_msgs = len(tool_result_indices)
for rank, idx in enumerate(reversed(tool_result_indices)):
rounds_ago = rank # 0 = most recent, 1 = second most recent, ...
for rank, idx in enumerate(reversed(tool_msg_indices)):
rounds_ago = rank
for threshold, max_chars in _DECAY_TIERS:
if rounds_ago < threshold:
decay_map[idx] = max_chars
break
result = []
for i, msg in enumerate(messages):
if i in decay_map:
max_chars = decay_map[i]
content = msg["content"]
content = msg.get("content", "") or ""
if len(content) > max_chars + 200:
# Truncate but preserve the tool_result tags structure
truncated = content[:max_chars]
# Count how many tool results are in this message
n_results = content.count(TOOL_RESULT_TAG)
truncated += (
f"\n... [context compressed: {len(content)} -> {max_chars} chars, "
f"{n_results} tool result(s)]"
truncated = (
content[:max_chars]
+ f"\n... [context compressed: {len(content)} -> {max_chars} chars]"
)
result.append({"role": msg["role"], "content": truncated})
new_msg = dict(msg)
new_msg["content"] = truncated
result.append(new_msg)
else:
result.append(msg)
else:
@@ -301,44 +307,51 @@ _FOLD_SUMMARY_SYSTEM = (
class LLMClient:
"""Calls Claude Messages API through a third-party proxy using raw httpx.
"""Async LLM client via the OpenAI SDK.
Uses prompt-based tool calling (ReAct pattern) since the proxy does not
support Claude's native tool_use format.
Works with any OpenAI-compatible endpoint (OpenAI, DeepSeek, ...).
Tool calling is text-based (ReAct) — see module docstring.
"""
def __init__(
self,
base_url: str,
api_key: str,
model: str = "claude-sonnet-4-6",
model: str = "deepseek-v4-pro",
max_tokens: int = 4096,
proxy: str | None = "auto",
reasoning_effort: str | None = None,
thinking_enabled: bool = False,
) -> None:
self.base_url = base_url.rstrip("/")
self.api_key = api_key
self.model = model
self.max_tokens = max_tokens
# proxy="auto": read from env; proxy=None/""/"none": no proxy; proxy="http://...": use it
self.reasoning_effort = reasoning_effort
self.thinking_enabled = thinking_enabled
# proxy="auto": read from env; proxy=None/""/"none": no proxy
if proxy == "auto":
proxy_url = os.environ.get("https_proxy") or os.environ.get("HTTPS_PROXY")
elif proxy and proxy.lower() != "none":
proxy_url = proxy
else:
proxy_url = None
self._client = httpx.AsyncClient(
http_client = (
httpx.AsyncClient(proxy=proxy_url, timeout=300.0)
if proxy_url else None
)
self._client = AsyncOpenAI(
api_key=self.api_key,
base_url=self.base_url,
headers={
"x-api-key": self.api_key,
"anthropic-version": "2023-06-01",
"content-type": "application/json",
},
timeout=300.0,
proxy=proxy_url,
http_client=http_client,
)
async def close(self) -> None:
await self._client.aclose()
await self._client.close()
async def chat(
self,
@@ -346,81 +359,144 @@ class LLMClient:
system: str | None = None,
max_retries: int = 5,
) -> str:
"""Send a streaming chat request and return the assembled text response.
"""Send a streaming chat completion and return the assembled text."""
full_messages: list[dict] = []
if system:
full_messages.append({"role": "system", "content": system})
full_messages.extend(messages)
Uses SSE streaming to keep the connection alive and avoid gateway
timeouts (504/524) on long-running completions.
"""
import asyncio as _asyncio
payload: dict[str, Any] = {
kwargs: dict[str, Any] = {
"model": self.model,
"messages": full_messages,
"max_tokens": self.max_tokens,
"messages": messages,
"stream": True,
}
if system:
payload["system"] = system
if self.reasoning_effort:
kwargs["reasoning_effort"] = self.reasoning_effort
if self.thinking_enabled:
kwargs["extra_body"] = {"thinking": {"type": "enabled"}}
for attempt in range(max_retries):
logger.debug("LLM request (stream): %d messages (attempt %d)", len(messages), attempt + 1)
logger.debug(
"LLM request (stream): %d messages (attempt %d)",
len(messages), attempt + 1,
)
text_parts: list[str] = []
try:
async with self._client.stream(
"POST", "/v1/messages", json=payload,
) as resp:
# Check for HTTP errors before consuming stream
if resp.status_code >= 400:
body = await resp.aread()
raise httpx.HTTPStatusError(
f"Server error '{resp.status_code}' for url '{resp.url}'",
request=resp.request,
response=resp,
)
# Parse SSE events
async for line in resp.aiter_lines():
if not line.startswith("data: "):
continue
data_str = line[6:] # strip "data: " prefix
if data_str.strip() == "[DONE]":
break
try:
event = json.loads(data_str)
except json.JSONDecodeError:
continue
event_type = event.get("type", "")
if event_type == "content_block_delta":
delta = event.get("delta", {})
if delta.get("type") == "text_delta":
text_parts.append(delta["text"])
elif event_type == "message_stop":
break
elif event_type == "error":
err_msg = event.get("error", {}).get("message", "Unknown streaming error")
raise httpx.HTTPStatusError(
err_msg, request=resp.request, response=resp,
)
stream = await self._client.chat.completions.create(**kwargs)
async for chunk in stream:
if not chunk.choices:
continue
delta = chunk.choices[0].delta
if delta.content:
text_parts.append(delta.content)
text = "".join(text_parts)
logger.debug("LLM response (stream): %d chars", len(text))
return text
except (httpx.HTTPStatusError, httpx.ConnectError, httpx.ReadTimeout, httpx.RemoteProtocolError) as e:
except (APIConnectionError, APITimeoutError, APIError) as e:
if attempt < max_retries - 1:
wait = 2 ** attempt * 10
logger.warning("Request failed (%s), retrying in %ds...", e, wait)
await _asyncio.sleep(wait)
await asyncio.sleep(wait)
else:
raise LLMAPIError(
f"LLM API unreachable after {max_retries} attempts: {e}",
attempts=max_retries,
) from e
# Should not reach here, but just in case
return ""
async def _chat_with_tools(
self,
messages: list[dict],
openai_tools: list[dict],
max_retries: int = 5,
) -> tuple[str, str | None, list[dict]]:
"""Stream a chat completion with native tool calling enabled.
Returns:
(text_content, reasoning_content, raw_tool_calls).
- reasoning_content is non-None when DeepSeek thinking mode is
active; the caller MUST echo it back in the assistant message
on subsequent requests, or the API returns HTTP 400.
- raw_tool_calls is a list of {"id","name","arguments"} dicts;
arguments is the raw JSON string returned by the API.
"""
kwargs: dict[str, Any] = {
"model": self.model,
"messages": messages,
"max_tokens": self.max_tokens,
"stream": True,
"tools": openai_tools,
}
if self.reasoning_effort:
kwargs["reasoning_effort"] = self.reasoning_effort
if self.thinking_enabled:
kwargs["extra_body"] = {"thinking": {"type": "enabled"}}
for attempt in range(max_retries):
logger.debug(
"LLM request (stream+tools): %d messages, %d tools (attempt %d)",
len(messages), len(openai_tools), attempt + 1,
)
text_parts: list[str] = []
reasoning_parts: list[str] = []
tool_calls_acc: dict[int, dict] = {} # index -> {id, name, arguments}
try:
stream = await self._client.chat.completions.create(**kwargs)
async for chunk in stream:
if not chunk.choices:
continue
delta = chunk.choices[0].delta
if delta.content:
text_parts.append(delta.content)
# DeepSeek thinking-mode: reasoning_content is returned
# alongside content and MUST be echoed back on subsequent
# requests, otherwise the API rejects with HTTP 400.
rc = getattr(delta, "reasoning_content", None)
if rc:
reasoning_parts.append(rc)
if delta.tool_calls:
for tc_delta in delta.tool_calls:
idx = tc_delta.index
entry = tool_calls_acc.setdefault(
idx, {"id": None, "name": None, "arguments": ""},
)
if tc_delta.id:
entry["id"] = tc_delta.id
fn = tc_delta.function
if fn:
if fn.name:
entry["name"] = fn.name
if fn.arguments:
entry["arguments"] += fn.arguments
text = "".join(text_parts)
reasoning = "".join(reasoning_parts) or None
ordered = [tool_calls_acc[i] for i in sorted(tool_calls_acc)]
logger.debug(
"LLM response (stream+tools): %d chars, %d reasoning chars, %d tool calls",
len(text), len(reasoning or ""), len(ordered),
)
return text, reasoning, ordered
except (APIConnectionError, APITimeoutError, APIError) as e:
if attempt < max_retries - 1:
wait = 2 ** attempt * 10
logger.warning(
"Tool-call request failed (%s), retrying in %ds...", e, wait,
)
await asyncio.sleep(wait)
else:
raise LLMAPIError(
f"LLM API unreachable after {max_retries} attempts: {e}",
attempts=max_retries,
) from e
return "", None, []
async def tool_call_loop(
self,
messages: list[dict],
@@ -428,87 +504,159 @@ class LLMClient:
tool_executor: dict[str, Any],
system: str | None = None,
max_iterations: int = 40,
terminal_tools: tuple[str, ...] = (),
) -> tuple[str, list[dict]]:
"""Run a ReAct-style tool-calling loop.
"""Run a tool-calling loop using OpenAI-native tool calls.
The model outputs <tool_call> blocks which we parse and execute,
feeding results back as <tool_result> blocks until the model
outputs an <answer> block.
The model returns structured `tool_calls` in its message; we
dispatch them through our executor dict and feed each result back
as a `role: "tool"` message with the matching `tool_call_id`. The
loop ends when:
- the model returns a message with no tool_calls (normal exit), or
- any tool in `terminal_tools` is called — in that case, the loop
short-circuits with that tool's result text as final_text. This
gives agents (notably ReportAgent) an explicit completion signal
that the old `<answer>` text tag used to provide.
Returns:
(final_text, all_messages)
(final_text, full_message_history)
"""
# Build system prompt with tool definitions
tools_prompt = _build_tools_prompt(tools)
full_system = f"{system}\n\n{tools_prompt}" if system else tools_prompt
terminal_set = set(terminal_tools)
openai_tools = _to_openai_tools(tools)
messages = list(messages) # don't mutate caller's list
_folded = False # Track whether we've already folded once this loop
# The caller may pass `messages` either as raw conversation (no system)
# together with `system=...`, OR as a complete history that already
# starts with the system message (retry path). Accept both shapes.
if messages and messages[0].get("role") == "system":
full_messages: list[dict] = list(messages)
else:
full_messages = []
if system:
full_messages.append({"role": "system", "content": system})
full_messages.extend(messages)
_folded = False
for i in range(max_iterations):
for _i in range(max_iterations):
# ── Context compression before each API call ──────────────
# Stage A: progressively decay old tool results
messages = _apply_progressive_decay(messages)
# Stage B: fold oldest messages into LLM summary if too long
if not _folded and len(messages) > _FOLD_THRESHOLD:
messages = await self._fold_old_messages(messages, full_system)
full_messages = _apply_progressive_decay(full_messages)
if not _folded and len(full_messages) > _FOLD_THRESHOLD:
full_messages = await self._fold_old_messages(full_messages)
_folded = True
elif _folded and len(messages) > _FOLD_THRESHOLD + _FOLD_KEEP_RECENT:
# Allow a second fold if messages grew back significantly
messages = await self._fold_old_messages(messages, full_system)
elif _folded and len(full_messages) > _FOLD_THRESHOLD + _FOLD_KEEP_RECENT:
full_messages = await self._fold_old_messages(full_messages)
text = await self.chat(messages, system=full_system)
text, reasoning, raw_tool_calls = await self._chat_with_tools(
full_messages, openai_tools,
)
# Check for final answer
answer = _extract_answer(text)
if answer is not None:
messages.append({"role": "assistant", "content": text})
return answer, messages
if not raw_tool_calls:
# Model produced a final response. Strip optional <answer>
# tags for backward compatibility with old prompts.
final_msg: dict[str, Any] = {"role": "assistant", "content": text}
if reasoning:
final_msg["reasoning_content"] = reasoning
full_messages.append(final_msg)
answer = _extract_answer(text)
return (answer if answer is not None else text), full_messages
# Check for tool calls
tool_calls = _extract_tool_calls(text)
# Parse arguments + build internal call dicts
parsed_calls: list[dict] = []
for rc in raw_tool_calls:
args_str = rc.get("arguments", "") or ""
try:
args = _safe_json_loads(args_str) if args_str.strip() else {}
except (json.JSONDecodeError, ValueError) as e:
logger.warning(
"Failed to parse arguments for tool %s: %s",
rc.get("name"), e,
)
args = {}
parsed_calls.append({
"id": rc.get("id"),
"name": rc.get("name", ""),
"arguments": args,
})
if not tool_calls:
# No tool calls and no answer tag — treat entire text as answer
messages.append({"role": "assistant", "content": text})
return text, messages
# Append the assistant turn with the raw tool_calls (and the
# DeepSeek-mandated reasoning_content echo-back), then execute.
asst_msg: dict[str, Any] = {
"role": "assistant",
"content": text or None,
"tool_calls": [
{
"id": rc.get("id"),
"type": "function",
"function": {
"name": rc.get("name", ""),
"arguments": rc.get("arguments", "") or "",
},
}
for rc in raw_tool_calls
],
}
if reasoning:
asst_msg["reasoning_content"] = reasoning
full_messages.append(asst_msg)
# Execute tool calls — read-only tools run in parallel
messages.append({"role": "assistant", "content": text})
result_parts = []
batches = _partition_tool_calls(tool_calls)
batches = _partition_tool_calls(parsed_calls)
t_batch_start = time.monotonic()
# Each entry: (tool_call_dict, raw_result, formatted_for_llm)
executed: list[tuple[dict, str, str]] = []
for batch in batches:
if batch.is_read_only and len(batch.calls) > 1:
batch_results = await self._execute_tool_batch_parallel(
results = await self._execute_tool_batch_parallel(
batch.calls, tool_executor, tools,
)
result_parts.extend(batch_results)
for tc, (raw, formatted) in zip(batch.calls, results):
executed.append((tc, raw, formatted))
else:
for tc in batch.calls:
result_parts.append(
await self._execute_single_tool(tc, tool_executor, tools)
raw, formatted = await self._execute_single_tool(
tc, tool_executor, tools,
)
executed.append((tc, raw, formatted))
# Emit folded tool-call summary for the terminal
t_batch_elapsed = time.monotonic() - t_batch_start
_emit_tool_call_summary(tool_calls, t_batch_elapsed)
_emit_tool_call_summary(parsed_calls, t_batch_elapsed)
# Feed results back as a user message
result_message = "\n\n".join(result_parts)
messages.append({"role": "user", "content": result_message})
# Append formatted tool results to the conversation (this is
# what the LLM sees on subsequent rounds — truncated for context
# economy).
for tc, _raw, formatted in executed:
full_messages.append({
"role": "tool",
"tool_call_id": tc["id"],
"content": formatted,
})
# Terminal-tool short-circuit: if the model called any tool in
# `terminal_tools`, end the loop immediately. The terminal tool's
# RAW result (untruncated) becomes final_text — the LLM may have
# produced a 20K-char report via save_report and we must not
# truncate it just because the LLM-facing copy is truncated.
if terminal_set:
for tc, raw, _formatted in executed:
name = tc.get("name", "")
if name in terminal_set:
logger.info(
"Terminal tool %s called — exiting tool_call_loop", name,
)
return raw, full_messages
logger.warning("Tool call loop hit max iterations (%d)", max_iterations)
return "[Max tool call iterations reached]", messages
return "[Max tool call iterations reached]", full_messages
async def _execute_single_tool(
self, tc: dict, tool_executor: dict[str, Any],
tools: list[dict] | None = None,
) -> str:
"""Execute a single tool call and return the formatted result."""
) -> tuple[str, str]:
"""Execute a single tool call.
Returns (raw_result, formatted_for_llm). `raw_result` is the
unmodified executor return (used by terminal-tool short-circuit as
final_text). `formatted_for_llm` is `[tool_name] {truncated}` and
is what gets fed back to the model as the tool message content.
"""
tool_name = tc.get("name", "")
tool_args = tc.get("arguments", {})
@@ -519,72 +667,106 @@ class LLMClient:
executor = tool_executor.get(tool_name)
if executor is None:
result_text = f"Error: unknown tool '{tool_name}'"
raw = f"Error: unknown tool '{tool_name}'"
else:
try:
result_text = await executor(**tool_args)
raw = await executor(**tool_args)
except Exception as e:
logger.error("Tool %s failed: %s", tool_name, e)
result_text = f"Error executing {tool_name}: {e}"
raw = f"Error executing {tool_name}: {e}"
return (
f"{TOOL_RESULT_TAG}\n"
f"[{tool_name}] {_truncate_tool_result(result_text)}\n"
f"{TOOL_RESULT_END}"
)
formatted = f"[{tool_name}] {_truncate_tool_result(raw)}"
return raw, formatted
async def _execute_tool_batch_parallel(
self, calls: list[dict], tool_executor: dict[str, Any],
tools: list[dict] | None = None,
) -> list[str]:
"""Execute multiple read-only tool calls concurrently."""
) -> list[tuple[str, str]]:
"""Execute multiple read-only tool calls concurrently.
Returns a list of (raw_result, formatted_for_llm) tuples in the
same order as `calls`.
"""
logger.info("Executing %d read-only tools in parallel", len(calls))
async def _run_one(tc: dict) -> str:
async def _run_one(tc: dict) -> tuple[str, str]:
tool_name = tc.get("name", "")
tool_args = tc.get("arguments", {})
if tools:
tool_args = _fix_tool_args(tool_name, tool_args, tools)
logger.info("Calling tool (parallel): %s(%s)", tool_name, json.dumps(tool_args, ensure_ascii=False))
logger.info(
"Calling tool (parallel): %s(%s)",
tool_name, json.dumps(tool_args, ensure_ascii=False),
)
executor = tool_executor.get(tool_name)
if executor is None:
result_text = f"Error: unknown tool '{tool_name}'"
raw = f"Error: unknown tool '{tool_name}'"
else:
try:
result_text = await executor(**tool_args)
raw = await executor(**tool_args)
except Exception as e:
logger.error("Tool %s failed: %s", tool_name, e)
result_text = f"Error executing {tool_name}: {e}"
return (
f"{TOOL_RESULT_TAG}\n"
f"[{tool_name}] {_truncate_tool_result(result_text)}\n"
f"{TOOL_RESULT_END}"
)
raw = f"Error executing {tool_name}: {e}"
formatted = f"[{tool_name}] {_truncate_tool_result(raw)}"
return raw, formatted
results = await asyncio.gather(*[_run_one(tc) for tc in calls])
return list(results)
async def _fold_old_messages(
self, messages: list[dict], system: str,
self, messages: list[dict],
) -> list[dict]:
"""Fold old messages into an LLM-generated summary (Stage B).
Keeps the most recent _FOLD_KEEP_RECENT messages intact and
replaces earlier ones with a single summary message.
Preserves the leading system message (if any), keeps the most
recent _FOLD_KEEP_RECENT messages intact, and replaces the older
middle slice with a single summary user message.
"""
n_to_fold = len(messages) - _FOLD_KEEP_RECENT
# Pin the system message — it must NEVER be summarized away.
system_msgs: list[dict] = []
body = messages
if messages and messages[0].get("role") == "system":
system_msgs = [messages[0]]
body = messages[1:]
n_to_fold = len(body) - _FOLD_KEEP_RECENT
if n_to_fold <= 2:
return messages
old_messages = messages[:n_to_fold]
recent_messages = messages[n_to_fold:]
# Pull the fold boundary forward so we never split an assistant turn
# from its matching tool results. The API rejects (HTTP 400) any
# `role: "tool"` message that does not immediately follow an
# `assistant` message with `tool_calls`. We walk the boundary into
# `recent_messages` while its head is a `role: "tool"` message, or
# while the prior `recent` message is `assistant{tool_calls}` whose
# paired tools span the boundary.
while n_to_fold < len(body):
head = body[n_to_fold]
if head.get("role") == "tool":
n_to_fold += 1
continue
break
if n_to_fold >= len(body):
# Everything got folded — nothing recent to keep.
return system_msgs + [body[0]] if system_msgs else messages
old_messages = body[:n_to_fold]
recent_messages = body[n_to_fold:]
# Build a text dump of old messages for summarization
old_text_parts = []
for msg in old_messages:
role = msg["role"]
content = msg.get("content", "")
# Truncate each message for the summary prompt to avoid overload
role = msg.get("role", "?")
content = msg.get("content") or ""
# Render tool_calls (assistant turn) compactly.
if role == "assistant" and msg.get("tool_calls"):
tc_names = [
tc.get("function", {}).get("name", "?")
for tc in msg["tool_calls"]
]
content = (content + " " if content else "") + (
"called: " + ", ".join(tc_names)
)
if len(content) > 1000:
content = content[:1000] + "..."
old_text_parts.append(f"[{role}]: {content}")
@@ -608,7 +790,6 @@ class LLMClient:
logger.warning("Context folding failed: %s — keeping original messages", e)
return messages
# Replace old messages with a single summary
summary_message = {
"role": "user",
"content": (
@@ -616,4 +797,4 @@ class LLMClient:
f"messages in this conversation]\n\n{summary}"
),
}
return [summary_message] + recent_messages
return system_msgs + [summary_message] + recent_messages

View File

@@ -219,6 +219,8 @@ async def async_main() -> None:
model=agent_cfg["model"],
max_tokens=agent_cfg.get("max_tokens", 4096),
proxy=agent_cfg.get("proxy", "auto"),
reasoning_effort=agent_cfg.get("reasoning_effort"),
thinking_enabled=agent_cfg.get("thinking_enabled", False),
)
# Initialize evidence graph
@@ -229,6 +231,7 @@ async def async_main() -> None:
graph = EvidenceGraph(
case_info=config.get("cfreds_hacking_case", {}),
persist_path=run_dir / "graph_state.json",
edge_weights=config.get("hypothesis_edge_weights"),
)
graph.image_path = image_path
graph.partition_offset = partition_offset

View File

@@ -11,8 +11,9 @@ from datetime import datetime
from pathlib import Path
from agent_factory import AgentFactory
from evidence_graph import EvidenceGraph, HYPOTHESIS_EDGE_WEIGHTS
from llm_client import LLMClient
from evidence_graph import EvidenceGraph
from llm_client import LLMClient, _extract_first_balanced, _safe_json_loads
from tool_registry import TOOL_CATALOG
logger = logging.getLogger(__name__)
@@ -93,6 +94,14 @@ class Orchestrator:
"Omit phenomena that are unrelated. Be conservative — only link genuinely relevant evidence."
)
_AREA_DERIVE_SYSTEM = (
"You are a forensic investigation strategist. Given a set of hypotheses, "
"decompose them into a minimal aggregate set of investigation areas. An "
"area is a focused, concrete question with the keywords and tool names an "
"answering phenomenon would mention. Aggregate aggressively — when two "
"hypotheses share an area, emit it once and list both hypothesis_ids."
)
def __init__(
self,
llm: LLMClient,
@@ -149,7 +158,8 @@ class Orchestrator:
await agent.run(
f"Investigate this lead: {lead.description}\n"
f"{hyp_line}"
f"Focus area: {lead.target_agent}"
f"Focus area: {lead.target_agent}",
lead_id=lead.id,
)
await self.graph.mark_lead_completed(lead.id)
self._failure_count = 0
@@ -209,14 +219,176 @@ class Orchestrator:
"1. Specific and testable\n"
"2. About a distinct aspect of activity (e.g., hacking tools, communication, "
"network attacks, data theft)\n\n"
"For each hypothesis:\n"
"- Call add_hypothesis to create it\n"
"- Then call link_phenomenon_to_hypothesis to link relevant existing phenomena\n"
"- Choose the relationship type carefully: direct_evidence, supports, "
"prerequisite_met, consequence_observed, contradicts, or weakens\n\n"
"Call add_hypothesis for each. The orchestrator will automatically link "
"relevant existing phenomena to each hypothesis after you finish — you do "
"not need to (and cannot) create those links yourself.\n\n"
"The ultimate goal is to reconstruct a detailed timeline of what happened on this host."
)
# ---- Investigation areas (manual seed + LLM derive) ----------------------
_VALID_AGENT_TYPES = {"filesystem", "registry", "communication", "network", "timeline"}
async def _seed_manual_investigation_areas(self) -> None:
"""Import config.yaml:investigation_areas entries (manual override).
Run early in Phase 2 so manual entries are in the graph before LLM
derivation; LLM derive then augments via slug-based dedupe.
"""
for entry in self.config.get("investigation_areas", []):
area = entry.get("area")
if not area:
continue
await self.graph.add_investigation_area(
area=area,
description=entry.get("description", entry.get("task", "")),
suggested_agent=entry.get("agent", "filesystem"),
expected_keywords=entry.get("keywords", []),
expected_tools=entry.get("tools", []),
priority=entry.get("priority", 3),
created_by="manual",
)
async def _derive_investigation_areas(self) -> None:
"""Ask LLM to derive investigation areas from active hypotheses.
Manual-seeded or already-populated graph (resume) → no-op. On LLM
failure or empty output, falls back to one-area-per-hypothesis.
"""
if self.graph.investigation_areas:
return
active = [h for h in self.graph.hypotheses.values() if h.status == "active"]
if not active:
return
available_tools = sorted(TOOL_CATALOG.keys())
hyp_lines = "\n".join(
f" [{h.id}] {h.title}: {h.description}" for h in active
)
prompt = (
f"Active hypotheses:\n{hyp_lines}\n\n"
f"Available agents: {sorted(self._VALID_AGENT_TYPES)}\n"
f"Available tool names (pick 1-3 per area for expected_tools): {available_tools}\n\n"
f"Emit 5-12 distinct investigation areas covering the FULL hypothesis set.\n"
f"Each area must include:\n"
f" - area: snake_case slug (dedupe key)\n"
f" - description: one sentence on what to find\n"
f" - suggested_agent: one of the agents above\n"
f" - expected_keywords: 3-8 lowercase tokens that an answering phenomenon would mention\n"
f" - expected_tools: 1-3 tool names from the list above\n"
f" - priority: 1 (highest) to 10\n"
f" - motivating_hypothesis_ids: at least one [hyp-xxx] from above\n\n"
f"Aggregate aggressively — when two hypotheses share an area, emit it ONCE "
f"and list both ids in motivating_hypothesis_ids.\n\n"
f"Respond ONLY with JSON:\n"
f'[{{"area":"...","description":"...","suggested_agent":"...",'
f'"expected_keywords":[...],"expected_tools":[...],"priority":1-10,'
f'"motivating_hypothesis_ids":["hyp-xxx"]}}]'
)
try:
items = await self._call_llm_for_json(
system=self._AREA_DERIVE_SYSTEM,
user_prompt=prompt,
schema="array",
)
except Exception as e:
logger.warning("Area derivation LLM failed: %s — falling back", e)
await self._derive_areas_fallback(active)
return
valid_hyp_ids = set(self.graph.hypotheses.keys())
for it in items:
area = it.get("area", "").strip()
if not area:
continue
agent = it.get("suggested_agent", "filesystem")
if agent not in self._VALID_AGENT_TYPES:
agent = AGENT_ALIASES.get(agent, "filesystem")
tools = [t for t in it.get("expected_tools", []) if t in TOOL_CATALOG]
motivating = [
h for h in it.get("motivating_hypothesis_ids", [])
if h in valid_hyp_ids
]
priority = max(1, min(10, int(it.get("priority", 5))))
await self.graph.add_investigation_area(
area=area,
description=it.get("description", ""),
suggested_agent=agent,
expected_keywords=[
str(kw).lower() for kw in it.get("expected_keywords", [])
],
expected_tools=tools,
priority=priority,
motivating_hypothesis_ids=motivating,
created_by="llm_derive",
)
if not self.graph.investigation_areas:
await self._derive_areas_fallback(active)
async def _derive_areas_fallback(self, active: list) -> None:
"""One area per active hypothesis as a minimal safety net."""
for h in active:
slug = re.sub(r"[^a-z0-9_]+", "_", h.title.lower())[:40].strip("_")
if not slug:
slug = h.id.replace("-", "_")
await self.graph.add_investigation_area(
area=slug,
description=h.title,
suggested_agent="filesystem",
expected_keywords=h.title.lower().split()[:6],
expected_tools=[],
priority=5,
motivating_hypothesis_ids=[h.id],
created_by="fallback",
)
# ---- LLM JSON helper -----------------------------------------------------
async def _call_llm_for_json(
self,
system: str,
user_prompt: str,
schema: str = "array",
max_retries: int = 2,
):
"""Call LLM expecting JSON output; self-correct on parse failure.
Two layers of safety:
1. _safe_json_loads escapes stray backslashes outside valid JSON escapes.
2. On any remaining parse error, append the error to the prompt and ask
the LLM to retry, up to max_retries additional attempts.
"""
error_hint = ""
last_err: Exception | None = None
open_c, close_c = ('[', ']') if schema == "array" else ('{', '}')
for attempt in range(max_retries + 1):
messages = [{"role": "user", "content": user_prompt + error_hint}]
response = await self.llm.chat(messages=messages, system=system)
candidate = _extract_first_balanced(response, open_c, close_c) or response
try:
return _safe_json_loads(candidate)
except (json.JSONDecodeError, ValueError) as e:
last_err = e
if attempt < max_retries:
logger.info(
"JSON parse attempt %d/%d failed (%s); retrying with hint",
attempt + 1, max_retries + 1, e,
)
error_hint = (
f"\n\n[Your previous response could not be parsed as JSON: {e}]\n"
f"Output STRICT JSON only — no markdown fences, no code blocks. "
f"Do NOT include literal backslash characters in any string value; "
f"rephrase using forward slashes or describe paths in English "
f"(e.g. 'the Cain folder under Program Files' instead of "
f"'C:\\Program Files\\Cain'). "
f"The response must be a single JSON {schema}."
)
assert last_err is not None
raise last_err
# ---- Hypothesis-directed investigation -----------------------------------
async def _generate_hypothesis_leads(self) -> None:
@@ -233,10 +405,19 @@ class Orchestrator:
existing = "\n".join(
f" - {r['node']} [{r['edge_type']}]" for r in related
) or " (none yet)"
related_areas = [
a.area for a in self.graph.investigation_areas.values()
if hyp.id in a.motivating_hypothesis_ids
]
expected_line = (
f" Expected areas to investigate: {', '.join(related_areas)}\n"
if related_areas else ""
)
hyp_blocks.append(
f"Hypothesis [{hyp.id}]: {hyp.title}\n"
f" Description: {hyp.description}\n"
f" Current confidence: {hyp.confidence:.2f}\n"
f"{expected_line}"
f" Existing evidence:\n{existing}"
)
@@ -253,15 +434,11 @@ class Orchestrator:
)
try:
response = await self.llm.chat(
messages=[{"role": "user", "content": prompt}],
tasks = await self._call_llm_for_json(
system=self._LEAD_GEN_SYSTEM,
user_prompt=prompt,
schema="array",
)
match = re.search(r'\[.*?\]', response, re.DOTALL)
if match:
tasks = json.loads(match.group())
else:
tasks = json.loads(response)
for task in tasks:
hyp_id = task.get("hypothesis_id", "")
@@ -287,12 +464,21 @@ class Orchestrator:
existing_evidence = "\n".join(
f" - {r['node']} [{r['edge_type']}]" for r in related
) or " (none yet)"
related_areas = [
a.area for a in self.graph.investigation_areas.values()
if hyp.id in a.motivating_hypothesis_ids
]
expected_line = (
f"Expected areas to investigate: {', '.join(related_areas)}\n\n"
if related_areas else ""
)
prompt = (
f"Hypothesis: {hyp.title}\n"
f"Description: {hyp.description}\n"
f"Current confidence: {hyp.confidence:.2f}\n\n"
f"Existing evidence linked to this hypothesis:\n{existing_evidence}\n\n"
f"{expected_line}"
f"What additional evidence should we look for to CONFIRM or DENY this hypothesis?\n"
f"List 1-3 specific, actionable investigation tasks.\n"
f"For each, specify which agent type should handle it: "
@@ -301,12 +487,11 @@ class Orchestrator:
f'[{{"agent": "agent_type", "task": "what to investigate", "priority": 1-10}}]'
)
try:
response = await self.llm.chat(
messages=[{"role": "user", "content": prompt}],
tasks = await self._call_llm_for_json(
system=self._LEAD_GEN_SYSTEM,
user_prompt=prompt,
schema="array",
)
match = re.search(r'\[.*?\]', response, re.DOTALL)
tasks = json.loads(match.group()) if match else json.loads(response)
for task in tasks:
await self.graph.add_lead(
target_agent=task.get("agent", "filesystem"),
@@ -333,7 +518,7 @@ class Orchestrator:
if not unlinked:
return
valid_types = list(HYPOTHESIS_EDGE_WEIGHTS.keys())
valid_types = list(self.graph.edge_weights.keys())
hyp_section = "\n".join(
f" [{h.id}] {h.title}: {h.description}" for h in active
@@ -352,15 +537,11 @@ class Orchestrator:
)
try:
response = await self.llm.chat(
messages=[{"role": "user", "content": prompt}],
judgments = await self._call_llm_for_json(
system=self._JUDGE_SYSTEM,
user_prompt=prompt,
schema="array",
)
match = re.search(r'\[.*?\]', response, re.DOTALL)
if match:
judgments = json.loads(match.group())
else:
judgments = json.loads(response)
for j in judgments:
hyp_id = j.get("hypothesis_id", "")
@@ -370,7 +551,7 @@ class Orchestrator:
if (
hyp_id in self.graph.hypotheses
and ph_id in self.graph.phenomena
and edge_type in HYPOTHESIS_EDGE_WEIGHTS
and edge_type in self.graph.edge_weights
):
await self.graph.update_hypothesis_confidence(
hyp_id=hyp_id,
@@ -403,17 +584,16 @@ class Orchestrator:
f'[{{"phenomenon_id": "ph-xxx", "edge_type": "supports|contradicts|...", "reason": "brief explanation"}}]'
)
try:
response = await self.llm.chat(
messages=[{"role": "user", "content": prompt}],
judgments = await self._call_llm_for_json(
system=self._JUDGE_SYSTEM,
user_prompt=prompt,
schema="array",
)
match = re.search(r'\[.*?\]', response, re.DOTALL)
judgments = json.loads(match.group()) if match else json.loads(response)
for j in judgments:
ph_id = j.get("phenomenon_id", "")
edge_type = j.get("edge_type", "")
reason = j.get("reason", "")
if ph_id in self.graph.phenomena and edge_type in HYPOTHESIS_EDGE_WEIGHTS:
if ph_id in self.graph.phenomena and edge_type in self.graph.edge_weights:
await self.graph.update_hypothesis_confidence(
hyp_id=hyp.id,
phenomenon_id=ph_id,
@@ -429,74 +609,57 @@ class Orchestrator:
# ---- Gap analysis (coverage check) ---------------------------------------
_AREA_KEYWORDS: dict[str, list[str]] = {
"system_info": ["install date", "registered owner", "product name", "windows xp", "system information"],
"user_accounts": ["user account", "enumerate", "sam hive", "administrator", "mr. evil"],
"shutdown_time": ["shutdown"],
"network_config": ["network interface", "network adapter", "ip address", "dhcp", "mac address", "network config"],
"installed_software": ["installed software", "program files", "installed program"],
"email_config": ["smtp", "pop3", "nntp", "email account", "email config"],
"chat_logs": ["irc", "mirc", "chat log", "channel"],
"network_activity": ["packet capture", "pcap", "interception", "http request", "user-agent"],
"deleted_files": ["deleted file", "recycle", "recycler"],
"execution_evidence": ["prefetch", "execution", "run count", "last execution"],
}
def _check_coverage(self) -> set[str]:
"""Return slugs of investigation_areas already covered by phenomena.
# Deterministic coverage: if the canonical tool was called, the area is covered.
_AREA_TOOLS: dict[str, list[str]] = {
"system_info": ["get_system_info"],
"user_accounts": ["enumerate_users"],
"shutdown_time": ["get_shutdown_time"],
"network_config": ["get_network_interfaces"],
"installed_software": ["list_installed_software"],
"email_config": ["get_email_config"],
"network_activity": ["parse_pcap_strings"],
"deleted_files": ["count_deleted_files"],
"execution_evidence": ["parse_prefetch"],
}
Layer A: any expected_keyword found in evidence text (category +
title + description, lowercased).
Layer B: any expected_tool present in the source_tool set of recorded
phenomena (deterministic — the canonical tool was actually called).
"""
evidence_text = " ".join(
f"{ph.category} {ph.title} {ph.description}".lower()
for ph in self.graph.phenomena.values()
)
used_tools: set[str] = {
ph.source_tool for ph in self.graph.phenomena.values() if ph.source_tool
}
def _check_coverage(self, areas: list[dict]) -> set[str]:
# Layer 1: keyword matching on category + title + description
evidence_text = ""
for ph in self.graph.phenomena.values():
evidence_text += f" {ph.category} {ph.title} {ph.description} ".lower()
# Layer 2: collect all source_tools that produced phenomena
used_tools: set[str] = {ph.source_tool for ph in self.graph.phenomena.values() if ph.source_tool}
covered = set()
for area in areas:
area_name = area["area"]
# Check keywords
keywords = self._AREA_KEYWORDS.get(area_name, [])
if any(kw in evidence_text for kw in keywords):
covered.add(area_name)
covered: set[str] = set()
for a in self.graph.investigation_areas.values():
if any(kw.lower() in evidence_text for kw in a.expected_keywords):
covered.add(a.area)
continue
# Check source_tool
area_tools = self._AREA_TOOLS.get(area_name, [])
if any(tool in used_tools for tool in area_tools):
covered.add(area_name)
if any(t in used_tools for t in a.expected_tools):
covered.add(a.area)
return covered
async def _run_gap_analysis(self) -> None:
areas = self.config.get("investigation_areas", [])
areas = list(self.graph.investigation_areas.values())
if not areas:
return
covered = self._check_coverage(areas)
uncovered = [a for a in areas if a["area"] not in covered]
covered = self._check_coverage()
uncovered = [a for a in areas if a.area not in covered]
if not uncovered:
_log(f"All {len(areas)} investigation areas covered", event="progress")
return
uncovered_names = ", ".join(a["area"] for a in uncovered)
_log(f"{len(uncovered)}/{len(areas)} areas uncovered: {uncovered_names}", event="dispatch")
for area in uncovered:
uncovered_names = ", ".join(a.area for a in uncovered)
_log(
f"{len(uncovered)}/{len(areas)} areas uncovered: {uncovered_names}",
event="dispatch",
)
for a in uncovered:
await self.graph.add_lead(
target_agent=area["agent"],
description=area["task"],
priority=3,
target_agent=a.suggested_agent,
description=a.description,
priority=a.priority,
hypothesis_id=(
a.motivating_hypothesis_ids[0]
if a.motivating_hypothesis_ids else None
),
)
for round_num in range(3):
@@ -505,6 +668,7 @@ class Orchestrator:
break
_log(f"Gap fill round {round_num}: {len(pending)} leads", event="dispatch")
await self._dispatch_leads_parallel(pending)
await self._judge_new_phenomena()
# ---- Run archiving -------------------------------------------------------
@@ -542,6 +706,15 @@ class Orchestrator:
json.dumps(leads_data, ensure_ascii=False, indent=2)
)
# Investigation areas export
areas_data = {
aid: a.to_dict()
for aid, a in self.graph.investigation_areas.items()
}
(self.run_dir / "investigation_areas.json").write_text(
json.dumps(areas_data, ensure_ascii=False, indent=2)
)
# Run metadata
end_time = datetime.now()
metadata = {
@@ -601,18 +774,32 @@ class Orchestrator:
if resume_phase <= 2:
_log("Phase 2: Hypothesis Generation", event="phase")
t0 = time.monotonic()
# Seed manual investigation areas (if any) BEFORE LLM derive,
# so manual entries win the dedupe and LLM only augments.
await self._seed_manual_investigation_areas()
manual_hypotheses = self.config.get("hypotheses", [])
if manual_hypotheses:
await self._generate_hypotheses_manual(manual_hypotheses)
if self.graph.phenomena:
await self._judge_new_phenomena()
else:
await self._generate_hypotheses_auto()
# Unified judge step — link Phase 1 phenomena to newly-created hypotheses
if self.graph.phenomena and self.graph.hypotheses:
await self._judge_new_phenomena()
# Derive investigation areas from active hypotheses.
# No-op if manual seed already populated or resume restored areas.
await self._derive_investigation_areas()
for h in self.graph.hypotheses.values():
_log(f" {h.summary()}", event="hypothesis")
for a in self.graph.investigation_areas.values():
_log(f" {a.summary()}", event="area")
_log(
f"+{len(self.graph.hypotheses)} hypotheses generated",
f"+{len(self.graph.hypotheses)} hypotheses, "
f"{len(self.graph.investigation_areas)} areas",
event="progress", elapsed=time.monotonic() - t0,
)

View File

@@ -5,6 +5,7 @@ description = "Multi-Agent System for Digital Forensics"
requires-python = ">=3.14"
dependencies = [
"httpx[socks]>=0.28.1",
"openai>=2.36.0",
"pyyaml",
"regipy>=6.2.1",
]

View File

@@ -13,8 +13,16 @@ from tool_registry import register_all_tools
async def main() -> None:
# Find the run to regenerate from
run_dir = Path(sys.argv[1]) if len(sys.argv) > 1 else Path("runs/2026-04-02T15-11-25")
# Find the run: CLI arg, or latest run with a graph_state.json
if len(sys.argv) > 1:
run_dir = Path(sys.argv[1])
else:
states = sorted(Path("runs").glob("*/graph_state.json"), reverse=True)
if not states:
print("No runs found in runs/")
return
run_dir = states[0].parent
print(f"Using latest run: {run_dir.name}")
state_path = run_dir / "graph_state.json"
if not state_path.exists():
@@ -24,8 +32,11 @@ async def main() -> None:
config = yaml.safe_load(open("config.yaml"))
agent_cfg = config["agent"]
# Load graph
graph = EvidenceGraph.load_state(state_path)
# Load graph (edge_weights from config — applied to the loaded graph)
graph = EvidenceGraph.load_state(
state_path,
edge_weights=config.get("hypothesis_edge_weights"),
)
print(f"Loaded: {graph.stats_summary()}")
# LLM client with larger max_tokens for report
@@ -34,6 +45,8 @@ async def main() -> None:
api_key=agent_cfg["api_key"],
model=agent_cfg["model"],
max_tokens=16384,
reasoning_effort=agent_cfg.get("reasoning_effort"),
thinking_enabled=agent_cfg.get("thinking_enabled", False),
)
register_all_tools(graph.image_path, graph.partition_offset, graph)

1508
tests/test_optimizations.py Normal file

File diff suppressed because it is too large Load Diff

172
uv.lock generated
View File

@@ -2,6 +2,15 @@ version = 1
revision = 3
requires-python = ">=3.14"
[[package]]
name = "annotated-types"
version = "0.7.0"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/ee/67/531ea369ba64dcff5ec9c3402f9f51bf748cec26dde048a2f973a4eea7f5/annotated_types-0.7.0.tar.gz", hash = "sha256:aff07c09a53a08bc8cfccb9c85b05f1aa9a2a6f23728d790723543408344ce89", size = 16081, upload-time = "2024-05-20T21:33:25.928Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/78/b6/6307fbef88d9b5ee7421e68d78a9f162e0da4900bc5f5793f6d3d0e34fb8/annotated_types-0.7.0-py3-none-any.whl", hash = "sha256:1f02e8b43a8fbbc3f3e0d4f0f4bfc8131bcb4eebe8849b8e5c773f3a1c582a53", size = 13643, upload-time = "2024-05-20T21:33:24.1Z" },
]
[[package]]
name = "anyio"
version = "4.13.0"
@@ -41,6 +50,15 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/b2/fb/08b3f4bf05da99aba8ffea52a558758def16e8516bc75ca94ff73587e7d3/construct-2.10.70-py3-none-any.whl", hash = "sha256:c80be81ef595a1a821ec69dc16099550ed22197615f4320b57cc9ce2a672cb30", size = 63020, upload-time = "2023-11-29T08:44:46.876Z" },
]
[[package]]
name = "distro"
version = "1.9.0"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/fc/f8/98eea607f65de6527f8a2e8885fc8015d3e6f5775df186e443e0964a11c3/distro-1.9.0.tar.gz", hash = "sha256:2fa77c6fd8940f116ee1d6b94a2f90b13b5ea8d019b98bc8bafdcabcdd9bdbed", size = 60722, upload-time = "2023-12-24T09:54:32.31Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/12/b3/231ffd4ab1fc9d679809f356cebee130ac7daa00d6d6f3206dd4fd137e9e/distro-1.9.0-py3-none-any.whl", hash = "sha256:7bffd925d65168f85027d8da9af6bddab658135b840670a223589bc0c8ef02b2", size = 20277, upload-time = "2023-12-24T09:54:30.421Z" },
]
[[package]]
name = "h11"
version = "0.16.0"
@@ -110,12 +128,48 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/cb/b1/3846dd7f199d53cb17f49cba7e651e9ce294d8497c8c150530ed11865bb8/iniconfig-2.3.0-py3-none-any.whl", hash = "sha256:f631c04d2c48c52b84d0d0549c99ff3859c98df65b3101406327ecc7d53fbf12", size = 7484, upload-time = "2025-10-18T21:55:41.639Z" },
]
[[package]]
name = "jiter"
version = "0.14.0"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/6e/c1/0cddc6eb17d4c53a99840953f95dd3accdc5cfc7a337b0e9b26476276be9/jiter-0.14.0.tar.gz", hash = "sha256:e8a39e66dac7153cf3f964a12aad515afa8d74938ec5cc0018adcdae5367c79e", size = 165725, upload-time = "2026-04-10T14:28:42.01Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/4f/1e/354ed92461b165bd581f9ef5150971a572c873ec3b68a916d5aa91da3cc2/jiter-0.14.0-cp314-cp314-macosx_10_12_x86_64.whl", hash = "sha256:6f396837fc7577871ca8c12edaf239ed9ccef3bbe39904ae9b8b63ce0a48b140", size = 315277, upload-time = "2026-04-10T14:27:18.109Z" },
{ url = "https://files.pythonhosted.org/packages/a6/95/8c7c7028aa8636ac21b7a55faef3e34215e6ed0cbf5ae58258427f621aa3/jiter-0.14.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:a4d50ea3d8ba4176f79754333bd35f1bbcd28e91adc13eb9b7ca91bc52a6cef9", size = 315923, upload-time = "2026-04-10T14:27:19.603Z" },
{ url = "https://files.pythonhosted.org/packages/47/40/e2a852a44c4a089f2681a16611b7ce113224a80fd8504c46d78491b47220/jiter-0.14.0-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ce17f8a050447d1b4153bda4fb7d26e6a9e74eb4f4a41913f30934c5075bf615", size = 344943, upload-time = "2026-04-10T14:27:21.262Z" },
{ url = "https://files.pythonhosted.org/packages/fc/1f/670f92adee1e9895eac41e8a4d623b6da68c4d46249d8b556b60b63f949e/jiter-0.14.0-cp314-cp314-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:f4f1c4b125e1652aefbc2e2c1617b60a160ab789d180e3d423c41439e5f32850", size = 369725, upload-time = "2026-04-10T14:27:22.766Z" },
{ url = "https://files.pythonhosted.org/packages/01/2f/541c9ba567d05de1c4874a0f8f8c5e3fd78e2b874266623da9a775cf46e0/jiter-0.14.0-cp314-cp314-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:be808176a6a3a14321d18c603f2d40741858a7c4fc982f83232842689fe86dd9", size = 461210, upload-time = "2026-04-10T14:27:24.315Z" },
{ url = "https://files.pythonhosted.org/packages/ce/a9/c31cbec09627e0d5de7aeaec7690dba03e090caa808fefd8133137cf45bc/jiter-0.14.0-cp314-cp314-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:26679d58ba816f88c3849306dd58cb863a90a1cf352cdd4ef67e30ccf8a77994", size = 380002, upload-time = "2026-04-10T14:27:26.155Z" },
{ url = "https://files.pythonhosted.org/packages/50/02/3c05c1666c41904a2f607475a73e7a4763d1cbde2d18229c4f85b22dc253/jiter-0.14.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:80381f5a19af8fa9aef743f080e34f6b25ebd89656475f8cf0470ec6157052aa", size = 354678, upload-time = "2026-04-10T14:27:27.701Z" },
{ url = "https://files.pythonhosted.org/packages/7d/97/e15b33545c2b13518f560d695f974b9891b311641bdcf178d63177e8801e/jiter-0.14.0-cp314-cp314-manylinux_2_31_riscv64.whl", hash = "sha256:004df5fdb8ecbd6d99f3227df18ba1a259254c4359736a2e6f036c944e02d7c5", size = 358920, upload-time = "2026-04-10T14:27:29.256Z" },
{ url = "https://files.pythonhosted.org/packages/ad/d2/8b1461def6b96ba44530df20d07ef7a1c7da22f3f9bf1727e2d611077bf1/jiter-0.14.0-cp314-cp314-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:cff5708f7ed0fa098f2b53446c6fa74c48469118e5cd7497b4f1cd569ab06928", size = 394512, upload-time = "2026-04-10T14:27:31.344Z" },
{ url = "https://files.pythonhosted.org/packages/e3/88/837566dd6ed6e452e8d3205355afd484ce44b2533edfa4ed73a298ea893e/jiter-0.14.0-cp314-cp314-musllinux_1_1_aarch64.whl", hash = "sha256:2492e5f06c36a976d25c7cc347a60e26d5470178d44cde1b9b75e60b4e519f28", size = 521120, upload-time = "2026-04-10T14:27:33.299Z" },
{ url = "https://files.pythonhosted.org/packages/89/6b/b00b45c4d1b4c031777fe161d620b755b5b02cdade1e316dcb46e4471d63/jiter-0.14.0-cp314-cp314-musllinux_1_1_x86_64.whl", hash = "sha256:7609cfbe3a03d37bfdbf5052012d5a879e72b83168a363deae7b3a26564d57de", size = 553668, upload-time = "2026-04-10T14:27:34.868Z" },
{ url = "https://files.pythonhosted.org/packages/ad/d8/6fe5b42011d19397433d345716eac16728ac241862a2aac9c91923c7509a/jiter-0.14.0-cp314-cp314-win32.whl", hash = "sha256:7282342d32e357543565286b6450378c3cd402eea333fc1ebe146f1fabb306fc", size = 207001, upload-time = "2026-04-10T14:27:36.455Z" },
{ url = "https://files.pythonhosted.org/packages/e5/43/5c2e08da1efad5e410f0eaaabeadd954812612c33fbbd8fd5328b489139d/jiter-0.14.0-cp314-cp314-win_amd64.whl", hash = "sha256:bd77945f38866a448e73b0b7637366afa814d4617790ecd88a18ca74377e6c02", size = 202187, upload-time = "2026-04-10T14:27:38Z" },
{ url = "https://files.pythonhosted.org/packages/aa/1f/6e39ac0b4cdfa23e606af5b245df5f9adaa76f35e0c5096790da430ca506/jiter-0.14.0-cp314-cp314-win_arm64.whl", hash = "sha256:f2d4c61da0821ee42e0cdf5489da60a6d074306313a377c2b35af464955a3611", size = 192257, upload-time = "2026-04-10T14:27:39.504Z" },
{ url = "https://files.pythonhosted.org/packages/05/57/7dbc0ffbbb5176a27e3518716608aa464aee2e2887dc938f0b900a120449/jiter-0.14.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:1bf7ff85517dd2f20a5750081d2b75083c1b269cf75afc7511bdf1f9548beb3b", size = 323441, upload-time = "2026-04-10T14:27:41.039Z" },
{ url = "https://files.pythonhosted.org/packages/83/6e/7b3314398d8983f06b557aa21b670511ec72d3b79a68ee5e4d9bff972286/jiter-0.14.0-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:c8ef8791c3e78d6c6b157c6d360fbb5c715bebb8113bc6a9303c5caff012754a", size = 348109, upload-time = "2026-04-10T14:27:42.552Z" },
{ url = "https://files.pythonhosted.org/packages/ae/4f/8dc674bcd7db6dba566de73c08c763c337058baff1dbeb34567045b27cdc/jiter-0.14.0-cp314-cp314t-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:e74663b8b10da1fe0f4e4703fd7980d24ad17174b6bb35d8498d6e3ebce2ae6a", size = 368328, upload-time = "2026-04-10T14:27:44.574Z" },
{ url = "https://files.pythonhosted.org/packages/3b/5f/188e09a1f20906f98bbdec44ed820e19f4e8eb8aff88b9d1a5a497587ff3/jiter-0.14.0-cp314-cp314t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:1aca29ba52913f78362ec9c2da62f22cdc4c3083313403f90c15460979b84d9b", size = 463301, upload-time = "2026-04-10T14:27:46.717Z" },
{ url = "https://files.pythonhosted.org/packages/ac/f0/19046ef965ed8f349e8554775bb12ff4352f443fbe12b95d31f575891256/jiter-0.14.0-cp314-cp314t-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:8b39b7d87a952b79949af5fef44d2544e58c21a28da7f1bae3ef166455c61746", size = 378891, upload-time = "2026-04-10T14:27:48.32Z" },
{ url = "https://files.pythonhosted.org/packages/c4/c3/da43bd8431ee175695777ee78cf0e93eacbb47393ff493f18c45231b427d/jiter-0.14.0-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:78d918a68b26e9fab068c2b5453577ef04943ab2807b9a6275df2a812599a310", size = 360749, upload-time = "2026-04-10T14:27:49.88Z" },
{ url = "https://files.pythonhosted.org/packages/72/26/e054771be889707c6161dbdec9c23d33a9ec70945395d70f07cfea1e9a6f/jiter-0.14.0-cp314-cp314t-manylinux_2_31_riscv64.whl", hash = "sha256:b08997c35aee1201c1a5361466a8fb9162d03ae7bf6568df70b6c859f1e654a4", size = 358526, upload-time = "2026-04-10T14:27:51.504Z" },
{ url = "https://files.pythonhosted.org/packages/c3/0f/7bea65ea2a6d91f2bf989ff11a18136644392bf2b0497a1fa50934c30a9c/jiter-0.14.0-cp314-cp314t-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:260bf7ca20704d58d41f669e5e9fe7fe2fa72901a6b324e79056f5d52e9c9be2", size = 393926, upload-time = "2026-04-10T14:27:53.368Z" },
{ url = "https://files.pythonhosted.org/packages/3c/a1/b1ff7d70deef61ac0b7c6c2f12d2ace950cdeecb4fdc94500a0926802857/jiter-0.14.0-cp314-cp314t-musllinux_1_1_aarch64.whl", hash = "sha256:37826e3df29e60f30a382f9294348d0238ef127f4b5d7f5f8da78b5b9e050560", size = 521052, upload-time = "2026-04-10T14:27:55.058Z" },
{ url = "https://files.pythonhosted.org/packages/0b/7b/3b0649983cbaf15eda26a414b5b1982e910c67bd6f7b1b490f3cfc76896a/jiter-0.14.0-cp314-cp314t-musllinux_1_1_x86_64.whl", hash = "sha256:645be49c46f2900937ba0eaf871ad5183c96858c0af74b6becc7f4e367e36e06", size = 553716, upload-time = "2026-04-10T14:27:57.269Z" },
{ url = "https://files.pythonhosted.org/packages/97/f8/33d78c83bd93ae0c0af05293a6660f88a1977caef39a6d72a84afab94ce0/jiter-0.14.0-cp314-cp314t-win32.whl", hash = "sha256:2f7877ed45118de283786178eceaf877110abacd04fde31efff3940ae9672674", size = 207957, upload-time = "2026-04-10T14:27:59.285Z" },
{ url = "https://files.pythonhosted.org/packages/d6/ac/2b760516c03e2227826d1f7025d89bf6bf6357a28fe75c2a2800873c50bf/jiter-0.14.0-cp314-cp314t-win_amd64.whl", hash = "sha256:14c0cb10337c49f5eafe8e7364daca5e29a020ea03580b8f8e6c597fed4e1588", size = 204690, upload-time = "2026-04-10T14:28:00.962Z" },
{ url = "https://files.pythonhosted.org/packages/dc/2e/a44c20c58aeed0355f2d326969a181696aeb551a25195f47563908a815be/jiter-0.14.0-cp314-cp314t-win_arm64.whl", hash = "sha256:5419d4aa2024961da9fe12a9cfe7484996735dca99e8e090b5c88595ef1951ff", size = 191338, upload-time = "2026-04-10T14:28:02.853Z" },
]
[[package]]
name = "masforensics"
version = "0.1.0"
source = { virtual = "." }
dependencies = [
{ name = "httpx", extra = ["socks"] },
{ name = "openai" },
{ name = "pyyaml" },
{ name = "regipy" },
]
@@ -129,6 +183,7 @@ dev = [
[package.metadata]
requires-dist = [
{ name = "httpx", extras = ["socks"], specifier = ">=0.28.1" },
{ name = "openai", specifier = ">=2.36.0" },
{ name = "pyyaml" },
{ name = "regipy", specifier = ">=6.2.1" },
]
@@ -139,6 +194,25 @@ dev = [
{ name = "pytest-asyncio", specifier = ">=1.3.0" },
]
[[package]]
name = "openai"
version = "2.36.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "anyio" },
{ name = "distro" },
{ name = "httpx" },
{ name = "jiter" },
{ name = "pydantic" },
{ name = "sniffio" },
{ name = "tqdm" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/f4/a1/4d5e84cf51720fc1526cc49e10ac1961abcccb55b0efb3d970db1e9a2728/openai-2.36.0.tar.gz", hash = "sha256:139dea0edd2f1b30c33d46ae1a6929e03906254140318e4608e98fe8c566f2e7", size = 753003, upload-time = "2026-05-07T17:33:17.075Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/9d/1c/5d43735b2553baae2a5e899dcbcd0670a86930d993184d72ca909bf11c9b/openai-2.36.0-py3-none-any.whl", hash = "sha256:143f6194b548dbc2c921af1f1b03b9f14c85fed8a75b5b516f5bcc11a2a50c63", size = 1302361, upload-time = "2026-05-07T17:33:15.063Z" },
]
[[package]]
name = "packaging"
version = "26.0"
@@ -157,6 +231,62 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/54/20/4d324d65cc6d9205fabedc306948156824eb9f0ee1633355a8f7ec5c66bf/pluggy-1.6.0-py3-none-any.whl", hash = "sha256:e920276dd6813095e9377c0bc5566d94c932c33b27a3e3945d8389c374dd4746", size = 20538, upload-time = "2025-05-15T12:30:06.134Z" },
]
[[package]]
name = "pydantic"
version = "2.13.4"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "annotated-types" },
{ name = "pydantic-core" },
{ name = "typing-extensions" },
{ name = "typing-inspection" },
]
sdist = { url = "https://files.pythonhosted.org/packages/18/a5/b60d21ac674192f8ab0ba4e9fd860690f9b4a6e51ca5df118733b487d8d6/pydantic-2.13.4.tar.gz", hash = "sha256:c40756b57adaa8b1efeeced5c196f3f3b7c435f90e84ea7f443901bec8099ef6", size = 844775, upload-time = "2026-05-06T13:43:05.343Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/fd/7b/122376b1fd3c62c1ed9dc80c931ace4844b3c55407b6fb2d199377c9736f/pydantic-2.13.4-py3-none-any.whl", hash = "sha256:45a282cde31d808236fd7ea9d919b128653c8b38b393d1c4ab335c62924d9aba", size = 472262, upload-time = "2026-05-06T13:43:02.641Z" },
]
[[package]]
name = "pydantic-core"
version = "2.46.4"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/9d/56/921726b776ace8d8f5db44c4ef961006580d91dc52b803c489fafd1aa249/pydantic_core-2.46.4.tar.gz", hash = "sha256:62f875393d7f270851f20523dd2e29f082bcc82292d66db2b64ea71f64b6e1c1", size = 471464, upload-time = "2026-05-06T13:37:06.98Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/8d/74/228a26ddad29c6672b805d9fd78e8d251cd04004fa7eed0e622096cd0250/pydantic_core-2.46.4-cp314-cp314-macosx_10_12_x86_64.whl", hash = "sha256:428e04521a40150c85216fc8b85e8d39fece235a9cf5e383761238c7fa9b96fb", size = 2102079, upload-time = "2026-05-06T13:38:41.019Z" },
{ url = "https://files.pythonhosted.org/packages/ad/1f/8970b150a4b4365623ae00fc88603491f763c627311ae8031e3111356d6e/pydantic_core-2.46.4-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:23ace664830ee0bfe014a0c7bc248b1f7f25ed7ad103852c317624a1083af462", size = 1952179, upload-time = "2026-05-06T13:36:59.812Z" },
{ url = "https://files.pythonhosted.org/packages/95/30/5211a831ae054928054b2f79731661087a2bc5c01e825c672b3a4a8f1b3e/pydantic_core-2.46.4-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ce5c1d2a8b27468f433ca974829c44060b8097eedc39933e3c206a90ee49c4a9", size = 1978926, upload-time = "2026-05-06T13:37:39.933Z" },
{ url = "https://files.pythonhosted.org/packages/57/e9/689668733b1eb67adeef047db3c2e8788fcf65a7fd9c9e2b46b7744fe245/pydantic_core-2.46.4-cp314-cp314-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:7283d57845ecf5a163403eb0702dfc220cc4fbdd18919cb5ccea4f95ee1cdab4", size = 2046785, upload-time = "2026-05-06T13:38:01.995Z" },
{ url = "https://files.pythonhosted.org/packages/60/d9/6715260422ff50a2109878fd24d948a6c3446bb2664f34ee78cd972b3acd/pydantic_core-2.46.4-cp314-cp314-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:8daafc69c93ee8a0204506a3b6b30f586ef54028f52aeeeb5c4cfc5184fd5914", size = 2228733, upload-time = "2026-05-06T13:40:50.371Z" },
{ url = "https://files.pythonhosted.org/packages/18/ae/fdb2f64316afca925640f8e70bb1a564b0ec2721c1389e25b8eb4bf9a299/pydantic_core-2.46.4-cp314-cp314-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:cd2213145bcc2ba85884d0ac63d222fece9209678f77b9b4d76f054c561adb28", size = 2307534, upload-time = "2026-05-06T13:37:21.531Z" },
{ url = "https://files.pythonhosted.org/packages/89/1d/8eff589b45bb8190a9d12c49cfad0f176a5cbd1534908a6b5125e2886239/pydantic_core-2.46.4-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7a5f930472650a82629163023e630d160863fce524c616f4e5186e5de9d9a49b", size = 2099732, upload-time = "2026-05-06T13:39:31.942Z" },
{ url = "https://files.pythonhosted.org/packages/06/d5/ee5a3366637fee41dee51a1fc91562dcf12ddbc68fda34e6b253da2324bb/pydantic_core-2.46.4-cp314-cp314-manylinux_2_31_riscv64.whl", hash = "sha256:c1b3f518abeca3aa13c712fd202306e145abf59a18b094a6bafb2d2bbf59192c", size = 2129627, upload-time = "2026-05-06T13:37:25.033Z" },
{ url = "https://files.pythonhosted.org/packages/94/33/2414be571d2c6a6c4d08be21f9292b6d3fdb08949a97b6dfe985017821db/pydantic_core-2.46.4-cp314-cp314-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:1a7dd0b3ee80d90150e3495a3a13ac34dbcbfd4f012996a6a1d8900e91b5c0fb", size = 2179141, upload-time = "2026-05-06T13:37:14.046Z" },
{ url = "https://files.pythonhosted.org/packages/7b/79/7daa95be995be0eecc4cf75064cb33f9bbbfe3fe0158caf2f0d4a996a5c7/pydantic_core-2.46.4-cp314-cp314-musllinux_1_1_aarch64.whl", hash = "sha256:3fb702cd90b0446a3a1c5e470bfa0dd23c0233b676a9099ddcc964fa6ca13898", size = 2184325, upload-time = "2026-05-06T13:36:53.615Z" },
{ url = "https://files.pythonhosted.org/packages/9f/cb/d0a382f5c0de8a222dc61c65348e0ce831b1f68e0a018450d31c2cace3a5/pydantic_core-2.46.4-cp314-cp314-musllinux_1_1_armv7l.whl", hash = "sha256:b8458003118a712e66286df6a707db01c52c0f52f7db8e4a38f0da1d3b94fc4e", size = 2323990, upload-time = "2026-05-06T13:40:29.971Z" },
{ url = "https://files.pythonhosted.org/packages/05/db/d9ba624cc4a5aced1598e88c04fdbd8310c8a69b9d38b9a3d39ce3a61ed7/pydantic_core-2.46.4-cp314-cp314-musllinux_1_1_x86_64.whl", hash = "sha256:372429a130e469c9cd698925ce5fc50940b7a1336b0d82038e63d5bbc4edc519", size = 2369978, upload-time = "2026-05-06T13:37:23.027Z" },
{ url = "https://files.pythonhosted.org/packages/f2/20/d15df15ba918c423461905802bfd2981c3af0bfa0e40d05e13edbfa48bc3/pydantic_core-2.46.4-cp314-cp314-win32.whl", hash = "sha256:85bb3611ff1802f3ee7fdd7dbff26b56f343fb432d57a4728fdd49b6ef35e2f4", size = 1966354, upload-time = "2026-05-06T13:38:03.499Z" },
{ url = "https://files.pythonhosted.org/packages/fc/b6/6b8de4c0a7d7ab3004c439c80c5c1e0a3e8d78bbae19379b01960383d9e5/pydantic_core-2.46.4-cp314-cp314-win_amd64.whl", hash = "sha256:811ff8e9c313ab425368bcbb36e5c4ebd7108c2bbf4e4089cfbb0b01eff63fac", size = 2072238, upload-time = "2026-05-06T13:39:40.807Z" },
{ url = "https://files.pythonhosted.org/packages/32/36/51eb763beec1f4cf59b1db243a7dcc39cbb41230f050a09b9d69faaf0a48/pydantic_core-2.46.4-cp314-cp314-win_arm64.whl", hash = "sha256:bfec22eab3c8cc2ceec0248aec886624116dc079afa027ecc8ad4a7e62010f8a", size = 2018251, upload-time = "2026-05-06T13:37:26.72Z" },
{ url = "https://files.pythonhosted.org/packages/e8/91/855af51d625b23aa987116a19e231d2aaef9c4a415273ddc189b79a45fee/pydantic_core-2.46.4-cp314-cp314t-macosx_10_12_x86_64.whl", hash = "sha256:af8244b2bef6aaad6d92cda81372de7f8c8d36c9f0c3ea36e827c60e7d9467a0", size = 2099593, upload-time = "2026-05-06T13:39:47.682Z" },
{ url = "https://files.pythonhosted.org/packages/fb/1b/8784a54c65edb5f49f0a14d6977cf1b209bba85a4c77445b255c2de58ab3/pydantic_core-2.46.4-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:5a4330cdbc57162e4b3aa303f588ba752257694c9c9be3e7ebb11b4aca659b5d", size = 1935226, upload-time = "2026-05-06T13:40:40.428Z" },
{ url = "https://files.pythonhosted.org/packages/e8/e7/1955d28d1afc56dd4b3ad7cc0cf39df1b9852964cf16e5d13912756d6d6b/pydantic_core-2.46.4-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:29c61fc04a3d840155ff08e475a04809278972fe6aef51e2720554e96367e34b", size = 1974605, upload-time = "2026-05-06T13:37:32.029Z" },
{ url = "https://files.pythonhosted.org/packages/93/e2/3fedbf0ba7a22850e6e9fd78117f1c0f10f950182344d8a6c535d468fdd8/pydantic_core-2.46.4-cp314-cp314t-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:c50f2528cf200c5eed56faf3f4e22fcd5f38c157a8b78576e6ba3168ec35f000", size = 2030777, upload-time = "2026-05-06T13:38:55.239Z" },
{ url = "https://files.pythonhosted.org/packages/f8/61/46be275fcaaba0b4f5b9669dd852267ce1ff616592dccf7a7845588df091/pydantic_core-2.46.4-cp314-cp314t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:0cbe8b01f948de4286c74cdd6c667aceb38f5c1e26f0693b3983d9d74887c65e", size = 2236641, upload-time = "2026-05-06T13:37:08.096Z" },
{ url = "https://files.pythonhosted.org/packages/60/db/12e93e46a8bac9988be3c016860f83293daea8c716c029c9ace279036f2f/pydantic_core-2.46.4-cp314-cp314t-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:617d7e2ca7dcb8c5cf6bcb8c59b8832c94b36196bbf1cbd1bfb56ed341905edd", size = 2286404, upload-time = "2026-05-06T13:40:20.221Z" },
{ url = "https://files.pythonhosted.org/packages/e2/4a/4d8b19008f38d31c53b8219cfedc2e3d5de5fe99d90076b7e767de29274f/pydantic_core-2.46.4-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7027560ee92211647d0d34e3f7cd6f50da56399d26a9c8ad0da286d3869a53f3", size = 2109219, upload-time = "2026-05-06T13:38:12.153Z" },
{ url = "https://files.pythonhosted.org/packages/88/70/3cbc40978fefb7bb09c6708d40d4ad1a5d70fd7213c3d17f971de868ec1f/pydantic_core-2.46.4-cp314-cp314t-manylinux_2_31_riscv64.whl", hash = "sha256:f99626688942fb746e545232e7726926f3be91b5975f8b55327665fafda991c7", size = 2110594, upload-time = "2026-05-06T13:40:02.971Z" },
{ url = "https://files.pythonhosted.org/packages/9d/20/b8d36736216e29491125531685b2f9e61aa5b4b2599893f8268551da3338/pydantic_core-2.46.4-cp314-cp314t-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:fc3e9034a63de20e15e8ade85358bc6efc614008cab72898b4b4952bea0509ff", size = 2159542, upload-time = "2026-05-06T13:39:27.506Z" },
{ url = "https://files.pythonhosted.org/packages/1d/a2/367df868eb584dacf6bf82a389272406d7178e301c4ac82545ab98bc2dd9/pydantic_core-2.46.4-cp314-cp314t-musllinux_1_1_aarch64.whl", hash = "sha256:97e7cf2be5c77b7d1a9713a05605d49460d02c6078d38d8bef3cbe323c548424", size = 2168146, upload-time = "2026-05-06T13:38:31.93Z" },
{ url = "https://files.pythonhosted.org/packages/c1/b8/4460f77f7e201893f649a29ab355dddd3beee8a97bcb1a320db414f9a06e/pydantic_core-2.46.4-cp314-cp314t-musllinux_1_1_armv7l.whl", hash = "sha256:3bf92c5d0e00fefaab325a4d27828fe6b6e2a21848686b5b60d2d9eeb09d76c6", size = 2306309, upload-time = "2026-05-06T13:37:44.717Z" },
{ url = "https://files.pythonhosted.org/packages/64/c4/be2639293acd87dc8ddbcec41a73cee9b2ebf996fe6d892a1a74e88ad3f7/pydantic_core-2.46.4-cp314-cp314t-musllinux_1_1_x86_64.whl", hash = "sha256:3ecbc122d18468d06ca279dc26a8c2e2d5acb10943bb35e36ae92096dc3b5565", size = 2369736, upload-time = "2026-05-06T13:37:05.645Z" },
{ url = "https://files.pythonhosted.org/packages/30/a6/9f9f380dbb301f67023bf8f707aaa75daadf84f7152d95c410fd7e81d994/pydantic_core-2.46.4-cp314-cp314t-win32.whl", hash = "sha256:e846ae7835bf0703ae43f534ab79a867146dadd59dc9ca5c8b53d5c8f7c9ef02", size = 1955575, upload-time = "2026-05-06T13:38:51.116Z" },
{ url = "https://files.pythonhosted.org/packages/40/1f/f1eb9eb350e795d1af8586289746f5c5677d16043040d63710e22abc43c9/pydantic_core-2.46.4-cp314-cp314t-win_amd64.whl", hash = "sha256:2108ba5c1c1eca18030634489dc544844144ee36357f2f9f780b93e7ddbb44b5", size = 2051624, upload-time = "2026-05-06T13:38:21.672Z" },
{ url = "https://files.pythonhosted.org/packages/f6/d2/42dd53d0a85c27606f316d3aa5d2869c4e8470a5ed6dec30e4a1abe19192/pydantic_core-2.46.4-cp314-cp314t-win_arm64.whl", hash = "sha256:4fcbe087dbc2068af7eda3aa87634eba216dbda64d1ae73c8684b621d33f6596", size = 2017325, upload-time = "2026-05-06T13:40:52.723Z" },
]
[[package]]
name = "pygments"
version = "2.20.0"
@@ -243,6 +373,15 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/65/eb/db13ab9b8d54e04f42b6619acca417ee37b07eb141a54884d13d20d7459e/regipy-6.2.1-py3-none-any.whl", hash = "sha256:b03110e5c4e12385e1ba53c032ccd120c6dcde1b71afb8c3b7aa4717a5a24e43", size = 134861, upload-time = "2026-01-22T15:26:05.653Z" },
]
[[package]]
name = "sniffio"
version = "1.3.1"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/a2/87/a6771e1546d97e7e041b6ae58d80074f81b7d5121207425c964ddf5cfdbd/sniffio-1.3.1.tar.gz", hash = "sha256:f4324edc670a0f49750a81b895f35c3adb843cca46f0530f79fc1babb23789dc", size = 20372, upload-time = "2024-02-25T23:20:04.057Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/e9/44/75a9c9421471a6c4805dbf2356f7c181a29c1879239abab1ea2cc8f38b40/sniffio-1.3.1-py3-none-any.whl", hash = "sha256:2f6da418d1f1e0fddd844478f41680e794e6051915791a034ff65e5f100525a2", size = 10235, upload-time = "2024-02-25T23:20:01.196Z" },
]
[[package]]
name = "socksio"
version = "1.0.0"
@@ -251,3 +390,36 @@ sdist = { url = "https://files.pythonhosted.org/packages/f8/5c/48a7d9495be3d1c65
wheels = [
{ url = "https://files.pythonhosted.org/packages/37/c3/6eeb6034408dac0fa653d126c9204ade96b819c936e136c5e8a6897eee9c/socksio-1.0.0-py3-none-any.whl", hash = "sha256:95dc1f15f9b34e8d7b16f06d74b8ccf48f609af32ab33c608d08761c5dcbb1f3", size = 12763, upload-time = "2020-04-17T15:50:31.878Z" },
]
[[package]]
name = "tqdm"
version = "4.67.3"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "colorama", marker = "sys_platform == 'win32'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/09/a9/6ba95a270c6f1fbcd8dac228323f2777d886cb206987444e4bce66338dd4/tqdm-4.67.3.tar.gz", hash = "sha256:7d825f03f89244ef73f1d4ce193cb1774a8179fd96f31d7e1dcde62092b960bb", size = 169598, upload-time = "2026-02-03T17:35:53.048Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/16/e1/3079a9ff9b8e11b846c6ac5c8b5bfb7ff225eee721825310c91b3b50304f/tqdm-4.67.3-py3-none-any.whl", hash = "sha256:ee1e4c0e59148062281c49d80b25b67771a127c85fc9676d3be5f243206826bf", size = 78374, upload-time = "2026-02-03T17:35:50.982Z" },
]
[[package]]
name = "typing-extensions"
version = "4.15.0"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/72/94/1a15dd82efb362ac84269196e94cf00f187f7ed21c242792a923cdb1c61f/typing_extensions-4.15.0.tar.gz", hash = "sha256:0cea48d173cc12fa28ecabc3b837ea3cf6f38c6d1136f85cbaaf598984861466", size = 109391, upload-time = "2025-08-25T13:49:26.313Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/18/67/36e9267722cc04a6b9f15c7f3441c2363321a3ea07da7ae0c0707beb2a9c/typing_extensions-4.15.0-py3-none-any.whl", hash = "sha256:f0fa19c6845758ab08074a0cfa8b7aecb71c999ca73d62883bc25cc018c4e548", size = 44614, upload-time = "2025-08-25T13:49:24.86Z" },
]
[[package]]
name = "typing-inspection"
version = "0.4.2"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/55/e3/70399cb7dd41c10ac53367ae42139cf4b1ca5f36bb3dc6c9d33acdb43655/typing_inspection-0.4.2.tar.gz", hash = "sha256:ba561c48a67c5958007083d386c3295464928b01faa735ab8547c5692e87f464", size = 75949, upload-time = "2025-10-01T02:14:41.687Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/dc/9b/47798a6c91d8bdb567fe2698fe81e0c6b7cb7ef4d13da4114b41d239f65d/typing_inspection-0.4.2-py3-none-any.whl", hash = "sha256:4ed1cacbdc298c220f1bd249ed5287caa16f34d44ef4e9c3d0cbad5b521545e7", size = 14611, upload-time = "2025-10-01T02:14:40.154Z" },
]