Initial commit: ER-TP-DGP research prototype

Event-Reified Temporal Provenance Dual-Granularity Prompting for
LLM-based APT detection on DARPA provenance datasets.

Includes phase 0-14 method spec, IR/graph/metapath/trimming/prompt
modules, scripts for THEIA candidate universe, landmark CSG construction,
hybrid prompting, and LLM inference. Excludes data/, reports/, and
local LLM config from version control.
This commit is contained in:
BattleTag
2026-05-15 16:53:57 +08:00
commit b86ae87b75
88 changed files with 18570 additions and 0 deletions

36
docs/phase7_trimming.md Normal file
View File

@@ -0,0 +1,36 @@
# Phase 7 Temporal Security-aware Metapath Trimming
Trimming selects evidence paths under each metapath before prompt construction.
It is not random sampling and not BFS truncation.
## Main Scoring Dimensions
- structural relevance;
- metapath diffusion similarity or its current explicit scaffold;
- temporal proximity to the target;
- behavior rarity;
- semantic similarity to target process/file/network context;
- path length penalty;
- security-stage relevance;
- rare path, parent-child, endpoint, or file interaction signals;
- valid target-relative time window.
## Output Contract
Each selected evidence path must include:
- `path_id`;
- `metapath_type`;
- ordered event IDs;
- ordered entity/event node IDs;
- timestamps;
- raw actions;
- selected reason;
- trimming score;
- summary status.
## Ablations
Random neighbors, shortest path only, BFS-only, no temporal term, and no
security-aware term are ablation or baseline settings only.