Initial commit: ER-TP-DGP research prototype

Event-Reified Temporal Provenance Dual-Granularity Prompting for
LLM-based APT detection on DARPA provenance datasets.

Includes phase 0-14 method spec, IR/graph/metapath/trimming/prompt
modules, scripts for THEIA candidate universe, landmark CSG construction,
hybrid prompting, and LLM inference. Excludes data/, reports/, and
local LLM config from version control.
This commit is contained in:
BattleTag
2026-05-15 16:53:57 +08:00
commit b86ae87b75
88 changed files with 18570 additions and 0 deletions

33
docs/phase12_metrics.md Normal file
View File

@@ -0,0 +1,33 @@
# Phase 12 Metrics
APT detection is highly imbalanced. Accuracy is not sufficient.
## Required Metrics
- AUPRC;
- AUROC;
- Macro-F1;
- Precision@K;
- Recall@K;
- FPR at fixed recall;
- attack-case recall;
- process-level recall;
- event-level recall;
- detection delay;
- token length;
- inference cost;
- prompt construction time;
- summary cache hit rate;
- evidence path hit rate;
- false positive and false negative case analysis.
## Reporting Layers
Reports must distinguish:
- candidate generation recall;
- final classification performance on candidates;
- end-to-end performance.
AUPRC is a primary metric.