Event-Reified Temporal Provenance Dual-Granularity Prompting for LLM-based APT detection on DARPA provenance datasets. Includes phase 0-14 method spec, IR/graph/metapath/trimming/prompt modules, scripts for THEIA candidate universe, landmark CSG construction, hybrid prompting, and LLM inference. Excludes data/, reports/, and local LLM config from version control.
623 B
623 B
Phase 12 Metrics
APT detection is highly imbalanced. Accuracy is not sufficient.
Required Metrics
- AUPRC;
- AUROC;
- Macro-F1;
- Precision@K;
- Recall@K;
- FPR at fixed recall;
- attack-case recall;
- process-level recall;
- event-level recall;
- detection delay;
- token length;
- inference cost;
- prompt construction time;
- summary cache hit rate;
- evidence path hit rate;
- false positive and false negative case analysis.
Reporting Layers
Reports must distinguish:
- candidate generation recall;
- final classification performance on candidates;
- end-to-end performance.
AUPRC is a primary metric.