Initial commit: ER-TP-DGP research prototype
Event-Reified Temporal Provenance Dual-Granularity Prompting for LLM-based APT detection on DARPA provenance datasets. Includes phase 0-14 method spec, IR/graph/metapath/trimming/prompt modules, scripts for THEIA candidate universe, landmark CSG construction, hybrid prompting, and LLM inference. Excludes data/, reports/, and local LLM config from version control.
This commit is contained in:
33
docs/phase12_metrics.md
Normal file
33
docs/phase12_metrics.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# Phase 12 Metrics
|
||||
|
||||
APT detection is highly imbalanced. Accuracy is not sufficient.
|
||||
|
||||
## Required Metrics
|
||||
|
||||
- AUPRC;
|
||||
- AUROC;
|
||||
- Macro-F1;
|
||||
- Precision@K;
|
||||
- Recall@K;
|
||||
- FPR at fixed recall;
|
||||
- attack-case recall;
|
||||
- process-level recall;
|
||||
- event-level recall;
|
||||
- detection delay;
|
||||
- token length;
|
||||
- inference cost;
|
||||
- prompt construction time;
|
||||
- summary cache hit rate;
|
||||
- evidence path hit rate;
|
||||
- false positive and false negative case analysis.
|
||||
|
||||
## Reporting Layers
|
||||
|
||||
Reports must distinguish:
|
||||
|
||||
- candidate generation recall;
|
||||
- final classification performance on candidates;
|
||||
- end-to-end performance.
|
||||
|
||||
AUPRC is a primary metric.
|
||||
|
||||
Reference in New Issue
Block a user