Files
ER-TP-DGP/docs/phase12_metrics.md
BattleTag b86ae87b75 Initial commit: ER-TP-DGP research prototype
Event-Reified Temporal Provenance Dual-Granularity Prompting for
LLM-based APT detection on DARPA provenance datasets.

Includes phase 0-14 method spec, IR/graph/metapath/trimming/prompt
modules, scripts for THEIA candidate universe, landmark CSG construction,
hybrid prompting, and LLM inference. Excludes data/, reports/, and
local LLM config from version control.
2026-05-15 16:53:57 +08:00

623 B

Phase 12 Metrics

APT detection is highly imbalanced. Accuracy is not sufficient.

Required Metrics

  • AUPRC;
  • AUROC;
  • Macro-F1;
  • Precision@K;
  • Recall@K;
  • FPR at fixed recall;
  • attack-case recall;
  • process-level recall;
  • event-level recall;
  • detection delay;
  • token length;
  • inference cost;
  • prompt construction time;
  • summary cache hit rate;
  • evidence path hit rate;
  • false positive and false negative case analysis.

Reporting Layers

Reports must distinguish:

  • candidate generation recall;
  • final classification performance on candidates;
  • end-to-end performance.

AUPRC is a primary metric.