# Phase 12 Metrics APT detection is highly imbalanced. Accuracy is not sufficient. ## Required Metrics - AUPRC; - AUROC; - Macro-F1; - Precision@K; - Recall@K; - FPR at fixed recall; - attack-case recall; - process-level recall; - event-level recall; - detection delay; - token length; - inference cost; - prompt construction time; - summary cache hit rate; - evidence path hit rate; - false positive and false negative case analysis. ## Reporting Layers Reports must distinguish: - candidate generation recall; - final classification performance on candidates; - end-to-end performance. AUPRC is a primary metric.