Event-Reified Temporal Provenance Dual-Granularity Prompting for LLM-based APT detection on DARPA provenance datasets. Includes phase 0-14 method spec, IR/graph/metapath/trimming/prompt modules, scripts for THEIA candidate universe, landmark CSG construction, hybrid prompting, and LLM inference. Excludes data/, reports/, and local LLM config from version control.
925 B
925 B
Phase 5 Candidate Target Generation
Candidate generation reduces LLM call volume. It is not the final detector.
Allowed Signals
Signals must be label-free:
- rare parent-child process relation;
- rare process path;
- rare file path;
- first-seen external endpoint;
- write-then-execute behavior;
- read-then-send behavior;
- unusual process tree depth;
- login followed by lateral communication;
- statistical anomaly or weak detector alert.
Required Evaluation
Candidate generation is evaluated separately from final LLM classification:
- candidate generation recall;
- candidate generation precision;
- number of candidates;
- positive coverage by process/event target;
- end-to-end recall after LLM classification.
Checks
- Candidate generation must not use test labels.
- Candidate generation must not use attack report narratives.
- Weak signals are retained for audit but do not replace ER-TP-DGP.