Initial commit: ER-TP-DGP research prototype
Event-Reified Temporal Provenance Dual-Granularity Prompting for LLM-based APT detection on DARPA provenance datasets. Includes phase 0-14 method spec, IR/graph/metapath/trimming/prompt modules, scripts for THEIA candidate universe, landmark CSG construction, hybrid prompting, and LLM inference. Excludes data/, reports/, and local LLM config from version control.
This commit is contained in:
34
docs/phase5_candidates.md
Normal file
34
docs/phase5_candidates.md
Normal file
@@ -0,0 +1,34 @@
|
||||
# Phase 5 Candidate Target Generation
|
||||
|
||||
Candidate generation reduces LLM call volume. It is not the final detector.
|
||||
|
||||
## Allowed Signals
|
||||
|
||||
Signals must be label-free:
|
||||
|
||||
- rare parent-child process relation;
|
||||
- rare process path;
|
||||
- rare file path;
|
||||
- first-seen external endpoint;
|
||||
- write-then-execute behavior;
|
||||
- read-then-send behavior;
|
||||
- unusual process tree depth;
|
||||
- login followed by lateral communication;
|
||||
- statistical anomaly or weak detector alert.
|
||||
|
||||
## Required Evaluation
|
||||
|
||||
Candidate generation is evaluated separately from final LLM classification:
|
||||
|
||||
- candidate generation recall;
|
||||
- candidate generation precision;
|
||||
- number of candidates;
|
||||
- positive coverage by process/event target;
|
||||
- end-to-end recall after LLM classification.
|
||||
|
||||
## Checks
|
||||
|
||||
- Candidate generation must not use test labels.
|
||||
- Candidate generation must not use attack report narratives.
|
||||
- Weak signals are retained for audit but do not replace ER-TP-DGP.
|
||||
|
||||
Reference in New Issue
Block a user