# Phase 4 Ground Truth Mapping and Labels

Ground truth is used only for label mapping and evaluation. It must not enter
LLM prompts.

## Label Levels

- Event-level: direct matched attack events.
- Process-level: processes involved in malicious event chains.
- Subgraph-level: local evidence subgraphs containing key attack-chain events.

## Ambiguous Cases

Ambiguous targets should be assigned `unknown` or `ignore`, not forced to
malicious or benign:

- attack window overlap without explicit evidence;
- normal child behavior from a compromised process;
- normal process later abused by an attacker;
- missing fields that prevent reliable mapping.

## Negative Sampling

Negative sampling must avoid:

- arbitrary benign labels inside attack windows;
- train/test leakage through the same attack entity;
- adjacent attack-chain events split across train and test;
- using attack-report text as prompt content.

## Checks

- Label records are not prompt-allowed.
- Each label has source and confidence.
- Trainable labels require high confidence.