4.5 KiB
Unified_CFM
A single multi-scale OT-CFM over one token sequence per flow:
[FLOW_TOKEN, PACKET_1, ..., PACKET_T]
This is not a Flow-CFM + Packet-CFM ensemble. Flow-level and packet-level signals interact inside one Transformer velocity field, and a Phase 2 masked-prediction consistency loss explicitly trains the cross-modal dependency.
This is the current SOTA model in the repo (within-dataset SOTA on ISCXTor2016 / CICIDS2017 / CICDDoS2019; near-SOTA cross-dataset).
Model
UnifiedTokenCFM uses fixed tokenization to avoid latent-collapse shortcuts:
flow token: [type=-1, normalized 20-d canonical flow features, zero pad]
packet token: [type=+1, normalized 9-d packet features, zero pad]
Velocity field: 4-layer AdaLN-Zero Transformer (d_model=128, n_heads=4),
sinusoidal time embedding (time_dim=64). Total ≈ 1.23M parameters.
Loss with Phase 2 consistency:
L = L_main + λ_flow · L_mask_flow + λ_packet · L_mask_packet
L_main: standard OT-CFM velocity regression with σ-band noise +
Sinkhorn OT coupling.
L_mask_flow: zero out the flow token's input at x_t; predict v[flow]
from packet context only.
L_mask_packet: zero out a random 50% of real packet tokens at x_t;
predict their velocities from flow + remaining packets.
Best hyperparameters from the σ × λ sweeps:
lambda_flow = lambda_packet = 0.3
packet_mask_ratio = 0.5
sigma = 0.6 # cross-dataset best; σ=0.1 marginally better for some within
use_ot = True
Scores
The model exposes three classes of scores at inference:
# primary
terminal_norm
# decomposed (analysis only)
terminal_flow terminal_packet
arc_length kinetic_energy kinetic_flow kinetic_packet
velocity_total velocity_flow velocity_packet
# Phase 1 diagnostics
curvature_total curvature_flow curvature_packet # ∫ ||dv/dt||² dt
kappa2_speed2norm_packet_{mean,median,trimmed10_mean} # packet curvature / speed²
jacobian_total jacobian_flow jacobian_packet # Hutchinson VJP estimate of ||∂v/∂x||_F²
velocity_*_t{01..10} # 18 time-profile scores
# Phase 2 cross-modal consistency
flow_consistency packet_consistency consistency_total
terminal_norm is the paper's primary score. The decomposed and diagnostic
scores serve per-attack-family analysis — they are NOT competing
SOTA claims. Multi-seed std on terminal_norm is ≤ 0.005 across all our
runs.
The Phase 2 consistency scores have a notable property: they are
discriminative only when the model is trained with the consistency loss.
On a baseline model flow_consistency is roughly random (0.57 on
CICIDS2017); after Phase 2 training it lifts to 0.88. On SSH-Patator,
where standard density scores struggle (terminal_norm 0.64), Phase 2
flow_consistency reaches 0.94.
Train
# baseline (no consistency loss)
uv run python Unified_CFM/train.py --config Unified_CFM/configs/cicids2017_baseline.yaml
# Phase 2 with consistency loss (λ=0.1, σ=0.1)
uv run python Unified_CFM/train.py --config Unified_CFM/configs/cicids2017_consistency.yaml
# σ × λ sweeps and multi-seed orchestrators live in
# artifacts/verify_2026_04_24/run_*.sh
The intended setup is to use the workspace-canonical 20-d packet-derived flow feature file:
flow_features_path: datasets/cicids2017/processed/flow_features.parquet
flow_features_align: auto
flow_features.parquet is row-aligned with the Packet_CFM artifacts via
flow_id. With flow_features_align: auto, the loader uses direct
row/flow_id alignment when possible; scan alignment remains only for
legacy full CSV-derived caches.
For large datasets where a monolithic packets.npz would exceed memory,
the loader supports the sharded backend:
source_store: datasets/cicddos2019/processed/full_store
val_cap: 20000
attack_cap: 20000
If flow_features_path is empty, the loader derives compact 16-d flow-level
statistics from the packet sequence. That fallback is for debugging only;
new runs should use the canonical 20-d file generated by
scripts/generate_flow_features.py.
Evaluation
artifacts/verify_2026_04_24/eval_phase1_unified.py runs Phase 1 + Phase 2
score battery on a trained checkpoint, with per-attack-class AUROC.
artifacts/verify_2026_04_24/eval_phase2_cross_cicddos2019.py runs
cross-dataset CICIDS2017→CICDDoS2019 evaluation under the standard
10k benign + 10k stratified attack protocol.