# Unified_CFM A single multi-scale OT-CFM over one token sequence per flow: ```text [FLOW_TOKEN, PACKET_1, ..., PACKET_T] ``` This is **not** a Flow-CFM + Packet-CFM ensemble. Flow-level and packet-level signals interact inside one Transformer velocity field, and a Phase 2 masked-prediction consistency loss explicitly trains the cross-modal dependency. This is the **current SOTA model** in the repo (within-dataset SOTA on ISCXTor2016 / CICIDS2017 / CICDDoS2019; near-SOTA cross-dataset). ## Model `UnifiedTokenCFM` uses fixed tokenization to avoid latent-collapse shortcuts: ```text flow token: [type=-1, normalized 20-d canonical flow features, zero pad] packet token: [type=+1, normalized 9-d packet features, zero pad] ``` Velocity field: 4-layer AdaLN-Zero Transformer (`d_model=128, n_heads=4`), sinusoidal time embedding (`time_dim=64`). Total ≈ 1.23M parameters. Loss with Phase 2 consistency: ``` L = L_main + λ_flow · L_mask_flow + λ_packet · L_mask_packet L_main: standard OT-CFM velocity regression with σ-band noise + Sinkhorn OT coupling. L_mask_flow: zero out the flow token's input at x_t; predict v[flow] from packet context only. L_mask_packet: zero out a random 50% of real packet tokens at x_t; predict their velocities from flow + remaining packets. ``` Best hyperparameters from the σ × λ sweeps: ``` lambda_flow = lambda_packet = 0.3 packet_mask_ratio = 0.5 sigma = 0.6 # cross-dataset best; σ=0.1 marginally better for some within use_ot = True ``` ## Scores The model exposes three classes of scores at inference: ```text # primary terminal_norm # decomposed (analysis only) terminal_flow terminal_packet arc_length kinetic_energy kinetic_flow kinetic_packet velocity_total velocity_flow velocity_packet # Phase 1 diagnostics curvature_total curvature_flow curvature_packet # ∫ ||dv/dt||² dt kappa2_speed2norm_packet_{mean,median,trimmed10_mean} # packet curvature / speed² jacobian_total jacobian_flow jacobian_packet # Hutchinson VJP estimate of ||∂v/∂x||_F² velocity_*_t{01..10} # 18 time-profile scores # Phase 2 cross-modal consistency flow_consistency packet_consistency consistency_total ``` `terminal_norm` is the paper's primary score. The decomposed and diagnostic scores serve **per-attack-family analysis** — they are NOT competing SOTA claims. Multi-seed std on `terminal_norm` is ≤ 0.005 across all our runs. The Phase 2 consistency scores have a notable property: they are **discriminative only when the model is trained with the consistency loss**. On a baseline model `flow_consistency` is roughly random (0.57 on CICIDS2017); after Phase 2 training it lifts to 0.88. On SSH-Patator, where standard density scores struggle (`terminal_norm` 0.64), Phase 2 `flow_consistency` reaches 0.94. ## Train ```bash # baseline (no consistency loss) uv run python Unified_CFM/train.py --config Unified_CFM/configs/cicids2017_baseline.yaml # Phase 2 with consistency loss (λ=0.1, σ=0.1) uv run python Unified_CFM/train.py --config Unified_CFM/configs/cicids2017_consistency.yaml # σ × λ sweeps and multi-seed orchestrators live in # artifacts/verify_2026_04_24/run_*.sh ``` The intended setup is to use the workspace-canonical 20-d packet-derived flow feature file: ```yaml flow_features_path: datasets/cicids2017/processed/flow_features.parquet flow_features_align: auto ``` `flow_features.parquet` is row-aligned with the Packet_CFM artifacts via `flow_id`. With `flow_features_align: auto`, the loader uses direct row/`flow_id` alignment when possible; scan alignment remains only for legacy full CSV-derived caches. For large datasets where a monolithic `packets.npz` would exceed memory, the loader supports the sharded backend: ```yaml source_store: datasets/cicddos2019/processed/full_store val_cap: 20000 attack_cap: 20000 ``` If `flow_features_path` is empty, the loader derives compact 16-d flow-level statistics from the packet sequence. That fallback is for debugging only; new runs should use the canonical 20-d file generated by `scripts/generate_flow_features.py`. ## Evaluation `artifacts/verify_2026_04_24/eval_phase1_unified.py` runs Phase 1 + Phase 2 score battery on a trained checkpoint, with per-attack-class AUROC. `artifacts/verify_2026_04_24/eval_phase2_cross_cicddos2019.py` runs cross-dataset CICIDS2017→CICDDoS2019 evaluation under the standard 10k benign + 10k stratified attack protocol.