Route Comparison Protocol

Goal: compare three FM-mechanism × traffic-property route variants on a unified training base. All routes start from the current Unified_CFM SOTA recipe and change one mechanism axis.

Unified base (LOCKED)

Item	Value
Dataset	CICIoT2023
Source store	`datasets/ciciot2023/processed/full_store/`
Flows	`datasets/ciciot2023/processed/full_store/flows.parquet`
Flow features	`datasets/ciciot2023/processed/flow_features.parquet` (canonical 20-d)
Train: benign	10,000 (Shafir within-dataset protocol)
Sequence length	T = 64
Packet preprocess	`mixed_dequant` (Routes A/B); raw binaries (Route C)
Benign split	80/20, `split_seed=42`
Val cap	10,000
Attack cap	20,000 (stratified)
Multi-seed	{42, 43, 44}

Architecture base (LOCKED)

Item	Value
`d_model`	128
`n_layers`	4
`n_heads`	4
`mlp_ratio`	4.0
`time_dim`	64
`sigma`	0.1
`use_ot`	True
`lambda_flow / lambda_packet`	0.3 / 0.3
`packet_mask_ratio`	0.5
Optimizer	AdamW, lr=3e-4, wd=0.01, grad_clip=1.0
Schedule	CosineAnnealingLR over total steps
Epochs	50
Batch size	256

Routes

Route	Mechanism axis	Traffic property targeted
Baseline	Standard UnifiedCFM (current SOTA)	—
A: Causal	Packet-causal attention mask	Protocol causality (TCP/HTTP handshake)
B: Spectral	Append K=8-band DFT of (size, IAT) — 32 dims — to flow features (`flow_dim` 20→52); model architecture unchanged	Burstiness / LRD / self-similarity
C: Mixed FM	Continuous-CFM on (size,IAT,win) + DFM on flags	Discrete-continuous mixed channels

Route D (Edit Flows) is deferred until A/B/C show signal.

Reporting

Each route × seed produces:

artifacts/route_comparison/<route>_seed<S>/
├── model.pt
├── config.yaml          # actual config used
├── history.json
├── phase1_summary.json  # 34-score per-attack-class AUROC table
└── train.log

Final aggregate at artifacts/route_comparison/RESULTS.md:

| Route | terminal_norm | route-specific score | param count | train wall |
| baseline | 0.962 (existing) | — | 1.23M | ~2 min |
| A | ? | causal_surprisal_packet_median | ? | ? |
| B | ? | velocity_freq | ? | ? |
| C | ? | nll_disc + terminal_cont | ? | ? |

Plus per-attack-class breakdown for the top 10 attack labels by support.

Baseline reference (single-seed, from existing run)

artifacts/runs/unified_cfm_ciciot2023_2026_04_29/:

50 epochs, σ=0.1, λ=0.3
final auroc_terminal_norm = 0.962
This is the number to compare against; we'll re-run it under multi-seed for fair comparison.

2.7 KiB Raw Blame History Unescape Escape