2.7 KiB
2.7 KiB
Route Comparison Protocol
Goal: compare three FM-mechanism × traffic-property route variants on a unified
training base. All routes start from the current Unified_CFM SOTA recipe and
change one mechanism axis.
Unified base (LOCKED)
| Item | Value |
|---|---|
| Dataset | CICIoT2023 |
| Source store | datasets/ciciot2023/processed/full_store/ |
| Flows | datasets/ciciot2023/processed/full_store/flows.parquet |
| Flow features | datasets/ciciot2023/processed/flow_features.parquet (canonical 20-d) |
| Train: benign | 10,000 (Shafir within-dataset protocol) |
| Sequence length | T = 64 |
| Packet preprocess | mixed_dequant (Routes A/B); raw binaries (Route C) |
| Benign split | 80/20, split_seed=42 |
| Val cap | 10,000 |
| Attack cap | 20,000 (stratified) |
| Multi-seed | {42, 43, 44} |
Architecture base (LOCKED)
| Item | Value |
|---|---|
d_model |
128 |
n_layers |
4 |
n_heads |
4 |
mlp_ratio |
4.0 |
time_dim |
64 |
sigma |
0.1 |
use_ot |
True |
lambda_flow / lambda_packet |
0.3 / 0.3 |
packet_mask_ratio |
0.5 |
| Optimizer | AdamW, lr=3e-4, wd=0.01, grad_clip=1.0 |
| Schedule | CosineAnnealingLR over total steps |
| Epochs | 50 |
| Batch size | 256 |
Routes
| Route | Mechanism axis | Traffic property targeted |
|---|---|---|
| Baseline | Standard UnifiedCFM (current SOTA) | — |
| A: Causal | Packet-causal attention mask | Protocol causality (TCP/HTTP handshake) |
| B: Spectral | Append K=8-band DFT of (size, IAT) — 32 dims — to flow features (flow_dim 20→52); model architecture unchanged |
Burstiness / LRD / self-similarity |
| C: Mixed FM | Continuous-CFM on (size,IAT,win) + DFM on flags | Discrete-continuous mixed channels |
Route D (Edit Flows) is deferred until A/B/C show signal.
Reporting
Each route × seed produces:
artifacts/route_comparison/<route>_seed<S>/
├── model.pt
├── config.yaml # actual config used
├── history.json
├── phase1_summary.json # 34-score per-attack-class AUROC table
└── train.log
Final aggregate at artifacts/route_comparison/RESULTS.md:
| Route | terminal_norm | route-specific score | param count | train wall |
| baseline | 0.962 (existing) | — | 1.23M | ~2 min |
| A | ? | causal_surprisal_packet_median | ? | ? |
| B | ? | velocity_freq | ? | ? |
| C | ? | nll_disc + terminal_cont | ? | ? |
Plus per-attack-class breakdown for the top 10 attack labels by support.
Baseline reference (single-seed, from existing run)
artifacts/runs/unified_cfm_ciciot2023_2026_04_29/:
- 50 epochs, σ=0.1, λ=0.3
- final
auroc_terminal_norm= 0.962 - This is the number to compare against; we'll re-run it under multi-seed for fair comparison.