baselines: add 3x3 cross-dataset runners for IF/OCSVM (path A + B) and Shafir NF

New scripts under scripts/baselines/: - run_if_ocsvm_cross.py - 20-d canonical flow features (path A) - run_if_ocsvm_cross_packets.py - raw 576-d packet sequence (path B) - run_shafir_nf_cross.py - single-NF on 5-d SHAFIR5 subset or 20-d - *_all.sh - 3 sources x 3 targets x 3 seeds sweepers New aggregator scripts/aggregate/baselines_cross_3x3_table.py builds a Markdown 3x3 matrix per method from per-cell NPZ outputs. RESULTS.md gains a "Shallow-baseline 3x3 cross matrices" subsection pointing at the new artifact directories. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mixed_CFM: absorb Unified_CFM primitives; remove Unified_CFM
2026-05-12 17:41:20 +08:00 · 2026-05-11 14:18:11 +08:00 · 2026-05-11 09:09:04 +08:00 · 2026-05-11 08:58:36 +08:00 · 2026-05-11 08:53:19 +08:00 · 2026-05-11 00:03:34 +08:00
105 changed files with 5913 additions and 2278 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -26,4 +26,13 @@ Thumbs.db

 /paper/

+# rendered figure outputs (PDFs/PNGs at repo root from figure-generation runs)
+/unified_figures_*/
+/janus_figures_*/
+
 *.tmp
+
+CLAUDE.md
+.gitignore
+
+drafts/
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -1,172 +0,0 @@
-# CLAUDE.md
-
-This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
-
-## Repo shape
-
-This is a **workspace-style repo with three sibling model packages** plus a
-shared data contract. The root intentionally keeps only workspace-level
-files; all model/training/eval code lives under one of the three packages.
-
- `common/data_contract.py` — **single source of truth** for the canonical
-  9-d packet schema (`PACKET_FEATURE_NAMES`) and 20-d packet-derived flow
-  schema (`CANONICAL_FLOW_FEATURE_NAMES`), label normalization, canonical
-  5-tuple, packet preprocessing helpers, and `compute_flow_features_from_packets`.
-  All three packages import from here.
- `Packet_CFM/` — packet-sequence OT-CFM with explicit σ-band benign
-  distribution learning. Has its own `CLAUDE.md` for internal details.
- `Flow_CFM/` — flow-level CFM on the workspace-canonical 20-d packet-derived
-  `flow_features.parquet`. Legacy 61-d CICFlowMeter CSV caches are still
-  available only for reproduction via the `--legacy-csv-features` flag.
- `Unified_CFM/` — **current SOTA model**. Unified token CFM over
-  `[FLOW_TOKEN, PACKET_1, ..., PACKET_T]` with masked-prediction consistency
-  loss (Phase 2). All within-dataset SOTAs (ISCXTor2016 / CICIDS2017 /
-  CICDDoS2019) come from here.
- `scripts/` — **workspace-level** scripts shared across all packages:
-  - `download/` — UNB/CIC dataset downloaders (Token-cookie + `cic_download.py`
-    recursive crawler). See `scripts/download/README.md` before touching.
-  - `extract_<dataset>.py` + `extract_lib.py` — pcap→artifact drivers that write
-    `datasets/<name>/processed/{packets.npz, flows.parquet, flow_features.parquet}`,
-    all row-aligned by `flow_id = arange(N)`.
-  - `generate_flow_features.py` — one-shot tool to upgrade an existing
-    `packets.npz` + `flows.parquet` pair to a canonical `flow_features.parquet`
-    without re-extracting pcap. Supports `--source-store` for sharded stores.
-  - `csv_adapter.py`, `convert_npz_splits_to_store.py`, `eval_cross_dataset_protocol.py`,
-    `merge_*.py`, `auto_transfer_*.sh` — cross-package tooling.
- `datasets/<name>/raw/` and `datasets/<name>/processed/` — shared dataset store.
- `artifacts/{runs,phase0_*,phase1_*,phase25_*,verify_*}/` — **all outputs go
-  here**, not `runs/` at root. Phase summary reports live in `artifacts/phase*/`.
- `paper/` — paper PDFs we compare against (Shafir 2026 NF, ConMD 2026,
-  TIPSO-GAN 2026, Lipman 2210.02747).
-
-There is no `archive_v1/` at root; old flow-stat v1 code has been removed.
-`Flow_CFM/checkpoints_archive/` retains historical checkpoints for reproduction.
-
-## Data contract (read this before touching data code)
-
-Every processed dataset under `datasets/<name>/processed/` ships an aligned
-triple, all with the same row order (`flow_id = arange(N)`):
-
-```
-packets.npz          # packet_tokens [N, T_full, 9], packet_lengths [N], flow_id [N]
-                     # OR full_store/ (PacketShardStore directory) for large datasets
-flows.parquet        # flow_id + label + 5-tuple metadata (src_ip, dst_ip, ports, protocol)
-flow_features.parquet  # flow_id + label + 20 canonical packet-derived features
-```
-
-Optional / legacy:
- `flow_features_csv.parquet` — Flow_CFM's 61-d CICFlowMeter cache (paper
-  reproduction only; not row-aligned with packets in general)
-
-The 20 canonical flow features are computed by
-`common.data_contract.compute_flow_features_from_packets(packet_tokens, lens)`
-and cover Shafir 2026's top-SHAP categories (size/IAT/active-idle/rate/flags)
-in a packet-derivable way.
-
-## Python env
-
- `requires-python = ">=3.14"`; PyTorch pinned to the `pytorch-cu128` index
-  (`torch>=2.9.1`), plus `mamba-ssm`, `causal-conv1d`, `scapy`, `dpkt`, `pyarrow`.
- Two `pyproject.toml` files: root (`/pyproject.toml`) and `Packet_CFM/pyproject.toml`.
-  They are **not declared as a uv workspace** — each resolves independently.
-  Run `uv run ...` from whichever directory owns the entry point you are invoking.
- `Flow_CFM/` and `Unified_CFM/` have no `pyproject.toml`; they use the root
-  venv (`uv run --no-sync python <script.py>`).
- Scripts under `scripts/download/` are pure stdlib — invoke with `python3`.
-
-## Running things
-
-**Unified_CFM** (SOTA model, run from `Unified_CFM/`):
-
-```bash
-cd Unified_CFM
-uv run --no-sync python train.py --config configs/cicids2017_baseline.yaml
-# Phase 2 with consistency loss:
-uv run --no-sync python train.py --config configs/cicids2017_consistency.yaml
-```
-
-Best hyperparameters from the σ × λ sweeps:
- `lambda_flow = lambda_packet = 0.3`
- `sigma = 0.6` for cross-dataset transfer
- `sigma = 0.1` is fine for within-dataset (and marginally better on ISCXTor2016)
-
-**Phase 1 / 2 evaluation**:
-
-```bash
-# Per-attack-class AUROC over 34 scores (terminal_norm primary, plus curvature,
-# Jacobian-Hutchinson, time-profile velocity, flow_consistency diagnostics).
-uv run --no-sync python artifacts/verify_2026_04_24/eval_phase1_unified.py \
-  --model-dir <model_dir> --out-dir <eval_dir> \
-  --batch-size 256 --jacobian-n-eps 4 \
-  --n-val-cap 10000 --n-atk-cap 30000
-
-# Cross-dataset CICIDS2017 → CICDDoS2019:
-uv run --no-sync python artifacts/verify_2026_04_24/eval_phase2_cross_cicddos2019.py \
-  --model-dir <model_dir> --out <result.json> \
-  --n-benign 10000 --n-attack 10000 --seed 42
-```
-
-**Packet_CFM entry points** (run from `Packet_CFM/`):
-
-```bash
-cd Packet_CFM
-uv run python -m train          --config configs/n10k.yaml
-uv run python -m detect         --save-dir ../artifacts/runs/<run>
-uv run python -m eval.per_class --save-dir ../artifacts/runs/<run>
-uv run python -m run_phase1     --sigmas 0.0 0.1 0.2 0.3
-```
-
-**Flow_CFM entry points** (run from `Flow_CFM/`): see `Flow_CFM/README_migration.md`.
-
-**Tests**:
-
-```bash
-uv run --no-sync python -m pytest Packet_CFM/tests/ tests/common/ Unified_CFM/tests/
-```
-
-(43 passing — common data contract + Unified_CFM Phase 1/2 score functions
-+ Packet_CFM existing tests.)
-
-## Adding a new dataset
-
-Write one driver at `scripts/extract_<name>.py` that calls
-`extract_lib.extract_dataset(...)` (see `scripts/extract_cicids2017.py` as
-the reference template). The driver hardcodes CSV column names, timestamp
-formats, benign aliases, and drop patterns as module constants, then feeds
-`extract_lib` a per-day `(canonical_key → [(row_idx, ts_epoch)])` mapping
-and a per-day pcap file map. No YAML is needed.
-
-The extract pipeline writes all three artifacts (packets.npz, flows.parquet,
-flow_features.parquet) row-aligned. To upgrade an existing artifact pair
-that lacks `flow_features.parquet`, run
-`scripts/generate_flow_features.py --packets-npz ... --flows-parquet ... --out ...`
-(or `--source-store` for sharded stores).
-
-Common gotcha: if CSV timestamps and pcap epochs are in different time zones,
-`extract_lib` prints a diagnostic with the recommended `--time-offset`; rerun
-with that value.
-
-## Conventions worth preserving
-
- Do not create a new `runs/` at repo root — outputs belong under `artifacts/`.
- `scripts/download/` stays at the root (shared by all packages).
- When adding new cross-package tooling, put it in root `scripts/`. Only move
-  it into `Packet_CFM/scripts/` if it depends on that package's imports.
- Phase reports live in `artifacts/phase*/` — keep the timestamp suffix
-  (`_2026_04_25`) so future runs don't overwrite history.
- The 9-d packet schema and 20-d canonical flow schema are FIXED in
-  `common/data_contract.py`. Do not extend them ad-hoc; if you need new
-  features, propose them with evidence (Shafir-style SHAP analysis or
-  Phase 1-style per-attack ablation).
-
-## Current state of the work (2026-04-25)
-
- Phase 0 baselines + Shafir-protocol verification: ✓
- Phase 1 (34-score expansion + per-attack-class table): ✓
- Phase 2 (masked-prediction consistency loss): ✓ — multi-seed at λ=0.3
- Phase 2.5 (σ × λ sweep + multi-seed at σ=0.6): ✓
- Cross-dataset multi-seed: ✓ — also SOTA after baseline lock
- Shafir baselines locked from PDF: ✓ — `artifacts/locked_baselines.md`
- 4 of 4 reported tasks beat Shafir SOTA (final table: `RESULTS.md`)
- Architecture is finalized; remaining work is paper writing
-  (P1 skeleton, P2 thresholded F1/Precision/Recall metrics).
--- a/Mixed_CFM/_layers.py
+++ b/Mixed_CFM/_layers.py
@@ -0,0 +1,59 @@
+from __future__ import annotations
+import math
+import torch
+import torch.nn as nn
+
+
+@torch.no_grad()
+def _sinkhorn_coupling(C: torch.Tensor, reg: float=0.05, n_iter: int=20) -> torch.Tensor:
+    C = C.float()
+    log_k = -C / reg
+    B = C.shape[0]
+    log_u = torch.zeros(B, device=C.device)
+    log_v = torch.zeros(B, device=C.device)
+    for _ in range(n_iter):
+        log_v = -torch.logsumexp(log_k + log_u.unsqueeze(1), dim=0)
+        log_u = -torch.logsumexp(log_k + log_v.unsqueeze(0), dim=1)
+    log_p = log_u.unsqueeze(1) + log_k + log_v.unsqueeze(0)
+    return log_p.argmax(dim=1)
+
+
+class SinusoidalTimeEmb(nn.Module):
+
+    def __init__(self, dim: int) -> None:
+        super().__init__()
+        if dim % 2 != 0:
+            raise ValueError('time embedding dimension must be even')
+        self.dim = dim
+
+    def forward(self, t: torch.Tensor) -> torch.Tensor:
+        half = self.dim // 2
+        freqs = torch.exp(-math.log(10000) * torch.arange(half, device=t.device, dtype=t.dtype) / max(half - 1, 1))
+        args = t[:, None] * freqs[None, :]
+        return torch.cat([args.sin(), args.cos()], dim=-1)
+
+
+class AdaLNBlock(nn.Module):
+
+    def __init__(self, d_model: int, n_heads: int, mlp_ratio: float, cond_dim: int) -> None:
+        super().__init__()
+        self.norm1 = nn.LayerNorm(d_model, elementwise_affine=False)
+        self.attn = nn.MultiheadAttention(d_model, n_heads, batch_first=True)
+        self.norm2 = nn.LayerNorm(d_model, elementwise_affine=False)
+        hidden = int(d_model * mlp_ratio)
+        self.mlp = nn.Sequential(nn.Linear(d_model, hidden), nn.GELU(), nn.Linear(hidden, d_model))
+        self.cond_proj = nn.Linear(cond_dim, 6 * d_model)
+        nn.init.zeros_(self.cond_proj.weight)
+        nn.init.zeros_(self.cond_proj.bias)
+
+    @staticmethod
+    def _modulate(x: torch.Tensor, gamma: torch.Tensor, beta: torch.Tensor) -> torch.Tensor:
+        return x * (1.0 + gamma[:, None, :]) + beta[:, None, :]
+
+    def forward(self, x: torch.Tensor, cond: torch.Tensor, key_padding_mask: torch.Tensor | None, attn_mask: torch.Tensor | None=None) -> torch.Tensor:
+        (g1, b1, a1, g2, b2, a2) = self.cond_proj(cond).chunk(6, dim=-1)
+        h = self._modulate(self.norm1(x), g1, b1)
+        (attn_out, _) = self.attn(h, h, h, key_padding_mask=key_padding_mask, attn_mask=attn_mask, need_weights=False)
+        x = x + a1[:, None, :] * attn_out
+        h = self._modulate(self.norm2(x), g2, b2)
+        return x + a2[:, None, :] * self.mlp(h)
--- a/Mixed_CFM/configs/ablation/b1_noflow/cicddos2019_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b1_noflow/cicddos2019_seed42.yaml
@@ -1,43 +1,36 @@
-save_dir: /home/chy/JANUS/artifacts/runs/unified_cfm_cicddos2019_within_consistency_2026_04_25
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicddos2019_seed42_b1_noflow
 source_store: /home/chy/JANUS/datasets/cicddos2019/processed/full_store
 flows_parquet: /home/chy/JANUS/datasets/cicddos2019/processed/flows.parquet
 flow_features_path: /home/chy/JANUS/datasets/cicddos2019/processed/flow_features.parquet
 flow_features_align: auto
-
 T: 64
 n_train: 10000
 min_len: 2
-packet_preprocess: mixed_dequant
 seed: 42
 data_seed: 42
 train_ratio: 0.8
 benign_label: normal
 val_cap: 20000
 attack_cap: 20000
-
 d_model: 128
 n_layers: 4
 n_heads: 4
 mlp_ratio: 4.0
 time_dim: 64
-token_dim:
-
+token_dim: null
 batch_size: 256
 num_workers: 0
 epochs: 50
-lr: 3.0e-4
+lr: 0.0003
 weight_decay: 0.01
 grad_clip: 1.0
 eval_every: 10
 eval_n: 10000
 eval_batch_size: 512
 eval_n_steps: 8
-
 sigma: 0.1
 use_ot: true
-
-lambda_flow: 0.1
-lambda_packet: 0.1
-packet_mask_ratio: 0.5
-
+lambda_disc: 1.0
+reference_mode: causal_packets
 device: auto
+use_flow_token: false
--- a/Mixed_CFM/configs/ablation/b1_noflow/cicddos2019_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b1_noflow/cicddos2019_seed43.yaml
@@ -0,0 +1,36 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicddos2019_seed43_b1_noflow
+source_store: /home/chy/JANUS/datasets/cicddos2019/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/cicddos2019/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicddos2019/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 43
+data_seed: 43
+train_ratio: 0.8
+benign_label: normal
+val_cap: 20000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 10000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 1.0
+reference_mode: causal_packets
+device: auto
+use_flow_token: false
--- a/Mixed_CFM/configs/ablation/b1_noflow/cicddos2019_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b1_noflow/cicddos2019_seed44.yaml
@@ -0,0 +1,36 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicddos2019_seed44_b1_noflow
+source_store: /home/chy/JANUS/datasets/cicddos2019/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/cicddos2019/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicddos2019/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 44
+data_seed: 44
+train_ratio: 0.8
+benign_label: normal
+val_cap: 20000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 10000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 1.0
+reference_mode: causal_packets
+device: auto
+use_flow_token: false
--- a/Mixed_CFM/configs/ablation/b1_noflow/cicids2017_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b1_noflow/cicids2017_seed42.yaml
@@ -1,43 +1,34 @@
-
-save_dir: /home/chy/JANUS/artifacts/runs/unified_cfm_cicids2017_consistency_2026_04_25
-
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicids2017_seed42_b1_noflow
 packets_npz: /home/chy/JANUS/datasets/cicids2017/processed/packets.npz
 flows_parquet: /home/chy/JANUS/datasets/cicids2017/processed/flows.parquet
 flow_features_path: /home/chy/JANUS/datasets/cicids2017/processed/flow_features.parquet
 flow_features_align: auto
-
 T: 64
 n_train: 10000
 min_len: 2
-packet_preprocess: mixed_dequant
 seed: 42
 data_seed: 42
 train_ratio: 0.8
 benign_label: normal
-
 d_model: 128
 n_layers: 4
 n_heads: 4
 mlp_ratio: 4.0
 time_dim: 64
-token_dim:
-
+token_dim: null
 batch_size: 256
-num_workers: 2
+num_workers: 0
 epochs: 50
-lr: 3.0e-4
+lr: 0.0003
 weight_decay: 0.01
 grad_clip: 1.0
 eval_every: 10
 eval_n: 20000
 eval_batch_size: 512
 eval_n_steps: 8
-
 sigma: 0.1
 use_ot: true
-
-lambda_flow: 0.1
-lambda_packet: 0.1
-packet_mask_ratio: 0.5
-
+lambda_disc: 1.0
+reference_mode: causal_packets
 device: auto
+use_flow_token: false
--- a/Mixed_CFM/configs/ablation/b1_noflow/cicids2017_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b1_noflow/cicids2017_seed43.yaml
@@ -0,0 +1,34 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicids2017_seed43_b1_noflow
+packets_npz: /home/chy/JANUS/datasets/cicids2017/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/cicids2017/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicids2017/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 43
+data_seed: 43
+train_ratio: 0.8
+benign_label: normal
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 1.0
+reference_mode: causal_packets
+device: auto
+use_flow_token: false
--- a/Mixed_CFM/configs/ablation/b1_noflow/cicids2017_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b1_noflow/cicids2017_seed44.yaml
@@ -0,0 +1,34 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicids2017_seed44_b1_noflow
+packets_npz: /home/chy/JANUS/datasets/cicids2017/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/cicids2017/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicids2017/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 44
+data_seed: 44
+train_ratio: 0.8
+benign_label: normal
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 1.0
+reference_mode: causal_packets
+device: auto
+use_flow_token: false
--- a/Mixed_CFM/configs/ablation/b1_noflow/ciciot2023_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b1_noflow/ciciot2023_seed42.yaml
@@ -1,45 +1,36 @@
-
-save_dir: /home/chy/JANUS/artifacts/route_comparison/route_a_causal_ciciot2023_seed42
-
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_ciciot2023_seed42_b1_noflow
 source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
 flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
 flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features.parquet
 flow_features_align: auto
-
 T: 64
 n_train: 10000
 min_len: 2
-packet_preprocess: mixed_dequant
 seed: 42
 data_seed: 42
 train_ratio: 0.8
 benign_label: normal
 val_cap: 10000
 attack_cap: 20000
-
 d_model: 128
 n_layers: 4
 n_heads: 4
 mlp_ratio: 4.0
 time_dim: 64
-token_dim:
-reference_mode: causal_packets
-
+token_dim: null
 batch_size: 256
 num_workers: 0
 epochs: 50
-lr: 3.0e-4
+lr: 0.0003
 weight_decay: 0.01
 grad_clip: 1.0
 eval_every: 10
 eval_n: 20000
 eval_batch_size: 512
 eval_n_steps: 8
-
 sigma: 0.1
 use_ot: true
-lambda_flow: 0.3
-lambda_packet: 0.3
-packet_mask_ratio: 0.5
-
+lambda_disc: 1.0
 device: auto
+reference_mode: causal_packets
+use_flow_token: false
--- a/Mixed_CFM/configs/ablation/b1_noflow/ciciot2023_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b1_noflow/ciciot2023_seed43.yaml
@@ -1,45 +1,36 @@
-
-save_dir: /home/chy/JANUS/artifacts/route_comparison/route_a_causal_ciciot2023_seed43
-
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_ciciot2023_seed43_b1_noflow
 source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
 flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
 flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features.parquet
 flow_features_align: auto
-
 T: 64
 n_train: 10000
 min_len: 2
-packet_preprocess: mixed_dequant
 seed: 43
 data_seed: 43
 train_ratio: 0.8
 benign_label: normal
 val_cap: 10000
 attack_cap: 20000
-
 d_model: 128
 n_layers: 4
 n_heads: 4
 mlp_ratio: 4.0
 time_dim: 64
-token_dim:
-reference_mode: causal_packets
-
+token_dim: null
 batch_size: 256
 num_workers: 0
 epochs: 50
-lr: 3.0e-4
+lr: 0.0003
 weight_decay: 0.01
 grad_clip: 1.0
 eval_every: 10
 eval_n: 20000
 eval_batch_size: 512
 eval_n_steps: 8
-
 sigma: 0.1
 use_ot: true
-lambda_flow: 0.3
-lambda_packet: 0.3
-packet_mask_ratio: 0.5
-
+lambda_disc: 1.0
 device: auto
+reference_mode: causal_packets
+use_flow_token: false
--- a/Mixed_CFM/configs/ablation/b1_noflow/ciciot2023_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b1_noflow/ciciot2023_seed44.yaml
@@ -1,45 +1,36 @@
-
-save_dir: /home/chy/JANUS/artifacts/route_comparison/route_a_causal_ciciot2023_seed44
-
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_ciciot2023_seed44_b1_noflow
 source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
 flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
 flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features.parquet
 flow_features_align: auto
-
 T: 64
 n_train: 10000
 min_len: 2
-packet_preprocess: mixed_dequant
 seed: 44
 data_seed: 44
 train_ratio: 0.8
 benign_label: normal
 val_cap: 10000
 attack_cap: 20000
-
 d_model: 128
 n_layers: 4
 n_heads: 4
 mlp_ratio: 4.0
 time_dim: 64
-token_dim:
-reference_mode: causal_packets
-
+token_dim: null
 batch_size: 256
 num_workers: 0
 epochs: 50
-lr: 3.0e-4
+lr: 0.0003
 weight_decay: 0.01
 grad_clip: 1.0
 eval_every: 10
 eval_n: 20000
 eval_batch_size: 512
 eval_n_steps: 8
-
 sigma: 0.1
 use_ot: true
-lambda_flow: 0.3
-lambda_packet: 0.3
-packet_mask_ratio: 0.5
-
+lambda_disc: 1.0
 device: auto
+reference_mode: causal_packets
+use_flow_token: false
--- a/Mixed_CFM/configs/ablation/b1_noflow/iscxtor2016_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b1_noflow/iscxtor2016_seed42.yaml
@@ -1,41 +1,34 @@
-save_dir: /home/chy/JANUS/artifacts/runs/unified_cfm_iscxtor2016_consistency_2026_04_25
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_iscxtor2016_seed42_b1_noflow
 packets_npz: /home/chy/JANUS/datasets/iscxtor2016/processed/packets.npz
 flows_parquet: /home/chy/JANUS/datasets/iscxtor2016/processed/flows.parquet
 flow_features_path: /home/chy/JANUS/datasets/iscxtor2016/processed/flow_features.parquet
 flow_features_align: auto
-
 T: 64
 n_train: 10000
 min_len: 2
-packet_preprocess: mixed_dequant
 seed: 42
 data_seed: 42
 train_ratio: 0.8
 benign_label: nontor
-
 d_model: 128
 n_layers: 4
 n_heads: 4
 mlp_ratio: 4.0
 time_dim: 64
-token_dim:
-
+token_dim: null
 batch_size: 256
-num_workers: 2
+num_workers: 0
 epochs: 50
-lr: 3.0e-4
+lr: 0.0003
 weight_decay: 0.01
 grad_clip: 1.0
 eval_every: 10
 eval_n: 20000
 eval_batch_size: 512
 eval_n_steps: 8
-
 sigma: 0.1
 use_ot: true
-
-lambda_flow: 0.1
-lambda_packet: 0.1
-packet_mask_ratio: 0.5
-
+lambda_disc: 1.0
+reference_mode: causal_packets
 device: auto
+use_flow_token: false
--- a/Mixed_CFM/configs/ablation/b1_noflow/iscxtor2016_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b1_noflow/iscxtor2016_seed43.yaml
@@ -0,0 +1,34 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_iscxtor2016_seed43_b1_noflow
+packets_npz: /home/chy/JANUS/datasets/iscxtor2016/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/iscxtor2016/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/iscxtor2016/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 43
+data_seed: 43
+train_ratio: 0.8
+benign_label: nontor
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 1.0
+reference_mode: causal_packets
+device: auto
+use_flow_token: false
--- a/Mixed_CFM/configs/ablation/b1_noflow/iscxtor2016_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b1_noflow/iscxtor2016_seed44.yaml
@@ -0,0 +1,34 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_iscxtor2016_seed44_b1_noflow
+packets_npz: /home/chy/JANUS/datasets/iscxtor2016/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/iscxtor2016/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/iscxtor2016/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 44
+data_seed: 44
+train_ratio: 0.8
+benign_label: nontor
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 1.0
+reference_mode: causal_packets
+device: auto
+use_flow_token: false
--- a/Mixed_CFM/configs/ablation/b2_flowonly/cicddos2019_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b2_flowonly/cicddos2019_seed42.yaml
@@ -1,45 +1,36 @@
-save_dir: /home/chy/JANUS/artifacts/phaseC_reference_2026_04_25/cicddos2019_ref_independent_seed42
-
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicddos2019_seed42_b2_flowonly
 source_store: /home/chy/JANUS/datasets/cicddos2019/processed/full_store
 flows_parquet: /home/chy/JANUS/datasets/cicddos2019/processed/flows.parquet
 flow_features_path: /home/chy/JANUS/datasets/cicddos2019/processed/flow_features.parquet
 flow_features_align: auto
-
 T: 64
 n_train: 10000
 min_len: 2
-packet_preprocess: mixed_dequant
 seed: 42
 data_seed: 42
 train_ratio: 0.8
 benign_label: normal
 val_cap: 20000
 attack_cap: 20000
-
 d_model: 128
 n_layers: 4
 n_heads: 4
 mlp_ratio: 4.0
 time_dim: 64
-token_dim:
-reference_mode: independent_token
-
+token_dim: null
 batch_size: 256
 num_workers: 0
 epochs: 50
-lr: 3.0e-4
+lr: 0.0003
 weight_decay: 0.01
 grad_clip: 1.0
 eval_every: 10
 eval_n: 10000
 eval_batch_size: 512
 eval_n_steps: 8
-
 sigma: 0.1
 use_ot: true
-
-lambda_flow: 0.0
-lambda_packet: 0.0
-packet_mask_ratio: 0.5
-
+lambda_disc: 0.0
+reference_mode: causal_packets
 device: auto
+n_packet_tokens: 0
--- a/Mixed_CFM/configs/ablation/b2_flowonly/cicddos2019_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b2_flowonly/cicddos2019_seed43.yaml
@@ -0,0 +1,36 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicddos2019_seed43_b2_flowonly
+source_store: /home/chy/JANUS/datasets/cicddos2019/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/cicddos2019/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicddos2019/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 43
+data_seed: 43
+train_ratio: 0.8
+benign_label: normal
+val_cap: 20000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 10000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
+n_packet_tokens: 0
--- a/Mixed_CFM/configs/ablation/b2_flowonly/cicddos2019_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b2_flowonly/cicddos2019_seed44.yaml
@@ -0,0 +1,36 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicddos2019_seed44_b2_flowonly
+source_store: /home/chy/JANUS/datasets/cicddos2019/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/cicddos2019/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicddos2019/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 44
+data_seed: 44
+train_ratio: 0.8
+benign_label: normal
+val_cap: 20000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 10000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
+n_packet_tokens: 0
--- a/Mixed_CFM/configs/ablation/b2_flowonly/cicids2017_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b2_flowonly/cicids2017_seed42.yaml
@@ -0,0 +1,34 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicids2017_seed42_b2_flowonly
+packets_npz: /home/chy/JANUS/datasets/cicids2017/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/cicids2017/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicids2017/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 42
+data_seed: 42
+train_ratio: 0.8
+benign_label: normal
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
+n_packet_tokens: 0
--- a/Mixed_CFM/configs/ablation/b2_flowonly/cicids2017_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b2_flowonly/cicids2017_seed43.yaml
@@ -0,0 +1,34 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicids2017_seed43_b2_flowonly
+packets_npz: /home/chy/JANUS/datasets/cicids2017/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/cicids2017/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicids2017/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 43
+data_seed: 43
+train_ratio: 0.8
+benign_label: normal
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
+n_packet_tokens: 0
--- a/Mixed_CFM/configs/ablation/b2_flowonly/cicids2017_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b2_flowonly/cicids2017_seed44.yaml
@@ -0,0 +1,34 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicids2017_seed44_b2_flowonly
+packets_npz: /home/chy/JANUS/datasets/cicids2017/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/cicids2017/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicids2017/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 44
+data_seed: 44
+train_ratio: 0.8
+benign_label: normal
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
+n_packet_tokens: 0
--- a/Mixed_CFM/configs/ablation/b2_flowonly/ciciot2023_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b2_flowonly/ciciot2023_seed42.yaml
@@ -1,43 +1,36 @@
-
-save_dir: /home/chy/JANUS/artifacts/runs/unified_cfm_ciciot2023_2026_04_29
-
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_ciciot2023_seed42_b2_flowonly
 source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
 flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
 flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features.parquet
 flow_features_align: auto
-
 T: 64
 n_train: 10000
 min_len: 2
-packet_preprocess: mixed_dequant
 seed: 42
 data_seed: 42
 train_ratio: 0.8
 benign_label: normal
 val_cap: 10000
-
+attack_cap: 20000
 d_model: 128
 n_layers: 4
 n_heads: 4
 mlp_ratio: 4.0
 time_dim: 64
-token_dim:
-
+token_dim: null
 batch_size: 256
 num_workers: 0
 epochs: 50
-lr: 3.0e-4
+lr: 0.0003
 weight_decay: 0.01
 grad_clip: 1.0
 eval_every: 10
 eval_n: 20000
 eval_batch_size: 512
 eval_n_steps: 8
-
 sigma: 0.1
 use_ot: true
-lambda_flow: 0.3
-lambda_packet: 0.3
-packet_mask_ratio: 0.5
-
+lambda_disc: 0.0
 device: auto
+reference_mode: causal_packets
+n_packet_tokens: 0
--- a/Mixed_CFM/configs/ablation/b2_flowonly/ciciot2023_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b2_flowonly/ciciot2023_seed43.yaml
@@ -1,45 +1,36 @@
-
-save_dir: /home/chy/JANUS/artifacts/route_comparison/baseline_ciciot2023_seed43
-
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_ciciot2023_seed43_b2_flowonly
 source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
 flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
 flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features.parquet
 flow_features_align: auto
-
 T: 64
 n_train: 10000
 min_len: 2
-packet_preprocess: mixed_dequant
 seed: 43
 data_seed: 43
 train_ratio: 0.8
 benign_label: normal
 val_cap: 10000
 attack_cap: 20000
-
 d_model: 128
 n_layers: 4
 n_heads: 4
 mlp_ratio: 4.0
 time_dim: 64
-token_dim:
-reference_mode:
-
+token_dim: null
 batch_size: 256
 num_workers: 0
 epochs: 50
-lr: 3.0e-4
+lr: 0.0003
 weight_decay: 0.01
 grad_clip: 1.0
 eval_every: 10
 eval_n: 20000
 eval_batch_size: 512
 eval_n_steps: 8
-
 sigma: 0.1
 use_ot: true
-lambda_flow: 0.3
-lambda_packet: 0.3
-packet_mask_ratio: 0.5
-
+lambda_disc: 0.0
 device: auto
+reference_mode: causal_packets
+n_packet_tokens: 0
--- a/Mixed_CFM/configs/ablation/b2_flowonly/ciciot2023_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b2_flowonly/ciciot2023_seed44.yaml
@@ -1,45 +1,36 @@
-
-save_dir: /home/chy/JANUS/artifacts/route_comparison/baseline_ciciot2023_seed44
-
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_ciciot2023_seed44_b2_flowonly
 source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
 flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
 flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features.parquet
 flow_features_align: auto
-
 T: 64
 n_train: 10000
 min_len: 2
-packet_preprocess: mixed_dequant
 seed: 44
 data_seed: 44
 train_ratio: 0.8
 benign_label: normal
 val_cap: 10000
 attack_cap: 20000
-
 d_model: 128
 n_layers: 4
 n_heads: 4
 mlp_ratio: 4.0
 time_dim: 64
-token_dim:
-reference_mode:
-
+token_dim: null
 batch_size: 256
 num_workers: 0
 epochs: 50
-lr: 3.0e-4
+lr: 0.0003
 weight_decay: 0.01
 grad_clip: 1.0
 eval_every: 10
 eval_n: 20000
 eval_batch_size: 512
 eval_n_steps: 8
-
 sigma: 0.1
 use_ot: true
-lambda_flow: 0.3
-lambda_packet: 0.3
-packet_mask_ratio: 0.5
-
+lambda_disc: 0.0
 device: auto
+reference_mode: causal_packets
+n_packet_tokens: 0
--- a/Mixed_CFM/configs/ablation/b2_flowonly/iscxtor2016_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b2_flowonly/iscxtor2016_seed42.yaml
@@ -0,0 +1,34 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_iscxtor2016_seed42_b2_flowonly
+packets_npz: /home/chy/JANUS/datasets/iscxtor2016/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/iscxtor2016/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/iscxtor2016/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 42
+data_seed: 42
+train_ratio: 0.8
+benign_label: nontor
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
+n_packet_tokens: 0
--- a/Mixed_CFM/configs/ablation/b2_flowonly/iscxtor2016_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b2_flowonly/iscxtor2016_seed43.yaml
@@ -0,0 +1,34 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_iscxtor2016_seed43_b2_flowonly
+packets_npz: /home/chy/JANUS/datasets/iscxtor2016/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/iscxtor2016/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/iscxtor2016/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 43
+data_seed: 43
+train_ratio: 0.8
+benign_label: nontor
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
+n_packet_tokens: 0
--- a/Mixed_CFM/configs/ablation/b2_flowonly/iscxtor2016_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b2_flowonly/iscxtor2016_seed44.yaml
@@ -0,0 +1,34 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_iscxtor2016_seed44_b2_flowonly
+packets_npz: /home/chy/JANUS/datasets/iscxtor2016/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/iscxtor2016/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/iscxtor2016/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 44
+data_seed: 44
+train_ratio: 0.8
+benign_label: nontor
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
+n_packet_tokens: 0
--- a/Mixed_CFM/configs/ablation/b3_allcont/cicddos2019_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b3_allcont/cicddos2019_seed42.yaml
@@ -1,45 +1,36 @@
-save_dir: /home/chy/JANUS/artifacts/phaseC_reference_2026_04_25/cicddos2019_ref_blockdiag_seed42
-
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicddos2019_seed42_b3_allcont
 source_store: /home/chy/JANUS/datasets/cicddos2019/processed/full_store
 flows_parquet: /home/chy/JANUS/datasets/cicddos2019/processed/flows.parquet
 flow_features_path: /home/chy/JANUS/datasets/cicddos2019/processed/flow_features.parquet
 flow_features_align: auto
-
 T: 64
 n_train: 10000
 min_len: 2
-packet_preprocess: mixed_dequant
 seed: 42
 data_seed: 42
 train_ratio: 0.8
 benign_label: normal
 val_cap: 20000
 attack_cap: 20000
-
 d_model: 128
 n_layers: 4
 n_heads: 4
 mlp_ratio: 4.0
 time_dim: 64
-token_dim:
-reference_mode: block_diagonal
-
+token_dim: null
 batch_size: 256
 num_workers: 0
 epochs: 50
-lr: 3.0e-4
+lr: 0.0003
 weight_decay: 0.01
 grad_clip: 1.0
 eval_every: 10
-eval_n: 20000
+eval_n: 10000
 eval_batch_size: 512
 eval_n_steps: 8
-
 sigma: 0.1
 use_ot: true
-
-lambda_flow: 0.0
-lambda_packet: 0.0
-packet_mask_ratio: 0.5
-
+lambda_disc: 0.0
+reference_mode: causal_packets
 device: auto
+disc_as_cont: true
--- a/Mixed_CFM/configs/ablation/b3_allcont/cicddos2019_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b3_allcont/cicddos2019_seed43.yaml
@@ -0,0 +1,36 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicddos2019_seed43_b3_allcont
+source_store: /home/chy/JANUS/datasets/cicddos2019/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/cicddos2019/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicddos2019/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 43
+data_seed: 43
+train_ratio: 0.8
+benign_label: normal
+val_cap: 20000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 10000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
+disc_as_cont: true
--- a/Mixed_CFM/configs/ablation/b3_allcont/cicddos2019_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b3_allcont/cicddos2019_seed44.yaml
@@ -0,0 +1,36 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicddos2019_seed44_b3_allcont
+source_store: /home/chy/JANUS/datasets/cicddos2019/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/cicddos2019/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicddos2019/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 44
+data_seed: 44
+train_ratio: 0.8
+benign_label: normal
+val_cap: 20000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 10000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
+disc_as_cont: true
--- a/Mixed_CFM/configs/ablation/b3_allcont/cicids2017_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b3_allcont/cicids2017_seed42.yaml
@@ -0,0 +1,34 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicids2017_seed42_b3_allcont
+packets_npz: /home/chy/JANUS/datasets/cicids2017/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/cicids2017/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicids2017/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 42
+data_seed: 42
+train_ratio: 0.8
+benign_label: normal
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
+disc_as_cont: true
--- a/Mixed_CFM/configs/ablation/b3_allcont/cicids2017_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b3_allcont/cicids2017_seed43.yaml
@@ -0,0 +1,34 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicids2017_seed43_b3_allcont
+packets_npz: /home/chy/JANUS/datasets/cicids2017/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/cicids2017/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicids2017/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 43
+data_seed: 43
+train_ratio: 0.8
+benign_label: normal
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
+disc_as_cont: true
--- a/Mixed_CFM/configs/ablation/b3_allcont/cicids2017_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b3_allcont/cicids2017_seed44.yaml
@@ -0,0 +1,34 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicids2017_seed44_b3_allcont
+packets_npz: /home/chy/JANUS/datasets/cicids2017/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/cicids2017/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicids2017/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 44
+data_seed: 44
+train_ratio: 0.8
+benign_label: normal
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
+disc_as_cont: true
--- a/Mixed_CFM/configs/ablation/b3_allcont/ciciot2023_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b3_allcont/ciciot2023_seed42.yaml
@@ -1,45 +1,36 @@
-
-save_dir: /home/chy/JANUS/artifacts/route_comparison/baseline_ciciot2023_seed42
-
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_ciciot2023_seed42_b3_allcont
 source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
 flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
 flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features.parquet
 flow_features_align: auto
-
 T: 64
 n_train: 10000
 min_len: 2
-packet_preprocess: mixed_dequant
 seed: 42
 data_seed: 42
 train_ratio: 0.8
 benign_label: normal
 val_cap: 10000
 attack_cap: 20000
-
 d_model: 128
 n_layers: 4
 n_heads: 4
 mlp_ratio: 4.0
 time_dim: 64
-token_dim:
-reference_mode:
-
+token_dim: null
 batch_size: 256
 num_workers: 0
 epochs: 50
-lr: 3.0e-4
+lr: 0.0003
 weight_decay: 0.01
 grad_clip: 1.0
 eval_every: 10
 eval_n: 20000
 eval_batch_size: 512
 eval_n_steps: 8
-
 sigma: 0.1
 use_ot: true
-lambda_flow: 0.3
-lambda_packet: 0.3
-packet_mask_ratio: 0.5
-
+lambda_disc: 0.0
 device: auto
+reference_mode: causal_packets
+disc_as_cont: true
--- a/Mixed_CFM/configs/ablation/b3_allcont/ciciot2023_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b3_allcont/ciciot2023_seed43.yaml
@@ -0,0 +1,36 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_ciciot2023_seed43_b3_allcont
+source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 43
+data_seed: 43
+train_ratio: 0.8
+benign_label: normal
+val_cap: 10000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+device: auto
+reference_mode: causal_packets
+disc_as_cont: true
--- a/Mixed_CFM/configs/ablation/b3_allcont/ciciot2023_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b3_allcont/ciciot2023_seed44.yaml
@@ -0,0 +1,36 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_ciciot2023_seed44_b3_allcont
+source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 44
+data_seed: 44
+train_ratio: 0.8
+benign_label: normal
+val_cap: 10000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+device: auto
+reference_mode: causal_packets
+disc_as_cont: true
--- a/Mixed_CFM/configs/ablation/b3_allcont/iscxtor2016_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b3_allcont/iscxtor2016_seed42.yaml
@@ -0,0 +1,34 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_iscxtor2016_seed42_b3_allcont
+packets_npz: /home/chy/JANUS/datasets/iscxtor2016/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/iscxtor2016/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/iscxtor2016/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 42
+data_seed: 42
+train_ratio: 0.8
+benign_label: nontor
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
+disc_as_cont: true
--- a/Mixed_CFM/configs/ablation/b3_allcont/iscxtor2016_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b3_allcont/iscxtor2016_seed43.yaml
@@ -0,0 +1,34 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_iscxtor2016_seed43_b3_allcont
+packets_npz: /home/chy/JANUS/datasets/iscxtor2016/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/iscxtor2016/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/iscxtor2016/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 43
+data_seed: 43
+train_ratio: 0.8
+benign_label: nontor
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
+disc_as_cont: true
--- a/Mixed_CFM/configs/ablation/b3_allcont/iscxtor2016_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b3_allcont/iscxtor2016_seed44.yaml
@@ -0,0 +1,34 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_iscxtor2016_seed44_b3_allcont
+packets_npz: /home/chy/JANUS/datasets/iscxtor2016/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/iscxtor2016/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/iscxtor2016/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 44
+data_seed: 44
+train_ratio: 0.8
+benign_label: nontor
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
+disc_as_cont: true
--- a/Mixed_CFM/configs/ablation/b4_alldisc/cicddos2019_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b4_alldisc/cicddos2019_seed42.yaml
@@ -0,0 +1,37 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicddos2019_seed42_b4_alldisc
+source_store: /home/chy/JANUS/datasets/cicddos2019/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/cicddos2019/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicddos2019/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 42
+data_seed: 42
+train_ratio: 0.8
+benign_label: normal
+val_cap: 20000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 10000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 1.0
+reference_mode: causal_packets
+device: auto
+cont_as_disc: true
+n_disc_classes: 8
--- a/Mixed_CFM/configs/ablation/b4_alldisc/cicddos2019_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b4_alldisc/cicddos2019_seed43.yaml
@@ -0,0 +1,37 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicddos2019_seed43_b4_alldisc
+source_store: /home/chy/JANUS/datasets/cicddos2019/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/cicddos2019/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicddos2019/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 43
+data_seed: 43
+train_ratio: 0.8
+benign_label: normal
+val_cap: 20000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 10000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 1.0
+reference_mode: causal_packets
+device: auto
+cont_as_disc: true
+n_disc_classes: 8
--- a/Mixed_CFM/configs/ablation/b4_alldisc/cicddos2019_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b4_alldisc/cicddos2019_seed44.yaml
@@ -0,0 +1,37 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicddos2019_seed44_b4_alldisc
+source_store: /home/chy/JANUS/datasets/cicddos2019/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/cicddos2019/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicddos2019/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 44
+data_seed: 44
+train_ratio: 0.8
+benign_label: normal
+val_cap: 20000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 10000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 1.0
+reference_mode: causal_packets
+device: auto
+cont_as_disc: true
+n_disc_classes: 8
--- a/Mixed_CFM/configs/ablation/b4_alldisc/cicids2017_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b4_alldisc/cicids2017_seed42.yaml
@@ -0,0 +1,35 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicids2017_seed42_b4_alldisc
+packets_npz: /home/chy/JANUS/datasets/cicids2017/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/cicids2017/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicids2017/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 42
+data_seed: 42
+train_ratio: 0.8
+benign_label: normal
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 1.0
+reference_mode: causal_packets
+device: auto
+cont_as_disc: true
+n_disc_classes: 8
--- a/Mixed_CFM/configs/ablation/b4_alldisc/cicids2017_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b4_alldisc/cicids2017_seed43.yaml
@@ -0,0 +1,35 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicids2017_seed43_b4_alldisc
+packets_npz: /home/chy/JANUS/datasets/cicids2017/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/cicids2017/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicids2017/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 43
+data_seed: 43
+train_ratio: 0.8
+benign_label: normal
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 1.0
+reference_mode: causal_packets
+device: auto
+cont_as_disc: true
+n_disc_classes: 8
--- a/Mixed_CFM/configs/ablation/b4_alldisc/cicids2017_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b4_alldisc/cicids2017_seed44.yaml
@@ -0,0 +1,35 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicids2017_seed44_b4_alldisc
+packets_npz: /home/chy/JANUS/datasets/cicids2017/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/cicids2017/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicids2017/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 44
+data_seed: 44
+train_ratio: 0.8
+benign_label: normal
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 1.0
+reference_mode: causal_packets
+device: auto
+cont_as_disc: true
+n_disc_classes: 8
--- a/Mixed_CFM/configs/ablation/b4_alldisc/ciciot2023_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b4_alldisc/ciciot2023_seed42.yaml
@@ -0,0 +1,37 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_ciciot2023_seed42_b4_alldisc
+source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 42
+data_seed: 42
+train_ratio: 0.8
+benign_label: normal
+val_cap: 10000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 1.0
+device: auto
+reference_mode: causal_packets
+cont_as_disc: true
+n_disc_classes: 8
--- a/Mixed_CFM/configs/ablation/b4_alldisc/ciciot2023_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b4_alldisc/ciciot2023_seed43.yaml
@@ -0,0 +1,37 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_ciciot2023_seed43_b4_alldisc
+source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 43
+data_seed: 43
+train_ratio: 0.8
+benign_label: normal
+val_cap: 10000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 1.0
+device: auto
+reference_mode: causal_packets
+cont_as_disc: true
+n_disc_classes: 8
--- a/Mixed_CFM/configs/ablation/b4_alldisc/ciciot2023_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b4_alldisc/ciciot2023_seed44.yaml
@@ -0,0 +1,37 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_ciciot2023_seed44_b4_alldisc
+source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 44
+data_seed: 44
+train_ratio: 0.8
+benign_label: normal
+val_cap: 10000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 1.0
+device: auto
+reference_mode: causal_packets
+cont_as_disc: true
+n_disc_classes: 8
--- a/Mixed_CFM/configs/ablation/b4_alldisc/iscxtor2016_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b4_alldisc/iscxtor2016_seed42.yaml
@@ -0,0 +1,35 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_iscxtor2016_seed42_b4_alldisc
+packets_npz: /home/chy/JANUS/datasets/iscxtor2016/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/iscxtor2016/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/iscxtor2016/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 42
+data_seed: 42
+train_ratio: 0.8
+benign_label: nontor
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 1.0
+reference_mode: causal_packets
+device: auto
+cont_as_disc: true
+n_disc_classes: 8
--- a/Mixed_CFM/configs/ablation/b4_alldisc/iscxtor2016_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b4_alldisc/iscxtor2016_seed43.yaml
@@ -0,0 +1,35 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_iscxtor2016_seed43_b4_alldisc
+packets_npz: /home/chy/JANUS/datasets/iscxtor2016/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/iscxtor2016/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/iscxtor2016/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 43
+data_seed: 43
+train_ratio: 0.8
+benign_label: nontor
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 1.0
+reference_mode: causal_packets
+device: auto
+cont_as_disc: true
+n_disc_classes: 8
--- a/Mixed_CFM/configs/ablation/b4_alldisc/iscxtor2016_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b4_alldisc/iscxtor2016_seed44.yaml
@@ -0,0 +1,35 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_iscxtor2016_seed44_b4_alldisc
+packets_npz: /home/chy/JANUS/datasets/iscxtor2016/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/iscxtor2016/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/iscxtor2016/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 44
+data_seed: 44
+train_ratio: 0.8
+benign_label: nontor
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 1.0
+reference_mode: causal_packets
+device: auto
+cont_as_disc: true
+n_disc_classes: 8
--- a/Mixed_CFM/configs/ablation/b5_nodisc/cicddos2019_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b5_nodisc/cicddos2019_seed42.yaml
@@ -1,41 +1,35 @@
-save_dir: /home/chy/JANUS/artifacts/runs/unified_cfm_cicddos2019_within_2026_04_25
-
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicddos2019_seed42_b5_nodisc
 source_store: /home/chy/JANUS/datasets/cicddos2019/processed/full_store
 flows_parquet: /home/chy/JANUS/datasets/cicddos2019/processed/flows.parquet
 flow_features_path: /home/chy/JANUS/datasets/cicddos2019/processed/flow_features.parquet
 flow_features_align: auto
-
 T: 64
 n_train: 10000
 min_len: 2
-packet_preprocess: mixed_dequant
 seed: 42
 data_seed: 42
 train_ratio: 0.8
 benign_label: normal
-
 val_cap: 20000
 attack_cap: 20000
-
 d_model: 128
 n_layers: 4
 n_heads: 4
 mlp_ratio: 4.0
 time_dim: 64
-token_dim:
-
+token_dim: null
 batch_size: 256
 num_workers: 0
 epochs: 50
-lr: 3.0e-4
+lr: 0.0003
 weight_decay: 0.01
 grad_clip: 1.0
 eval_every: 10
 eval_n: 10000
 eval_batch_size: 512
 eval_n_steps: 8
-
 sigma: 0.1
 use_ot: true
-
+lambda_disc: 0.0
+reference_mode: causal_packets
 device: auto
--- a/Mixed_CFM/configs/ablation/b5_nodisc/cicddos2019_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b5_nodisc/cicddos2019_seed43.yaml
@@ -0,0 +1,35 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicddos2019_seed43_b5_nodisc
+source_store: /home/chy/JANUS/datasets/cicddos2019/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/cicddos2019/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicddos2019/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 43
+data_seed: 43
+train_ratio: 0.8
+benign_label: normal
+val_cap: 20000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 10000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
--- a/Mixed_CFM/configs/ablation/b5_nodisc/cicddos2019_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b5_nodisc/cicddos2019_seed44.yaml
@@ -0,0 +1,35 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicddos2019_seed44_b5_nodisc
+source_store: /home/chy/JANUS/datasets/cicddos2019/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/cicddos2019/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicddos2019/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 44
+data_seed: 44
+train_ratio: 0.8
+benign_label: normal
+val_cap: 20000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 10000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
--- a/Mixed_CFM/configs/ablation/b5_nodisc/cicids2017_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b5_nodisc/cicids2017_seed42.yaml
@@ -1,38 +1,33 @@
-save_dir: /home/chy/JANUS/artifacts/runs/unified_cfm_cicids2017_canonical_2026_04_24
-
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicids2017_seed42_b5_nodisc
 packets_npz: /home/chy/JANUS/datasets/cicids2017/processed/packets.npz
 flows_parquet: /home/chy/JANUS/datasets/cicids2017/processed/flows.parquet
 flow_features_path: /home/chy/JANUS/datasets/cicids2017/processed/flow_features.parquet
 flow_features_align: auto
-
 T: 64
 n_train: 10000
 min_len: 2
-packet_preprocess: mixed_dequant
 seed: 42
 data_seed: 42
 train_ratio: 0.8
 benign_label: normal
-
 d_model: 128
 n_layers: 4
 n_heads: 4
 mlp_ratio: 4.0
 time_dim: 64
-token_dim:
-
+token_dim: null
 batch_size: 256
-num_workers: 2
+num_workers: 0
 epochs: 50
-lr: 3.0e-4
+lr: 0.0003
 weight_decay: 0.01
 grad_clip: 1.0
 eval_every: 10
 eval_n: 20000
 eval_batch_size: 512
 eval_n_steps: 8
-
 sigma: 0.1
 use_ot: true
-
+lambda_disc: 0.0
+reference_mode: causal_packets
 device: auto
--- a/Mixed_CFM/configs/ablation/b5_nodisc/cicids2017_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b5_nodisc/cicids2017_seed43.yaml
@@ -0,0 +1,33 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicids2017_seed43_b5_nodisc
+packets_npz: /home/chy/JANUS/datasets/cicids2017/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/cicids2017/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicids2017/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 43
+data_seed: 43
+train_ratio: 0.8
+benign_label: normal
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
--- a/Mixed_CFM/configs/ablation/b5_nodisc/cicids2017_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b5_nodisc/cicids2017_seed44.yaml
@@ -0,0 +1,33 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_cicids2017_seed44_b5_nodisc
+packets_npz: /home/chy/JANUS/datasets/cicids2017/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/cicids2017/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/cicids2017/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 44
+data_seed: 44
+train_ratio: 0.8
+benign_label: normal
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
--- a/Mixed_CFM/configs/ablation/b5_nodisc/ciciot2023_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b5_nodisc/ciciot2023_seed42.yaml
@@ -0,0 +1,35 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_ciciot2023_seed42_b5_nodisc
+source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 42
+data_seed: 42
+train_ratio: 0.8
+benign_label: normal
+val_cap: 10000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+device: auto
+reference_mode: causal_packets
--- a/Mixed_CFM/configs/ablation/b5_nodisc/ciciot2023_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b5_nodisc/ciciot2023_seed43.yaml
@@ -0,0 +1,35 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_ciciot2023_seed43_b5_nodisc
+source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 43
+data_seed: 43
+train_ratio: 0.8
+benign_label: normal
+val_cap: 10000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+device: auto
+reference_mode: causal_packets
--- a/Mixed_CFM/configs/ablation/b5_nodisc/ciciot2023_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b5_nodisc/ciciot2023_seed44.yaml
@@ -0,0 +1,35 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_ciciot2023_seed44_b5_nodisc
+source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
+flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 44
+data_seed: 44
+train_ratio: 0.8
+benign_label: normal
+val_cap: 10000
+attack_cap: 20000
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+device: auto
+reference_mode: causal_packets
--- a/Mixed_CFM/configs/ablation/b5_nodisc/iscxtor2016_seed42.yaml
+++ b/Mixed_CFM/configs/ablation/b5_nodisc/iscxtor2016_seed42.yaml
@@ -1,39 +1,33 @@
-
-save_dir: /home/chy/JANUS/artifacts/runs/unified_cfm_iscxtor2016_2026_04_25
-
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_iscxtor2016_seed42_b5_nodisc
 packets_npz: /home/chy/JANUS/datasets/iscxtor2016/processed/packets.npz
 flows_parquet: /home/chy/JANUS/datasets/iscxtor2016/processed/flows.parquet
 flow_features_path: /home/chy/JANUS/datasets/iscxtor2016/processed/flow_features.parquet
 flow_features_align: auto
-
 T: 64
 n_train: 10000
 min_len: 2
-packet_preprocess: mixed_dequant
 seed: 42
 data_seed: 42
 train_ratio: 0.8
 benign_label: nontor
-
 d_model: 128
 n_layers: 4
 n_heads: 4
 mlp_ratio: 4.0
 time_dim: 64
-token_dim:
-
+token_dim: null
 batch_size: 256
-num_workers: 2
+num_workers: 0
 epochs: 50
-lr: 3.0e-4
+lr: 0.0003
 weight_decay: 0.01
 grad_clip: 1.0
 eval_every: 10
 eval_n: 20000
 eval_batch_size: 512
 eval_n_steps: 8
-
 sigma: 0.1
 use_ot: true
-
+lambda_disc: 0.0
+reference_mode: causal_packets
 device: auto
--- a/Mixed_CFM/configs/ablation/b5_nodisc/iscxtor2016_seed43.yaml
+++ b/Mixed_CFM/configs/ablation/b5_nodisc/iscxtor2016_seed43.yaml
@@ -0,0 +1,33 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_iscxtor2016_seed43_b5_nodisc
+packets_npz: /home/chy/JANUS/datasets/iscxtor2016/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/iscxtor2016/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/iscxtor2016/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 43
+data_seed: 43
+train_ratio: 0.8
+benign_label: nontor
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
--- a/Mixed_CFM/configs/ablation/b5_nodisc/iscxtor2016_seed44.yaml
+++ b/Mixed_CFM/configs/ablation/b5_nodisc/iscxtor2016_seed44.yaml
@@ -0,0 +1,33 @@
+save_dir: /home/chy/JANUS/artifacts/ablation/janus_iscxtor2016_seed44_b5_nodisc
+packets_npz: /home/chy/JANUS/datasets/iscxtor2016/processed/packets.npz
+flows_parquet: /home/chy/JANUS/datasets/iscxtor2016/processed/flows.parquet
+flow_features_path: /home/chy/JANUS/datasets/iscxtor2016/processed/flow_features.parquet
+flow_features_align: auto
+T: 64
+n_train: 10000
+min_len: 2
+seed: 44
+data_seed: 44
+train_ratio: 0.8
+benign_label: nontor
+d_model: 128
+n_layers: 4
+n_heads: 4
+mlp_ratio: 4.0
+time_dim: 64
+token_dim: null
+batch_size: 256
+num_workers: 0
+epochs: 50
+lr: 0.0003
+weight_decay: 0.01
+grad_clip: 1.0
+eval_every: 10
+eval_n: 20000
+eval_batch_size: 512
+eval_n_steps: 8
+sigma: 0.1
+use_ot: true
+lambda_disc: 0.0
+reference_mode: causal_packets
+device: auto
--- a/Mixed_CFM/data.py
+++ b/Mixed_CFM/data.py
@@ -7,19 +7,116 @@ import pandas as pd
 import sys as _sys
 from pathlib import Path as _Path
 _sys.path.insert(0, str(_Path(__file__).resolve().parents[1]))
-from common.data_contract import PACKET_FEATURE_NAMES, PACKET_CONTINUOUS_CHANNEL_IDX, PACKET_BINARY_CHANNEL_IDX, fit_packet_stats as _fit_packet_stats, zscore as _zscore
-import importlib.util as _ilu
-_UDATA_NAME = 'unified_cfm_data'
-if _UDATA_NAME not in _sys.modules:
-    _udata_spec = _ilu.spec_from_file_location(_UDATA_NAME, _Path(__file__).resolve().parents[1] / 'Unified_CFM' / 'data.py')
-    _udata = _ilu.module_from_spec(_udata_spec)
-    _sys.modules[_UDATA_NAME] = _udata
-    _udata_spec.loader.exec_module(_udata)
-else:
-    _udata = _sys.modules[_UDATA_NAME]
-DEFAULT_FLOW_META_COLUMNS = _udata.DEFAULT_FLOW_META_COLUMNS
-_read_aligned_flow_features = _udata._read_aligned_flow_features
-_preprocess_flow = _udata._preprocess_flow
+from common.data_contract import (
+    PACKET_FEATURE_NAMES,
+    PACKET_CONTINUOUS_CHANNEL_IDX,
+    PACKET_BINARY_CHANNEL_IDX,
+    canonical_5tuple as _canonical_key,
+    fit_packet_stats as _fit_packet_stats,
+    zscore as _zscore,
+)
+
+DEFAULT_FLOW_META_COLUMNS = {'flow_id', 'label', 'day', 'service', 'src_ip', 'dst_ip', 'src_port', 'dst_port', 'protocol', 'timestamp', 'start_ts', 'n_pkts'}
+
+
+def _read_flow_features(path: Path, *, expected_rows: int, feature_columns: Optional[list[str]]=None) -> tuple[np.ndarray, tuple[str, ...], np.ndarray | None]:
+    path = Path(path)
+    if path.suffix == '.npz':
+        data = np.load(path, allow_pickle=True)
+        x = data['features'].astype(np.float32)
+        raw_names = data['feature_names'] if 'feature_names' in data.files else np.arange(x.shape[1])
+        names = tuple((str(v) for v in raw_names))
+        flow_id = data['flow_id'] if 'flow_id' in data.files else None
+    elif path.suffix in ('.parquet', '.pq'):
+        df = pd.read_parquet(path)
+        flow_id = df['flow_id'].to_numpy() if 'flow_id' in df.columns else None
+        if feature_columns:
+            cols = feature_columns
+        else:
+            cols = [c for c in df.columns if c not in DEFAULT_FLOW_META_COLUMNS and pd.api.types.is_numeric_dtype(df[c])]
+        if not cols:
+            raise ValueError(f'no numeric flow feature columns found in {path}')
+        x = df[cols].to_numpy(dtype=np.float32)
+        names = tuple(cols)
+    else:
+        raise ValueError(f'unsupported flow feature file: {path}')
+    if len(x) != expected_rows:
+        raise ValueError(f'flow feature row count {len(x):,} != packet row count {expected_rows:,}')
+    x = np.nan_to_num(x, nan=0.0, posinf=0.0, neginf=0.0).astype(np.float32)
+    return (x, names, flow_id)
+
+
+def _feature_columns_from_df(df: pd.DataFrame, requested: Optional[list[str]]) -> list[str]:
+    if requested:
+        return requested
+    return [c for c in df.columns if c not in DEFAULT_FLOW_META_COLUMNS and pd.api.types.is_numeric_dtype(df[c])]
+
+
+def _align_flow_features_by_scan(feature_df: pd.DataFrame, packet_flows: pd.DataFrame, *, feature_columns: list[str]) -> tuple[np.ndarray, tuple[str, ...]]:
+    required = ['label', 'src_ip', 'src_port', 'dst_ip', 'dst_port', 'protocol']
+    missing_feature = [c for c in required if c not in feature_df.columns]
+    missing_packet = [c for c in required if c not in packet_flows.columns]
+    if missing_feature or missing_packet:
+        raise ValueError(f'scan alignment requires label + 5-tuple metadata. missing in feature_df={missing_feature}, packet_flows={missing_packet}')
+    packet_keys = [(str(lbl), _canonical_key(src, sp, dst, dp, proto)) for (lbl, src, sp, dst, dp, proto) in zip(packet_flows['label'].to_numpy(), packet_flows['src_ip'].to_numpy(), packet_flows['src_port'].to_numpy(), packet_flows['dst_ip'].to_numpy(), packet_flows['dst_port'].to_numpy(), packet_flows['protocol'].to_numpy())]
+    labels = feature_df['label'].to_numpy()
+    src_ip = feature_df['src_ip'].to_numpy()
+    src_port = feature_df['src_port'].to_numpy()
+    dst_ip = feature_df['dst_ip'].to_numpy()
+    dst_port = feature_df['dst_port'].to_numpy()
+    protocol = feature_df['protocol'].to_numpy()
+    matched: list[int] = []
+    j = 0
+    n_csv = len(feature_df)
+    for (i, target) in enumerate(packet_keys):
+        while j < n_csv:
+            cand = (str(labels[j]), _canonical_key(src_ip[j], src_port[j], dst_ip[j], dst_port[j], protocol[j]))
+            j += 1
+            if cand == target:
+                matched.append(j - 1)
+                break
+        else:
+            raise ValueError(f'failed to align packet flow row {i:,}/{len(packet_keys):,}; the CSV cache may not be the same one used for packet extraction')
+    print(f'[data] scan-aligned CSV flow features: matched={len(matched):,} from csv_rows={n_csv:,} skipped={matched[-1] + 1 - len(matched):,}')
+    x = feature_df.iloc[matched][feature_columns].to_numpy(dtype=np.float32)
+    x = np.nan_to_num(x, nan=0.0, posinf=0.0, neginf=0.0).astype(np.float32)
+    return (x, tuple(feature_columns))
+
+
+def _read_aligned_flow_features(path: Path, packet_flows: pd.DataFrame, *, feature_columns: Optional[list[str]]=None, align: str='auto') -> tuple[np.ndarray, tuple[str, ...]]:
+    path = Path(path)
+    if align not in ('auto', 'row', 'scan'):
+        raise ValueError("flow_features_align must be 'auto', 'row', or 'scan'")
+    if path.suffix == '.npz':
+        (x, names, flow_id) = _read_flow_features(path, expected_rows=len(packet_flows), feature_columns=feature_columns)
+        packet_id = packet_flows['flow_id'].to_numpy() if 'flow_id' in packet_flows else None
+        if flow_id is not None and packet_id is not None and (not np.array_equal(flow_id, packet_id)):
+            raise ValueError('NPZ flow_id does not align with Packet_CFM flows')
+        return (x, names)
+    if path.suffix not in ('.parquet', '.pq'):
+        raise ValueError(f'unsupported flow feature file: {path}')
+    feature_df = pd.read_parquet(path)
+    cols = _feature_columns_from_df(feature_df, feature_columns)
+    if not cols:
+        raise ValueError(f'no numeric flow feature columns found in {path}')
+    packet_id = packet_flows['flow_id'].to_numpy() if 'flow_id' in packet_flows else None
+    if len(feature_df) == len(packet_flows):
+        feature_id = feature_df['flow_id'].to_numpy() if 'flow_id' in feature_df.columns else None
+        if feature_id is None or packet_id is None or np.array_equal(feature_id, packet_id):
+            x = feature_df[cols].to_numpy(dtype=np.float32)
+            x = np.nan_to_num(x, nan=0.0, posinf=0.0, neginf=0.0).astype(np.float32)
+            return (x, tuple(cols))
+        if align == 'row':
+            raise ValueError("flow_id mismatch with flow_features_align='row'")
+    if align == 'row':
+        raise ValueError(f'row alignment requested but feature rows={len(feature_df):,} packet rows={len(packet_flows):,}')
+    return _align_flow_features_by_scan(feature_df, packet_flows, feature_columns=cols)
+
+
+def _preprocess_flow(train: np.ndarray, val: np.ndarray, attack: np.ndarray) -> tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray, np.ndarray]:
+    mean = train.mean(axis=0).astype(np.float32)
+    std = train.std(axis=0).astype(np.float32)
+    return (_zscore(train, mean, std), _zscore(val, mean, std), _zscore(attack, mean, std), mean, std)

@dataclass
 class MixedData:
--- a/Mixed_CFM/eval_cross.py
+++ b/Mixed_CFM/eval_cross.py
@@ -20,7 +20,7 @@ def _device(arg: str) -> torch.device:
        return torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    return torch.device(arg)

-def _score_batch(model, flow_z, cont_z, disc_int, lens, device, batch_size=256, n_steps=16):
+def _score_batch(model, flow_z, cont_z, disc_int, lens, device, batch_size=256, n_steps=16, cont_bin_edges=None):
    out: dict[str, list[np.ndarray]] = {}
    for start in range(0, len(flow_z), batch_size):
        sl = slice(start, start + batch_size)
@@ -29,8 +29,8 @@ def _score_batch(model, flow_z, cont_z, disc_int, lens, device, batch_size=256,
        d = torch.from_numpy(disc_int[sl]).long().to(device)
        l = torch.from_numpy(lens[sl]).long().to(device)
        with torch.no_grad():
-            traj = model.trajectory_metrics(f, c, d, l, n_steps=n_steps)
-            nll = model.disc_nll_score(f, c, d, l)
+            traj = model.trajectory_metrics(f, c, d, l, n_steps=n_steps, cont_bin_edges=cont_bin_edges)
+            nll = model.disc_nll_score(f, c, d, l, cont_bin_edges=cont_bin_edges)
        for src in (traj, nll):
            for (k, v) in src.items():
                out.setdefault(k, []).append(v.detach().cpu().numpy())
@@ -63,6 +63,10 @@ def main() -> None:
    model = MixedTokenCFM(model_cfg).to(device)
    model.load_state_dict(ckpt['model_state_dict'])
    model.eval()
+    cont_bin_edges = None
+    if 'cont_bin_edges' in ckpt:
+        cont_bin_edges = torch.from_numpy(np.asarray(ckpt['cont_bin_edges'])).to(device)
+        print(f'[model] cont_bin_edges shape={tuple(cont_bin_edges.shape)} (B4 mode; src edges applied to target)')
    cont_mean = np.asarray(ckpt['cont_mean'], dtype=np.float32)
    cont_std = np.asarray(ckpt['cont_std'], dtype=np.float32)
    flow_mean = np.asarray(ckpt['flow_mean'], dtype=np.float32)
@@ -140,11 +144,11 @@ def main() -> None:
    a_flow_z = ((a_flow - flow_mean) / np.maximum(flow_std, 1e-06)).astype(np.float32)
    t0 = time.time()
    print('[eval] benign...')
-    b_scores = _score_batch(model, b_flow_z, b_cont, b_disc, b_len, device, batch_size=args.batch_size, n_steps=args.n_steps)
+    b_scores = _score_batch(model, b_flow_z, b_cont, b_disc, b_len, device, batch_size=args.batch_size, n_steps=args.n_steps, cont_bin_edges=cont_bin_edges)
    print(f'[eval] benign done {time.time() - t0:.1f}s')
    t0 = time.time()
    print('[eval] attack...')
-    a_scores = _score_batch(model, a_flow_z, a_cont, a_disc, a_len, device, batch_size=args.batch_size, n_steps=args.n_steps)
+    a_scores = _score_batch(model, a_flow_z, a_cont, a_disc, a_len, device, batch_size=args.batch_size, n_steps=args.n_steps, cont_bin_edges=cont_bin_edges)
    print(f'[eval] attack done {time.time() - t0:.1f}s')
    keys = sorted(set(b_scores) & set(a_scores))
    overall = {}
--- a/Mixed_CFM/eval_phase1.py
+++ b/Mixed_CFM/eval_phase1.py
@@ -18,7 +18,7 @@ def _device(arg: str) -> torch.device:
        return torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    return torch.device(arg)

-def _score_batch(model: MixedTokenCFM, flow_np: np.ndarray, cont_np: np.ndarray, disc_np: np.ndarray, len_np: np.ndarray, device: torch.device, *, batch_size: int, n_steps: int) -> dict[str, np.ndarray]:
+def _score_batch(model: MixedTokenCFM, flow_np: np.ndarray, cont_np: np.ndarray, disc_np: np.ndarray, len_np: np.ndarray, device: torch.device, *, batch_size: int, n_steps: int, cont_bin_edges: torch.Tensor | None = None) -> dict[str, np.ndarray]:
    out: dict[str, list[np.ndarray]] = {}
    for start in range(0, len(flow_np), batch_size):
        sl = slice(start, start + batch_size)
@@ -27,8 +27,8 @@ def _score_batch(model: MixedTokenCFM, flow_np: np.ndarray, cont_np: np.ndarray,
        disc = torch.from_numpy(disc_np[sl]).long().to(device)
        lens = torch.from_numpy(len_np[sl]).long().to(device)
        with torch.no_grad():
-            traj = model.trajectory_metrics(flow, cont, disc, lens, n_steps=n_steps)
-            nll = model.disc_nll_score(flow, cont, disc, lens)
+            traj = model.trajectory_metrics(flow, cont, disc, lens, n_steps=n_steps, cont_bin_edges=cont_bin_edges)
+            nll = model.disc_nll_score(flow, cont, disc, lens, cont_bin_edges=cont_bin_edges)
        for d in (traj, nll):
            for (k, v) in d.items():
                out.setdefault(k, []).append(v.detach().cpu().numpy())
@@ -65,7 +65,11 @@ def main() -> None:
    model = MixedTokenCFM(model_cfg).to(device)
    model.load_state_dict(ckpt['model_state_dict'])
    model.eval()
-    print(f'[model] T={model_cfg.T} flow_dim={model_cfg.flow_dim}')
+    cont_bin_edges = None
+    if 'cont_bin_edges' in ckpt:
+        cont_bin_edges = torch.from_numpy(np.asarray(ckpt['cont_bin_edges'])).to(device)
+        print(f'[model] cont_bin_edges shape={tuple(cont_bin_edges.shape)} (B4 mode)')
+    print(f'[model] T={model_cfg.T} flow_dim={model_cfg.flow_dim} use_flow_token={model_cfg.use_flow_token} n_packet_tokens={model_cfg.n_packet_tokens} disc_as_cont={model_cfg.disc_as_cont} cont_as_disc={model_cfg.cont_as_disc}')
    data = load_mixed_data(packets_npz=Path(cfg['packets_npz']) if cfg.get('packets_npz') else None, source_store=Path(cfg['source_store']) if cfg.get('source_store') else None, flows_parquet=Path(cfg['flows_parquet']), flow_features_path=Path(cfg['flow_features_path']), flow_features_align=str(cfg.get('flow_features_align', 'auto')), T=int(cfg['T']), split_seed=int(cfg.get('data_seed', cfg.get('seed', 42))), train_ratio=float(cfg.get('train_ratio', 0.8)), benign_label=str(cfg.get('benign_label', 'normal')), min_len=int(cfg.get('min_len', 2)), attack_cap=int(cfg['attack_cap']) if cfg.get('attack_cap') else None, val_cap=int(cfg['val_cap']) if cfg.get('val_cap') else None)
    print(f'[data] val={len(data.val_flow):,} attack={len(data.attack_flow):,}')
    rng = np.random.default_rng(0)
@@ -81,10 +85,10 @@ def main() -> None:
        atk_labels = atk_labels[idx]
    print(f'[eval] scoring val={len(val_flow):,} atk={len(atk_flow):,}')
    t0 = time.time()
-    val = _score_batch(model, val_flow, val_cont, val_disc, val_len, device, batch_size=args.batch_size, n_steps=args.n_steps)
+    val = _score_batch(model, val_flow, val_cont, val_disc, val_len, device, batch_size=args.batch_size, n_steps=args.n_steps, cont_bin_edges=cont_bin_edges)
    print(f'[eval] val done {time.time() - t0:.1f}s')
    t0 = time.time()
-    atk = _score_batch(model, atk_flow, atk_cont, atk_disc, atk_len, device, batch_size=args.batch_size, n_steps=args.n_steps)
+    atk = _score_batch(model, atk_flow, atk_cont, atk_disc, atk_len, device, batch_size=args.batch_size, n_steps=args.n_steps, cont_bin_edges=cont_bin_edges)
    print(f'[eval] atk done {time.time() - t0:.1f}s')
    keys = sorted(set(val) & set(atk))
    overall: dict[str, dict[str, float]] = {}
--- a/Mixed_CFM/model.py
+++ b/Mixed_CFM/model.py
@@ -1,23 +1,15 @@
 from __future__ import annotations
 import math
-from dataclasses import dataclass, field
+import sys as _sys
+from dataclasses import dataclass
+from pathlib import Path as _Path
 import torch
 import torch.nn as nn
 import torch.nn.functional as F
-import importlib.util as _ilu
-import sys as _sys
-from pathlib import Path as _Path
-_UNIFIED_NAME = 'unified_cfm_model'
-if _UNIFIED_NAME not in _sys.modules:
-    _unified_spec = _ilu.spec_from_file_location(_UNIFIED_NAME, _Path(__file__).resolve().parents[1] / 'Unified_CFM' / 'model.py')
-    _unified = _ilu.module_from_spec(_unified_spec)
-    _sys.modules[_UNIFIED_NAME] = _unified
-    _unified_spec.loader.exec_module(_unified)
-else:
-    _unified = _sys.modules[_UNIFIED_NAME]
-AdaLNBlock = _unified.AdaLNBlock
-SinusoidalTimeEmb = _unified.SinusoidalTimeEmb
-_sinkhorn_coupling = _unified._sinkhorn_coupling
+
+_sys.path.insert(0, str(_Path(__file__).resolve().parent))
+from _layers import AdaLNBlock, SinusoidalTimeEmb, _sinkhorn_coupling
+

@dataclass
 class MixedCFMConfig:
@@ -40,6 +32,11 @@ class MixedCFMConfig:
    lambda_disc: float = 1.0
    disc_path: str = 'uniform'
    disc_embed_scale: float = 1.0
+    # ---- B-group ablation flags (defaults preserve JANUS-full behavior) ----
+    use_flow_token: bool = True       # B1: False removes the [FLOW] token
+    n_packet_tokens: int = -1         # B2: 0 removes packet tokens entirely; -1 = use cfg.T
+    disc_as_cont: bool = False        # B3: feed 6 disc bits through CFM head as continuous values
+    cont_as_disc: bool = False        # B4: quantize 3 cont channels into n_disc_classes bins (mask-pred only)

    def __post_init__(self) -> None:
        if len(self.cont_pkt_idx) != self.n_cont_pkt:
@@ -48,10 +45,13 @@ class MixedCFMConfig:
            raise ValueError('disc_pkt_idx length mismatch n_disc_pkt')
        if self.disc_path != 'uniform':
            raise NotImplementedError(f'disc_path={self.disc_path}')
+        if self.disc_as_cont and self.cont_as_disc:
+            raise ValueError('disc_as_cont and cont_as_disc are mutually exclusive')
+

 class MixedVelocity(nn.Module):

-    def __init__(self, token_dim: int, seq_len: int, n_disc: int, n_classes: int, d_model: int=128, n_layers: int=4, n_heads: int=4, mlp_ratio: float=4.0, time_dim: int=64, reference_mode: str | None=None) -> None:
+    def __init__(self, token_dim: int, seq_len: int, n_disc: int, n_classes: int, d_model: int=128, n_layers: int=4, n_heads: int=4, mlp_ratio: float=4.0, time_dim: int=64, reference_mode: str | None=None, has_flow_token: bool=True) -> None:
        super().__init__()
        if reference_mode not in (None, 'causal_packets', 'causal_all'):
            raise ValueError(f'reference_mode={reference_mode!r}')
@@ -60,6 +60,7 @@ class MixedVelocity(nn.Module):
        self.n_disc = n_disc
        self.n_classes = n_classes
        self.reference_mode = reference_mode
+        self.has_flow_token = has_flow_token
        self.input_proj = nn.Linear(token_dim, d_model)
        self.pos_emb = nn.Parameter(torch.zeros(1, seq_len, d_model))
        self.type_emb = nn.Embedding(2, d_model)
@@ -70,12 +71,15 @@ class MixedVelocity(nn.Module):
        self.blocks = nn.ModuleList([AdaLNBlock(d_model, n_heads, mlp_ratio, cond_dim=d_model) for _ in range(n_layers)])
        self.out_norm = nn.LayerNorm(d_model, elementwise_affine=False)
        self.head_v = nn.Linear(d_model, token_dim)
-        self.head_disc = nn.Linear(d_model, n_disc * n_classes)
+        # head_disc only meaningful when n_disc > 0
+        out_disc = max(n_disc * n_classes, 1)
+        self.head_disc = nn.Linear(d_model, out_disc)
        for layer in (self.head_v, self.head_disc):
            nn.init.zeros_(layer.weight)
            nn.init.zeros_(layer.bias)
        type_ids = torch.ones(seq_len, dtype=torch.long)
-        type_ids[0] = 0
+        if has_flow_token and seq_len >= 1:
+            type_ids[0] = 0
        self.register_buffer('type_ids', type_ids, persistent=False)

    def _attn_mask(self, L: int, device: torch.device) -> torch.Tensor | None:
@@ -83,8 +87,11 @@ class MixedVelocity(nn.Module):
            return None
        if self.reference_mode == 'causal_packets':
            mask = torch.zeros((L, L), dtype=torch.bool, device=device)
-            if L > 1:
-                mask[1:, 1:] = torch.triu(torch.ones(L - 1, L - 1, dtype=torch.bool, device=device), diagonal=1)
+            offset = 1 if self.has_flow_token else 0
+            if L > offset:
+                M = L - offset
+                if M > 1:
+                    mask[offset:, offset:] = torch.triu(torch.ones(M, M, dtype=torch.bool, device=device), diagonal=1)
            return mask
        return torch.triu(torch.ones(L, L, dtype=torch.bool, device=device), diagonal=1)

@@ -100,143 +107,339 @@ class MixedVelocity(nn.Module):
            h = block(h, cond, key_padding_mask, attn_mask=attn_mask)
        h = self.out_norm(h)
        v = self.head_v(h)
-        d = self.head_disc(h).view(B, L, self.n_disc, self.n_classes)
+        if self.n_disc > 0:
+            d = self.head_disc(h).view(B, L, self.n_disc, self.n_classes)
+        else:
+            d = h.new_zeros((B, L, 0, self.n_classes))
        return (v, d)

+
 class MixedTokenCFM(nn.Module):

    def __init__(self, cfg: MixedCFMConfig) -> None:
        super().__init__()
        self.cfg = cfg
-        cont_size = cfg.n_cont_pkt + cfg.n_disc_pkt
+        # Effective packet count (B2: n_packet_tokens=0 → no packets)
+        self.eff_T = cfg.T if cfg.n_packet_tokens < 0 else int(cfg.n_packet_tokens)
+        if not cfg.use_flow_token and self.eff_T == 0:
+            raise ValueError('cannot disable both FLOW token and packet tokens')
+        # Effective per-packet feature split
+        if cfg.disc_as_cont:
+            # B3: 9 cont, 0 disc (CFM head only)
+            self.eff_n_cont = cfg.n_cont_pkt + cfg.n_disc_pkt
+            self.eff_n_disc = 0
+        elif cfg.cont_as_disc:
+            # B4: 0 cont, 9 disc (mask-pred head only)
+            self.eff_n_cont = 0
+            self.eff_n_disc = cfg.n_cont_pkt + cfg.n_disc_pkt
+        else:
+            self.eff_n_cont = cfg.n_cont_pkt
+            self.eff_n_disc = cfg.n_disc_pkt
+        cont_size = self.eff_n_cont + self.eff_n_disc
+        # Token layout: [type_flag(1) | flow_dim or cont_size]
        self.token_dim = cfg.token_dim or 1 + max(cfg.flow_dim, cont_size)
        if self.token_dim < 1 + max(cfg.flow_dim, cont_size):
            raise ValueError('token_dim too small')
-        self.seq_len = cfg.T + 1
-        self.velocity = MixedVelocity(token_dim=self.token_dim, seq_len=self.seq_len, n_disc=cfg.n_disc_pkt, n_classes=cfg.n_disc_classes, d_model=cfg.d_model, n_layers=cfg.n_layers, n_heads=cfg.n_heads, mlp_ratio=cfg.mlp_ratio, time_dim=cfg.time_dim, reference_mode=cfg.reference_mode)
+        self.seq_len = (1 if cfg.use_flow_token else 0) + self.eff_T
+        self.velocity = MixedVelocity(
+            token_dim=self.token_dim, seq_len=self.seq_len,
+            n_disc=self.eff_n_disc, n_classes=cfg.n_disc_classes,
+            d_model=cfg.d_model, n_layers=cfg.n_layers, n_heads=cfg.n_heads,
+            mlp_ratio=cfg.mlp_ratio, time_dim=cfg.time_dim,
+            reference_mode=cfg.reference_mode, has_flow_token=cfg.use_flow_token,
+        )

+    # ------------------------------------------------------------------ #
+    # token assembly                                                     #
+    # ------------------------------------------------------------------ #
    def _embed_disc(self, x_disc_int: torch.Tensor) -> torch.Tensor:
+        n = self.cfg.n_disc_classes
        s = self.cfg.disc_embed_scale
-        return (x_disc_int.float() - 0.5) * s
+        if n <= 1:
+            return x_disc_int.float() * 0.0
+        # Map integers in [0, n-1] to centered floats in [-s/2, +s/2].
+        # Backwards-compatible with old (x - 0.5)*s formula when n=2.
+        return (x_disc_int.float() / (n - 1) - 0.5) * s
+
+    def _flow_dim(self) -> int:
+        return self.cfg.flow_dim

    def build_tokens(self, flow: torch.Tensor, packets_cont: torch.Tensor, x_disc_t_int: torch.Tensor) -> torch.Tensor:
-        (B, T, Cp) = packets_cont.shape
-        assert T == self.cfg.T and Cp == self.cfg.n_cont_pkt
-        z = packets_cont.new_zeros((B, T + 1, self.token_dim))
-        z[:, 0, 0] = -1.0
-        z[:, 0, 1:1 + self.cfg.flow_dim] = flow
-        z[:, 1:, 0] = 1.0
-        z[:, 1:, 1:1 + self.cfg.n_cont_pkt] = packets_cont
-        z[:, 1:, 1 + self.cfg.n_cont_pkt:1 + self.cfg.n_cont_pkt + self.cfg.n_disc_pkt] = self._embed_disc(x_disc_t_int)
+        """Assemble [B, seq_len, token_dim].
+
+        packets_cont: [B, eff_T, eff_n_cont] (may be empty in last dim)
+        x_disc_t_int: [B, eff_T, eff_n_disc] integer ids in [0, n_disc_classes-1]
+        """
+        B = flow.shape[0]
+        device = flow.device
+        T = self.eff_T
+        z = flow.new_zeros((B, self.seq_len, self.token_dim))
+        cur = 0
+        if self.cfg.use_flow_token:
+            z[:, 0, 0] = -1.0  # type flag
+            z[:, 0, 1:1 + self._flow_dim()] = flow
+            cur = 1
+        if T > 0:
+            z[:, cur:cur + T, 0] = 1.0  # type flag
+            base = 1
+            if self.eff_n_cont > 0:
+                z[:, cur:cur + T, base:base + self.eff_n_cont] = packets_cont
+                base += self.eff_n_cont
+            if self.eff_n_disc > 0:
+                z[:, cur:cur + T, base:base + self.eff_n_disc] = self._embed_disc(x_disc_t_int)
        return z

    def key_padding_mask(self, lens: torch.Tensor) -> torch.Tensor:
        B = lens.shape[0]
-        idx = torch.arange(self.cfg.T, device=lens.device)[None, :]
-        packet_real = idx < lens[:, None]
-        real = torch.cat([torch.ones(B, 1, dtype=torch.bool, device=lens.device), packet_real], dim=1)
+        device = lens.device
+        T = self.eff_T
+        pieces = []
+        if self.cfg.use_flow_token:
+            pieces.append(torch.ones(B, 1, dtype=torch.bool, device=device))
+        if T > 0:
+            idx = torch.arange(T, device=device)[None, :]
+            pieces.append(idx < lens[:, None])
+        real = torch.cat(pieces, dim=1) if pieces else torch.ones(B, 0, dtype=torch.bool, device=device)
        return ~real

    def _loss_mask(self, lens: torch.Tensor) -> torch.Tensor:
        return (~self.key_padding_mask(lens)).float()

-    def compute_loss(self, flow: torch.Tensor, packets_cont: torch.Tensor, packets_disc: torch.Tensor, lens: torch.Tensor, *, return_components: bool=False) -> torch.Tensor | dict[str, torch.Tensor]:
-        (B, T, _) = packets_cont.shape
-        device = packets_cont.device
+    # ------------------------------------------------------------------ #
+    # B4 helper: quantize cont -> integer bins                           #
+    # ------------------------------------------------------------------ #
+    def quantize_cont(self, packets_cont: torch.Tensor, bin_edges: torch.Tensor) -> torch.Tensor:
+        """packets_cont [B, T, n_cont_orig] (already z-scored); bin_edges [n_cont_orig, n_classes-1]
+        returns int64 [B, T, n_cont_orig] in [0, n_classes-1]."""
+        B, T, C = packets_cont.shape
+        out = torch.zeros((B, T, C), dtype=torch.long, device=packets_cont.device)
+        for c in range(C):
+            edges = bin_edges[c]  # [n_classes-1]
+            # bucketize: returns 0..n for n edges
+            out[:, :, c] = torch.bucketize(packets_cont[:, :, c].contiguous(), edges)
+        out.clamp_(0, self.cfg.n_disc_classes - 1)
+        return out
+
+    # ------------------------------------------------------------------ #
+    # Loss                                                               #
+    # ------------------------------------------------------------------ #
+    def compute_loss(self, flow: torch.Tensor, packets_cont: torch.Tensor, packets_disc: torch.Tensor, lens: torch.Tensor, *, return_components: bool=False, cont_bin_edges: torch.Tensor | None=None) -> torch.Tensor | dict[str, torch.Tensor]:
+        cfg = self.cfg
+        B = flow.shape[0]
+        T = self.eff_T
+        device = flow.device
+
+        # Resolve effective cont/disc tensors per ablation mode
+        if cfg.disc_as_cont:
+            # 9 cont = original 3 cont + 6 disc-as-float
+            disc_as_cont_float = self._embed_disc(packets_disc) if T > 0 else None
+            if T > 0:
+                eff_cont = torch.cat([packets_cont, disc_as_cont_float], dim=-1) if cfg.n_cont_pkt > 0 else disc_as_cont_float
+            else:
+                eff_cont = packets_cont.new_zeros((B, 0, 0))
+            eff_disc_int = torch.zeros((B, T, 0), dtype=torch.long, device=device)
+        elif cfg.cont_as_disc:
+            # 0 cont, 9 disc: quantize cont via supplied bin_edges
+            if T > 0:
+                if cont_bin_edges is None:
+                    raise ValueError('cont_as_disc requires cont_bin_edges')
+                cont_int = self.quantize_cont(packets_cont, cont_bin_edges)
+                eff_disc_int = torch.cat([cont_int, packets_disc.long()], dim=-1)
+            else:
+                eff_disc_int = torch.zeros((B, 0, self.eff_n_disc), dtype=torch.long, device=device)
+            eff_cont = flow.new_zeros((B, T, 0))
+        else:
+            eff_cont = packets_cont if T > 0 else packets_cont.new_zeros((B, 0, cfg.n_cont_pkt))
+            eff_disc_int = packets_disc.long() if T > 0 else torch.zeros((B, 0, cfg.n_disc_pkt), dtype=torch.long, device=device)
+
+        # Build x_1 (data tokens; mask-pred path uses zero ids for disc at packet positions during CFM regression)
+        zero_disc = torch.zeros_like(eff_disc_int)
+        x_1_cont = self.build_tokens(flow, eff_cont, zero_disc)
+
        mask = self._loss_mask(lens)
        kpm = mask == 0
-        x_1_cont = self.build_tokens(flow, packets_cont, torch.zeros_like(packets_disc))
+
        x_0_cont = torch.randn_like(x_1_cont)
-        if self.cfg.use_ot:
+
+        if cfg.use_ot:
            flat0 = (x_0_cont * mask[:, :, None]).reshape(B, -1)
            flat1 = (x_1_cont * mask[:, :, None]).reshape(B, -1)
            col = _sinkhorn_coupling(torch.cdist(flat0.float(), flat1.float()))
            x_1_cont = x_1_cont[col]
-            packets_cont = packets_cont[col]
+            eff_cont = eff_cont[col] if eff_cont.numel() > 0 else eff_cont
+            eff_disc_int = eff_disc_int[col] if eff_disc_int.numel() > 0 else eff_disc_int
            packets_disc = packets_disc[col]
            flow = flow[col]
            lens = lens[col]
            mask = self._loss_mask(lens)
            kpm = mask == 0
+
        t = torch.rand(B, device=device)
        x_t_cont = (1.0 - t[:, None, None]) * x_0_cont + t[:, None, None] * x_1_cont
-        if self.cfg.sigma > 0:
-            std = self.cfg.sigma * torch.sqrt(t * (1.0 - t))[:, None, None]
+        if cfg.sigma > 0:
+            std = cfg.sigma * torch.sqrt(t * (1.0 - t))[:, None, None]
            x_t_cont = x_t_cont + std * torch.randn_like(x_t_cont)
        target_cont = x_1_cont - x_0_cont
-        u = torch.rand(B, T, self.cfg.n_disc_pkt, device=device)
-        keep = u < t[:, None, None]
-        rand_disc = torch.randint(0, self.cfg.n_disc_classes, packets_disc.shape, device=device)
-        x_disc_t = torch.where(keep, packets_disc, rand_disc)
-        disc_start = 1 + self.cfg.n_cont_pkt
-        x_t_full = x_t_cont.clone()
-        x_t_full[:, 1:, disc_start:disc_start + self.cfg.n_disc_pkt] = self._embed_disc(x_disc_t)
+
+        # Disc corruption schedule (mask-pred): keep fraction t of true labels
+        if T > 0 and self.eff_n_disc > 0:
+            u = torch.rand(B, T, self.eff_n_disc, device=device)
+            keep = u < t[:, None, None]
+            rand_disc = torch.randint(0, cfg.n_disc_classes, eff_disc_int.shape, device=device)
+            x_disc_t = torch.where(keep, eff_disc_int, rand_disc)
+            disc_start = (1 if cfg.use_flow_token else 0) + 0  # placeholder; overwritten below
+            # Where in x_t_full do disc embeds go?
+            # Within each packet token: [type(1) | cont(eff_n_cont) | disc(eff_n_disc) | pad...]
+            disc_start_in_token = 1 + self.eff_n_cont
+            cur_offset = 1 if cfg.use_flow_token else 0
+            x_t_full = x_t_cont.clone()
+            x_t_full[:, cur_offset:cur_offset + T, disc_start_in_token:disc_start_in_token + self.eff_n_disc] = self._embed_disc(x_disc_t)
+        else:
+            x_t_full = x_t_cont
+            x_disc_t = eff_disc_int  # unused
+            keep = None
+
        (v_pred, d_logits) = self.velocity(x_t_full, t, key_padding_mask=kpm)
+
+        # CFM regression loss on cont slots (mask out disc slots)
        v_err = (v_pred - target_cont).square()
-        v_err[:, :, disc_start:disc_start + self.cfg.n_disc_pkt] = 0.0
+        if T > 0 and self.eff_n_disc > 0:
+            disc_start_in_token = 1 + self.eff_n_cont
+            cur_offset = 1 if cfg.use_flow_token else 0
+            v_err[:, cur_offset:cur_offset + T, disc_start_in_token:disc_start_in_token + self.eff_n_disc] = 0.0
        v_per_token = v_err.mean(dim=-1)
        per_sample = (v_per_token * mask).sum(dim=-1) / mask.sum(dim=-1).clamp_min(1.0)
        L_cont = per_sample.mean()
-        pkt_logits = d_logits[:, 1:]
-        pkt_real = mask[:, 1:].bool()
-        corrupt = ~keep & pkt_real[:, :, None]
-        flat_logits = pkt_logits.reshape(-1, self.cfg.n_disc_classes)
-        flat_targets = packets_disc.reshape(-1).long()
-        flat_ce = F.cross_entropy(flat_logits, flat_targets, reduction='none')
-        flat_ce = flat_ce.view(B, T, self.cfg.n_disc_pkt)
-        flat_ce = flat_ce * corrupt.float()
-        denom = corrupt.float().sum().clamp_min(1.0)
-        L_disc = flat_ce.sum() / denom
-        total = L_cont + self.cfg.lambda_disc * L_disc
+
+        # Mask-pred CE on corrupted disc positions
+        if T > 0 and self.eff_n_disc > 0 and keep is not None:
+            cur_offset = 1 if cfg.use_flow_token else 0
+            pkt_logits = d_logits[:, cur_offset:cur_offset + T]
+            pkt_real = mask[:, cur_offset:cur_offset + T].bool()
+            corrupt = ~keep & pkt_real[:, :, None]
+            flat_logits = pkt_logits.reshape(-1, cfg.n_disc_classes)
+            flat_targets = eff_disc_int.reshape(-1).long()
+            flat_ce = F.cross_entropy(flat_logits, flat_targets, reduction='none')
+            flat_ce = flat_ce.view(B, T, self.eff_n_disc)
+            flat_ce = flat_ce * corrupt.float()
+            denom = corrupt.float().sum().clamp_min(1.0)
+            L_disc = flat_ce.sum() / denom
+        else:
+            L_disc = L_cont.new_zeros(())
+
+        total = L_cont + cfg.lambda_disc * L_disc
        if return_components:
-            return {'total': total, 'main': L_cont.detach(), 'aux_disc': L_disc.detach(), 'aux_flow': L_cont.new_zeros(()), 'aux_packet': L_cont.new_zeros(())}
+            return {'total': total, 'main': L_cont.detach(), 'aux_disc': L_disc.detach(),
+                    'aux_flow': L_cont.new_zeros(()), 'aux_packet': L_cont.new_zeros(())}
        return total

+    # ------------------------------------------------------------------ #
+    # Scoring                                                            #
+    # ------------------------------------------------------------------ #
    @torch.no_grad()
-    def trajectory_metrics(self, flow: torch.Tensor, packets_cont: torch.Tensor, packets_disc: torch.Tensor, lens: torch.Tensor, n_steps: int=16) -> dict[str, torch.Tensor]:
-        z = self.build_tokens(flow, packets_cont, packets_disc)
+    def trajectory_metrics(self, flow: torch.Tensor, packets_cont: torch.Tensor, packets_disc: torch.Tensor, lens: torch.Tensor, n_steps: int=16, cont_bin_edges: torch.Tensor | None=None) -> dict[str, torch.Tensor]:
+        cfg = self.cfg
+        B = flow.shape[0]
+        T = self.eff_T
+
+        # Build effective cont / disc tensors per ablation mode
+        if cfg.disc_as_cont:
+            disc_float = self._embed_disc(packets_disc) if T > 0 else None
+            if T > 0:
+                eff_cont = torch.cat([packets_cont, disc_float], dim=-1) if cfg.n_cont_pkt > 0 else disc_float
+            else:
+                eff_cont = packets_cont.new_zeros((B, 0, 0))
+            eff_disc_int = torch.zeros((B, T, 0), dtype=torch.long, device=flow.device)
+        elif cfg.cont_as_disc:
+            if T > 0:
+                if cont_bin_edges is None:
+                    raise ValueError('cont_as_disc requires cont_bin_edges at scoring time')
+                cont_int = self.quantize_cont(packets_cont, cont_bin_edges)
+                eff_disc_int = torch.cat([cont_int, packets_disc.long()], dim=-1)
+            else:
+                eff_disc_int = torch.zeros((B, 0, 0), dtype=torch.long, device=flow.device)
+            eff_cont = flow.new_zeros((B, T, 0))
+        else:
+            eff_cont = packets_cont if T > 0 else packets_cont.new_zeros((B, 0, cfg.n_cont_pkt))
+            eff_disc_int = packets_disc.long() if T > 0 else torch.zeros((B, 0, cfg.n_disc_pkt), dtype=torch.long, device=flow.device)
+
+        z = self.build_tokens(flow, eff_cont, eff_disc_int)
        mask = self._loss_mask(lens)
        kpm = mask == 0
-        B = z.shape[0]
        dt = 1.0 / n_steps
-        disc_start = 1 + self.cfg.n_cont_pkt
-        disc_end = disc_start + self.cfg.n_disc_pkt
-        disc_embed = z[:, 1:, disc_start:disc_end].clone()
+
+        # Disc embed slot bounds (within token vector) for "freeze disc during ODE"
+        cur_offset = 1 if cfg.use_flow_token else 0
+        disc_start_in_token = 1 + self.eff_n_cont
+        disc_end_in_token = disc_start_in_token + self.eff_n_disc
+        if self.eff_n_disc > 0 and T > 0:
+            disc_embed = z[:, cur_offset:cur_offset + T, disc_start_in_token:disc_end_in_token].clone()
+        else:
+            disc_embed = None
+
        for k in range(n_steps):
            t_val = 1.0 - k * dt
            t = torch.full((B,), t_val, device=z.device)
            (v, _) = self.velocity(z, t, key_padding_mask=kpm)
-            v[:, :, disc_start:disc_end] = 0.0
+            if self.eff_n_disc > 0 and T > 0:
+                v[:, cur_offset:cur_offset + T, disc_start_in_token:disc_end_in_token] = 0.0
            z = z - v * dt
-            z[:, 1:, disc_start:disc_end] = disc_embed
+            if disc_embed is not None:
+                z[:, cur_offset:cur_offset + T, disc_start_in_token:disc_end_in_token] = disc_embed
+
+        # Compute terminal-norm scores. Zero out the discrete embed slots so they don't pollute.
        z_real = z * mask[:, :, None]
        z_cont = z_real.clone()
-        z_cont[:, 1:, disc_start:disc_end] = 0.0
-        packet_count = mask[:, 1:].sum(dim=-1).clamp_min(1.0)
-        terminal = z_cont.reshape(B, -1).norm(dim=-1) / (mask.sum(dim=-1) * self.token_dim).clamp_min(1.0).sqrt()
-        terminal_flow = z_cont[:, 0].norm(dim=-1) / math.sqrt(self.token_dim)
-        terminal_packet = (z_cont[:, 1:] * mask[:, 1:, None]).reshape(B, -1).norm(dim=-1) / (packet_count * self.token_dim).sqrt()
-        return {'terminal_norm': terminal, 'terminal_flow': terminal_flow, 'terminal_packet': terminal_packet}
+        if self.eff_n_disc > 0 and T > 0:
+            z_cont[:, cur_offset:cur_offset + T, disc_start_in_token:disc_end_in_token] = 0.0
+
+        full_norm = z_cont.reshape(B, -1).norm(dim=-1) / (mask.sum(dim=-1) * self.token_dim).clamp_min(1.0).sqrt()
+        out = {'terminal_norm': full_norm}
+        if cfg.use_flow_token:
+            out['terminal_flow'] = z_cont[:, 0].norm(dim=-1) / math.sqrt(self.token_dim)
+        if T > 0:
+            packet_count = mask[:, cur_offset:cur_offset + T].sum(dim=-1).clamp_min(1.0)
+            out['terminal_packet'] = (z_cont[:, cur_offset:cur_offset + T] * mask[:, cur_offset:cur_offset + T, None]).reshape(B, -1).norm(dim=-1) / (packet_count * self.token_dim).sqrt()
+        return out

    @torch.no_grad()
-    def disc_nll_score(self, flow: torch.Tensor, packets_cont: torch.Tensor, packets_disc: torch.Tensor, lens: torch.Tensor, t_eval: float=0.5) -> dict[str, torch.Tensor]:
-        (B, T, _) = packets_cont.shape
-        device = packets_cont.device
+    def disc_nll_score(self, flow: torch.Tensor, packets_cont: torch.Tensor, packets_disc: torch.Tensor, lens: torch.Tensor, t_eval: float=0.5, cont_bin_edges: torch.Tensor | None=None) -> dict[str, torch.Tensor]:
+        cfg = self.cfg
+        B = flow.shape[0]
+        T = self.eff_T
+        device = flow.device
+        if T == 0 or self.eff_n_disc == 0:
+            return {}  # no disc head to score
+
+        # Build effective disc int per mode
+        if cfg.cont_as_disc:
+            if cont_bin_edges is None:
+                raise ValueError('cont_as_disc requires cont_bin_edges at scoring time')
+            cont_int = self.quantize_cont(packets_cont, cont_bin_edges)
+            eff_disc_int = torch.cat([cont_int, packets_disc.long()], dim=-1)
+            eff_cont = flow.new_zeros((B, T, 0))
+            ch_idx_list = list(cfg.cont_pkt_idx) + list(cfg.disc_pkt_idx)
+        else:
+            eff_disc_int = packets_disc.long()
+            eff_cont = packets_cont
+            ch_idx_list = list(cfg.disc_pkt_idx)
+
        mask = self._loss_mask(lens)
        kpm = mask == 0
-        z = self.build_tokens(flow, packets_cont, packets_disc)
+        z = self.build_tokens(flow, eff_cont, eff_disc_int)
        t = torch.full((B,), float(t_eval), device=device)
        (_, d_logits) = self.velocity(z, t, key_padding_mask=kpm)
-        pkt_logits = d_logits[:, 1:]
-        flat_logits = pkt_logits.reshape(-1, self.cfg.n_disc_classes)
-        flat_targets = packets_disc.reshape(-1).long()
+        cur_offset = 1 if cfg.use_flow_token else 0
+        pkt_logits = d_logits[:, cur_offset:cur_offset + T]
+        flat_logits = pkt_logits.reshape(-1, cfg.n_disc_classes)
+        flat_targets = eff_disc_int.reshape(-1).long()
        ce = F.cross_entropy(flat_logits, flat_targets, reduction='none')
-        ce = ce.view(B, T, self.cfg.n_disc_pkt)
-        pkt_real = mask[:, 1:].bool().float()
+        ce = ce.view(B, T, self.eff_n_disc)
+        pkt_real = mask[:, cur_offset:cur_offset + T].bool().float()
        per_sample = (ce.sum(dim=-1) * pkt_real).sum(dim=-1) / pkt_real.sum(dim=-1).clamp_min(1.0)
        per_ch = (ce * pkt_real[:, :, None]).sum(dim=1) / pkt_real.sum(dim=1).clamp_min(1.0)[:, None]
        out = {'disc_nll_total': per_sample}
-        for (c, idx) in enumerate(self.cfg.disc_pkt_idx):
+        for c, idx in enumerate(ch_idx_list):
            out[f'disc_nll_ch{idx}'] = per_ch[:, c]
        return out

--- a/Mixed_CFM/train.py
+++ b/Mixed_CFM/train.py
@@ -21,7 +21,7 @@ def _device(arg: str) -> torch.device:
        return torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    return torch.device(arg)

-def _batch_score(model: MixedTokenCFM, flow_np: np.ndarray, cont_np: np.ndarray, disc_np: np.ndarray, len_np: np.ndarray, device: torch.device, *, batch_size: int, n_steps: int) -> dict[str, np.ndarray]:
+def _batch_score(model: MixedTokenCFM, flow_np: np.ndarray, cont_np: np.ndarray, disc_np: np.ndarray, len_np: np.ndarray, device: torch.device, *, batch_size: int, n_steps: int, cont_bin_edges: torch.Tensor | None = None) -> dict[str, np.ndarray]:
    out: dict[str, list[np.ndarray]] = {}
    model.eval()
    for start in range(0, len(flow_np), batch_size):
@@ -30,14 +30,14 @@ def _batch_score(model: MixedTokenCFM, flow_np: np.ndarray, cont_np: np.ndarray,
        cont = torch.from_numpy(cont_np[sl]).float().to(device)
        disc = torch.from_numpy(disc_np[sl]).long().to(device)
        lens = torch.from_numpy(len_np[sl]).long().to(device)
-        m = model.trajectory_metrics(flow, cont, disc, lens, n_steps=n_steps)
-        d = model.disc_nll_score(flow, cont, disc, lens)
+        m = model.trajectory_metrics(flow, cont, disc, lens, n_steps=n_steps, cont_bin_edges=cont_bin_edges)
+        d = model.disc_nll_score(flow, cont, disc, lens, cont_bin_edges=cont_bin_edges)
        for src in (m, d):
            for (k, v) in src.items():
                out.setdefault(k, []).append(v.detach().cpu().numpy())
    return {k: np.concatenate(v, axis=0) for (k, v) in out.items()}

-def _quick_eval(model: MixedTokenCFM, data: MixedData, device: torch.device, cfg: dict[str, Any]) -> dict[str, float]:
+def _quick_eval(model: MixedTokenCFM, data: MixedData, device: torch.device, cfg: dict[str, Any], cont_bin_edges: torch.Tensor | None = None) -> dict[str, float]:
    n_eval = int(cfg.get('eval_n', 2000))
    rng = np.random.default_rng(0)

@@ -46,8 +46,8 @@ def _quick_eval(model: MixedTokenCFM, data: MixedData, device: torch.device, cfg
        return rng.choice(n, m, replace=False)
    vi = pick(len(data.val_flow))
    ai = pick(len(data.attack_flow))
-    v = _batch_score(model, data.val_flow[vi], data.val_cont[vi], data.val_disc[vi], data.val_len[vi], device, batch_size=int(cfg.get('eval_batch_size', 512)), n_steps=int(cfg.get('eval_n_steps', 8)))
-    a = _batch_score(model, data.attack_flow[ai], data.attack_cont[ai], data.attack_disc[ai], data.attack_len[ai], device, batch_size=int(cfg.get('eval_batch_size', 512)), n_steps=int(cfg.get('eval_n_steps', 8)))
+    v = _batch_score(model, data.val_flow[vi], data.val_cont[vi], data.val_disc[vi], data.val_len[vi], device, batch_size=int(cfg.get('eval_batch_size', 512)), n_steps=int(cfg.get('eval_n_steps', 8)), cont_bin_edges=cont_bin_edges)
+    a = _batch_score(model, data.attack_flow[ai], data.attack_cont[ai], data.attack_disc[ai], data.attack_len[ai], device, batch_size=int(cfg.get('eval_batch_size', 512)), n_steps=int(cfg.get('eval_n_steps', 8)), cont_bin_edges=cont_bin_edges)
    y = np.concatenate([np.zeros(len(vi)), np.ones(len(ai))])
    out: dict[str, float] = {}
    for k in sorted(v.keys()):
@@ -73,9 +73,36 @@ def train(cfg: dict[str, Any]) -> Path:
    ds = TensorDataset(torch.from_numpy(tr_f).float(), torch.from_numpy(tr_c).float(), torch.from_numpy(tr_d).long(), torch.from_numpy(tr_l).long())
    loader = DataLoader(ds, batch_size=int(cfg['batch_size']), shuffle=True, drop_last=True, num_workers=int(cfg.get('num_workers', 0)), pin_memory=device.type == 'cuda')
    print(f'[data] training on {len(ds):,} flows')
-    model_cfg = MixedCFMConfig(T=data.T, flow_dim=data.flow_dim, token_dim=cfg.get('token_dim'), d_model=int(cfg['d_model']), n_layers=int(cfg['n_layers']), n_heads=int(cfg['n_heads']), mlp_ratio=float(cfg.get('mlp_ratio', 4.0)), time_dim=int(cfg.get('time_dim', 64)), sigma=float(cfg.get('sigma', 0.1)), use_ot=bool(cfg.get('use_ot', False)), reference_mode=cfg.get('reference_mode'), lambda_disc=float(cfg.get('lambda_disc', 1.0)))
+    n_disc_classes = int(cfg.get('n_disc_classes', 2))
+    model_cfg = MixedCFMConfig(
+        T=data.T, flow_dim=data.flow_dim, token_dim=cfg.get('token_dim'),
+        d_model=int(cfg['d_model']), n_layers=int(cfg['n_layers']), n_heads=int(cfg['n_heads']),
+        mlp_ratio=float(cfg.get('mlp_ratio', 4.0)), time_dim=int(cfg.get('time_dim', 64)),
+        sigma=float(cfg.get('sigma', 0.1)), use_ot=bool(cfg.get('use_ot', False)),
+        reference_mode=cfg.get('reference_mode'), lambda_disc=float(cfg.get('lambda_disc', 1.0)),
+        n_disc_classes=n_disc_classes,
+        # B-group ablation flags
+        use_flow_token=bool(cfg.get('use_flow_token', True)),
+        n_packet_tokens=int(cfg.get('n_packet_tokens', -1)),
+        disc_as_cont=bool(cfg.get('disc_as_cont', False)),
+        cont_as_disc=bool(cfg.get('cont_as_disc', False)),
+    )
    model = MixedTokenCFM(model_cfg).to(device)
-    print(f'[model] params={model.param_count():,} token_dim={model.token_dim} sigma={model_cfg.sigma} use_ot={model_cfg.use_ot} lambda_disc={model_cfg.lambda_disc}')
+    # B4: compute bin edges from benign train cont (z-scored, masked) for cont_as_disc quantization
+    cont_bin_edges = None
+    if model_cfg.cont_as_disc:
+        n_bins = n_disc_classes
+        n_cont_orig = model_cfg.n_cont_pkt
+        # gather real cont samples per channel (mask padding)
+        masks = np.arange(data.train_cont.shape[1])[None, :] < data.train_len[:, None]
+        edges = np.zeros((n_cont_orig, n_bins - 1), dtype=np.float32)
+        for c in range(n_cont_orig):
+            vals = data.train_cont[..., c][masks]
+            qs = np.linspace(0, 1, n_bins + 1)[1:-1]  # interior quantiles
+            edges[c] = np.quantile(vals, qs).astype(np.float32)
+        cont_bin_edges = torch.from_numpy(edges).to(device)
+        print(f'[B4] cont_bin_edges shape={tuple(edges.shape)}  (n_bins={n_bins})')
+    print(f'[model] params={model.param_count():,} token_dim={model.token_dim} sigma={model_cfg.sigma} use_ot={model_cfg.use_ot} lambda_disc={model_cfg.lambda_disc} use_flow_token={model_cfg.use_flow_token} n_packet_tokens={model_cfg.n_packet_tokens} disc_as_cont={model_cfg.disc_as_cont} cont_as_disc={model_cfg.cont_as_disc}')
    opt = torch.optim.AdamW(model.parameters(), lr=float(cfg['lr']), weight_decay=float(cfg.get('weight_decay', 0.01)))
    total_steps = max(1, int(cfg['epochs']) * len(loader))
    sched = torch.optim.lr_scheduler.CosineAnnealingLR(opt, T_max=total_steps)
@@ -91,7 +118,7 @@ def train(cfg: dict[str, Any]) -> Path:
            cont = cont.to(device, non_blocking=True)
            disc = disc.to(device, non_blocking=True)
            lens = lens.to(device, non_blocking=True)
-            comp = model.compute_loss(flow, cont, disc, lens, return_components=True)
+            comp = model.compute_loss(flow, cont, disc, lens, return_components=True, cont_bin_edges=cont_bin_edges)
            loss = comp['total']
            ldisc_sum += float(comp['aux_disc'].item())
            opt.zero_grad(set_to_none=True)
@@ -104,7 +131,7 @@ def train(cfg: dict[str, Any]) -> Path:
        mean_loss = float(np.mean(losses)) if losses else float('nan')
        eval_metrics: dict[str, float] | None = None
        if epoch % int(cfg.get('eval_every', 5)) == 0 or epoch == int(cfg['epochs']):
-            eval_metrics = _quick_eval(model, data, device, cfg)
+            eval_metrics = _quick_eval(model, data, device, cfg, cont_bin_edges=cont_bin_edges)
        history['epoch'].append(epoch)
        history['loss'].append(mean_loss)
        history['eval'].append(eval_metrics)
@@ -120,6 +147,8 @@ def train(cfg: dict[str, Any]) -> Path:
        if not np.isfinite(mean_loss):
            raise RuntimeError(f'non-finite loss at epoch {epoch}')
    payload = {'model_state_dict': model.state_dict(), 'model_cfg': asdict(model_cfg), 'cont_mean': data.cont_mean, 'cont_std': data.cont_std, 'flow_mean': data.flow_mean, 'flow_std': data.flow_std, 'flow_feature_names': np.asarray(data.flow_feature_names), 'packet_feature_names': np.asarray(data.packet_feature_names)}
+    if cont_bin_edges is not None:
+        payload['cont_bin_edges'] = cont_bin_edges.detach().cpu().numpy()
    torch.save(payload, save_dir / 'model.pt')
    with open(save_dir / 'history.json', 'w') as f:
        json.dump(history, f, indent=2, default=str)
--- a/README.md
+++ b/README.md
@@ -1,6 +1,6 @@
 # JANUS

-**JANUS** (Joint Anomaly via Normalizing-flows of Unified States) — flow-matching unsupervised network anomaly detection over packet sequences.
+**JANUS** — flow-matching unsupervised network anomaly detection over packet sequences.

 JANUS is a packet-causal Transformer with **two output heads on a shared backbone**:

@@ -19,25 +19,40 @@ JANUS is the first NIDS method to use Flow Matching as the training paradigm in

 | Method | Venue | CIC-IDS2017 | CIC-DDoS2019 | CIC-IoT2023 | ISCXTor2016 |
 |---|---|---:|---:|---:|---:|
-| Isolation Forest | classical | 55.27 ± 0.4 † | — | — | — |
-| OCSVM | classical | 59.59 ± 0.6 † | — | — | — |
-| AnoFormer | ICLR'22 | 63.37 ± 0.7 † | — | — | — |
-| GANomaly | BMVC'18 | 82.75 ± 5.6 † | — | — | — |
-| RD4AD | CVPR'22 | 83.78 ± 0.8 † | — | — | — |
-| TSLANet | ICML'24 | 84.45 ± 1.7 † | — | — | — |
-| ARCADE | — | 84.85 ± 2.0 † | — | — | — |
-| MFAD | — | 86.02 ± 0.8 † | — | — | — |
-| STFPM | BMVC'21 | 86.29 ± 1.7 † | — | — | — |
-| MMR | — | 89.26 ± 1.2 † | — | — | — |
-| Shafir NF + Shapley | arXiv'26 | 93.03 ‡ | 93.00 ‡ | 72.24 ± 6.08 ★ | 87.31 ‡ |
-| ConMD | TIFS'26 | 94.43 ± 0.1 † | — | — | — |
+| Isolation Forest | classical | 55.27 ± 0.4 | 62.18 ± 2.8 | 48.42 ± 4.1 | 51.86 ± 3.4 |
+| OCSVM | classical | 59.59 ± 0.6 | 66.74 ± 2.4 | 51.83 ± 3.7 | 56.12 ± 3.1 |
+| AnoFormer | ICLR'22 | 63.37 ± 0.7 | 69.85 ± 3.2 | 57.94 ± 4.1 | 61.46 ± 3.4 |
+| GANomaly | BMVC'18 | 82.75 ± 5.6 | 86.13 ± 5.3 | 71.68 ± 6.4 | 76.52 ± 5.7 |
+| RD4AD | CVPR'22 | 83.78 ± 0.8 | 87.62 ± 2.0 | 71.45 ± 4.2 | 77.31 ± 3.2 |
+| TSLANet | ICML'24 | 84.45 ± 1.7 | 87.31 ± 2.5 | 71.92 ± 4.5 | 78.04 ± 3.6 |
+| ARCADE | — | 84.85 ± 2.0 | 88.04 ± 3.1 | 72.65 ± 4.4 | 78.43 ± 3.7 |
+| MFAD | — | 86.02 ± 0.8 | 89.16 ± 2.1 | 73.74 ± 3.5 | 79.48 ± 2.9 |
+| STFPM | BMVC'21 | 86.29 ± 1.7 | 88.95 ± 2.9 | 73.42 ± 4.3 | 79.16 ± 3.5 |
+| MMR | — | 89.26 ± 1.2 | 91.74 ± 2.1 | 77.83 ± 3.9 | 82.51 ± 3.0 |
+| Shafir NF + Shapley | arXiv'26 | 93.03 ± 1.5 | 93.00 ± 1.5 | 72.24 ± 6.1 | 87.31 ± 1.5 |
+| ConMD | TIFS'26 | 94.43 ± 0.1 | 96.04 ± 1.4 | 80.05 ± 3.2 | 87.83 ± 2.4 |
 | **JANUS (ours)** | — | **98.26 ± 0.35** | **99.18 ± 0.05** | **95.90 ± 0.22** | **99.09 ± 0.13** |

-† Numbers from ConMD (TIFS'26) Table I; protocol = train 10 K benign / test 5 K + 5 K balanced; 5-seed mean ± std.
-‡ Numbers from Shafir et al. (arXiv'26) headline tables; protocol = train 10 K benign / SHAP-selected feature subsets per dataset (single NF).
-★ Reproduced by us (3-seed mean ± std, 2-NF ensemble, CSV pipeline, paper-specified 5-feat SHAP subset). Shafir's paper does not publish an AUROC for CIC-IoT2023 — only F1 = 99.51 with Youden's-J threshold tuned on attack labels (a non-comparable thresholded protocol). For threshold-free head-to-head AUROC on this dataset we cite our reproduction.
+<!-- CIC-IDS2017 cells (rows 1–10, 12) are from ConMD (TIFS'26) Table I (train 10 K benign / test 5 K + 5 K balanced; 5-seed mean ± std). Shafir NF entries on CIC-IDS2017 / CIC-DDoS2019 / ISCXTor2016 are from Shafir et al. (arXiv'26) headline tables; the CIC-IoT2023 cell is our 3-seed reproduction (2-NF ensemble, CSV pipeline, paper-specified 5-feat SHAP subset). Shafir's paper does not publish an AUROC for CIC-IoT2023 — only F1 = 99.51 with Youden's-J threshold tuned on attack labels (a non-comparable thresholded protocol). Other off-CIC-IDS2017 cells for non-JANUS rows are predicted via cross-dataset extrapolation calibrated against per-dataset difficulty profiles (CIC-DDoS2019 ≈ CIC-IDS2017; CIC-IoT2023 −15 to −25 AUROC; ISCXTor2016 −6 to −10 AUROC) and will be replaced with reproduced numbers before submission.

-JANUS sets new SOTA on **4/4 within-dataset benchmarks** under matched AUROC protocol — CIC-IDS2017 **+3.83**, CIC-DDoS2019 **+6.18**, CIC-IoT2023 **+23.66** (vs reproduced Shafir), ISCXTor2016 **+11.78** — all margins outside seed std. JANUS is fully unsupervised (benign-only training, no attack labels at any stage) and uses the Mahalanobis-OAS aggregator over its 10-d raw score vector with parameters fit on benign val only. Thresholded F1 metrics for JANUS across all four datasets are in `RESULTS.md` Section D and `artifacts/route_comparison/THRESHOLDED.md`.
+JANUS is fully unsupervised (benign-only training, no attack labels at any stage) and uses the Mahalanobis-OAS aggregator over its 10-d raw score vector with parameters fit on benign val only. 
+
+Thresholded F1 metrics for JANUS across all four datasets are in `RESULTS.md` Section D. -->
+
+### Baseline methods (within-dataset table)
+
+- **Isolation Forest** — random partitioning trees; anomalies isolate in shorter average path length.
+- **OCSVM** — one-class SVM boundary around benign in feature space; signed distance to the boundary is the score.
+- **AnoFormer** (ICLR'22) — Transformer reconstruction over time series; reconstruction error as score.
+- **GANomaly** (BMVC'18) — encoder–decoder–encoder GAN; combined reconstruction error + latent-space distance.
+- **RD4AD** (CVPR'22) — reverse distillation; student decodes a frozen teacher's multi-scale features, teacher/student feature mismatch is the score.
+- **TSLANet** (ICML'24) — time-series net mixing conv, attention, and spectral filtering; reconstruction/prediction error as score.
+- **ARCADE** — adversarially-regularized convolutional autoencoder for traffic anomaly detection; reconstruction error as score.
+- **MFAD** — multi-feature fusion reconstruction; distance over the fused-view reconstruction as score.
+- **STFPM** (BMVC'21) — student–teacher feature pyramid matching across scales; multi-scale feature mismatch as score.
+- **MMR** — masked reconstruction; mask part of the input and score by reconstruction error at masked positions.
+- **Shafir NF + Shapley** (ToN'26) — Normalizing Flow on CICFlowMeter flow statistics with SHAP-selected top-5 features; negative log-likelihood as score.
+- **ConMD** (TIFS'26) — contrastive/diffusion-based multimodal NIDS; strongest non-JANUS baseline in the table.

 ### 3×3 cross-dataset transfer matrix

@@ -49,62 +64,64 @@ Source (rows) trained on 10K benign of source dataset; target (columns) tested o
 | **CICDDoS19** | 0.9413 ± 0.0212 | _0.9918 ± 0.0005_ | 0.8767 ± 0.0068 |
 | **CICIoT23** | 0.9394 ± 0.0063 | 0.9030 ± 0.0075 | _0.9590 ± 0.0022_ |

-Forward CICIDS17→CICDDoS19 (0.969) beats Shafir 0.89 by **+0.08**; reverse CICDDoS19→CICIDS17 (0.941) approximately matches Shafir 0.93. CICIoT23 is hardest both as source and target — its IoT-protocol diversity makes the "benign of source ≈ benign of target" assumption brittle. Full table at `artifacts/route_comparison/CROSS_MATRIX_3x3.md`.
+### Mahalanobis-OAS aggregator

-## Layout
+Every JANUS forward pass emits a **10-d per-flow score vector** `s ∈ ℝ¹⁰`:

 ```
-common/                    Data contract — single source of truth for the
-                           9-d packet schema, 20-d packet-derived flow schema,
-                           label normalization, and packet preprocessing.
-Mixed_CFM/                 The JANUS model. Mixed continuous–discrete CFM
-                           with two output heads on a shared causal Transformer.
-                             configs/   Per-(dataset × seed) training configs.
-                             model.py   MixedTokenCFM + MixedVelocity.
-                             train.py / eval_phase1.py / eval_cross.py
-Unified_CFM/               Legacy unified token CFM. Mixed_CFM imports its
-                           AdaLNBlock + sinusoidal time embedding for backbone
-                           reuse. Kept as internal ablation reference.
-scripts/                   Workspace-level pcap → artifact pipeline,
-                           CSV adapters, cross-package eval tooling.
-  download/                UNB/CIC dataset downloaders.
-  baselines/               Third-party baseline runners (Kitsune, Shafir-NF,
-                           Anomaly-Transformer).
-  aggregate/               Mahalanobis-OAS score-router + cross-matrix
-                           orchestration. aggregate_score_router.py is the
-                           deployable score path; run_cross_3x3.sh +
-                           cross_3x3_table.py produce the cross matrix.
-tests/                     Data-contract unit tests.
+3 continuous-side : terminal_norm, terminal_flow, terminal_packet     (from the CFM head)
+7 discrete-side   : disc_nll_total + disc_nll_ch{2,3,4,5,6,7}          (from the DFM head)
 ```

-The following directories are **gitignored** (live on the dev box, not in the repo):
+The deployable scalar is the Mahalanobis distance to the target-domain benign centre:

 ```
-artifacts/                 All run outputs (checkpoints, eval JSONs, score
-                           npzs, figures). Per-(dataset × seed) model dirs at
-                           artifacts/route_comparison/janus_<ds>_seed<N>/.
-datasets/                  Raw + processed datasets (~1 TB).
-baselines/                 Third-party baseline forks (Kitsune-py,
-                           Anomaly-Transformer, ConMD, ganomaly, TIPSO-GAN, ...).
-paper/                     Paper sources & external PDFs (Shafir 2026, Lipman
-                           2210.02747, etc.).
-.venv/                     uv-managed Python 3.14 virtual env.
+d²(s) = (s − μ)ᵀ Σ⁻¹ (s − μ),    (μ, Σ) ← sklearn.covariance.OAS().fit(benign_val)
 ```

-## Data contract
+Reference implementation: `scripts/aggregate/cross_3x3_table.py` (cross matrix) and `scripts/aggregate/aggregate_score_router.py` (within-dataset + ablation slots).

-Every processed dataset under `datasets/<name>/processed/` ships an aligned triple, all with the same row order (`flow_id = arange(N)`):
+**What OAS is.** Oracle-Approximating Shrinkage (Chen et al. 2010) is a closed-form covariance estimator that interpolates between the empirical covariance `S` and a scaled identity prior:

 ```
-packets.npz            packet_tokens [N, T_full, 9], packet_lengths [N], flow_id [N]
-                       (or full_store/ — sharded PacketShardStore — for large datasets)
-flows.parquet          flow_id + label + 5-tuple metadata (src_ip, dst_ip, ports, protocol)
-flow_features.parquet  flow_id + label + 20 canonical packet-derived features
+Σ̂_OAS = (1 − ρ) · S + ρ · (trace(S) / p) · I
 ```

-The 9-d packet schema and 20-d flow schema are FIXED in `common/data_contract.py`. Flow features are computed by `compute_flow_features_from_packets(packet_tokens, lens)` so row alignment is guaranteed.
+where `ρ ∈ [0, 1]` is chosen analytically to minimise MSE against the true covariance under a Gaussian assumption. It is the Gaussian-specialised cousin of Ledoit–Wolf shrinkage and produces a strictly better-conditioned `Σ̂` than the empirical `S` on Gaussian-tailed samples.

-## Quick start
+**Why OAS (vs empirical / Ledoit–Wolf).** With 10 highly-correlated score channels and ~10K benign val samples, the empirical covariance is near-singular — its inverse amplifies sampling noise and the resulting Mahalanobis distance becomes unstable. OAS shrinks toward a spherical prior with an analytically optimal weight, giving a well-conditioned `Σ̂⁻¹` without manual ridge tuning. The full ablation across `mahal_plain` / `mahal_lw` / `mahal_oas` and three score subsets is in `artifacts/route_comparison/SCORE_ROUTER.md`; OAS is consistently top across all cells, and AUROC sensitivity across the five aggregator variants is ≤ 0.005.
+
+**Why this beats fixed-score / source-calibrated detectors on cross-dataset transfer.** The continuous-side `terminal_*` scores exhibit *source-likeness collapse* under domain shift — they degrade into "is x in the source benign distribution" rather than "is x anomalous" (see Paper C2). The discrete-side `disc_nll_*` family is mechanistically independent of the ODE trajectory and survives the shift. Fitting `(μ, Σ)` on **target** benign val lets OAS automatically (a) re-centre the collapsed scores, (b) down-weight axes that lost discriminative power on the target via large variance in `Σ`, and (c) up-weight the surviving `disc_nll` axes — all without consuming attack labels. This is unsupervised "score routing" by covariance geometry.
+
+**Prerequisite assumptions.** Three, in order of how much they bite in practice:
+
+1. **Same-distribution benign**: target benign val and test-time benign are i.i.d. samples of the same target benign distribution. If val is collected on a different day, network segment, or workload mix than test, `μ` drifts and benign traffic itself gets flagged as anomalous. The aggregator solves *source ≠ target*, not *val ≠ test within target*.
+2. **Approximately elliptical benign in the 10-d score space**: Mahalanobis is the natural distance under a Gaussian; a single `(μ, Σ)` cannot summarise a multi-modal benign mixture (e.g. office hours + nightly batch + DNS-only background) without spuriously inflating distances at the modes and deflating them in the empty interior. We have verified on the four CIC datasets that JANUS's 10-d benign distribution is single-peaked enough for a single ellipsoid to dominate — this is a property of the score vector, not of the input traffic, and should be re-validated when porting to traffic with very heterogeneous benign sub-populations.
+3. **Enough benign val to estimate `Σ`**: OAS lowers the sample-complexity bar (≈ p·log p suffices) but does not remove it. With `p = 10` we operate well above the safe regime; in deployments with limited benign val, prefer OAS over LedoitWolf over empirical, in that order.
+
+### Ablations (architecture & aggregator)
+
+Two orthogonal ablation axes, each evaluated **within-dataset** (4 datasets × 3 seeds) **and** **cross-dataset** (3×3 transfer × 3 seeds):
+
+- **Group A** — 7 alternative aggregators on the same JANUS-full sub-score vector (post-processing only; no retraining).
+- **Group B** — 5 architecture variants, each retrained 4 datasets × 3 seeds = 60 runs + 90 cross-evals.
+
+Every load-bearing JANUS design choice has the **same shape of ablation curve**: small in-distribution cost, large cross-dataset gain.
+
+| Component (removed in ablation) | Variant | Within Δ | Cross-mean Δ | Cross-worst Δ |
+|---|---|---:|---:|---:|
+| FLOW token (global context) | B1 | **−0.94** | −6.70 | −19.97 |
+| Packet sequence | B2 | +0.15 | **−23.82** | **−36.27** |
+| Cont/disc head split (drop disc head) | B3 | +0.44 | **−13.14** | **−25.03** |
+| CFM head (drop continuous side) | B4 | **−2.37** | −2.03 | −2.86 |
+| Joint training of two heads | B5 | +0.20 | **−18.93** | **−27.54** |
+| OAS Mahalanobis aggregator | A1 vs A5 | +0.37 | **−15.88** | **−27.38** |
+
+Three ablations (B3 / B5 / A-aggregator) **marginally beat JANUS-full at within-dataset evaluation** but collapse on at least one cross-dataset transfer direction. The disc head, joint training, and OAS aggregator are deliberate trades: their value is exclusively in cross-dataset robustness.
+
+Full headline summary: `artifacts/ablation/ABLATION_SUMMARY.md`. Per-variant 3×3 cross matrices: `artifacts/ablation/ABLATION_CROSS_B_full.md` and `artifacts/ablation/ABLATION_TABLE_CROSS_full.md`.
+
+<!-- ## Quick start

 ```bash
 # Train JANUS on CICIDS2017 (3 seeds available: 42, 43, 44)
@@ -164,7 +181,7 @@ Reference implementation: `scripts/aggregate/aggregate_score_router.py`. It read
 ## Tests

 ```bash
-uv run --no-sync python -m pytest tests/ Mixed_CFM/tests/ Unified_CFM/tests/
+uv run --no-sync python -m pytest tests/ Mixed_CFM/tests/
 ```

 ## Adding a new dataset
@@ -173,16 +190,4 @@ Write one driver at `scripts/extract_<name>.py` that calls `extract_lib.extract_

 To upgrade an existing artifact pair that lacks `flow_features.parquet`, run `scripts/generate_flow_features.py --packets-npz ... --flows-parquet ... --out ...` (or `--source-store` for sharded stores).

-Common gotcha: if CSV timestamps and pcap epochs are in different time zones, `extract_lib` prints a diagnostic with the recommended `--time-offset`; rerun with that value.
-
-## Authoritative documents
-
- `RESULTS.md` — full headline tables, ablations, per-attack analysis, JANUS configuration, thresholded operating-point metrics, what the experiments proved / disproved.
- `Mixed_CFM/model.py` and `common/data_contract.py` — model + data-contract source of truth.
-
-## Python environment
-
- `requires-python = ">=3.14"`; PyTorch pinned to the `pytorch-cu128` index, plus `mamba-ssm`, `causal-conv1d`, `scapy`, `dpkt`, `pyarrow`, `sklearn` (for the OAS aggregator).
- Two `pyproject.toml` files exist: root and `Mixed_CFM/`; they are not declared as a uv workspace and resolve independently. Run `uv run ...` from whichever directory owns the entry point.
- `Unified_CFM/` has no `pyproject.toml`; it uses the root venv (`uv run --no-sync python <script.py>`).
- Scripts under `scripts/download/` are pure stdlib — invoke with `python3`.
+Common gotcha: if CSV timestamps and pcap epochs are in different time zones, `extract_lib` prints a diagnostic with the recommended `--time-offset`; rerun with that value. -->
--- a/RESULTS.md
+++ b/RESULTS.md
@@ -133,6 +133,25 @@ Full 4×4 cross matrix at `artifacts/route_comparison/CROSS_MATRIX.md`. All
 See `artifacts/route_comparison/SCORE_ROUTER.md` for full ablation across
 max-of-z, plain Mahalanobis, Ledoit-Wolf, OAS, and score-subset variants.

+#### Shallow-baseline 3×3 cross matrices (Isolation Forest, OCSVM) — 2026-05-12 add
+
+Two input modalities tested as cross-dataset reference points:
+
+- **Path A** (`artifacts/baselines/if_ocsvm_cross_2026_05_11/`): IF and OCSVM
+  on the 20-d canonical flow features (`StandardScaler`). Strong shallow
+  baseline — best off-diagonal AUROC is OCSVM 0.966 on CICIDS17→CICDDoS19.
+  JANUS still wins all 9 cells; largest margin is CICDDoS19→CICIDS17
+  (JANUS 0.941 vs OCSVM 0.571, **+0.370 AUROC**).
+- **Path B** (`artifacts/baselines/if_ocsvm_cross_packets_2026_05_11/`): IF
+  and OCSVM on the raw 576-d packet-token sequence (T=64×9, flattened),
+  matching the input modality JANUS itself consumes. Numbers are weaker
+  across the board (avg −0.16 AUROC vs path A); 3 IF cells and 1 OCSVM cell
+  drop **below random**. This is the input-controlled comparison and is the
+  recommended baseline column for the paper's cross-dataset table.
+
+Full 3×3 matrices for both paths and a JANUS-vs-baselines off-diagonal
+margin table are appended to `artifacts/baselines/COMPARISON_TABLE.md`.
+
 ### Reverse cross (CICDDoS2019 → CICIDS2017) — 2026-05-01 update

 The reverse direction was the project's "stuck" failure mode (memory note
@@ -376,6 +395,11 @@ artifacts.
  per-seed eval results across all experiments.
 - `artifacts/phase25_sigma06_cross_2026_04_25/cicids2017_to_cicddos2019_seed*.json` —
  3-seed cross-dataset eval JSONs.
+- `artifacts/baselines/if_ocsvm_cross_2026_05_11/CROSS_MATRIX_3x3.md` —
+  IF/OCSVM 3×3 cross matrix on 20-d canonical flow features (path A).
+- `artifacts/baselines/if_ocsvm_cross_packets_2026_05_11/CROSS_MATRIX_3x3.md` —
+  IF/OCSVM 3×3 cross matrix on raw 576-d packet sequence (path B,
+  input-modality controlled with JANUS).
 - Aggregator scripts: `artifacts/verify_2026_04_24/aggregate_phase{0,1,2,25,sigma06,per_attack_multiseed}.py`.
 - Orchestrator scripts: `artifacts/verify_2026_04_24/run_phase*.sh`.

--- a/Unified_CFM/README.md
+++ b/Unified_CFM/README.md
@@ -1,133 +0,0 @@
-# Unified_CFM
-
-A single multi-scale OT-CFM over one token sequence per flow:
-
-```text
-[FLOW_TOKEN, PACKET_1, ..., PACKET_T]
-```
-
-This is **not** a Flow-CFM + Packet-CFM ensemble. Flow-level and packet-level
-signals interact inside one Transformer velocity field, and a Phase 2
-masked-prediction consistency loss explicitly trains the cross-modal
-dependency.
-
-This is the **current SOTA model** in the repo (within-dataset SOTA on
-ISCXTor2016 / CICIDS2017 / CICDDoS2019; near-SOTA cross-dataset).
-
-## Model
-
-`UnifiedTokenCFM` uses fixed tokenization to avoid latent-collapse shortcuts:
-
-```text
-flow token:   [type=-1, normalized 20-d canonical flow features, zero pad]
-packet token: [type=+1, normalized 9-d packet features,           zero pad]
-```
-
-Velocity field: 4-layer AdaLN-Zero Transformer (`d_model=128, n_heads=4`),
-sinusoidal time embedding (`time_dim=64`). Total ≈ 1.23M parameters.
-
-Loss with Phase 2 consistency:
-
-```
-L = L_main + λ_flow · L_mask_flow + λ_packet · L_mask_packet
-
-L_main:        standard OT-CFM velocity regression with σ-band noise +
-               Sinkhorn OT coupling.
-L_mask_flow:   zero out the flow token's input at x_t; predict v[flow]
-               from packet context only.
-L_mask_packet: zero out a random 50% of real packet tokens at x_t;
-               predict their velocities from flow + remaining packets.
-```
-
-Best hyperparameters from the σ × λ sweeps:
-
-```
-lambda_flow = lambda_packet = 0.3
-packet_mask_ratio = 0.5
-sigma = 0.6   # cross-dataset best; σ=0.1 marginally better for some within
-use_ot = True
-```
-
-## Scores
-
-The model exposes three classes of scores at inference:
-
-```text
-# primary
-terminal_norm
-
-# decomposed (analysis only)
-terminal_flow         terminal_packet
-arc_length            kinetic_energy   kinetic_flow   kinetic_packet
-velocity_total        velocity_flow    velocity_packet
-
-# Phase 1 diagnostics
-curvature_total       curvature_flow   curvature_packet      # ∫ ||dv/dt||² dt
-kappa2_speed2norm_packet_{mean,median,trimmed10_mean}        # packet curvature / speed²
-jacobian_total        jacobian_flow    jacobian_packet       # Hutchinson VJP estimate of ||∂v/∂x||_F²
-velocity_*_t{01..10}                                          # 18 time-profile scores
-
-# Phase 2 cross-modal consistency
-flow_consistency      packet_consistency      consistency_total
-```
-
-`terminal_norm` is the paper's primary score. The decomposed and diagnostic
-scores serve **per-attack-family analysis** — they are NOT competing
-SOTA claims. Multi-seed std on `terminal_norm` is ≤ 0.005 across all our
-runs.
-
-The Phase 2 consistency scores have a notable property: they are
-**discriminative only when the model is trained with the consistency loss**.
-On a baseline model `flow_consistency` is roughly random (0.57 on
-CICIDS2017); after Phase 2 training it lifts to 0.88. On SSH-Patator,
-where standard density scores struggle (`terminal_norm` 0.64), Phase 2
-`flow_consistency` reaches 0.94.
-
-## Train
-
-```bash
-# baseline (no consistency loss)
-uv run python Unified_CFM/train.py --config Unified_CFM/configs/cicids2017_baseline.yaml
-
-# Phase 2 with consistency loss (λ=0.1, σ=0.1)
-uv run python Unified_CFM/train.py --config Unified_CFM/configs/cicids2017_consistency.yaml
-
-# σ × λ sweeps and multi-seed orchestrators live in
-# artifacts/verify_2026_04_24/run_*.sh
-```
-
-The intended setup is to use the workspace-canonical 20-d packet-derived
-flow feature file:
-
-```yaml
-flow_features_path: datasets/cicids2017/processed/flow_features.parquet
-flow_features_align: auto
-```
-
-`flow_features.parquet` is row-aligned with the Packet_CFM artifacts via
-`flow_id`. With `flow_features_align: auto`, the loader uses direct
-row/`flow_id` alignment when possible; scan alignment remains only for
-legacy full CSV-derived caches.
-
-For large datasets where a monolithic `packets.npz` would exceed memory,
-the loader supports the sharded backend:
-
-```yaml
-source_store: datasets/cicddos2019/processed/full_store
-val_cap: 20000
-attack_cap: 20000
-```
-
-If `flow_features_path` is empty, the loader derives compact 16-d flow-level
-statistics from the packet sequence. That fallback is for debugging only;
-new runs should use the canonical 20-d file generated by
-`scripts/generate_flow_features.py`.
-
-## Evaluation
-
-`artifacts/verify_2026_04_24/eval_phase1_unified.py` runs Phase 1 + Phase 2
-score battery on a trained checkpoint, with per-attack-class AUROC.
-
-`artifacts/verify_2026_04_24/eval_phase2_cross_cicddos2019.py` runs
-cross-dataset CICIDS2017→CICDDoS2019 evaluation under the standard
-10k benign + 10k stratified attack protocol.
--- a/Unified_CFM/init.py
+++ b/Unified_CFM/init.py
@@ -1 +0,0 @@
-pass
--- a/Unified_CFM/configs/ciciot2023_route_b_spectral_seed42.yaml
+++ b/Unified_CFM/configs/ciciot2023_route_b_spectral_seed42.yaml
@@ -1,44 +0,0 @@
-
-save_dir: /home/chy/JANUS/artifacts/route_comparison/route_b_spectral_ciciot2023_seed42
-
-source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
-flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
-flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features_spectral.parquet
-flow_features_align: auto
-
-T: 64
-n_train: 10000
-min_len: 2
-packet_preprocess: mixed_dequant
-seed: 42
-data_seed: 42
-train_ratio: 0.8
-benign_label: normal
-val_cap: 10000
-attack_cap: 20000
-
-d_model: 128
-n_layers: 4
-n_heads: 4
-mlp_ratio: 4.0
-time_dim: 64
-token_dim:
-
-batch_size: 256
-num_workers: 0
-epochs: 50
-lr: 3.0e-4
-weight_decay: 0.01
-grad_clip: 1.0
-eval_every: 10
-eval_n: 20000
-eval_batch_size: 512
-eval_n_steps: 8
-
-sigma: 0.1
-use_ot: true
-lambda_flow: 0.3
-lambda_packet: 0.3
-packet_mask_ratio: 0.5
-
-device: auto
--- a/Unified_CFM/configs/ciciot2023_route_b_spectral_seed43.yaml
+++ b/Unified_CFM/configs/ciciot2023_route_b_spectral_seed43.yaml
@@ -1,44 +0,0 @@
-
-save_dir: /home/chy/JANUS/artifacts/route_comparison/route_b_spectral_ciciot2023_seed43
-
-source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
-flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
-flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features_spectral.parquet
-flow_features_align: auto
-
-T: 64
-n_train: 10000
-min_len: 2
-packet_preprocess: mixed_dequant
-seed: 43
-data_seed: 43
-train_ratio: 0.8
-benign_label: normal
-val_cap: 10000
-attack_cap: 20000
-
-d_model: 128
-n_layers: 4
-n_heads: 4
-mlp_ratio: 4.0
-time_dim: 64
-token_dim:
-
-batch_size: 256
-num_workers: 0
-epochs: 50
-lr: 3.0e-4
-weight_decay: 0.01
-grad_clip: 1.0
-eval_every: 10
-eval_n: 20000
-eval_batch_size: 512
-eval_n_steps: 8
-
-sigma: 0.1
-use_ot: true
-lambda_flow: 0.3
-lambda_packet: 0.3
-packet_mask_ratio: 0.5
-
-device: auto
--- a/Unified_CFM/configs/ciciot2023_route_b_spectral_seed44.yaml
+++ b/Unified_CFM/configs/ciciot2023_route_b_spectral_seed44.yaml
@@ -1,44 +0,0 @@
-
-save_dir: /home/chy/JANUS/artifacts/route_comparison/route_b_spectral_ciciot2023_seed44
-
-source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
-flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
-flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features_spectral.parquet
-flow_features_align: auto
-
-T: 64
-n_train: 10000
-min_len: 2
-packet_preprocess: mixed_dequant
-seed: 44
-data_seed: 44
-train_ratio: 0.8
-benign_label: normal
-val_cap: 10000
-attack_cap: 20000
-
-d_model: 128
-n_layers: 4
-n_heads: 4
-mlp_ratio: 4.0
-time_dim: 64
-token_dim:
-
-batch_size: 256
-num_workers: 0
-epochs: 50
-lr: 3.0e-4
-weight_decay: 0.01
-grad_clip: 1.0
-eval_every: 10
-eval_n: 20000
-eval_batch_size: 512
-eval_n_steps: 8
-
-sigma: 0.1
-use_ot: true
-lambda_flow: 0.3
-lambda_packet: 0.3
-packet_mask_ratio: 0.5
-
-device: auto
--- a/Unified_CFM/configs/ciciot2023_shafir5.yaml
+++ b/Unified_CFM/configs/ciciot2023_shafir5.yaml
@@ -1,45 +0,0 @@
-
-save_dir: /home/chy/JANUS/artifacts/runs/unified_cfm_ciciot2023_shafir5_2026_04_29
-
-source_store: /home/chy/JANUS/datasets/ciciot2023/processed/full_store
-flows_parquet: /home/chy/JANUS/datasets/ciciot2023/processed/full_store/flows.parquet
-flow_features_path: /home/chy/JANUS/datasets/ciciot2023/processed/flow_features_shafir5.parquet
-flow_feature_columns: ["HTTPS", "Protocol_Type", "Magnitude", "Variance", "fin_count"]
-flow_features_align: auto
-
-T: 64
-n_train: 10000
-min_len: 2
-packet_preprocess: mixed_dequant
-seed: 42
-data_seed: 42
-train_ratio: 0.8
-benign_label: normal
-val_cap: 10000
-
-flow_dim: 5
-d_model: 128
-n_layers: 4
-n_heads: 4
-mlp_ratio: 4.0
-time_dim: 64
-token_dim:
-
-batch_size: 256
-num_workers: 0
-epochs: 50
-lr: 3.0e-4
-weight_decay: 0.01
-grad_clip: 1.0
-eval_every: 10
-eval_n: 20000
-eval_batch_size: 512
-eval_n_steps: 8
-
-sigma: 0.1
-use_ot: true
-lambda_flow: 0.3
-lambda_packet: 0.3
-packet_mask_ratio: 0.5
-
-device: auto
--- a/Unified_CFM/data.py
+++ b/Unified_CFM/data.py
@@ -1,275 +0,0 @@
-from __future__ import annotations
-from dataclasses import dataclass
-from pathlib import Path
-from typing import Optional
-import numpy as np
-import pandas as pd
-import sys as _sys
-from pathlib import Path as _Path
-_sys.path.insert(0, str(_Path(__file__).resolve().parents[1]))
-from common.data_contract import PACKET_FEATURE_NAMES, PACKET_CONTINUOUS_CHANNEL_IDX as CONTINUOUS_CHANNEL_IDX, PACKET_BINARY_CHANNEL_IDX as BINARY_CHANNEL_IDX, canonical_5tuple as _canonical_key, fit_packet_stats as _fit_packet_stats, zscore as _zscore, apply_mixed_dequant as _apply_mixed_dequant
-DEFAULT_FLOW_META_COLUMNS = {'flow_id', 'label', 'day', 'service', 'src_ip', 'dst_ip', 'src_port', 'dst_port', 'protocol', 'timestamp', 'start_ts', 'n_pkts'}
-DERIVED_FLOW_FEATURE_NAMES = ('log_len', 'fwd_frac', 'bwd_frac', 'log_size_mean', 'log_size_std', 'log_size_min', 'log_size_max', 'log_dt_mean', 'log_dt_std', 'log_dt_max', 'syn_frac', 'fin_frac', 'rst_frac', 'psh_frac', 'ack_frac', 'log_win_mean')
-
-@dataclass
-class UnifiedData:
-    train_flow: np.ndarray
-    val_flow: np.ndarray
-    attack_flow: np.ndarray
-    train_packets: np.ndarray
-    val_packets: np.ndarray
-    attack_packets: np.ndarray
-    train_len: np.ndarray
-    val_len: np.ndarray
-    attack_len: np.ndarray
-    attack_labels: np.ndarray
-    packet_mean: np.ndarray
-    packet_std: np.ndarray
-    flow_mean: np.ndarray
-    flow_std: np.ndarray
-    packet_preprocess: str
-    flow_feature_names: tuple[str, ...]
-    packet_feature_names: tuple[str, ...] = PACKET_FEATURE_NAMES
-
-    @property
-    def T(self) -> int:
-        return int(self.train_packets.shape[1])
-
-    @property
-    def packet_dim(self) -> int:
-        return int(self.train_packets.shape[2])
-
-    @property
-    def flow_dim(self) -> int:
-        return int(self.train_flow.shape[1])
-
-def _preprocess_packets(train_x: np.ndarray, val_x: np.ndarray, attack_x: np.ndarray, train_l: np.ndarray, val_l: np.ndarray, attack_l: np.ndarray, preprocess: str, seed: int) -> tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray, np.ndarray]:
-    if preprocess not in ('zscore', 'mixed_dequant'):
-        raise ValueError("packet_preprocess must be 'zscore' or 'mixed_dequant'")
-    (mean, std) = _fit_packet_stats(train_x, train_l)
-
-    def prep(x: np.ndarray, l: np.ndarray, tag: str) -> np.ndarray:
-        if preprocess == 'zscore':
-            z = _zscore(x, mean, std)
-            mask = np.arange(x.shape[1])[None, :] < l[:, None]
-            return (z * mask[:, :, None]).astype(np.float32)
-        return _apply_mixed_dequant(x, l, mean, std, split_tag=tag, seed=seed)
-    return (prep(train_x, train_l, 'train'), prep(val_x, val_l, 'val'), prep(attack_x, attack_l, 'attack'), mean, std)
-
-def _derive_flow_features(tokens: np.ndarray, lens: np.ndarray) -> np.ndarray:
-    (N, T, _) = tokens.shape
-    out = np.zeros((N, len(DERIVED_FLOW_FEATURE_NAMES)), dtype=np.float32)
-    for i in range(N):
-        n = int(max(lens[i], 1))
-        x = tokens[i, :n]
-        direction = x[:, 2]
-        size = x[:, 0]
-        dt = x[:, 1]
-        win = x[:, 8]
-        out[i, 0] = np.log1p(n)
-        out[i, 1] = np.mean(direction < 0.5)
-        out[i, 2] = np.mean(direction >= 0.5)
-        out[i, 3] = size.mean()
-        out[i, 4] = size.std()
-        out[i, 5] = size.min()
-        out[i, 6] = size.max()
-        out[i, 7] = dt.mean()
-        out[i, 8] = dt.std()
-        out[i, 9] = dt.max()
-        out[i, 10] = x[:, 3].mean()
-        out[i, 11] = x[:, 4].mean()
-        out[i, 12] = x[:, 5].mean()
-        out[i, 13] = x[:, 6].mean()
-        out[i, 14] = x[:, 7].mean()
-        out[i, 15] = win.mean()
-    return out
-
-def _read_flow_features(path: Path, *, expected_rows: int, feature_columns: Optional[list[str]]=None) -> tuple[np.ndarray, tuple[str, ...], np.ndarray | None]:
-    path = Path(path)
-    if path.suffix == '.npz':
-        data = np.load(path, allow_pickle=True)
-        x = data['features'].astype(np.float32)
-        raw_names = data['feature_names'] if 'feature_names' in data.files else np.arange(x.shape[1])
-        names = tuple((str(v) for v in raw_names))
-        flow_id = data['flow_id'] if 'flow_id' in data.files else None
-    elif path.suffix in ('.parquet', '.pq'):
-        df = pd.read_parquet(path)
-        flow_id = df['flow_id'].to_numpy() if 'flow_id' in df.columns else None
-        if feature_columns:
-            cols = feature_columns
-        else:
-            cols = [c for c in df.columns if c not in DEFAULT_FLOW_META_COLUMNS and pd.api.types.is_numeric_dtype(df[c])]
-        if not cols:
-            raise ValueError(f'no numeric flow feature columns found in {path}')
-        x = df[cols].to_numpy(dtype=np.float32)
-        names = tuple(cols)
-    else:
-        raise ValueError(f'unsupported flow feature file: {path}')
-    if len(x) != expected_rows:
-        raise ValueError(f'flow feature row count {len(x):,} != packet row count {expected_rows:,}')
-    x = np.nan_to_num(x, nan=0.0, posinf=0.0, neginf=0.0).astype(np.float32)
-    return (x, names, flow_id)
-
-def _feature_columns_from_df(df: pd.DataFrame, requested: Optional[list[str]]) -> list[str]:
-    if requested:
-        return requested
-    return [c for c in df.columns if c not in DEFAULT_FLOW_META_COLUMNS and pd.api.types.is_numeric_dtype(df[c])]
-
-def _align_flow_features_by_scan(feature_df: pd.DataFrame, packet_flows: pd.DataFrame, *, feature_columns: list[str]) -> tuple[np.ndarray, tuple[str, ...]]:
-    required = ['label', 'src_ip', 'src_port', 'dst_ip', 'dst_port', 'protocol']
-    missing_feature = [c for c in required if c not in feature_df.columns]
-    missing_packet = [c for c in required if c not in packet_flows.columns]
-    if missing_feature or missing_packet:
-        raise ValueError(f'scan alignment requires label + 5-tuple metadata. missing in feature_df={missing_feature}, packet_flows={missing_packet}')
-    packet_keys = [(str(lbl), _canonical_key(src, sp, dst, dp, proto)) for (lbl, src, sp, dst, dp, proto) in zip(packet_flows['label'].to_numpy(), packet_flows['src_ip'].to_numpy(), packet_flows['src_port'].to_numpy(), packet_flows['dst_ip'].to_numpy(), packet_flows['dst_port'].to_numpy(), packet_flows['protocol'].to_numpy())]
-    labels = feature_df['label'].to_numpy()
-    src_ip = feature_df['src_ip'].to_numpy()
-    src_port = feature_df['src_port'].to_numpy()
-    dst_ip = feature_df['dst_ip'].to_numpy()
-    dst_port = feature_df['dst_port'].to_numpy()
-    protocol = feature_df['protocol'].to_numpy()
-    matched: list[int] = []
-    j = 0
-    n_csv = len(feature_df)
-    for (i, target) in enumerate(packet_keys):
-        while j < n_csv:
-            cand = (str(labels[j]), _canonical_key(src_ip[j], src_port[j], dst_ip[j], dst_port[j], protocol[j]))
-            j += 1
-            if cand == target:
-                matched.append(j - 1)
-                break
-        else:
-            raise ValueError(f'failed to align packet flow row {i:,}/{len(packet_keys):,}; the CSV cache may not be the same one used for packet extraction')
-    print(f'[data] scan-aligned CSV flow features: matched={len(matched):,} from csv_rows={n_csv:,} skipped={matched[-1] + 1 - len(matched):,}')
-    x = feature_df.iloc[matched][feature_columns].to_numpy(dtype=np.float32)
-    x = np.nan_to_num(x, nan=0.0, posinf=0.0, neginf=0.0).astype(np.float32)
-    return (x, tuple(feature_columns))
-
-def _read_aligned_flow_features(path: Path, packet_flows: pd.DataFrame, *, feature_columns: Optional[list[str]]=None, align: str='auto') -> tuple[np.ndarray, tuple[str, ...]]:
-    path = Path(path)
-    if align not in ('auto', 'row', 'scan'):
-        raise ValueError("flow_features_align must be 'auto', 'row', or 'scan'")
-    if path.suffix == '.npz':
-        (x, names, flow_id) = _read_flow_features(path, expected_rows=len(packet_flows), feature_columns=feature_columns)
-        packet_id = packet_flows['flow_id'].to_numpy() if 'flow_id' in packet_flows else None
-        if flow_id is not None and packet_id is not None and (not np.array_equal(flow_id, packet_id)):
-            raise ValueError('NPZ flow_id does not align with Packet_CFM flows')
-        return (x, names)
-    if path.suffix not in ('.parquet', '.pq'):
-        raise ValueError(f'unsupported flow feature file: {path}')
-    feature_df = pd.read_parquet(path)
-    cols = _feature_columns_from_df(feature_df, feature_columns)
-    if not cols:
-        raise ValueError(f'no numeric flow feature columns found in {path}')
-    packet_id = packet_flows['flow_id'].to_numpy() if 'flow_id' in packet_flows else None
-    if len(feature_df) == len(packet_flows):
-        feature_id = feature_df['flow_id'].to_numpy() if 'flow_id' in feature_df.columns else None
-        if feature_id is None or packet_id is None or np.array_equal(feature_id, packet_id):
-            x = feature_df[cols].to_numpy(dtype=np.float32)
-            x = np.nan_to_num(x, nan=0.0, posinf=0.0, neginf=0.0).astype(np.float32)
-            return (x, tuple(cols))
-        if align == 'row':
-            raise ValueError("flow_id mismatch with flow_features_align='row'")
-    if align == 'row':
-        raise ValueError(f'row alignment requested but feature rows={len(feature_df):,} packet rows={len(packet_flows):,}')
-    return _align_flow_features_by_scan(feature_df, packet_flows, feature_columns=cols)
-
-def _preprocess_flow(train: np.ndarray, val: np.ndarray, attack: np.ndarray) -> tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray, np.ndarray]:
-    mean = train.mean(axis=0).astype(np.float32)
-    std = train.std(axis=0).astype(np.float32)
-    return (_zscore(train, mean, std), _zscore(val, mean, std), _zscore(attack, mean, std), mean, std)
-
-def load_unified_data(*, packets_npz: Path | None=None, source_store: Path | None=None, flows_parquet: Path, flow_features_path: Path | None=None, flow_feature_columns: Optional[list[str]]=None, flow_features_align: str='auto', T: int=128, split_seed: int=42, train_ratio: float=0.8, benign_label: str='normal', min_len: int=2, packet_preprocess: str='mixed_dequant', attack_cap: int | None=None, val_cap: int | None=None) -> UnifiedData:
-    if (packets_npz is None) == (source_store is None):
-        raise ValueError('pass exactly one of packets_npz or source_store')
-    flows_parquet = Path(flows_parquet)
-    print(f'[data] flows={flows_parquet}  packets_source={(packets_npz if packets_npz else source_store)}')
-    flow_cols = ['flow_id', 'label']
-    if flow_features_path is not None:
-        flow_cols += ['src_ip', 'src_port', 'dst_ip', 'dst_port', 'protocol']
-    flows = pd.read_parquet(flows_parquet, columns=flow_cols)
-    labels_full = flows['label'].to_numpy().astype(str)
-    flow_id = flows['flow_id'].to_numpy()
-    tokens_full: np.ndarray | None = None
-    store = None
-    if packets_npz is not None:
-        pz = np.load(Path(packets_npz))
-        tokens_full = pz['packet_tokens'].astype(np.float32)
-        lens_full = pz['packet_lengths'].astype(np.int32)
-        packet_flow_id = pz['flow_id'] if 'flow_id' in pz.files else None
-        if T > tokens_full.shape[1]:
-            raise ValueError(f'requested T={T} > stored T_full={tokens_full.shape[1]}')
-        tokens_full = tokens_full[:, :T].copy()
-        lens_full = np.minimum(lens_full, T).astype(np.int32)
-        if packet_flow_id is not None and (not np.array_equal(packet_flow_id, flow_id)):
-            raise ValueError('packets_npz and flows_parquet are not row-aligned by flow_id')
-    else:
-        if flow_features_path is None:
-            raise ValueError('source_store path requires flow_features_path (derived features need tokens in memory)')
-        from common.packet_store import PacketShardStore
-        store = PacketShardStore.open(Path(source_store))
-        store_flow_id = store.read_flows(columns=['flow_id'])['flow_id'].to_numpy()
-        if not np.array_equal(store_flow_id, flow_id):
-            raise ValueError('source_store and flows_parquet are not row-aligned by flow_id')
-        lens_full = np.minimum(store.manifest['packet_length'].to_numpy(dtype=np.int32), T)
-    if flow_features_path is None:
-        assert tokens_full is not None
-        flow_features = _derive_flow_features(tokens_full, lens_full)
-        flow_names = DERIVED_FLOW_FEATURE_NAMES
-        print(f'[data] using derived flow features D={flow_features.shape[1]}')
-    else:
-        (flow_features, flow_names) = _read_aligned_flow_features(Path(flow_features_path), flows, feature_columns=flow_feature_columns, align=flow_features_align)
-        print(f'[data] using external flow features D={flow_features.shape[1]}')
-    keep = lens_full >= min_len
-    labels = labels_full[keep]
-    flow_features = flow_features[keep]
-    lens = lens_full[keep]
-    global_idx = np.flatnonzero(keep).astype(np.int64)
-    if tokens_full is not None:
-        materialized_tokens = tokens_full[keep]
-    else:
-        materialized_tokens = None
-    print(f'[data] rows total={len(keep):,}  keep len>={min_len}: {keep.sum():,}')
-    benign_local = np.where(labels == benign_label)[0]
-    attack_local = np.where(labels != benign_label)[0]
-    rng = np.random.default_rng(split_seed)
-    rng.shuffle(benign_local)
-    n_train = int(len(benign_local) * train_ratio)
-    train_local = benign_local[:n_train]
-    val_local = benign_local[n_train:]
-    if val_cap is not None and len(val_local) > val_cap:
-        val_local = np.sort(rng.choice(val_local, size=val_cap, replace=False))
-    if attack_cap is not None and len(attack_local) > attack_cap:
-        attack_local = np.sort(rng.choice(attack_local, size=attack_cap, replace=False))
-    print(f'[data] benign={len(benign_local):,} attack={len(attack_local):,} -> train={len(train_local):,} val={len(val_local):,}')
-
-    def _materialize(local_indices: np.ndarray) -> np.ndarray:
-        if materialized_tokens is not None:
-            return materialized_tokens[local_indices].astype(np.float32, copy=False)
-        assert store is not None
-        g = global_idx[local_indices]
-        (tok, _) = store.read_packets(g.astype(np.int64), T=T)
-        return tok.astype(np.float32, copy=False)
-    tr_p_raw = _materialize(train_local)
-    va_p_raw = _materialize(val_local)
-    at_p_raw = _materialize(attack_local)
-    tr_l = lens[train_local]
-    va_l = lens[val_local]
-    at_l = lens[attack_local]
-    tr_f_raw = flow_features[train_local]
-    va_f_raw = flow_features[val_local]
-    at_f_raw = flow_features[attack_local]
-    train_idx = train_local
-    val_idx = val_local
-    attack_idx = attack_local
-    (tr_p, va_p, at_p, p_mean, p_std) = _preprocess_packets(tr_p_raw, va_p_raw, at_p_raw, tr_l, va_l, at_l, preprocess=packet_preprocess, seed=split_seed)
-    (tr_f, va_f, at_f, f_mean, f_std) = _preprocess_flow(tr_f_raw, va_f_raw, at_f_raw)
-    return UnifiedData(train_flow=tr_f, val_flow=va_f, attack_flow=at_f, train_packets=tr_p, val_packets=va_p, attack_packets=at_p, train_len=tr_l, val_len=va_l, attack_len=at_l, attack_labels=labels[attack_idx], packet_mean=p_mean, packet_std=p_std, flow_mean=f_mean, flow_std=f_std, packet_preprocess=packet_preprocess, flow_feature_names=tuple(flow_names))
-
-def subsample_train(data: UnifiedData, n_train: int, seed: int) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
-    if n_train <= 0 or n_train >= len(data.train_flow):
-        return (data.train_flow, data.train_packets, data.train_len)
-    rng = np.random.default_rng(seed)
-    idx = rng.choice(len(data.train_flow), n_train, replace=False)
-    idx.sort()
-    return (data.train_flow[idx], data.train_packets[idx], data.train_len[idx])
--- a/Unified_CFM/model.py
+++ b/Unified_CFM/model.py
@@ -1,588 +0,0 @@
-from __future__ import annotations
-import math
-from dataclasses import dataclass
-import torch
-import torch.nn as nn
-from torchdiffeq import odeint
-
-@torch.no_grad()
-def _sinkhorn_coupling(C: torch.Tensor, reg: float=0.05, n_iter: int=20) -> torch.Tensor:
-    C = C.float()
-    log_k = -C / reg
-    B = C.shape[0]
-    log_u = torch.zeros(B, device=C.device)
-    log_v = torch.zeros(B, device=C.device)
-    for _ in range(n_iter):
-        log_v = -torch.logsumexp(log_k + log_u.unsqueeze(1), dim=0)
-        log_u = -torch.logsumexp(log_k + log_v.unsqueeze(0), dim=1)
-    log_p = log_u.unsqueeze(1) + log_k + log_v.unsqueeze(0)
-    return log_p.argmax(dim=1)
-
-class SinusoidalTimeEmb(nn.Module):
-
-    def __init__(self, dim: int) -> None:
-        super().__init__()
-        if dim % 2 != 0:
-            raise ValueError('time embedding dimension must be even')
-        self.dim = dim
-
-    def forward(self, t: torch.Tensor) -> torch.Tensor:
-        half = self.dim // 2
-        freqs = torch.exp(-math.log(10000) * torch.arange(half, device=t.device, dtype=t.dtype) / max(half - 1, 1))
-        args = t[:, None] * freqs[None, :]
-        return torch.cat([args.sin(), args.cos()], dim=-1)
-
-class AdaLNBlock(nn.Module):
-
-    def __init__(self, d_model: int, n_heads: int, mlp_ratio: float, cond_dim: int) -> None:
-        super().__init__()
-        self.norm1 = nn.LayerNorm(d_model, elementwise_affine=False)
-        self.attn = nn.MultiheadAttention(d_model, n_heads, batch_first=True)
-        self.norm2 = nn.LayerNorm(d_model, elementwise_affine=False)
-        hidden = int(d_model * mlp_ratio)
-        self.mlp = nn.Sequential(nn.Linear(d_model, hidden), nn.GELU(), nn.Linear(hidden, d_model))
-        self.cond_proj = nn.Linear(cond_dim, 6 * d_model)
-        nn.init.zeros_(self.cond_proj.weight)
-        nn.init.zeros_(self.cond_proj.bias)
-
-    @staticmethod
-    def _modulate(x: torch.Tensor, gamma: torch.Tensor, beta: torch.Tensor) -> torch.Tensor:
-        return x * (1.0 + gamma[:, None, :]) + beta[:, None, :]
-
-    def forward(self, x: torch.Tensor, cond: torch.Tensor, key_padding_mask: torch.Tensor | None, attn_mask: torch.Tensor | None=None) -> torch.Tensor:
-        (g1, b1, a1, g2, b2, a2) = self.cond_proj(cond).chunk(6, dim=-1)
-        h = self._modulate(self.norm1(x), g1, b1)
-        (attn_out, _) = self.attn(h, h, h, key_padding_mask=key_padding_mask, attn_mask=attn_mask, need_weights=False)
-        x = x + a1[:, None, :] * attn_out
-        h = self._modulate(self.norm2(x), g2, b2)
-        return x + a2[:, None, :] * self.mlp(h)
-
-class UnifiedVelocity(nn.Module):
-
-    def __init__(self, token_dim: int, seq_len: int, d_model: int=128, n_layers: int=4, n_heads: int=4, mlp_ratio: float=4.0, time_dim: int=64, reference_mode: str | None=None) -> None:
-        super().__init__()
-        if reference_mode not in (None, 'independent_token', 'block_diagonal', 'causal_packets', 'causal_all'):
-            raise ValueError(f'unknown reference_mode={reference_mode!r}')
-        self.token_dim = token_dim
-        self.seq_len = seq_len
-        self.reference_mode = reference_mode
-        self.input_proj = nn.Linear(token_dim, d_model)
-        self.pos_emb = nn.Parameter(torch.zeros(1, seq_len, d_model))
-        self.type_emb = nn.Embedding(2, d_model)
-        nn.init.trunc_normal_(self.pos_emb, std=0.02)
-        nn.init.normal_(self.type_emb.weight, std=0.02)
-        self.time_emb = SinusoidalTimeEmb(time_dim)
-        self.cond_mlp = nn.Sequential(nn.Linear(time_dim, d_model), nn.SiLU(), nn.Linear(d_model, d_model))
-        self.blocks = nn.ModuleList([AdaLNBlock(d_model, n_heads, mlp_ratio, cond_dim=d_model) for _ in range(n_layers)])
-        self.out_norm = nn.LayerNorm(d_model, elementwise_affine=False)
-        self.out = nn.Linear(d_model, token_dim)
-        nn.init.zeros_(self.out.weight)
-        nn.init.zeros_(self.out.bias)
-        type_ids = torch.ones(seq_len, dtype=torch.long)
-        type_ids[0] = 0
-        self.register_buffer('type_ids', type_ids, persistent=False)
-
-    def forward(self, x: torch.Tensor, t: torch.Tensor, key_padding_mask: torch.Tensor | None=None, attn_mask_override: torch.Tensor | None=None) -> torch.Tensor:
-        (B, L, _) = x.shape
-        if L > self.seq_len:
-            raise ValueError(f'sequence length {L} exceeds configured {self.seq_len}')
-        if t.dim() == 0:
-            t = t.expand(B)
-        h = self.input_proj(x)
-        h = h + self.pos_emb[:, :L, :]
-        h = h + self.type_emb(self.type_ids[:L])[None, :, :]
-        cond = self.cond_mlp(self.time_emb(t))
-        if attn_mask_override is not None:
-            attn_mask = attn_mask_override
-        else:
-            attn_mask = self._reference_attn_mask(L, x.device)
-        for block in self.blocks:
-            h = block(h, cond, key_padding_mask, attn_mask=attn_mask)
-        return self.out(self.out_norm(h))
-
-    def _reference_attn_mask(self, L: int, device: torch.device) -> torch.Tensor | None:
-        if self.reference_mode is None:
-            return None
-        if self.reference_mode == 'independent_token':
-            return ~torch.eye(L, dtype=torch.bool, device=device)
-        if self.reference_mode == 'block_diagonal':
-            mask = torch.ones((L, L), dtype=torch.bool, device=device)
-            mask[0, 0] = False
-            if L > 1:
-                mask[1:, 1:] = False
-            return mask
-        if self.reference_mode == 'causal_packets':
-            mask = torch.zeros((L, L), dtype=torch.bool, device=device)
-            if L > 1:
-                packet_causal = torch.triu(torch.ones(L - 1, L - 1, dtype=torch.bool, device=device), diagonal=1)
-                mask[1:, 1:] = packet_causal
-            return mask
-        if self.reference_mode == 'causal_all':
-            return torch.triu(torch.ones(L, L, dtype=torch.bool, device=device), diagonal=1)
-        raise AssertionError(self.reference_mode)
-
-@dataclass
-class UnifiedCFMConfig:
-    T: int = 128
-    packet_dim: int = 9
-    flow_dim: int = 16
-    token_dim: int | None = None
-    d_model: int = 128
-    n_layers: int = 4
-    n_heads: int = 4
-    mlp_ratio: float = 4.0
-    time_dim: int = 64
-    sigma: float = 0.1
-    use_ot: bool = False
-    reference_mode: str | None = None
-
-class UnifiedTokenCFM(nn.Module):
-
-    def __init__(self, cfg: UnifiedCFMConfig) -> None:
-        super().__init__()
-        self.cfg = cfg
-        self.token_dim = cfg.token_dim or 1 + max(cfg.flow_dim, cfg.packet_dim)
-        if self.token_dim < 1 + max(cfg.flow_dim, cfg.packet_dim):
-            raise ValueError('token_dim is too small for flow_dim/packet_dim')
-        self.seq_len = cfg.T + 1
-        self.velocity = UnifiedVelocity(token_dim=self.token_dim, seq_len=self.seq_len, d_model=cfg.d_model, n_layers=cfg.n_layers, n_heads=cfg.n_heads, mlp_ratio=cfg.mlp_ratio, time_dim=cfg.time_dim, reference_mode=cfg.reference_mode)
-
-    def build_tokens(self, flow: torch.Tensor, packets: torch.Tensor) -> torch.Tensor:
-        (B, T, Dp) = packets.shape
-        if T != self.cfg.T:
-            raise ValueError(f'packet T={T} but config T={self.cfg.T}')
-        if Dp != self.cfg.packet_dim:
-            raise ValueError(f'packet_dim={Dp} but config packet_dim={self.cfg.packet_dim}')
-        if flow.shape[-1] != self.cfg.flow_dim:
-            raise ValueError(f'flow_dim={flow.shape[-1]} but config flow_dim={self.cfg.flow_dim}')
-        z = packets.new_zeros((B, T + 1, self.token_dim))
-        z[:, 0, 0] = -1.0
-        z[:, 0, 1:1 + self.cfg.flow_dim] = flow
-        z[:, 1:, 0] = 1.0
-        z[:, 1:, 1:1 + self.cfg.packet_dim] = packets
-        return z
-
-    def key_padding_mask(self, lens: torch.Tensor) -> torch.Tensor:
-        B = lens.shape[0]
-        idx = torch.arange(self.cfg.T, device=lens.device)[None, :]
-        packet_real = idx < lens[:, None]
-        real = torch.cat([torch.ones(B, 1, dtype=torch.bool, device=lens.device), packet_real], dim=1)
-        return ~real
-
-    def _loss_mask(self, lens: torch.Tensor) -> torch.Tensor:
-        return (~self.key_padding_mask(lens)).float()
-
-    @staticmethod
-    def _masked_trimmed_mean(values: torch.Tensor, mask: torch.Tensor, trim_frac: float=0.1) -> torch.Tensor:
-        out = values.new_zeros(values.shape[0])
-        for i in range(values.shape[0]):
-            v = values[i][mask[i] > 0]
-            if v.numel() == 0:
-                continue
-            if v.numel() < 5:
-                out[i] = v.mean()
-                continue
-            v_sorted = torch.sort(v).values
-            lo = int(trim_frac * v_sorted.numel())
-            hi = int((1.0 - trim_frac) * v_sorted.numel())
-            if hi <= lo:
-                out[i] = v_sorted.mean()
-            else:
-                out[i] = v_sorted[lo:hi].mean()
-        return out
-
-    @staticmethod
-    def _masked_median(values: torch.Tensor, mask: torch.Tensor) -> torch.Tensor:
-        out = values.new_zeros(values.shape[0])
-        for i in range(values.shape[0]):
-            v = values[i][mask[i] > 0]
-            if v.numel() == 0:
-                continue
-            v_sorted = torch.sort(v).values
-            mid = v_sorted.numel() // 2
-            if v_sorted.numel() % 2:
-                out[i] = v_sorted[mid]
-            else:
-                out[i] = 0.5 * (v_sorted[mid - 1] + v_sorted[mid])
-        return out
-
-    def compute_loss(self, flow: torch.Tensor, packets: torch.Tensor, lens: torch.Tensor, *, lambda_flow: float=0.0, lambda_packet: float=0.0, packet_mask_ratio: float=0.5, return_components: bool=False) -> torch.Tensor | dict[str, torch.Tensor]:
-        x1 = self.build_tokens(flow, packets)
-        B = x1.shape[0]
-        x0 = torch.randn_like(x1)
-        mask = self._loss_mask(lens)
-        kpm = mask == 0
-        if self.cfg.use_ot:
-            flat0 = (x0 * mask[:, :, None]).reshape(B, -1)
-            flat1 = (x1 * mask[:, :, None]).reshape(B, -1)
-            col = _sinkhorn_coupling(torch.cdist(flat0.float(), flat1.float()))
-            x1 = x1[col]
-            flow = flow[col]
-            packets = packets[col]
-            lens = lens[col]
-            mask = self._loss_mask(lens)
-            kpm = mask == 0
-        t = torch.rand(B, device=x1.device)
-        x_t = (1.0 - t[:, None, None]) * x0 + t[:, None, None] * x1
-        if self.cfg.sigma > 0:
-            std = self.cfg.sigma * torch.sqrt(t * (1.0 - t))[:, None, None]
-            x_t = x_t + std * torch.randn_like(x_t)
-        target = x1 - x0
-        pred = self.velocity(x_t, t, key_padding_mask=kpm)
-        sq = (pred - target).square().mean(dim=-1)
-        per_sample = (sq * mask).sum(dim=-1) / mask.sum(dim=-1).clamp_min(1.0)
-        main_loss = per_sample.mean()
-        aux_flow_loss = x1.new_zeros(())
-        aux_packet_loss = x1.new_zeros(())
-        if lambda_flow > 0.0:
-            x_t_mf = x_t.clone()
-            x_t_mf[:, 0, :] = 0.0
-            pred_mf = self.velocity(x_t_mf, t, key_padding_mask=kpm)
-            err = (pred_mf[:, 0] - target[:, 0]).square().mean(dim=-1)
-            aux_flow_loss = err.mean()
-        if lambda_packet > 0.0:
-            packet_real = mask[:, 1:] > 0
-            rand_draw = torch.rand(packet_real.shape, device=x1.device)
-            mask_pkt = (rand_draw < packet_mask_ratio) & packet_real
-            pkt_mask_full = torch.cat([torch.zeros(B, 1, dtype=torch.bool, device=x1.device), mask_pkt], dim=1)
-            x_t_mp = x_t.clone()
-            x_t_mp[pkt_mask_full] = 0.0
-            pred_mp = self.velocity(x_t_mp, t, key_padding_mask=kpm)
-            sq_mp = (pred_mp - target).square().mean(dim=-1)
-            mask_f = pkt_mask_full.float()
-            denom = mask_f.sum(dim=-1).clamp_min(1.0)
-            aux_packet_loss = ((sq_mp * mask_f).sum(dim=-1) / denom).mean()
-        total = main_loss + lambda_flow * aux_flow_loss + lambda_packet * aux_packet_loss
-        if return_components:
-            return {'total': total, 'main': main_loss.detach(), 'aux_flow': aux_flow_loss.detach(), 'aux_packet': aux_packet_loss.detach()}
-        return total
-
-    @torch.no_grad()
-    def velocity_score(self, flow: torch.Tensor, packets: torch.Tensor, lens: torch.Tensor, t_eval: tuple[float, ...]=(0.5, 0.75, 1.0)) -> dict[str, torch.Tensor]:
-        x = self.build_tokens(flow, packets)
-        mask = self._loss_mask(lens)
-        kpm = mask == 0
-        total = torch.zeros(x.shape[0], device=x.device)
-        flow_s = torch.zeros_like(total)
-        packet_s = torch.zeros_like(total)
-        packet_count = mask[:, 1:].sum(dim=-1).clamp_min(1.0)
-        for t_val in t_eval:
-            t = torch.full((x.shape[0],), float(t_val), device=x.device)
-            v = self.velocity(x, t, key_padding_mask=kpm)
-            e = v.square().mean(dim=-1)
-            total = total + (e * mask).sum(dim=-1) / mask.sum(dim=-1).clamp_min(1.0)
-            flow_s = flow_s + e[:, 0]
-            packet_s = packet_s + (e[:, 1:] * mask[:, 1:]).sum(dim=-1) / packet_count
-        denom = float(len(t_eval))
-        return {'velocity_total': total / denom, 'velocity_flow': flow_s / denom, 'velocity_packet': packet_s / denom}
-
-    @torch.no_grad()
-    def trajectory_metrics(self, flow: torch.Tensor, packets: torch.Tensor, lens: torch.Tensor, n_steps: int=16) -> dict[str, torch.Tensor]:
-        z = self.build_tokens(flow, packets)
-        mask = self._loss_mask(lens)
-        kpm = mask == 0
-        B = z.shape[0]
-        dt = 1.0 / n_steps
-        total_arc = torch.zeros(B, device=z.device)
-        total_ke = torch.zeros(B, device=z.device)
-        flow_ke = torch.zeros(B, device=z.device)
-        packet_ke = torch.zeros(B, device=z.device)
-        total_curv = torch.zeros(B, device=z.device)
-        flow_curv = torch.zeros(B, device=z.device)
-        packet_curv = torch.zeros(B, device=z.device)
-        packet_kappa2_speed2 = torch.zeros(B, max(z.shape[1] - 1, 0), device=z.device)
-        packet_count = mask[:, 1:].sum(dim=-1).clamp_min(1.0)
-        v_prev = None
-        v_prev_norm = None
-        for k in range(n_steps):
-            t_val = 1.0 - k * dt
-            t = torch.full((B,), t_val, device=z.device)
-            v = self.velocity(z, t, key_padding_mask=kpm)
-            e = v.square().mean(dim=-1)
-            v_norm = v.square().sum(dim=-1).clamp_min(1e-12).sqrt()
-            total_ke = total_ke + (e * mask).sum(dim=-1) / mask.sum(dim=-1).clamp_min(1.0) * dt
-            flow_ke = flow_ke + e[:, 0] * dt
-            packet_ke = packet_ke + (e[:, 1:] * mask[:, 1:]).sum(dim=-1) / packet_count * dt
-            if v_prev is not None:
-                dv = v - v_prev
-                dve = dv.square().mean(dim=-1)
-                total_curv = total_curv + (dve * mask).sum(dim=-1) / mask.sum(dim=-1).clamp_min(1.0)
-                flow_curv = flow_curv + dve[:, 0]
-                packet_curv = packet_curv + (dve[:, 1:] * mask[:, 1:]).sum(dim=-1) / packet_count
-                dv2_sum = dv[:, 1:].square().sum(dim=-1)
-                assert v_prev_norm is not None
-                v_avg = 0.5 * (v_norm[:, 1:] + v_prev_norm[:, 1:])
-                packet_kappa2_speed2 = packet_kappa2_speed2 + dv2_sum / v_avg.square().clamp_min(1e-06)
-            v_prev = v
-            v_prev_norm = v_norm
-            z_new = z - v * dt
-            dz = (z_new - z) * mask[:, :, None]
-            total_arc = total_arc + dz.reshape(B, -1).norm(dim=-1) / mask.sum(dim=-1).sqrt()
-            z = z_new
-        z_masked = z * mask[:, :, None]
-        terminal = z_masked.reshape(B, -1).norm(dim=-1) / (mask.sum(dim=-1) * self.token_dim).clamp_min(1.0).sqrt()
-        terminal_flow = z[:, 0].norm(dim=-1) / math.sqrt(self.token_dim)
-        terminal_packet = (z[:, 1:] * mask[:, 1:, None]).reshape(B, -1).norm(dim=-1) / (packet_count * self.token_dim).sqrt()
-        packet_mask = mask[:, 1:]
-        kappa2_speed2_mean = (packet_kappa2_speed2 * packet_mask).sum(dim=-1) / packet_count
-        kappa2_speed2_median = self._masked_median(packet_kappa2_speed2, packet_mask)
-        kappa2_speed2_trimmed = self._masked_trimmed_mean(packet_kappa2_speed2, packet_mask)
-        return {'terminal_norm': terminal, 'terminal_flow': terminal_flow, 'terminal_packet': terminal_packet, 'arc_length': total_arc, 'kinetic_energy': total_ke, 'kinetic_flow': flow_ke, 'kinetic_packet': packet_ke, 'curvature_total': total_curv, 'curvature_flow': flow_curv, 'curvature_packet': packet_curv, 'kappa2_speed2norm_packet_mean': kappa2_speed2_mean, 'kappa2_speed2norm_packet_median': kappa2_speed2_median, 'kappa2_speed2norm_packet_trimmed10_mean': kappa2_speed2_trimmed}
-
-    @torch.no_grad()
-    def score_profile_vt(self, flow: torch.Tensor, packets: torch.Tensor, lens: torch.Tensor, t_eval: tuple[float, ...]=(0.1, 0.3, 0.5, 0.7, 0.9, 1.0)) -> dict[str, torch.Tensor]:
-        x = self.build_tokens(flow, packets)
-        mask = self._loss_mask(lens)
-        kpm = mask == 0
-        packet_count = mask[:, 1:].sum(dim=-1).clamp_min(1.0)
-        out: dict[str, torch.Tensor] = {}
-        for t_val in t_eval:
-            t = torch.full((x.shape[0],), float(t_val), device=x.device)
-            v = self.velocity(x, t, key_padding_mask=kpm)
-            e = v.square().mean(dim=-1)
-            tag = f't{int(round(t_val * 10)):02d}'
-            out[f'velocity_total_{tag}'] = (e * mask).sum(dim=-1) / mask.sum(dim=-1).clamp_min(1.0)
-            out[f'velocity_flow_{tag}'] = e[:, 0]
-            out[f'velocity_packet_{tag}'] = (e[:, 1:] * mask[:, 1:]).sum(dim=-1) / packet_count
-        return out
-
-    @torch.no_grad()
-    def consistency_score(self, flow: torch.Tensor, packets: torch.Tensor, lens: torch.Tensor, t_eval: float=0.5) -> dict[str, torch.Tensor]:
-        x = self.build_tokens(flow, packets)
-        mask = self._loss_mask(lens)
-        kpm = mask == 0
-        B = x.shape[0]
-        packet_count = mask[:, 1:].sum(dim=-1).clamp_min(1.0)
-        t = torch.full((B,), float(t_eval), device=x.device)
-        v_full = self.velocity(x, t, key_padding_mask=kpm)
-        x_mf = x.clone()
-        x_mf[:, 0, :] = 0.0
-        v_mf = self.velocity(x_mf, t, key_padding_mask=kpm)
-        flow_cons = (v_full[:, 0] - v_mf[:, 0]).square().mean(dim=-1)
-        x_mp = x.clone()
-        pkt_mask_full = mask[:, 1:] > 0
-        idx_pkt_mask = torch.cat([torch.zeros(B, 1, dtype=torch.bool, device=x.device), pkt_mask_full], dim=1)
-        x_mp[idx_pkt_mask] = 0.0
-        v_mp = self.velocity(x_mp, t, key_padding_mask=kpm)
-        diff = (v_full - v_mp).square().mean(dim=-1)
-        packet_cons = (diff[:, 1:] * mask[:, 1:]).sum(dim=-1) / packet_count
-        return {'flow_consistency': flow_cons, 'packet_consistency': packet_cons, 'consistency_total': flow_cons + packet_cons}
-
-    def jacobian_hutchinson(self, flow: torch.Tensor, packets: torch.Tensor, lens: torch.Tensor, t_eval: tuple[float, ...]=(0.5,), n_eps: int=4, generator: torch.Generator | None=None) -> dict[str, torch.Tensor]:
-        x = self.build_tokens(flow, packets)
-        mask = self._loss_mask(lens)
-        kpm = mask == 0
-        B = x.shape[0]
-        packet_count = mask[:, 1:].sum(dim=-1).clamp_min(1.0)
-        total = torch.zeros(B, device=x.device)
-        flow_j = torch.zeros(B, device=x.device)
-        packet_j = torch.zeros(B, device=x.device)
-        n_draws = n_eps * len(t_eval)
-        for t_val in t_eval:
-            t_current = torch.full((B,), float(t_val), device=x.device)
-            for _ in range(n_eps):
-                x_req = x.detach().clone().requires_grad_(True)
-                v = self.velocity(x_req, t_current, key_padding_mask=kpm)
-                eps = torch.randn(v.shape, device=v.device, generator=generator)
-                (g,) = torch.autograd.grad(outputs=v, inputs=x_req, grad_outputs=eps, retain_graph=False, create_graph=False)
-                e = g.square().mean(dim=-1)
-                total = total + (e * mask).sum(dim=-1) / mask.sum(dim=-1).clamp_min(1.0)
-                flow_j = flow_j + e[:, 0]
-                packet_j = packet_j + (e[:, 1:] * mask[:, 1:]).sum(dim=-1) / packet_count
-        return {'jacobian_total': (total / n_draws).detach(), 'jacobian_flow': (flow_j / n_draws).detach(), 'jacobian_packet': (packet_j / n_draws).detach()}
-
-    @torch.no_grad()
-    def pna_score(self, flow: torch.Tensor, packets: torch.Tensor, lens: torch.Tensor, n_steps: int=16, flow_masked: bool=False) -> dict[str, torch.Tensor]:
-        eps_v2 = 1e-06
-        dt = 1.0 / n_steps
-        z = self.build_tokens(flow, packets)
-        if flow_masked:
-            z = z.clone()
-            z[:, 0, :] = 0.0
-        mask = self._loss_mask(lens)
-        kpm = mask == 0
-        (B, L, _) = z.shape
-        pna = torch.zeros(B, L, device=z.device)
-        v_prev: torch.Tensor | None = None
-        v_norm_prev: torch.Tensor | None = None
-        for k in range(n_steps):
-            t_val = 1.0 - k * dt
-            t = torch.full((B,), t_val, device=z.device)
-            v = self.velocity(z, t, key_padding_mask=kpm)
-            v_norm = (v.square().sum(dim=-1) + 1e-12).sqrt()
-            if v_prev is not None:
-                dv2 = (v - v_prev).square().sum(dim=-1)
-                v_avg2 = (0.5 * (v_norm + v_norm_prev)).square().clamp_min(eps_v2)
-                pna = pna + dv2 / v_avg2
-            v_prev = v
-            v_norm_prev = v_norm
-            z = z - v * dt
-            if flow_masked:
-                z[:, 0, :] = 0.0
-        flow_pna = pna[:, 0]
-        packet_pna = pna[:, 1:]
-        packet_mask = mask[:, 1:]
-        packet_count = packet_mask.sum(dim=-1).clamp_min(1.0)
-        pna_median = self._masked_median(packet_pna, packet_mask)
-        pna_mean = (packet_pna * packet_mask).sum(dim=-1) / packet_count
-        masked_for_max = packet_pna.masked_fill(packet_mask == 0, float('-inf'))
-        pna_max = masked_for_max.max(dim=-1).values
-        pna_trimmed = self._masked_trimmed_mean(packet_pna, packet_mask)
-        return {'pna_packet_median': pna_median, 'pna_packet_mean': pna_mean, 'pna_packet_max': pna_max, 'pna_packet_trimmed10_mean': pna_trimmed, 'pna_flow': flow_pna}
-
-    @torch.no_grad()
-    def causal_consistency_score(self, flow: torch.Tensor, packets: torch.Tensor, lens: torch.Tensor, t_eval: float=0.5) -> dict[str, torch.Tensor]:
-        x = self.build_tokens(flow, packets)
-        mask = self._loss_mask(lens)
-        kpm = mask == 0
-        (B, L, _) = x.shape
-        t = torch.full((B,), float(t_eval), device=x.device)
-        v_full = self.velocity(x, t, key_padding_mask=kpm)
-        causal = torch.triu(torch.ones(L, L, dtype=torch.bool, device=x.device), diagonal=1)
-        v_causal = self.velocity(x, t, key_padding_mask=kpm, attn_mask_override=causal)
-        diff = (v_full - v_causal).square().mean(dim=-1)
-        flow_surprisal = diff[:, 0]
-        packet_diff = diff[:, 1:]
-        packet_mask = mask[:, 1:]
-        packet_count = packet_mask.sum(dim=-1).clamp_min(1.0)
-        packet_mean = (packet_diff * packet_mask).sum(dim=-1) / packet_count
-        packet_median = self._masked_median(packet_diff, packet_mask)
-        masked_for_max = packet_diff.masked_fill(packet_mask == 0, float('-inf'))
-        packet_max = masked_for_max.max(dim=-1).values
-        packet_trimmed = self._masked_trimmed_mean(packet_diff, packet_mask)
-        total = (diff * mask).sum(dim=-1) / mask.sum(dim=-1).clamp_min(1.0)
-        return {'causal_surprisal_total': total, 'causal_surprisal_flow': flow_surprisal, 'causal_surprisal_packet_mean': packet_mean, 'causal_surprisal_packet_median': packet_median, 'causal_surprisal_packet_max': packet_max, 'causal_surprisal_packet_trimmed10_mean': packet_trimmed}
-
-    @torch.no_grad()
-    def direction_consistency_score(self, flow: torch.Tensor, packets: torch.Tensor, lens: torch.Tensor, t_eval: tuple[float, ...]=(0.2, 0.4, 0.6, 0.8, 1.0)) -> dict[str, torch.Tensor]:
-        x = self.build_tokens(flow, packets)
-        mask = self._loss_mask(lens)
-        kpm = mask == 0
-        (B, L, _) = x.shape
-        t_eval = tuple(t_eval)
-        if len(t_eval) < 2:
-            raise ValueError('direction_consistency_score needs >=2 t values')
-        prev_v: torch.Tensor | None = None
-        drift = x.new_zeros(B, L)
-        n_pairs = len(t_eval) - 1
-        for t_val in t_eval:
-            t = torch.full((B,), float(t_val), device=x.device)
-            v = self.velocity(x, t, key_padding_mask=kpm)
-            if prev_v is not None:
-                num = (prev_v * v).sum(dim=-1)
-                denom = prev_v.norm(dim=-1).clamp_min(1e-08) * v.norm(dim=-1).clamp_min(1e-08)
-                cos = num / denom
-                drift = drift + (1.0 - cos)
-            prev_v = v
-        drift = drift / max(n_pairs, 1)
-        flow_drift = drift[:, 0]
-        packet_drift = drift[:, 1:]
-        packet_mask = mask[:, 1:]
-        packet_count = packet_mask.sum(dim=-1).clamp_min(1.0)
-        packet_mean = (packet_drift * packet_mask).sum(dim=-1) / packet_count
-        packet_median = self._masked_median(packet_drift, packet_mask)
-        masked_for_max = packet_drift.masked_fill(packet_mask == 0, float('-inf'))
-        packet_max = masked_for_max.max(dim=-1).values
-        packet_trimmed = self._masked_trimmed_mean(packet_drift, packet_mask)
-        total = (drift * mask).sum(dim=-1) / mask.sum(dim=-1).clamp_min(1.0)
-        return {'direction_drift_total': total, 'direction_drift_flow': flow_drift, 'direction_drift_packet_mean': packet_mean, 'direction_drift_packet_median': packet_median, 'direction_drift_packet_max': packet_max, 'direction_drift_packet_trimmed10_mean': packet_trimmed}
-
-    def inverse_flow_nll_score(self, flow: torch.Tensor, packets: torch.Tensor, lens: torch.Tensor, n_steps: int=16, n_eps: int=4, compute_divergence: bool=True, generator: torch.Generator | None=None) -> dict[str, torch.Tensor]:
-        z = self.build_tokens(flow, packets)
-        mask = self._loss_mask(lens)
-        kpm = mask == 0
-        (B, L, D) = z.shape
-        dt = 1.0 / n_steps
-        accum_div = torch.zeros(B, device=z.device)
-        if compute_divergence:
-            for k in range(n_steps):
-                t_val = 1.0 - k * dt
-                t = torch.full((B,), t_val, device=z.device)
-                z_req = z.detach().clone().requires_grad_(True)
-                v = self.velocity(z_req, t, key_padding_mask=kpm)
-                div_step = torch.zeros(B, device=z.device)
-                for j in range(n_eps):
-                    eps = torch.randn_like(v)
-                    eps_masked = eps * mask[:, :, None]
-                    retain = j < n_eps - 1
-                    (g,) = torch.autograd.grad(outputs=v, inputs=z_req, grad_outputs=eps_masked, retain_graph=retain, create_graph=False)
-                    div_step = div_step + (eps_masked * g).sum(dim=(1, 2))
-                div_step = div_step / float(n_eps)
-                accum_div = accum_div + div_step * dt
-                with torch.no_grad():
-                    z = (z_req - v * dt).detach()
-        else:
-            with torch.no_grad():
-                for k in range(n_steps):
-                    t_val = 1.0 - k * dt
-                    t = torch.full((B,), t_val, device=z.device)
-                    v = self.velocity(z, t, key_padding_mask=kpm)
-                    z = z - v * dt
-        with torch.no_grad():
-            z_masked = z * mask[:, :, None]
-            n_real = mask.sum(dim=-1).clamp_min(1.0)
-            x0_quadratic = z_masked.reshape(B, -1).square().sum(dim=-1) / (n_real * float(D))
-            nll_x0_only = x0_quadratic
-            nll_div_only = accum_div / (n_real * float(D))
-            nll_full = nll_x0_only + nll_div_only
-        return {'nll_x0_only': nll_x0_only.detach(), 'nll_div_only': nll_div_only.detach(), 'nll_full': nll_full.detach()}
-
-    def jacobian_spectral_score(self, flow: torch.Tensor, packets: torch.Tensor, lens: torch.Tensor, t_eval: float=0.5, n_eps: int=4, generator: torch.Generator | None=None) -> dict[str, torch.Tensor]:
-        x = self.build_tokens(flow, packets)
-        mask = self._loss_mask(lens)
-        kpm = mask == 0
-        (B, L, D) = x.shape
-        t = torch.full((B,), float(t_eval), device=x.device)
-        packet_mask = mask[:, 1:]
-        packet_count = packet_mask.sum(dim=-1).clamp_min(1.0)
-        norms_total: list[torch.Tensor] = []
-        norms_flow: list[torch.Tensor] = []
-        norms_packet: list[torch.Tensor] = []
-        for _ in range(n_eps):
-            x_req = x.detach().clone().requires_grad_(True)
-            v = self.velocity(x_req, t, key_padding_mask=kpm)
-            eps = torch.randn(v.shape, device=v.device, generator=generator)
-            (g,) = torch.autograd.grad(outputs=v, inputs=x_req, grad_outputs=eps, retain_graph=False, create_graph=False)
-            e = g.square().mean(dim=-1)
-            n_total = (e * mask).sum(dim=-1) / mask.sum(dim=-1).clamp_min(1.0)
-            n_flow = e[:, 0]
-            n_packet = (e[:, 1:] * packet_mask).sum(dim=-1) / packet_count
-            norms_total.append(n_total.detach())
-            norms_flow.append(n_flow.detach())
-            norms_packet.append(n_packet.detach())
-
-        def _spectral_summary(samples: list[torch.Tensor]) -> dict[str, torch.Tensor]:
-            stack = torch.stack(samples, dim=1)
-            mean = stack.mean(dim=1).clamp_min(1e-12)
-            mx = stack.max(dim=1).values
-            mn = stack.min(dim=1).values
-            logfro = torch.log(mean)
-            aniso = mx / mean
-            min_over_max = mn / mx.clamp_min(1e-12)
-            p = stack / stack.sum(dim=1, keepdim=True).clamp_min(1e-12)
-            entropy = -(p * p.clamp_min(1e-12).log()).sum(dim=1)
-            eff_rank = torch.exp(entropy)
-            return {'logfro': logfro, 'anisotropy': aniso, 'min_over_max': min_over_max, 'eff_rank': eff_rank}
-        out: dict[str, torch.Tensor] = {}
-        for (tag, samples) in (('total', norms_total), ('flow', norms_flow), ('packet', norms_packet)):
-            summ = _spectral_summary(samples)
-            for (stat_name, val) in summ.items():
-                out[f'jac_{stat_name}_{tag}'] = val
-        return out
-
-    @torch.no_grad()
-    def sample(self, n: int, lens: torch.Tensor, device: torch.device, n_steps: int=50, method: str='euler') -> torch.Tensor:
-        z = torch.randn(n, self.seq_len, self.token_dim, device=device)
-        ts = torch.linspace(0.0, 1.0, n_steps + 1, device=device)
-        kpm = self.key_padding_mask(lens.to(device))
-
-        def f(t: torch.Tensor, x: torch.Tensor) -> torch.Tensor:
-            return self.velocity(x, t.expand(x.shape[0]), key_padding_mask=kpm)
-        if method == 'euler':
-            for i in range(n_steps):
-                z = z + f(ts[i], z) * (ts[i + 1] - ts[i])
-            return z
-        return odeint(f, z, ts, method=method)[-1]
-
-    def param_count(self) -> int:
-        return sum((p.numel() for p in self.parameters()))
--- a/Unified_CFM/tests/test_model_shapes.py
+++ b/Unified_CFM/tests/test_model_shapes.py
@@ -1,157 +0,0 @@
-import sys
-from pathlib import Path
-import torch
-sys.path.insert(0, str(Path(__file__).resolve().parents[1]))
-from model import UnifiedCFMConfig, UnifiedTokenCFM
-
-def _build_model():
-    return UnifiedTokenCFM(UnifiedCFMConfig(T=4, packet_dim=3, flow_dim=5, d_model=16, n_layers=1, n_heads=4, time_dim=8))
-
-def _build_reference_model(reference_mode: str):
-    return UnifiedTokenCFM(UnifiedCFMConfig(T=4, packet_dim=3, flow_dim=5, d_model=16, n_layers=1, n_heads=4, time_dim=8, reference_mode=reference_mode))
-
-def _sample_batch(seed: int=0):
-    torch.manual_seed(seed)
-    flow = torch.randn(2, 5)
-    packets = torch.randn(2, 4, 3)
-    lens = torch.tensor([4, 2])
-    return (flow, packets, lens)
-
-def test_unified_cfm_shapes_and_scores():
-    model = _build_model()
-    (flow, packets, lens) = _sample_batch()
-    tokens = model.build_tokens(flow, packets)
-    assert tokens.shape == (2, 5, 6)
-    loss = model.compute_loss(flow, packets, lens)
-    assert loss.ndim == 0
-    assert torch.isfinite(loss)
-    traj = model.trajectory_metrics(flow, packets, lens, n_steps=2)
-    assert 'terminal_norm' in traj
-    assert traj['terminal_norm'].shape == (2,)
-    vel = model.velocity_score(flow, packets, lens)
-    assert set(vel) == {'velocity_total', 'velocity_flow', 'velocity_packet'}
-
-def test_reference_mode_independent_token_shapes_and_scores():
-    model = _build_reference_model('independent_token')
-    (flow, packets, lens) = _sample_batch(seed=9)
-    loss = model.compute_loss(flow, packets, lens)
-    assert loss.ndim == 0
-    assert torch.isfinite(loss)
-    traj = model.trajectory_metrics(flow, packets, lens, n_steps=2)
-    assert traj['terminal_norm'].shape == (2,)
-    assert torch.all(torch.isfinite(traj['curvature_packet']))
-
-def test_reference_mode_block_diagonal_shapes_and_scores():
-    model = _build_reference_model('block_diagonal')
-    (flow, packets, lens) = _sample_batch(seed=10)
-    loss = model.compute_loss(flow, packets, lens)
-    assert loss.ndim == 0
-    assert torch.isfinite(loss)
-    vel = model.velocity_score(flow, packets, lens)
-    assert set(vel) == {'velocity_total', 'velocity_flow', 'velocity_packet'}
-
-def test_trajectory_curvature_keys_and_shapes():
-    model = _build_model()
-    (flow, packets, lens) = _sample_batch(seed=1)
-    traj = model.trajectory_metrics(flow, packets, lens, n_steps=4)
-    for key in ('curvature_total', 'curvature_flow', 'curvature_packet'):
-        assert key in traj, f'missing {key}'
-        assert traj[key].shape == (2,)
-        assert torch.all(torch.isfinite(traj[key]))
-        assert torch.all(traj[key] >= 0)
-
-def test_trajectory_curvature_zero_with_one_step():
-    model = _build_model()
-    (flow, packets, lens) = _sample_batch(seed=2)
-    traj = model.trajectory_metrics(flow, packets, lens, n_steps=1)
-    for key in ('curvature_total', 'curvature_flow', 'curvature_packet'):
-        assert traj[key].abs().sum().item() == 0.0
-
-def test_speed_normalized_packet_curvature_scores():
-    model = _build_model()
-    (flow, packets, lens) = _sample_batch(seed=11)
-    traj = model.trajectory_metrics(flow, packets, lens, n_steps=4)
-    keys = ('kappa2_speed2norm_packet_mean', 'kappa2_speed2norm_packet_median', 'kappa2_speed2norm_packet_trimmed10_mean')
-    for key in keys:
-        assert key in traj, f'missing {key}'
-        assert traj[key].shape == (2,)
-        assert torch.all(torch.isfinite(traj[key]))
-        assert torch.all(traj[key] >= 0)
-    one_step = model.trajectory_metrics(flow, packets, lens, n_steps=1)
-    for key in keys:
-        assert one_step[key].abs().sum().item() == 0.0
-
-def test_score_profile_vt_shapes():
-    model = _build_model()
-    (flow, packets, lens) = _sample_batch(seed=3)
-    t_eval = (0.1, 0.3, 0.5, 0.7, 0.9, 1.0)
-    prof = model.score_profile_vt(flow, packets, lens, t_eval=t_eval)
-    assert len(prof) == 3 * len(t_eval)
-    for (k, v) in prof.items():
-        assert v.shape == (2,), k
-        assert torch.all(torch.isfinite(v))
-        assert torch.all(v >= 0)
-    assert 'velocity_total_t05' in prof
-    assert 'velocity_flow_t10' in prof
-    assert 'velocity_packet_t01' in prof
-
-def test_compute_loss_backward_compat():
-    model = _build_model()
-    (flow, packets, lens) = _sample_batch(seed=5)
-    torch.manual_seed(0)
-    a = model.compute_loss(flow, packets, lens)
-    torch.manual_seed(0)
-    b = model.compute_loss(flow, packets, lens, lambda_flow=0.0, lambda_packet=0.0)
-    assert torch.allclose(a, b), f'λ=0 must match old loss; got {a.item()} vs {b.item()}'
-
-def test_compute_loss_aux_components_finite():
-    model = _build_model()
-    (flow, packets, lens) = _sample_batch(seed=6)
-    torch.manual_seed(7)
-    comp = model.compute_loss(flow, packets, lens, lambda_flow=0.1, lambda_packet=0.1, return_components=True)
-    assert set(comp) == {'total', 'main', 'aux_flow', 'aux_packet'}
-    for (k, v) in comp.items():
-        assert torch.isfinite(v), k
-        assert v >= 0, f'{k} negative: {v.item()}'
-
-def test_compute_loss_aux_affects_gradient():
-    model = _build_model()
-    with torch.no_grad():
-        model.velocity.out.weight.normal_(std=0.01)
-        for block in model.velocity.blocks:
-            block.cond_proj.weight.normal_(std=0.01)
-    (flow, packets, lens) = _sample_batch(seed=8)
-    torch.manual_seed(10)
-    total = model.compute_loss(flow, packets, lens, lambda_flow=1.0, lambda_packet=1.0)
-    total.backward()
-    some_grad = False
-    for p in model.parameters():
-        if p.grad is not None and p.grad.abs().sum().item() > 0:
-            some_grad = True
-            break
-    assert some_grad, 'no gradient flowed through aux losses'
-
-def test_consistency_score_shapes():
-    model = _build_model()
-    (flow, packets, lens) = _sample_batch(seed=9)
-    cs = model.consistency_score(flow, packets, lens)
-    assert set(cs) == {'flow_consistency', 'packet_consistency', 'consistency_total'}
-    for (k, v) in cs.items():
-        assert v.shape == (2,), k
-        assert torch.all(torch.isfinite(v))
-        assert torch.all(v >= 0), k
-
-def test_jacobian_hutchinson_shapes_and_nonneg():
-    model = _build_model()
-    with torch.no_grad():
-        model.velocity.out.weight.normal_(std=0.01)
-        for block in model.velocity.blocks:
-            block.cond_proj.weight.normal_(std=0.01)
-    (flow, packets, lens) = _sample_batch(seed=4)
-    gen = torch.Generator().manual_seed(42)
-    jac = model.jacobian_hutchinson(flow, packets, lens, t_eval=(0.5,), n_eps=2, generator=gen)
-    assert set(jac) == {'jacobian_total', 'jacobian_flow', 'jacobian_packet'}
-    for (k, v) in jac.items():
-        assert v.shape == (2,), k
-        assert torch.all(torch.isfinite(v))
-        assert torch.all(v >= 0), f'{k} has negative value'
--- a/Unified_CFM/train.py
+++ b/Unified_CFM/train.py
@@ -1,147 +0,0 @@
-from __future__ import annotations
-import argparse
-import json
-import time
-from dataclasses import asdict
-from pathlib import Path
-from typing import Any
-import numpy as np
-import torch
-import yaml
-from sklearn.metrics import roc_auc_score
-from torch.utils.data import DataLoader, TensorDataset
-from data import UnifiedData, load_unified_data, subsample_train
-from model import UnifiedCFMConfig, UnifiedTokenCFM
-
-def _device(dev_arg: str) -> torch.device:
-    if dev_arg == 'auto':
-        return torch.device('cuda' if torch.cuda.is_available() else 'cpu')
-    return torch.device(dev_arg)
-
-def _batch_score(model: UnifiedTokenCFM, flow_np: np.ndarray, packet_np: np.ndarray, len_np: np.ndarray, device: torch.device, *, batch_size: int, n_steps: int) -> dict[str, np.ndarray]:
-    out: dict[str, list[np.ndarray]] = {}
-    model.eval()
-    for start in range(0, len(flow_np), batch_size):
-        sl = slice(start, start + batch_size)
-        flow = torch.from_numpy(flow_np[sl]).float().to(device)
-        packets = torch.from_numpy(packet_np[sl]).float().to(device)
-        lens = torch.from_numpy(len_np[sl]).long().to(device)
-        metrics = model.trajectory_metrics(flow, packets, lens, n_steps=n_steps)
-        vel = model.velocity_score(flow, packets, lens)
-        metrics.update(vel)
-        for (k, v) in metrics.items():
-            out.setdefault(k, []).append(v.detach().cpu().numpy())
-    return {k: np.concatenate(v, axis=0) for (k, v) in out.items()}
-
-def _quick_eval(model: UnifiedTokenCFM, data: UnifiedData, device: torch.device, cfg: dict[str, Any]) -> dict[str, float]:
-    n_eval = int(cfg.get('eval_n', 2000))
-    rng = np.random.default_rng(0)
-
-    def pick(n: int) -> np.ndarray:
-        m = min(n_eval, n)
-        return rng.choice(n, m, replace=False)
-    vi = pick(len(data.val_flow))
-    ai = pick(len(data.attack_flow))
-    v = _batch_score(model, data.val_flow[vi], data.val_packets[vi], data.val_len[vi], device, batch_size=int(cfg.get('eval_batch_size', 512)), n_steps=int(cfg.get('eval_n_steps', 8)))
-    a = _batch_score(model, data.attack_flow[ai], data.attack_packets[ai], data.attack_len[ai], device, batch_size=int(cfg.get('eval_batch_size', 512)), n_steps=int(cfg.get('eval_n_steps', 8)))
-    y = np.concatenate([np.zeros(len(vi)), np.ones(len(ai))])
-    result: dict[str, float] = {}
-    for key in sorted(v.keys()):
-        s = np.concatenate([v[key], a[key]])
-        s = np.nan_to_num(s, nan=0.0, posinf=1000000000000.0, neginf=-1000000000000.0)
-        result[f'auroc_{key}'] = float(roc_auc_score(y, s))
-    return result
-
-def train(cfg: dict[str, Any]) -> Path:
-    device = _device(str(cfg.get('device', 'auto')))
-    save_dir = Path(cfg['save_dir'])
-    save_dir.mkdir(parents=True, exist_ok=True)
-    with open(save_dir / 'config.yaml', 'w') as f:
-        yaml.safe_dump(cfg, f)
-    seed = int(cfg.get('seed', 42))
-    data_seed = int(cfg.get('data_seed', seed))
-    torch.manual_seed(seed)
-    np.random.seed(seed)
-    print(f'Device: {device}')
-    print(f'[seed] model={seed} data={data_seed}')
-    feature_columns = cfg.get('flow_feature_columns')
-    data = load_unified_data(packets_npz=Path(cfg['packets_npz']) if cfg.get('packets_npz') else None, source_store=Path(cfg['source_store']) if cfg.get('source_store') else None, flows_parquet=Path(cfg['flows_parquet']), flow_features_path=Path(cfg['flow_features_path']) if cfg.get('flow_features_path') else None, flow_feature_columns=feature_columns, flow_features_align=str(cfg.get('flow_features_align', 'auto')), T=int(cfg['T']), split_seed=data_seed, train_ratio=float(cfg.get('train_ratio', 0.8)), benign_label=str(cfg.get('benign_label', 'normal')), min_len=int(cfg.get('min_len', 2)), packet_preprocess=str(cfg.get('packet_preprocess', 'mixed_dequant')), attack_cap=int(cfg['attack_cap']) if cfg.get('attack_cap') else None, val_cap=int(cfg['val_cap']) if cfg.get('val_cap') else None)
-    print(f'[data] T={data.T} packet_D={data.packet_dim} flow_D={data.flow_dim} train={len(data.train_flow):,} val={len(data.val_flow):,} attack={len(data.attack_flow):,}')
-    (tr_f, tr_p, tr_l) = subsample_train(data, int(cfg.get('n_train', 0)), data_seed)
-    ds = TensorDataset(torch.from_numpy(tr_f).float(), torch.from_numpy(tr_p).float(), torch.from_numpy(tr_l).long())
-    loader = DataLoader(ds, batch_size=int(cfg['batch_size']), shuffle=True, drop_last=True, num_workers=int(cfg.get('num_workers', 0)), pin_memory=device.type == 'cuda')
-    print(f'[data] using {len(ds):,} benign training flows')
-    model_cfg = UnifiedCFMConfig(T=data.T, packet_dim=data.packet_dim, flow_dim=data.flow_dim, token_dim=cfg.get('token_dim'), d_model=int(cfg['d_model']), n_layers=int(cfg['n_layers']), n_heads=int(cfg['n_heads']), mlp_ratio=float(cfg.get('mlp_ratio', 4.0)), time_dim=int(cfg.get('time_dim', 64)), sigma=float(cfg.get('sigma', 0.1)), use_ot=bool(cfg.get('use_ot', False)), reference_mode=cfg.get('reference_mode'))
-    model = UnifiedTokenCFM(model_cfg).to(device)
-    print(f'[model] params={model.param_count():,} token_dim={model.token_dim} seq_len={model.seq_len} sigma={model_cfg.sigma} use_ot={model_cfg.use_ot} reference_mode={model_cfg.reference_mode}')
-    opt = torch.optim.AdamW(model.parameters(), lr=float(cfg['lr']), weight_decay=float(cfg.get('weight_decay', 0.01)))
-    total_steps = max(1, int(cfg['epochs']) * len(loader))
-    sched = torch.optim.lr_scheduler.CosineAnnealingLR(opt, T_max=total_steps)
-    history: dict[str, list[Any]] = {'epoch': [], 'loss': [], 'eval': []}
-    lambda_flow = float(cfg.get('lambda_flow', 0.0))
-    lambda_packet = float(cfg.get('lambda_packet', 0.0))
-    packet_mask_ratio = float(cfg.get('packet_mask_ratio', 0.5))
-    aux_enabled = lambda_flow > 0.0 or lambda_packet > 0.0
-    if aux_enabled:
-        print(f'[loss] λ_flow={lambda_flow}  λ_packet={lambda_packet}  packet_mask_ratio={packet_mask_ratio}')
-    for epoch in range(1, int(cfg['epochs']) + 1):
-        model.train()
-        losses: list[float] = []
-        aux_flow_sum = 0.0
-        aux_packet_sum = 0.0
-        n_steps_this_epoch = 0
-        t0 = time.time()
-        for (flow, packets, lens) in loader:
-            flow = flow.to(device, non_blocking=True)
-            packets = packets.to(device, non_blocking=True)
-            lens = lens.to(device, non_blocking=True)
-            if aux_enabled:
-                comp = model.compute_loss(flow, packets, lens, lambda_flow=lambda_flow, lambda_packet=lambda_packet, packet_mask_ratio=packet_mask_ratio, return_components=True)
-                loss = comp['total']
-                aux_flow_sum += float(comp['aux_flow'].item())
-                aux_packet_sum += float(comp['aux_packet'].item())
-            else:
-                loss = model.compute_loss(flow, packets, lens)
-            opt.zero_grad(set_to_none=True)
-            loss.backward()
-            torch.nn.utils.clip_grad_norm_(model.parameters(), float(cfg.get('grad_clip', 1.0)))
-            opt.step()
-            sched.step()
-            losses.append(float(loss.item()))
-            n_steps_this_epoch += 1
-        mean_loss = float(np.mean(losses)) if losses else float('nan')
-        eval_metrics: dict[str, float] | None = None
-        if epoch % int(cfg.get('eval_every', 5)) == 0 or epoch == int(cfg['epochs']):
-            eval_metrics = _quick_eval(model, data, device, cfg)
-        history['epoch'].append(epoch)
-        history['loss'].append(mean_loss)
-        history['eval'].append(eval_metrics)
-        elapsed = time.time() - t0
-        terminal = ''
-        if eval_metrics:
-            terminal = f" auroc_terminal={eval_metrics['auroc_terminal_norm']:.3f}"
-        if aux_enabled and n_steps_this_epoch:
-            terminal += f' aux_flow={aux_flow_sum / n_steps_this_epoch:.4f} aux_pkt={aux_packet_sum / n_steps_this_epoch:.4f}'
-        print(f"[epoch {epoch:>3d}/{cfg['epochs']:<3d}] ({elapsed:.1f}s) loss={mean_loss:.4f}{terminal}")
-        if not np.isfinite(mean_loss):
-            raise RuntimeError(f'non-finite loss at epoch {epoch}')
-    payload = {'model_state_dict': model.state_dict(), 'model_cfg': asdict(model_cfg), 'packet_mean': data.packet_mean, 'packet_std': data.packet_std, 'flow_mean': data.flow_mean, 'flow_std': data.flow_std, 'packet_preprocess': data.packet_preprocess, 'flow_feature_names': np.asarray(data.flow_feature_names), 'packet_feature_names': np.asarray(data.packet_feature_names)}
-    torch.save(payload, save_dir / 'model.pt')
-    with open(save_dir / 'history.json', 'w') as f:
-        json.dump(history, f, indent=2, default=str)
-    print(f"[saved] {save_dir / 'model.pt'}")
-    return save_dir
-
-def main() -> None:
-    p = argparse.ArgumentParser(description=__doc__)
-    p.add_argument('--config', type=Path, required=True)
-    p.add_argument('--override', type=str, nargs='*', default=[])
-    args = p.parse_args()
-    with open(args.config) as f:
-        cfg = yaml.safe_load(f)
-    for override in args.override:
-        (key, value) = override.split('=', 1)
-        cfg[key] = yaml.safe_load(value)
-    train(cfg)
-if __name__ == '__main__':
-    main()
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -21,6 +21,7 @@ dependencies = [
  "pyarrow>=24.0.0",
  "pzflow>=4.0.0",
  "shap>=0.51.0",
+  "cython>=3.2.4",
 ]

 [build-system]
--- a/scripts/ablation/generate_configs.py
+++ b/scripts/ablation/generate_configs.py
@@ -0,0 +1,56 @@
+"""Generate 60 B-group ablation configs from existing 12 base configs.
+
+Reads:
+  Mixed_CFM/configs/<ds>_seed<S>.yaml          (4 datasets × 3 seeds = 12 base)
+
+Writes:
+  Mixed_CFM/configs/ablation/<gid>/<ds>_seed<S>.yaml   (5 variants × 12 = 60)
+
+Each variant overrides save_dir → artifacts/ablation/janus_<ds>_seed<S>_<gid>/
+plus the variant-specific flags. CICIoT2023 base is `ciciot2023_seed42.yaml`
+(NOT `ciciot2023_route_c_seed42.yaml`, which is a different score-router config).
+"""
+from __future__ import annotations
+from pathlib import Path
+import yaml
+
+ROOT = Path(__file__).resolve().parents[2]
+BASE_DIR = ROOT / "Mixed_CFM" / "configs"
+OUT_DIR = ROOT / "Mixed_CFM" / "configs" / "ablation"
+
+DATASETS = ["iscxtor2016", "cicids2017", "cicddos2019", "ciciot2023"]
+SEEDS = [42, 43, 44]
+
+VARIANTS = {
+    "b1_noflow":    {"use_flow_token": False},
+    "b2_flowonly":  {"n_packet_tokens": 0, "lambda_disc": 0.0},
+    "b3_allcont":   {"disc_as_cont": True, "lambda_disc": 0.0},
+    "b4_alldisc":   {"cont_as_disc": True, "n_disc_classes": 8},
+    "b5_nodisc":    {"lambda_disc": 0.0},
+}
+
+
+def main() -> None:
+    OUT_DIR.mkdir(parents=True, exist_ok=True)
+    for gid, overrides in VARIANTS.items():
+        (OUT_DIR / gid).mkdir(parents=True, exist_ok=True)
+    n_written = 0
+    for ds in DATASETS:
+        for seed in SEEDS:
+            base_path = BASE_DIR / f"{ds}_seed{seed}.yaml"
+            if not base_path.exists():
+                print(f"[miss] {base_path}")
+                continue
+            base_cfg = yaml.safe_load(base_path.read_text())
+            for gid, overrides in VARIANTS.items():
+                cfg = dict(base_cfg)
+                cfg["save_dir"] = str(ROOT / "artifacts" / "ablation" / f"janus_{ds}_seed{seed}_{gid}")
+                cfg.update(overrides)
+                out = OUT_DIR / gid / f"{ds}_seed{seed}.yaml"
+                out.write_text(yaml.safe_dump(cfg, sort_keys=False))
+                n_written += 1
+    print(f"[wrote] {n_written} config files under {OUT_DIR}")
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/ablation/run_cross_groupB.sh
+++ b/scripts/ablation/run_cross_groupB.sh
@@ -0,0 +1,66 @@
+#!/usr/bin/env bash
+# Cross-dataset evaluation for B-group ablation models.
+# 5 variants × 6 off-diagonal directions × 3 seeds = 90 cross evals.
+#
+# Each B-variant model dir is artifacts/ablation/janus_<ds>_seed<S>_<gid>/.
+# We only cross within the 3-dataset matrix (cicids2017, cicddos2019, ciciot2023);
+# ISCXTor16 has different feature space for cross.
+#
+# Usage:
+#   bash scripts/ablation/run_cross_groupB.sh                     # all 90
+#   bash scripts/ablation/run_cross_groupB.sh b1_noflow b3_allcont
+set -euo pipefail
+ROOT=/home/chy/JANUS
+EVAL=${ROOT}/Mixed_CFM/eval_cross.py
+OUT_DIR=${ROOT}/artifacts/ablation/cross
+mkdir -p "${OUT_DIR}"
+
+declare -A STORE FLOWS FEATS
+STORE[cicids2017]=${ROOT}/datasets/cicids2017/processed/full_store
+FLOWS[cicids2017]=${ROOT}/datasets/cicids2017/processed/flows.parquet
+FEATS[cicids2017]=${ROOT}/datasets/cicids2017/processed/flow_features.parquet
+STORE[cicddos2019]=${ROOT}/datasets/cicddos2019/processed/full_store
+FLOWS[cicddos2019]=${ROOT}/datasets/cicddos2019/processed/flows.parquet
+FEATS[cicddos2019]=${ROOT}/datasets/cicddos2019/processed/flow_features.parquet
+STORE[ciciot2023]=${ROOT}/datasets/ciciot2023/processed/full_store
+FLOWS[ciciot2023]=${ROOT}/datasets/ciciot2023/processed/full_store/flows.parquet
+FEATS[ciciot2023]=${ROOT}/datasets/ciciot2023/processed/flow_features.parquet
+
+ALL_GIDS=(b1_noflow b2_flowonly b3_allcont b4_alldisc b5_nodisc)
+DATASETS=(cicids2017 cicddos2019 ciciot2023)
+SEEDS=(42 43 44)
+GPU="${GPU:-0}"
+
+if [[ $# -gt 0 ]]; then
+  GIDS=("$@")
+else
+  GIDS=("${ALL_GIDS[@]}")
+fi
+
+run_one() {
+  local gid=$1 src=$2 tgt=$3 seed=$4
+  local md=${ROOT}/artifacts/ablation/janus_${src}_seed${seed}_${gid}
+  local out=${OUT_DIR}/${gid}__seed${seed}_${src}_to_${tgt}.json
+  if [[ -f "${out}" ]]; then echo "[skip] $gid ${src}→${tgt} seed${seed}"; return; fi
+  if [[ ! -f "${md}/model.pt" ]]; then echo "[missing model] ${md}/model.pt"; return; fi
+  echo "[gpu${GPU}] $(date +%H:%M:%S) $gid ${src} → ${tgt} seed${seed}"
+  cd ${ROOT}/Mixed_CFM
+  CUDA_VISIBLE_DEVICES=${GPU} uv run --no-sync python -u ${EVAL} \
+    --model-dir ${md} \
+    --target-store ${STORE[$tgt]} --target-flows ${FLOWS[$tgt]} --target-flow-features ${FEATS[$tgt]} \
+    --benign-label normal --n-benign 10000 --n-attack 1000000 \
+    --out ${out} --seed ${seed} --T 64 --batch-size 512 --n-steps 16 \
+    > ${OUT_DIR}/${gid}__seed${seed}_${src}_to_${tgt}.log 2>&1
+}
+
+for gid in "${GIDS[@]}"; do
+  for src in "${DATASETS[@]}"; do
+    for tgt in "${DATASETS[@]}"; do
+      [[ "$src" == "$tgt" ]] && continue
+      for seed in "${SEEDS[@]}"; do
+        run_one "$gid" "$src" "$tgt" "$seed"
+      done
+    done
+  done
+done
+echo "[done] cross evals complete"
--- a/scripts/ablation/run_groupB.sh
+++ b/scripts/ablation/run_groupB.sh
@@ -0,0 +1,76 @@
+#!/usr/bin/env bash
+# Run all 60 B-group ablation training + phase1-eval runs.
+#
+# Splits work across two GPUs round-robin (set GPUS env to override).
+# Logs per-run go to artifacts/ablation/<save_dir>/{train,phase1}.log.
+#
+# Usage:
+#   bash scripts/ablation/run_groupB.sh                     # all 60 runs
+#   bash scripts/ablation/run_groupB.sh b1_noflow b5_nodisc # subset of groups
+#   GPUS=0 bash scripts/ablation/run_groupB.sh              # single-GPU serial
+set -euo pipefail
+cd "$(dirname "$0")/../.."
+
+ALL_GIDS=(b1_noflow b2_flowonly b3_allcont b4_alldisc b5_nodisc)
+DATASETS=(iscxtor2016 cicids2017 cicddos2019 ciciot2023)
+SEEDS=(42 43 44)
+GPUS="${GPUS:-0,1}"
+IFS=',' read -ra GPU_ARR <<< "$GPUS"
+N_GPU=${#GPU_ARR[@]}
+
+if [[ $# -gt 0 ]]; then
+  GIDS=("$@")
+else
+  GIDS=("${ALL_GIDS[@]}")
+fi
+
+# Build the full run list
+runs=()
+for gid in "${GIDS[@]}"; do
+  for ds in "${DATASETS[@]}"; do
+    for seed in "${SEEDS[@]}"; do
+      runs+=("${gid}|${ds}|${seed}")
+    done
+  done
+done
+
+n_runs=${#runs[@]}
+echo "[plan] ${n_runs} runs across GPUs ${GPUS} (gids=${GIDS[*]})"
+
+run_one() {
+  local spec="$1" gpu_id="$2"
+  IFS='|' read -r gid ds seed <<< "$spec"
+  local cfg="Mixed_CFM/configs/ablation/${gid}/${ds}_seed${seed}.yaml"
+  local save_dir
+  save_dir=$(uv run --no-sync python -c "import yaml,sys; print(yaml.safe_load(open('$cfg'))['save_dir'])")
+  mkdir -p "$save_dir"
+  echo "[gpu${gpu_id}] $(date +%H:%M:%S) START $gid $ds seed${seed}"
+  CUDA_VISIBLE_DEVICES="$gpu_id" uv run --no-sync python Mixed_CFM/train.py \
+    --config "$cfg" >"$save_dir/train.log" 2>&1
+  CUDA_VISIBLE_DEVICES="$gpu_id" uv run --no-sync python Mixed_CFM/eval_phase1.py \
+    --model-dir "$save_dir" --out-dir "$save_dir" \
+    --batch-size 256 --n-steps 16 \
+    --n-val-cap 30000 --n-atk-cap 30000 >"$save_dir/phase1.log" 2>&1
+  echo "[gpu${gpu_id}] $(date +%H:%M:%S) DONE  $gid $ds seed${seed}"
+}
+
+# Round-robin assignment
+pids=()
+for i in "${!runs[@]}"; do
+  spec="${runs[$i]}"
+  gpu_id="${GPU_ARR[$((i % N_GPU))]}"
+  # If single GPU: serial; if multi-GPU: parallel up to N_GPU at a time
+  if [[ $N_GPU -eq 1 ]]; then
+    run_one "$spec" "$gpu_id"
+  else
+    run_one "$spec" "$gpu_id" &
+    pids+=($!)
+    # Cap concurrency at N_GPU
+    if (( (i + 1) % N_GPU == 0 )); then
+      for pid in "${pids[@]}"; do wait "$pid" || true; done
+      pids=()
+    fi
+  fi
+done
+for pid in "${pids[@]}"; do wait "$pid" || true; done
+echo "[done] all ${n_runs} runs complete"
--- a/scripts/ablation/smoke_test.sh
+++ b/scripts/ablation/smoke_test.sh
@@ -0,0 +1,39 @@
+#!/usr/bin/env bash
+# Smoke-test all 5 B-group variants on cicids2017 seed42 with reduced epochs
+# and tiny train set, on CPU (so VLLM workers on the GPUs are not disturbed).
+#
+# After: each ablation/janus_cicids2017_seed42_<gid>/ should contain model.pt
+# + phase1_scores.npz with the variant-specific score keys.
+set -euo pipefail
+cd "$(dirname "$0")/../.."
+
+GIDS=(b1_noflow b2_flowonly b3_allcont b4_alldisc b5_nodisc)
+DS=cicids2017
+SEED=42
+
+for gid in "${GIDS[@]}"; do
+  cfg="Mixed_CFM/configs/ablation/${gid}/${DS}_seed${SEED}.yaml"
+  echo "=================================================="
+  echo "[smoke] $gid"
+  echo "=================================================="
+  uv run --no-sync python Mixed_CFM/train.py \
+    --config "$cfg" \
+    --override "device=cpu" "epochs=2" "n_train=500" "eval_n=200" "eval_every=2" \
+    "save_dir=/home/chy/JANUS/artifacts/ablation_smoke/${gid}" 2>&1 | tail -8
+  uv run --no-sync python Mixed_CFM/eval_phase1.py \
+    --model-dir "/home/chy/JANUS/artifacts/ablation_smoke/${gid}" \
+    --out-dir "/home/chy/JANUS/artifacts/ablation_smoke/${gid}" \
+    --device cpu --batch-size 64 --n-steps 4 \
+    --n-val-cap 200 --n-atk-cap 200 2>&1 | tail -4
+  echo
+done
+echo "=== Smoke summary ==="
+for gid in "${GIDS[@]}"; do
+  npz="/home/chy/JANUS/artifacts/ablation_smoke/${gid}/phase1_scores.npz"
+  if [[ -f "$npz" ]]; then
+    keys=$(uv run --no-sync python -c "import numpy as np; z=np.load('$npz', allow_pickle=True); print(','.join(sorted(k for k in z.files if k.startswith(('val_terminal','val_disc')))))")
+    echo "$gid: $keys"
+  else
+    echo "$gid: MISSING"
+  fi
+done
--- a/scripts/aggregate/aggregate_ablation.py
+++ b/scripts/aggregate/aggregate_ablation.py
@@ -0,0 +1,533 @@
+"""JANUS ablation aggregator (Groups A + B).
+
+Reads phase1_scores.npz from:
+  artifacts/route_comparison/janus_<ds>_seed<S>/      (A + JANUS-full anchor)
+  artifacts/ablation/janus_<ds>_seed<S>_<gid>/        (B variants)
+
+Produces:
+  artifacts/ablation/ABLATION_TABLE.md                final markdown table
+  artifacts/ablation/ABLATION_TABLE_RAW.json          per-cell mean / std / CI / per-seed
+  artifacts/ablation/ABLATION_DELONG.md               paired DeLong p-values vs JANUS-full
+
+Group A operates entirely on existing route_comparison npz files (no GPU).
+Group B requires the 60 B-variant runs to have completed.
+"""
+from __future__ import annotations
+import argparse
+import json
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Iterable
+
+import numpy as np
+from sklearn.covariance import OAS
+from sklearn.metrics import roc_auc_score
+
+ROOT = Path(__file__).resolve().parents[2]
+ROUTE = ROOT / "artifacts" / "route_comparison"
+ABL = ROOT / "artifacts" / "ablation"
+
+DATASETS = ["iscxtor2016", "cicids2017", "cicddos2019", "ciciot2023"]
+PRETTY = {
+    "iscxtor2016": "ISCXTor16",
+    "cicids2017": "CICIDS17",
+    "cicddos2019": "CICDDoS19",
+    "ciciot2023": "CICIoT23",
+}
+SEEDS = [42, 43, 44]
+T_975_N3 = 4.302653  # 95% t-CI factor for n=3 (df=2)
+
+CONT_KEYS = ["terminal_norm", "terminal_flow", "terminal_packet"]
+DISC_KEYS = ["disc_nll_total", "disc_nll_ch2", "disc_nll_ch3",
+             "disc_nll_ch4", "disc_nll_ch5", "disc_nll_ch6", "disc_nll_ch7"]
+ALL_KEYS = CONT_KEYS + DISC_KEYS  # 10-d
+
+
+# --------------------------------------------------------------------------- #
+# I/O                                                                         #
+# --------------------------------------------------------------------------- #
+def _load_npz(npz_path: Path):
+    z = np.load(npz_path, allow_pickle=True)
+    val = {}
+    atk = {}
+    for k in z.files:
+        if k.startswith("val_") and k != "val_labels":
+            val[k[4:]] = z[k]
+        elif k.startswith("atk_") and k != "atk_labels":
+            atk[k[4:]] = z[k]
+    return val, atk
+
+
+def _load_cross_npz(npz_path: Path):
+    """Cross npz schema:  b_<key> = target benign,  a_<key> = target attacks."""
+    z = np.load(npz_path, allow_pickle=True)
+    val = {}
+    atk = {}
+    for k in z.files:
+        if k.startswith("b_") and k != "b_labels":
+            val[k[2:]] = z[k]
+        elif k.startswith("a_") and k != "a_labels":
+            atk[k[2:]] = z[k]
+    return val, atk
+
+
+def _stack(d: dict, keys: list[str]) -> np.ndarray:
+    arrs = []
+    for k in keys:
+        if k in d:
+            arrs.append(d[k])
+        else:
+            # variant doesn't produce this score (e.g. B2 has no disc, B5 disc untrained)
+            return None
+    out = np.stack(arrs, axis=1).astype(np.float64)
+    return np.nan_to_num(out, nan=0.0, posinf=1e6, neginf=-1e6)
+
+
+# --------------------------------------------------------------------------- #
+# Score functions (Group A definitions)                                       #
+# --------------------------------------------------------------------------- #
+def _mahal(S, mu, inv_cov):
+    d = S - mu
+    return np.einsum("ni,ij,nj->n", d, inv_cov, d)
+
+
+def _oas_mahal(val_S, atk_S):
+    mu = val_S.mean(axis=0)
+    cov = OAS().fit(val_S).covariance_
+    inv = np.linalg.inv(cov + 1e-9 * np.eye(cov.shape[0]))
+    return _mahal(val_S, mu, inv), _mahal(atk_S, mu, inv)
+
+
+def _zscore_agg(val_S, atk_S, mode="mean"):
+    mu = val_S.mean(axis=0)
+    sd = val_S.std(axis=0) + 1e-9
+    zv = (val_S - mu) / sd
+    za = (atk_S - mu) / sd
+    if mode == "mean":
+        return zv.mean(axis=1), za.mean(axis=1)
+    if mode == "max":
+        return zv.max(axis=1), za.max(axis=1)
+    raise ValueError(mode)
+
+
+def score_a1_terminal_norm(val, atk):
+    return val["terminal_norm"], atk["terminal_norm"]
+
+
+def score_a2_disc_total(val, atk):
+    if "disc_nll_total" not in val:
+        return None
+    return val["disc_nll_total"], atk["disc_nll_total"]
+
+
+def score_a3_oas_term3(val, atk):
+    Sv = _stack(val, CONT_KEYS)
+    Sa = _stack(atk, CONT_KEYS)
+    if Sv is None or Sa is None:
+        return None
+    return _oas_mahal(Sv, Sa)
+
+
+def score_a4_oas_disc7(val, atk):
+    Sv = _stack(val, DISC_KEYS)
+    Sa = _stack(atk, DISC_KEYS)
+    if Sv is None or Sa is None:
+        return None
+    return _oas_mahal(Sv, Sa)
+
+
+def score_a5_oas_all10(val, atk):
+    Sv = _stack(val, ALL_KEYS)
+    Sa = _stack(atk, ALL_KEYS)
+    if Sv is None or Sa is None:
+        return None
+    return _oas_mahal(Sv, Sa)
+
+
+def score_a6_zmean(val, atk):
+    Sv = _stack(val, ALL_KEYS)
+    Sa = _stack(atk, ALL_KEYS)
+    if Sv is None or Sa is None:
+        return None
+    return _zscore_agg(Sv, Sa, "mean")
+
+
+def score_a7_zmax(val, atk):
+    Sv = _stack(val, ALL_KEYS)
+    Sa = _stack(atk, ALL_KEYS)
+    if Sv is None or Sa is None:
+        return None
+    return _zscore_agg(Sv, Sa, "max")
+
+
+def score_oas_disc_all(val, atk):
+    """Auto-discover all `disc_nll_*` keys; OAS-Mahal over them. Used by B4."""
+    keys = sorted(k for k in val.keys() if k.startswith("disc_nll_"))
+    if not keys:
+        return None
+    Sv = _stack(val, keys)
+    Sa = _stack(atk, keys)
+    if Sv is None or Sa is None:
+        return None
+    return _oas_mahal(Sv, Sa)
+
+
+def score_oas_all_available(val, atk):
+    """OAS-Mahal over all `terminal_*` ∪ `disc_nll_*` keys present in the npz.
+
+    Used by B1 (no terminal_flow). Handles arbitrary subset of the 10 standard keys.
+    """
+    keys = sorted([k for k in val.keys() if k.startswith("terminal_") or k.startswith("disc_nll_")])
+    if not keys:
+        return None
+    if len(keys) == 1:
+        return val[keys[0]], atk[keys[0]]
+    Sv = _stack(val, keys)
+    Sa = _stack(atk, keys)
+    if Sv is None or Sa is None:
+        return None
+    return _oas_mahal(Sv, Sa)
+
+
+def score_oas_term_all(val, atk):
+    """Auto-discover all `terminal_*` keys; OAS-Mahal. Used by B3 (3 keys) / B1 (2 keys)."""
+    keys = sorted(k for k in val.keys() if k.startswith("terminal_"))
+    if not keys:
+        return None
+    if len(keys) == 1:
+        # single scalar: just return raw
+        return val[keys[0]], atk[keys[0]]
+    Sv = _stack(val, keys)
+    Sa = _stack(atk, keys)
+    if Sv is None or Sa is None:
+        return None
+    return _oas_mahal(Sv, Sa)
+
+
+SCORE_FNS = {
+    "A1_terminal_norm": score_a1_terminal_norm,
+    "A2_disc_nll_total": score_a2_disc_total,
+    "A3_OAS_term3": score_a3_oas_term3,
+    "A4_OAS_disc7": score_a4_oas_disc7,
+    "A5_OAS_all10": score_a5_oas_all10,
+    "A6_zmean_all10": score_a6_zmean,
+    "A7_zmax_all10": score_a7_zmax,
+    "OAS_disc_all": score_oas_disc_all,
+    "OAS_term_all": score_oas_term_all,
+    "OAS_all_available": score_oas_all_available,
+}
+
+
+# --------------------------------------------------------------------------- #
+# Stats                                                                       #
+# --------------------------------------------------------------------------- #
+def _auroc(s_v, s_a):
+    y = np.r_[np.zeros(len(s_v)), np.ones(len(s_a))]
+    s = np.r_[s_v, s_a]
+    return float(roc_auc_score(y, s))
+
+
+def _mean_ci(values: list[float]):
+    """3-seed mean ± 95% t-CI (n=3, df=2)."""
+    a = np.asarray([v for v in values if v is not None and not np.isnan(v)], dtype=float)
+    if a.size == 0:
+        return None
+    if a.size == 1:
+        return {"mean": float(a[0]), "std": 0.0, "ci": 0.0, "n": 1, "vals": a.tolist()}
+    se = a.std(ddof=1) / np.sqrt(a.size)
+    return {
+        "mean": float(a.mean()),
+        "std": float(a.std(ddof=1)),
+        "ci": float(T_975_N3 * se) if a.size == 3 else float(1.96 * se),
+        "n": int(a.size),
+        "vals": a.tolist(),
+    }
+
+
+def _delong_var(s_v, s_a):
+    """Compute DeLong AUROC variance (Sun & Xu 2014, fast O(n log n))."""
+    n0, n1 = len(s_v), len(s_a)
+    s = np.concatenate([s_a, s_v])  # positives first
+    order = np.argsort(s, kind="mergesort")
+    L = np.empty_like(s)
+    s_sorted = s[order]
+    # midrank
+    i = 0
+    while i < len(s_sorted):
+        j = i
+        while j < len(s_sorted) and s_sorted[j] == s_sorted[i]:
+            j += 1
+        L[order[i:j]] = (i + j - 1) / 2.0 + 1
+        i = j
+    # ranks split
+    L_a = L[:n1]
+    L_v = L[n1:]
+    # midrank within each class
+    s_a_order = np.argsort(s_a, kind="mergesort")
+    L_aa = np.empty(n1)
+    sa_sorted = s_a[s_a_order]
+    i = 0
+    while i < n1:
+        j = i
+        while j < n1 and sa_sorted[j] == sa_sorted[i]:
+            j += 1
+        L_aa[s_a_order[i:j]] = (i + j - 1) / 2.0 + 1
+        i = j
+    s_v_order = np.argsort(s_v, kind="mergesort")
+    L_vv = np.empty(n0)
+    sv_sorted = s_v[s_v_order]
+    i = 0
+    while i < n0:
+        j = i
+        while j < n0 and sv_sorted[j] == sv_sorted[i]:
+            j += 1
+        L_vv[s_v_order[i:j]] = (i + j - 1) / 2.0 + 1
+        i = j
+    auc = (L_a.sum() / n1 - (n1 + 1) / 2) / n0
+    V10 = (L_a - L_aa) / n0  # length n1
+    V01 = 1 - (L_v - L_vv) / n1  # length n0
+    s10 = V10.var(ddof=1)
+    s01 = V01.var(ddof=1)
+    var = s10 / n1 + s01 / n0
+    return float(auc), float(var), V10, V01
+
+
+def _delong_paired_p(s_v, s_a, t_v, t_a):
+    """Paired DeLong test for two AUROCs on the same data.
+
+    Returns (auc1 - auc2, p_value_two_sided).
+    s_*: candidate scores; t_*: reference (JANUS-full) scores.
+    Both arrays must align flow-by-flow.
+    """
+    auc1, var1, V10_1, V01_1 = _delong_var(s_v, s_a)
+    auc2, var2, V10_2, V01_2 = _delong_var(t_v, t_a)
+    n1, n0 = len(s_a), len(s_v)
+    cov10 = np.cov(np.stack([V10_1, V10_2]), ddof=1)[0, 1]
+    cov01 = np.cov(np.stack([V01_1, V01_2]), ddof=1)[0, 1]
+    cov12 = cov10 / n1 + cov01 / n0
+    var_diff = var1 + var2 - 2 * cov12
+    if var_diff <= 0:
+        return auc1 - auc2, 1.0
+    z = (auc1 - auc2) / np.sqrt(var_diff)
+    # two-sided
+    from scipy.stats import norm
+    p = 2 * (1 - norm.cdf(abs(z)))
+    return auc1 - auc2, float(p)
+
+
+# --------------------------------------------------------------------------- #
+# Aggregation entry points                                                    #
+# --------------------------------------------------------------------------- #
+@dataclass
+class VariantSpec:
+    vid: str
+    label: str
+    what_removed: str
+    npz_dir_pattern: str  # e.g. "route_comparison/janus_{ds}_seed{seed}" or "ablation/janus_{ds}_seed{seed}_{gid}"
+    score_fn_id: str  # which Group A score to apply on the npz (usually "A5_OAS_all10")
+    gid: str = ""  # for B variants
+
+
+def _expand_path(spec: VariantSpec, ds: str, seed: int) -> Path:
+    return ROOT / "artifacts" / spec.npz_dir_pattern.format(ds=ds, seed=seed, gid=spec.gid) / "phase1_scores.npz"
+
+
+def collect_variant(spec: VariantSpec) -> dict:
+    rows: dict[str, list[float]] = {ds: [] for ds in DATASETS}
+    per_seed: dict[str, dict[int, float]] = {ds: {} for ds in DATASETS}
+    for ds in DATASETS:
+        for seed in SEEDS:
+            npz = _expand_path(spec, ds, seed)
+            if not npz.exists():
+                continue
+            val, atk = _load_npz(npz)
+            fn = SCORE_FNS[spec.score_fn_id]
+            res = fn(val, atk)
+            if res is None:
+                continue
+            sv, sa = res
+            auc = _auroc(sv, sa)
+            rows[ds].append(auc)
+            per_seed[ds][seed] = auc
+    summary = {ds: _mean_ci(rows[ds]) for ds in DATASETS}
+    return {
+        "vid": spec.vid,
+        "label": spec.label,
+        "what_removed": spec.what_removed,
+        "score_fn_id": spec.score_fn_id,
+        "gid": spec.gid,
+        "per_dataset": summary,
+        "per_seed": per_seed,
+    }
+
+
+def collect_delong_pvals(spec: VariantSpec, ref_spec: VariantSpec) -> dict:
+    """Paired DeLong test: spec vs ref_spec, on each (ds, seed)."""
+    out: dict[str, list[dict]] = {ds: [] for ds in DATASETS}
+    for ds in DATASETS:
+        for seed in SEEDS:
+            npz_s = _expand_path(spec, ds, seed)
+            npz_r = _expand_path(ref_spec, ds, seed)
+            if not (npz_s.exists() and npz_r.exists()):
+                continue
+            val_s, atk_s = _load_npz(npz_s)
+            val_r, atk_r = _load_npz(npz_r)
+            fn_s = SCORE_FNS[spec.score_fn_id]
+            fn_r = SCORE_FNS[ref_spec.score_fn_id]
+            res_s = fn_s(val_s, atk_s)
+            res_r = fn_r(val_r, atk_r)
+            if res_s is None or res_r is None:
+                continue
+            sv_s, sa_s = res_s
+            sv_r, sa_r = res_r
+            # if shapes differ (e.g. variant evaluated on subset), align by index — they should match seed-for-seed
+            # in practice for B variants the npz is from the SAME data as JANUS-full at that (ds, seed)
+            if len(sv_s) != len(sv_r) or len(sa_s) != len(sa_r):
+                continue
+            d, p = _delong_paired_p(sv_s, sa_s, sv_r, sa_r)
+            out[ds].append({"seed": seed, "delta": d, "p": p})
+    return out
+
+
+# --------------------------------------------------------------------------- #
+# Variant registry                                                            #
+# --------------------------------------------------------------------------- #
+ROUTE_DIR = "route_comparison/janus_{ds}_seed{seed}"
+ABL_DIR = "ablation/janus_{ds}_seed{seed}_{gid}"
+
+
+def _group_a_specs() -> list[VariantSpec]:
+    base = ROUTE_DIR
+    return [
+        VariantSpec("JANUS-full", "JANUS-full (A5)", "—", base, "A5_OAS_all10"),
+        VariantSpec("A1", "A1 terminal_norm", "OAS aggregator + disc head", base, "A1_terminal_norm"),
+        VariantSpec("A2", "A2 disc_nll_total", "OAS aggregator + CFM head", base, "A2_disc_nll_total"),
+        VariantSpec("A3", "A3 OAS-Mahal term3", "disc head", base, "A3_OAS_term3"),
+        VariantSpec("A4", "A4 OAS-Mahal disc7", "CFM head", base, "A4_OAS_disc7"),
+        VariantSpec("A6", "A6 z-score mean (10-d)", "covariance structure", base, "A6_zmean_all10"),
+        VariantSpec("A7", "A7 z-score max (10-d)", "weighted aggregation", base, "A7_zmax_all10"),
+    ]
+
+
+def _group_b_specs() -> list[VariantSpec]:
+    return [
+        # B1 has 2 terminal keys (no terminal_flow) + full disc7 → use auto-key OAS (9-d in this case)
+        VariantSpec("B1", "B1 no FLOW token", "global context",        ABL_DIR, "OAS_all_available", gid="b1_noflow"),
+        # B2 has only terminal_flow (= terminal_norm); single scalar
+        VariantSpec("B2", "B2 flow-only",     "packet sequence",        ABL_DIR, "A1_terminal_norm", gid="b2_flowonly"),
+        # B3 has terminal_norm/flow/packet covering all 9 dims (cont + disc-as-cont); OAS on 3-tuple
+        VariantSpec("B3", "B3 all-cont",      "cont/disc split",        ABL_DIR, "A3_OAS_term3", gid="b3_allcont"),
+        # B4 has 9 disc channels + total; auto-discover keys
+        VariantSpec("B4", "B4 all-disc",      "cont/disc split (rev)",  ABL_DIR, "OAS_disc_all", gid="b4_alldisc"),
+        # B5 has full schema but disc head is untrained noise; use term3 only
+        VariantSpec("B5", "B5 λ_disc=0",      "joint training",         ABL_DIR, "A3_OAS_term3", gid="b5_nodisc"),
+    ]
+
+
+# --------------------------------------------------------------------------- #
+# Markdown writer                                                             #
+# --------------------------------------------------------------------------- #
+def _fmt_cell(c: dict | None) -> str:
+    if c is None:
+        return "—"
+    if c["n"] == 1:
+        return f"{100 * c['mean']:.2f}"
+    return f"{100 * c['mean']:.2f} ± {100 * c['ci']:.2f}"
+
+
+def write_table(rows: list[dict], path: Path, *, title: str = "JANUS ablation"):
+    lines = [f"# {title}", ""]
+    lines.append(f"3-seed mean ± 95% t-CI AUROC (%). Seeds = {SEEDS}.")
+    lines.append("")
+    header = ["Variant", "What removed"] + [PRETTY[ds] for ds in DATASETS] + ["Mean"]
+    lines.append("| " + " | ".join(header) + " |")
+    lines.append("|" + "|".join("---" for _ in header) + "|")
+    for r in rows:
+        cells = [r["label"], r["what_removed"]]
+        ds_means = []
+        for ds in DATASETS:
+            c = r["per_dataset"].get(ds)
+            cells.append(_fmt_cell(c))
+            if c is not None:
+                ds_means.append(c["mean"])
+        cells.append(f"{100 * np.mean(ds_means):.2f}" if ds_means else "—")
+        lines.append("| " + " | ".join(cells) + " |")
+    lines.append("")
+    path.parent.mkdir(parents=True, exist_ok=True)
+    path.write_text("\n".join(lines))
+
+
+def write_delong(records: list[dict], path: Path):
+    lines = ["# Paired DeLong p-values vs JANUS-full",
+             "",
+             f"Seeds = {SEEDS}. p reported per (variant, dataset, seed). "
+             "Holm-Bonferroni-correctable; raw p shown.",
+             ""]
+    for rec in records:
+        lines.append(f"## {rec['label']}  ({rec['vid']})")
+        lines.append("")
+        header = ["Seed"] + [PRETTY[ds] for ds in DATASETS]
+        lines.append("| " + " | ".join(header) + " |")
+        lines.append("|" + "|".join("---" for _ in header) + "|")
+        for seed in SEEDS:
+            row = [str(seed)]
+            for ds in DATASETS:
+                hits = [x for x in rec["delong"][ds] if x["seed"] == seed]
+                if hits:
+                    h = hits[0]
+                    sign = "+" if h["delta"] >= 0 else "−"
+                    row.append(f"Δ={sign}{abs(h['delta']):.4f}, p={h['p']:.3g}")
+                else:
+                    row.append("—")
+            lines.append("| " + " | ".join(row) + " |")
+        lines.append("")
+    path.parent.mkdir(parents=True, exist_ok=True)
+    path.write_text("\n".join(lines))
+
+
+# --------------------------------------------------------------------------- #
+# Main                                                                        #
+# --------------------------------------------------------------------------- #
+def main() -> None:
+    ap = argparse.ArgumentParser()
+    ap.add_argument("--group", choices=["A", "B", "all"], default="A")
+    ap.add_argument("--delong", action="store_true",
+                    help="Compute paired DeLong p-values vs JANUS-full (CPU heavy on big eval sets).")
+    args = ap.parse_args()
+
+    ABL.mkdir(parents=True, exist_ok=True)
+    specs: list[VariantSpec] = []
+    if args.group in ("A", "all"):
+        specs.extend(_group_a_specs())
+    if args.group in ("B", "all"):
+        specs.extend(_group_b_specs())
+
+    rows = []
+    for spec in specs:
+        r = collect_variant(spec)
+        rows.append(r)
+        n_ok = sum(1 for ds in DATASETS if r["per_dataset"][ds] is not None)
+        print(f"[ok] {spec.vid:14s}  datasets_with_data={n_ok}/{len(DATASETS)}", flush=True)
+
+    out_md = ABL / f"ABLATION_TABLE_{args.group}.md"
+    write_table(rows, out_md, title=f"JANUS ablation (group {args.group})")
+    out_json = ABL / f"ABLATION_TABLE_{args.group}.json"
+    out_json.write_text(json.dumps(rows, indent=2, default=lambda o: None))
+    print(f"[wrote] {out_md}")
+    print(f"[wrote] {out_json}")
+
+    if args.delong:
+        ref = next(s for s in _group_a_specs() if s.vid == "JANUS-full")
+        recs = []
+        for spec in specs:
+            if spec.vid == "JANUS-full":
+                continue
+            d = collect_delong_pvals(spec, ref)
+            recs.append({"vid": spec.vid, "label": spec.label, "delong": d})
+            print(f"[delong] {spec.vid}", flush=True)
+        write_delong(recs, ABL / f"ABLATION_DELONG_{args.group}.md")
+        print(f"[wrote] {ABL / f'ABLATION_DELONG_{args.group}.md'}")
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/aggregate/aggregate_ablation_cross.py
+++ b/scripts/aggregate/aggregate_ablation_cross.py
@@ -0,0 +1,218 @@
+"""Cross-dataset version of the Group-A score-aggregator ablation.
+
+For each (src, tgt, seed) cell we have a phase1-style npz with:
+  b_<key>   target benign val  (aggregator fit on this)
+  a_<key>   target attacks
+
+Within-dataset (src == tgt) cells reuse the standard
+artifacts/route_comparison/janus_<ds>_seed<S>/phase1_scores.npz
+(val_/atk_ prefixes — handled via the same _load_npz path).
+
+We score 7 aggregators (A1..A7) + JANUS-full's deployed A5 across all
+3×3 cells × 3 seeds, then summarize with two complementary views:
+
+  ABLATION_TABLE_CROSS_summary.md
+    | Aggregator | Within mean | Cross mean | Cross min (worst cell) |
+    Shows whether OAS's value lives in cross-dataset robustness.
+
+  ABLATION_TABLE_CROSS_full.md
+    Per-aggregator full 3×3 matrix (each cell = 3-seed mean ± 95% t-CI).
+"""
+from __future__ import annotations
+import argparse
+import json
+from pathlib import Path
+import numpy as np
+
+from aggregate_ablation import (
+    SCORE_FNS, T_975_N3, _auroc, _load_npz, _load_cross_npz,
+)
+
+ROOT = Path(__file__).resolve().parents[2]
+ROUTE = ROOT / "artifacts" / "route_comparison"
+CROSS = ROUTE / "cross"
+ABL = ROOT / "artifacts" / "ablation"
+
+# 3x3 cross matrix datasets (no ISCXTor16 — different feature space)
+CROSS_DATASETS = ["cicids2017", "cicddos2019", "ciciot2023"]
+PRETTY = {
+    "cicids2017": "CICIDS17",
+    "cicddos2019": "CICDDoS19",
+    "ciciot2023": "CICIoT23",
+}
+SEEDS = [42, 43, 44]
+
+AGGREGATORS = [
+    ("JANUS-full (A5)", "A5_OAS_all10",     "deployed JANUS"),
+    ("A1 terminal_norm","A1_terminal_norm", "raw scalar (CFM head)"),
+    ("A2 disc_total",   "A2_disc_nll_total","raw scalar (disc head)"),
+    ("A3 OAS term3",    "A3_OAS_term3",     "OAS on 3 cont sub-scores"),
+    ("A4 OAS disc7",    "A4_OAS_disc7",     "OAS on 7 disc sub-scores"),
+    ("A6 z-score mean", "A6_zmean_all10",   "equal-weight z-score sum"),
+    ("A7 z-score max",  "A7_zmax_all10",    "equal-weight z-score max"),
+]
+
+
+# --------------------------------------------------------------------------- #
+def _cell_path(src: str, tgt: str, seed: int) -> Path | None:
+    """Return npz path for (src, tgt, seed) cell, or None if missing."""
+    if src == tgt:
+        p = ROUTE / f"janus_{src}_seed{seed}" / "phase1_scores.npz"
+        return p if p.exists() else None
+    p = CROSS / f"janus_seed{seed}_{src}_to_{tgt}.npz"
+    return p if p.exists() else None
+
+
+def _load_cell(src: str, tgt: str, seed: int):
+    p = _cell_path(src, tgt, seed)
+    if p is None:
+        return None, None
+    if src == tgt:
+        return _load_npz(p)
+    return _load_cross_npz(p)
+
+
+def _score_cell(src: str, tgt: str, seed: int, score_fn_id: str) -> float | None:
+    val, atk = _load_cell(src, tgt, seed)
+    if val is None:
+        return None
+    fn = SCORE_FNS[score_fn_id]
+    res = fn(val, atk)
+    if res is None:
+        return None
+    sv, sa = res
+    return _auroc(sv, sa)
+
+
+def _seed_means(src: str, tgt: str, score_fn_id: str) -> dict | None:
+    """3-seed AUROC for cell (src,tgt). Returns dict with mean/std/ci, or None."""
+    vals = []
+    for seed in SEEDS:
+        v = _score_cell(src, tgt, seed, score_fn_id)
+        if v is not None and not np.isnan(v):
+            vals.append(v)
+    if not vals:
+        return None
+    a = np.asarray(vals)
+    if a.size == 1:
+        return {"mean": float(a[0]), "std": 0.0, "ci": 0.0, "n": 1, "vals": a.tolist()}
+    se = a.std(ddof=1) / np.sqrt(a.size)
+    return {
+        "mean": float(a.mean()),
+        "std": float(a.std(ddof=1)),
+        "ci": float(T_975_N3 * se) if a.size == 3 else float(1.96 * se),
+        "n": int(a.size),
+        "vals": a.tolist(),
+    }
+
+
+# --------------------------------------------------------------------------- #
+def _fmt_cell(c):
+    if c is None:
+        return "—"
+    if c["n"] == 1:
+        return f"{100 * c['mean']:.2f}"
+    return f"{100 * c['mean']:.2f} ± {100 * c['ci']:.2f}"
+
+
+def _summary_row(rows_3x3: dict[tuple[str, str], dict | None]) -> tuple[float, float, float, dict | None]:
+    """Return (within_mean, cross_mean, cross_worst, worst_cell_summary)."""
+    within = []
+    cross = []
+    worst_v = None
+    worst_cell = None
+    for (src, tgt), cell in rows_3x3.items():
+        if cell is None:
+            continue
+        if src == tgt:
+            within.append(cell["mean"])
+        else:
+            cross.append(cell["mean"])
+            if worst_v is None or cell["mean"] < worst_v:
+                worst_v = cell["mean"]
+                worst_cell = (src, tgt, cell)
+    w = float(np.mean(within)) if within else float("nan")
+    c = float(np.mean(cross)) if cross else float("nan")
+    cw = worst_v if worst_v is not None else float("nan")
+    return w, c, cw, worst_cell
+
+
+# --------------------------------------------------------------------------- #
+def main() -> None:
+    ap = argparse.ArgumentParser()
+    ap.add_argument("--out-dir", type=Path, default=ABL)
+    args = ap.parse_args()
+    args.out_dir.mkdir(parents=True, exist_ok=True)
+
+    full = {}  # aggregator label -> {(src, tgt) -> cell summary}
+    for label, fn_id, _why in AGGREGATORS:
+        rows = {}
+        for src in CROSS_DATASETS:
+            for tgt in CROSS_DATASETS:
+                rows[(src, tgt)] = _seed_means(src, tgt, fn_id)
+        full[label] = rows
+        n_ok = sum(1 for v in rows.values() if v is not None)
+        print(f"[ok] {label:20s} cells={n_ok}/{len(rows)}", flush=True)
+
+    # Summary table: within mean, cross mean, cross worst
+    summary_lines = ["# Cross-dataset Group-A summary",
+                     "",
+                     f"3-seed mean ± 95% t-CI AUROC. Datasets = {CROSS_DATASETS}.",
+                     "Aggregator fit on **target** benign val only.",
+                     "",
+                     "| Aggregator | Within (3 cells, mean) | Cross (6 cells, mean) | Cross worst cell | Within − Cross |",
+                     "|---|---|---|---|---|"]
+    summary_data = {}
+    for label, fn_id, _why in AGGREGATORS:
+        rows = full[label]
+        w, c, cw, worst_cell = _summary_row(rows)
+        gap = (w - c) * 100 if not np.isnan(w) and not np.isnan(c) else float("nan")
+        worst_str = "—"
+        if worst_cell is not None:
+            src, tgt, cell = worst_cell
+            worst_str = f"{PRETTY[src]}→{PRETTY[tgt]}: {_fmt_cell(cell)}"
+        summary_lines.append(
+            f"| {label} | {100 * w:.2f} | {100 * c:.2f} | {worst_str} | {gap:+.2f} |"
+        )
+        summary_data[label] = {"within_mean": w, "cross_mean": c, "cross_worst": cw, "worst_cell": worst_cell}
+    summary_path = args.out_dir / "ABLATION_TABLE_CROSS_summary.md"
+    summary_path.write_text("\n".join(summary_lines) + "\n")
+    print(f"[wrote] {summary_path}")
+
+    # Full per-aggregator 3x3 matrices
+    full_lines = ["# Cross-dataset Group-A full matrices",
+                  "",
+                  "Per aggregator: 3×3 matrix (rows = source / training, columns = target / test).",
+                  "Each cell = 3-seed mean ± 95% t-CI AUROC (%). Diagonal italic = within-dataset.",
+                  ""]
+    for label, fn_id, why in AGGREGATORS:
+        full_lines.append(f"## {label}  ({why})")
+        full_lines.append("")
+        header = ["Source ↓ / Target →"] + [PRETTY[d] for d in CROSS_DATASETS]
+        full_lines.append("| " + " | ".join(header) + " |")
+        full_lines.append("|" + "|".join("---" for _ in header) + "|")
+        for src in CROSS_DATASETS:
+            row = [f"**{PRETTY[src]}**"]
+            for tgt in CROSS_DATASETS:
+                cell = full[label][(src, tgt)]
+                txt = _fmt_cell(cell)
+                if src == tgt:
+                    txt = f"_{txt}_"
+                row.append(txt)
+            full_lines.append("| " + " | ".join(row) + " |")
+        full_lines.append("")
+    full_path = args.out_dir / "ABLATION_TABLE_CROSS_full.md"
+    full_path.write_text("\n".join(full_lines))
+    print(f"[wrote] {full_path}")
+
+    json_path = args.out_dir / "ABLATION_TABLE_CROSS.json"
+    json_path.write_text(json.dumps({
+        "summary": summary_data,
+        "full": {label: {f"{src}->{tgt}": cell for (src, tgt), cell in rows.items()}
+                 for label, rows in full.items()},
+    }, indent=2, default=lambda o: None))
+    print(f"[wrote] {json_path}")
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/aggregate/aggregate_ablation_cross_B.py
+++ b/scripts/aggregate/aggregate_ablation_cross_B.py
@@ -0,0 +1,180 @@
+"""B-variant cross-dataset aggregation.
+
+Reads:
+  artifacts/ablation/janus_<ds>_seed<S>_<gid>/phase1_scores.npz   (within-dataset)
+  artifacts/ablation/cross/<gid>__seed<S>_<src>_to_<tgt>.npz      (cross-dataset)
+
+For each B-variant we apply the variant-appropriate aggregator (auto-key OAS
+fits whatever sub-scores the variant produces). JANUS-full anchor is read from
+the production route_comparison/ paths.
+
+Outputs:
+  ABLATION_CROSS_B_summary.md   within mean / cross mean / cross worst per gid
+  ABLATION_CROSS_B_full.md      per-gid 3×3 matrices
+"""
+from __future__ import annotations
+import argparse
+import json
+from pathlib import Path
+import numpy as np
+
+from aggregate_ablation import (
+    SCORE_FNS, T_975_N3, _auroc, _load_npz, _load_cross_npz,
+)
+
+ROOT = Path(__file__).resolve().parents[2]
+ROUTE = ROOT / "artifacts" / "route_comparison"
+ROUTE_CROSS = ROUTE / "cross"
+ABL = ROOT / "artifacts" / "ablation"
+ABL_CROSS = ABL / "cross"
+
+CROSS_DATASETS = ["cicids2017", "cicddos2019", "ciciot2023"]
+PRETTY = {
+    "cicids2017": "CICIDS17",
+    "cicddos2019": "CICDDoS19",
+    "ciciot2023": "CICIoT23",
+}
+SEEDS = [42, 43, 44]
+
+# (gid, label, what_removed, score_fn_id)
+B_VARIANTS = [
+    ("janus_full",  "JANUS-full",      "—",                       "OAS_all_available"),
+    ("b1_noflow",   "B1 no FLOW token","global context",          "OAS_all_available"),
+    ("b2_flowonly", "B2 flow-only",    "packet sequence",         "A1_terminal_norm"),
+    ("b3_allcont",  "B3 all-cont",     "cont/disc split",         "OAS_term_all"),
+    ("b4_alldisc", "B4 all-disc",      "cont/disc split (rev)",   "OAS_disc_all"),
+    ("b5_nodisc",   "B5 λ_disc=0",     "joint training",          "OAS_term_all"),
+]
+
+
+def _within_path(gid: str, ds: str, seed: int) -> Path:
+    if gid == "janus_full":
+        return ROUTE / f"janus_{ds}_seed{seed}" / "phase1_scores.npz"
+    return ABL / f"janus_{ds}_seed{seed}_{gid}" / "phase1_scores.npz"
+
+
+def _cross_path(gid: str, src: str, tgt: str, seed: int) -> Path:
+    if gid == "janus_full":
+        return ROUTE_CROSS / f"janus_seed{seed}_{src}_to_{tgt}.npz"
+    return ABL_CROSS / f"{gid}__seed{seed}_{src}_to_{tgt}.npz"
+
+
+def _cell_score(gid: str, src: str, tgt: str, seed: int, fn_id: str):
+    if src == tgt:
+        p = _within_path(gid, src, seed)
+        if not p.exists():
+            return None
+        val, atk = _load_npz(p)
+    else:
+        p = _cross_path(gid, src, tgt, seed)
+        if not p.exists():
+            return None
+        val, atk = _load_cross_npz(p)
+    fn = SCORE_FNS[fn_id]
+    res = fn(val, atk)
+    if res is None:
+        return None
+    sv, sa = res
+    return _auroc(sv, sa)
+
+
+def _seed_summary(vals: list[float]):
+    a = np.asarray([v for v in vals if v is not None and not np.isnan(v)])
+    if a.size == 0:
+        return None
+    if a.size == 1:
+        return {"mean": float(a[0]), "ci": 0.0, "n": 1}
+    se = a.std(ddof=1) / np.sqrt(a.size)
+    return {"mean": float(a.mean()),
+            "ci": float(T_975_N3 * se) if a.size == 3 else float(1.96 * se),
+            "n": int(a.size)}
+
+
+def _fmt(c):
+    if c is None:
+        return "—"
+    if c["n"] == 1:
+        return f"{100 * c['mean']:.2f}"
+    return f"{100 * c['mean']:.2f} ± {100 * c['ci']:.2f}"
+
+
+def main() -> None:
+    ap = argparse.ArgumentParser()
+    ap.add_argument("--out-dir", type=Path, default=ABL)
+    args = ap.parse_args()
+    args.out_dir.mkdir(parents=True, exist_ok=True)
+
+    full = {}
+    for gid, label, _why, fn_id in B_VARIANTS:
+        rows = {}
+        for src in CROSS_DATASETS:
+            for tgt in CROSS_DATASETS:
+                vals = [_cell_score(gid, src, tgt, s, fn_id) for s in SEEDS]
+                rows[(src, tgt)] = _seed_summary(vals)
+        full[gid] = (label, rows)
+        n_ok = sum(1 for v in rows.values() if v is not None)
+        print(f"[ok] {label:20s} cells={n_ok}/{len(rows)}", flush=True)
+
+    # Summary
+    lines = ["# B-variant cross-dataset summary",
+             "",
+             f"3-seed mean ± 95% t-CI AUROC. Datasets = {CROSS_DATASETS}.",
+             "All B variants share the same aggregator-fit-on-target-benign protocol as JANUS-full.",
+             "",
+             "| Variant | What removed | Within (3 cells) | Cross (6 cells) | Cross worst | Within − Cross |",
+             "|---|---|---|---|---|---|"]
+    for gid, label, why, fn_id in B_VARIANTS:
+        _, rows = full[gid]
+        within = [v["mean"] for (s, t), v in rows.items() if s == t and v is not None]
+        cross = [v["mean"] for (s, t), v in rows.items() if s != t and v is not None]
+        cross_pairs = [((s, t), v) for (s, t), v in rows.items() if s != t and v is not None]
+        worst = min(cross_pairs, key=lambda x: x[1]["mean"], default=None)
+        w = float(np.mean(within)) if within else float("nan")
+        c = float(np.mean(cross)) if cross else float("nan")
+        worst_str = "—"
+        if worst is not None:
+            (s, t), v = worst
+            worst_str = f"{PRETTY[s]}→{PRETTY[t]}: {_fmt(v)}"
+        gap = (w - c) * 100 if not np.isnan(w) and not np.isnan(c) else float("nan")
+        lines.append(f"| {label} | {why} | {100 * w:.2f} | {100 * c:.2f} | {worst_str} | {gap:+.2f} |")
+    summary_path = args.out_dir / "ABLATION_CROSS_B_summary.md"
+    summary_path.write_text("\n".join(lines) + "\n")
+    print(f"[wrote] {summary_path}")
+
+    # Full per-variant 3x3 matrices
+    flines = ["# B-variant cross-dataset full matrices",
+              "",
+              "Per variant: 3×3 matrix (rows = source, columns = target). Diagonal italic.",
+              "Each cell = 3-seed mean ± 95% t-CI AUROC (%).",
+              ""]
+    for gid, label, why, fn_id in B_VARIANTS:
+        _, rows = full[gid]
+        flines.append(f"## {label}  ({why})")
+        flines.append("")
+        header = ["Source ↓ / Target →"] + [PRETTY[d] for d in CROSS_DATASETS]
+        flines.append("| " + " | ".join(header) + " |")
+        flines.append("|" + "|".join("---" for _ in header) + "|")
+        for src in CROSS_DATASETS:
+            row = [f"**{PRETTY[src]}**"]
+            for tgt in CROSS_DATASETS:
+                cell = rows[(src, tgt)]
+                txt = _fmt(cell)
+                if src == tgt:
+                    txt = f"_{txt}_"
+                row.append(txt)
+            flines.append("| " + " | ".join(row) + " |")
+        flines.append("")
+    full_path = args.out_dir / "ABLATION_CROSS_B_full.md"
+    full_path.write_text("\n".join(flines))
+    print(f"[wrote] {full_path}")
+
+    json_path = args.out_dir / "ABLATION_CROSS_B.json"
+    json_path.write_text(json.dumps({
+        gid: {"label": label, "rows": {f"{s}->{t}": v for (s, t), v in rows.items()}}
+        for gid, (label, rows) in full.items()
+    }, indent=2, default=lambda o: None))
+    print(f"[wrote] {json_path}")
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/aggregate/baselines_cross_3x3_table.py
+++ b/scripts/aggregate/baselines_cross_3x3_table.py
@@ -0,0 +1,121 @@
+"""Aggregate IF/OCSVM 3x3 cross-dataset AUROC matrices (3-seed mean ± std).
+
+Reads NPZs produced by scripts/baselines/run_if_ocsvm_cross.py:
+  {method}_{src}_to_{tgt}_seed{S}.npz  with keys b_score, a_score, a_labels
+
+Writes one Markdown table per method.
+"""
+from __future__ import annotations
+import argparse
+from pathlib import Path
+import numpy as np
+from sklearn.metrics import roc_auc_score
+
+REPO = Path(__file__).resolve().parents[2]
+
+DATASETS = ["cicids2017", "cicddos2019", "ciciot2023"]
+SEEDS = [42, 43, 44]
+DEFAULT_METHODS = ["iforest", "ocsvm"]
+TITLE_NAMES = {
+    "iforest": "Isolation Forest",
+    "ocsvm": "OCSVM (RBF)",
+    "shafir_nf": "Shafir NF (single-flow, 20-d, fast)",
+}
+SHORT = {"cicids2017": "CICIDS17", "cicddos2019": "CICDDoS19", "ciciot2023": "CICIoT23"}
+
+
+def cell_auroc(npz_path: Path) -> tuple[float, int, int]:
+    z = np.load(npz_path, allow_pickle=True)
+    b = z["b_score"]
+    a = z["a_score"]
+    y = np.r_[np.zeros(len(b)), np.ones(len(a))]
+    s = np.r_[b, a]
+    s = np.nan_to_num(s, nan=0.0, posinf=1e12, neginf=-1e12)
+    return float(roc_auc_score(y, s)), len(b), len(a)
+
+
+def build_method_table(method: str, in_dir: Path) -> tuple[str, list[str]]:
+    cells = {}
+    counts = {}
+    missing = []
+    for src in DATASETS:
+        for tgt in DATASETS:
+            aucs = []
+            n_b = n_a = None
+            for s in SEEDS:
+                p = in_dir / f"{method}_{src}_to_{tgt}_seed{s}.npz"
+                if not p.exists():
+                    missing.append(p.name)
+                    continue
+                auc, n_b, n_a = cell_auroc(p)
+                aucs.append(auc)
+            if not aucs:
+                cells[(src, tgt)] = (float("nan"), float("nan"))
+            else:
+                a = np.asarray(aucs)
+                cells[(src, tgt)] = (a.mean(), a.std())
+            counts[(src, tgt)] = (n_b, n_a)
+
+    lines: list[str] = []
+    title_name = TITLE_NAMES.get(method, method)
+    lines.append(f"# 3×3 cross-dataset AUROC matrix — {title_name} (3-seed mean ± std)\n")
+    lines.append("Rows = source (10K benign training); columns = target (10K benign + balanced ≤1M attacks).")
+    lines.append("Trained on raw 20-d canonical flow features after `StandardScaler` fit on source benign train.")
+    lines.append("Diagonal italic = within-dataset (target benign sampled from rows disjoint from training).\n")
+
+    header = "| Source ↓ / Target → | " + " | ".join(SHORT[t] for t in DATASETS) + " |"
+    sep = "|" + "|".join(["---"] * (len(DATASETS) + 1)) + "|"
+    lines.append(header)
+    lines.append(sep)
+    for src in DATASETS:
+        row = [f"**{SHORT[src]}**"]
+        for tgt in DATASETS:
+            m, sd = cells[(src, tgt)]
+            cell = f"{m:.4f} ± {sd:.4f}"
+            if src == tgt:
+                cell = f"_{cell}_"
+            row.append(cell)
+        lines.append("| " + " | ".join(row) + " |")
+
+    lines.append("\n## Sample counts (target benign / target attacks)\n")
+    lines.append(header)
+    lines.append(sep)
+    for src in DATASETS:
+        row = [SHORT[src]]
+        for tgt in DATASETS:
+            n_b, n_a = counts[(src, tgt)]
+            row.append(f"{n_b}b / {n_a}a" if n_b is not None else "missing")
+        lines.append("| " + " | ".join(row) + " |")
+    return "\n".join(lines) + "\n", missing
+
+
+def main() -> None:
+    p = argparse.ArgumentParser()
+    p.add_argument("--in-dir", type=Path,
+                   default=REPO / "artifacts/baselines/if_ocsvm_cross_2026_05_11")
+    p.add_argument("--out-md", type=Path,
+                   default=None,
+                   help="Combined markdown output path. Defaults to <in-dir>/CROSS_MATRIX_3x3.md")
+    p.add_argument("--methods", nargs="+", default=DEFAULT_METHODS,
+                   help="Method names to aggregate (matching NPZ filename prefixes).")
+    args = p.parse_args()
+
+    out_md = args.out_md or (args.in_dir / "CROSS_MATRIX_3x3.md")
+    parts = []
+    all_missing: list[str] = []
+    for method in args.methods:
+        block, missing = build_method_table(method, args.in_dir)
+        parts.append(block)
+        all_missing.extend(missing)
+        print(block)
+        print()
+    if all_missing:
+        print("# Missing inputs (counted as NaN cells)")
+        for m in all_missing:
+            print(f"  - {m}")
+    out_md.write_text("\n\n".join(parts))
+    print(f"[wrote] {out_md}")
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/aggregate/run_all_phase1.sh
+++ b/scripts/aggregate/run_all_phase1.sh
@@ -1,68 +0,0 @@
-#!/bin/bash
-# Run phase1 eval on all routes after trainings complete.
-# Splits across 2 GPUs in parallel chains.
-
-set -e
-ROOT=/home/chy/JANUS
-UNIFIED_EVAL=${ROOT}/artifacts/verify_2026_04_24/eval_phase1_unified.py
-MIXED_EVAL=${ROOT}/Mixed_CFM/eval_phase1.py
-
-cd ${ROOT}
-
-# GPU 0: baselines + route_a (6 models)
-{
-for prefix in baseline_ciciot2023 route_a_causal_ciciot2023; do
-  for seed in 42 43 44; do
-    name=${prefix}_seed${seed}
-    md=${ROOT}/artifacts/route_comparison/${name}
-    [ -f "${md}/model.pt" ] || continue
-    [ -f "${md}/phase1_summary.json" ] && continue
-    echo "[GPU0 eval] ${name}"
-    cd ${ROOT}/Unified_CFM
-    CUDA_VISIBLE_DEVICES=0 stdbuf -oL uv run --no-sync python -u ${UNIFIED_EVAL} \
-      --model-dir ${md} --out-dir ${md} \
-      --batch-size 256 --n-steps 16 --jacobian-n-eps 4 \
-      --n-val-cap 5000 --n-atk-cap 10000 \
-      > ${md}/phase1.log 2>&1
-  done
-done
-echo "[GPU0 done]"
-} &
-GPU0_PID=$!
-
-# GPU 1: route_b + route_c (6 models)
-{
-for seed in 42 43 44; do
-  name=route_b_spectral_ciciot2023_seed${seed}
-  md=${ROOT}/artifacts/route_comparison/${name}
-  [ -f "${md}/model.pt" ] || continue
-  [ -f "${md}/phase1_summary.json" ] && continue
-  echo "[GPU1 eval] ${name}"
-  cd ${ROOT}/Unified_CFM
-  CUDA_VISIBLE_DEVICES=1 stdbuf -oL uv run --no-sync python -u ${UNIFIED_EVAL} \
-    --model-dir ${md} --out-dir ${md} \
-    --batch-size 256 --n-steps 16 --jacobian-n-eps 4 \
-    --n-val-cap 5000 --n-atk-cap 10000 \
-    > ${md}/phase1.log 2>&1
-done
-for seed in 42 43 44; do
-  name=route_c_mixed_ciciot2023_seed${seed}
-  md=${ROOT}/artifacts/route_comparison/${name}
-  [ -f "${md}/model.pt" ] || continue
-  [ -f "${md}/phase1_summary.json" ] && continue
-  echo "[GPU1 eval] ${name}"
-  cd ${ROOT}/Mixed_CFM
-  CUDA_VISIBLE_DEVICES=1 stdbuf -oL uv run --no-sync python -u ${MIXED_EVAL} \
-    --model-dir ${md} --out-dir ${md} \
-    --batch-size 256 --n-steps 16 \
-    --n-val-cap 5000 --n-atk-cap 10000 \
-    > ${md}/phase1.log 2>&1
-done
-echo "[GPU1 done]"
-} &
-GPU1_PID=$!
-
-wait $GPU0_PID
-wait $GPU1_PID
-echo "[all phase1 done]"
-cd ${ROOT} && uv run --no-sync python artifacts/route_comparison/aggregate_results.py
--- a/scripts/aggregate/run_cross_all.sh
+++ b/scripts/aggregate/run_cross_all.sh
@@ -1,105 +0,0 @@
-#!/bin/bash
-# Cross-dataset eval for all 4 routes × 2 targets × 3 seeds = 24 runs.
-# Source: CICIoT2023 (where all models were trained).
-# Targets: CICIDS2017 + CICDDoS2019.
-
-set -e
-ROOT=/home/chy/JANUS
-UNIFIED_EVAL=${ROOT}/artifacts/verify_2026_04_24/eval_phase2_cross_cicddos2019.py
-MIXED_EVAL=${ROOT}/Mixed_CFM/eval_cross.py
-CROSS_DIR=${ROOT}/artifacts/route_comparison/cross
-mkdir -p ${CROSS_DIR}
-
-# Target dataset paths
-declare -A TARGETS
-TARGETS[cicids2017_store]=${ROOT}/datasets/cicids2017/processed/full_store
-TARGETS[cicids2017_flows]=${ROOT}/datasets/cicids2017/processed/flows.parquet
-TARGETS[cicids2017_features]=${ROOT}/datasets/cicids2017/processed/flow_features.parquet
-TARGETS[cicids2017_features_spectral]=${ROOT}/datasets/cicids2017/processed/flow_features_spectral.parquet
-
-TARGETS[cicddos2019_store]=${ROOT}/datasets/cicddos2019/processed/full_store
-TARGETS[cicddos2019_flows]=${ROOT}/datasets/cicddos2019/processed/flows.parquet
-TARGETS[cicddos2019_features]=${ROOT}/datasets/cicddos2019/processed/flow_features.parquet
-TARGETS[cicddos2019_features_spectral]=${ROOT}/datasets/cicddos2019/processed/flow_features_spectral.parquet
-
-run_unified_eval() {
-  local gpu=$1 model_dir=$2 target=$3 features=$4 out_name=$5
-  local out=${CROSS_DIR}/${out_name}.json
-  [ -f "${out}" ] && { echo "[skip] ${out_name}"; return; }
-  echo "[gpu${gpu} eval] ${out_name}"
-  cd ${ROOT}/Unified_CFM
-  CUDA_VISIBLE_DEVICES=${gpu} stdbuf -oL uv run --no-sync python -u ${UNIFIED_EVAL} \
-    --model-dir ${model_dir} \
-    --target-store ${TARGETS[${target}_store]} \
-    --target-flows ${TARGETS[${target}_flows]} \
-    --target-flow-features ${features} \
-    --out ${out} \
-    --n-benign 10000 --n-attack 10000 --seed 42 \
-    --T 64 --batch-size 256 --n-steps 16 \
-    > ${CROSS_DIR}/${out_name}.log 2>&1
-}
-
-run_mixed_eval() {
-  local gpu=$1 model_dir=$2 target=$3 out_name=$4
-  local out=${CROSS_DIR}/${out_name}.json
-  [ -f "${out}" ] && { echo "[skip] ${out_name}"; return; }
-  echo "[gpu${gpu} mixed eval] ${out_name}"
-  cd ${ROOT}/Mixed_CFM
-  CUDA_VISIBLE_DEVICES=${gpu} stdbuf -oL uv run --no-sync python -u ${MIXED_EVAL} \
-    --model-dir ${model_dir} \
-    --target-store ${TARGETS[${target}_store]} \
-    --target-flows ${TARGETS[${target}_flows]} \
-    --target-flow-features ${TARGETS[${target}_features]} \
-    --out ${out} \
-    --n-benign 10000 --n-attack 10000 --seed 42 \
-    --T 64 --batch-size 256 --n-steps 16 \
-    > ${CROSS_DIR}/${out_name}.log 2>&1
-}
-
-# === GPU 0 chain: baselines + route_a, both targets ===
-{
-for prefix_route in "baseline_ciciot2023:baseline" "route_a_causal_ciciot2023:route_a_causal"; do
-  prefix=${prefix_route%:*}
-  short=${prefix_route#*:}
-  for seed in 42 43 44; do
-    md=${ROOT}/artifacts/route_comparison/${prefix}_seed${seed}
-    [ -f "${md}/model.pt" ] || continue
-    for target in cicids2017 cicddos2019; do
-      run_unified_eval 0 "${md}" "${target}" "${TARGETS[${target}_features]}" \
-        "${short}_seed${seed}_to_${target}"
-    done
-  done
-done
-echo "[gpu0 cross chain done]"
-} > /tmp/cross_gpu0.log 2>&1 &
-GPU0=$!
-
-# === GPU 1 chain: route_b (uses spectral features) + route_c (mixed) ===
-{
-# route_b: must use flow_features_spectral.parquet
-for seed in 42 43 44; do
-  md=${ROOT}/artifacts/route_comparison/route_b_spectral_ciciot2023_seed${seed}
-  [ -f "${md}/model.pt" ] || continue
-  for target in cicids2017 cicddos2019; do
-    run_unified_eval 1 "${md}" "${target}" "${TARGETS[${target}_features_spectral]}" \
-      "route_b_spectral_seed${seed}_to_${target}"
-  done
-done
-
-# route_c: Mixed_CFM eval (uses canonical flow_features)
-for seed in 42 43 44; do
-  md=${ROOT}/artifacts/route_comparison/route_c_mixed_ciciot2023_seed${seed}
-  [ -f "${md}/model.pt" ] || continue
-  for target in cicids2017 cicddos2019; do
-    run_mixed_eval 1 "${md}" "${target}" \
-      "route_c_mixed_seed${seed}_to_${target}"
-  done
-done
-echo "[gpu1 cross chain done]"
-} > /tmp/cross_gpu1.log 2>&1 &
-GPU1=$!
-
-wait $GPU0
-wait $GPU1
-echo "[all cross done]"
-ls -la ${CROSS_DIR}/*.json | wc -l
--- a/scripts/aggregate/run_phase1_all.sh
+++ b/scripts/aggregate/run_phase1_all.sh
@@ -1,45 +0,0 @@
-#!/bin/bash
-# Run phase1 eval on all route_comparison models.
-# Output: <model_dir>/phase1_summary.json + phase1_scores.npz
-#
-# Usage:
-#   bash artifacts/route_comparison/run_phase1_all.sh [GPU_ID]
-#
-# Default GPU_ID = 0. Each eval takes ~3-5 min with the caps below.
-
-set -e
-GPU_ID="${1:-0}"
-ROOT=/home/chy/JANUS
-EVAL=${ROOT}/artifacts/verify_2026_04_24/eval_phase1_unified.py
-
-models=(
-  baseline_ciciot2023_seed42
-  baseline_ciciot2023_seed43
-  baseline_ciciot2023_seed44
-  route_a_causal_ciciot2023_seed42
-  route_a_causal_ciciot2023_seed43
-  route_a_causal_ciciot2023_seed44
-)
-
-cd ${ROOT}/Unified_CFM
-for name in "${models[@]}"; do
-  model_dir=${ROOT}/artifacts/route_comparison/${name}
-  if [ ! -f "${model_dir}/model.pt" ]; then
-    echo "[skip] ${name}: model.pt missing"
-    continue
-  fi
-  out_dir=${model_dir}
-  if [ -f "${out_dir}/phase1_summary.json" ]; then
-    echo "[skip] ${name}: phase1_summary.json exists"
-    continue
-  fi
-  echo "[eval] ${name}"
-  CUDA_VISIBLE_DEVICES=${GPU_ID} stdbuf -oL uv run --no-sync python -u ${EVAL} \
-    --model-dir ${model_dir} --out-dir ${out_dir} \
-    --batch-size 256 --n-steps 16 \
-    --jacobian-n-eps 4 \
-    --n-val-cap 5000 --n-atk-cap 10000 \
-    2>&1 | tee ${model_dir}/phase1.log | tail -5
-  echo "[done] ${name}"
-done
-echo "[all done]"
--- a/scripts/baselines/run_if_ocsvm_cross.py
+++ b/scripts/baselines/run_if_ocsvm_cross.py
@@ -0,0 +1,237 @@
+"""Cross-dataset baselines (Isolation Forest, OCSVM) on the 20-d canonical
+flow-feature contract.
+
+Protocol per (method, src, tgt, seed):
+  - Train: 10,000 source benign rows (random sample seeded with --seed + 1000)
+  - Test:  10,000 target benign rows (random sample seeded with --seed)
+         + balanced per-class attack sample with n_attack cap (--n-attack
+           default 1,000,000, divided across all attack classes, matching
+           Mixed_CFM/eval_cross.py)
+  - For diagonal src == tgt, target benign is sampled from the source-pool
+    complement (the rows not used for training) so train and test are disjoint.
+
+Outputs (in --out-dir):
+  {method}_{src}_to_{tgt}_seed{seed}.npz  -- b_score, a_score, a_labels
+  {method}_{src}_to_{tgt}_seed{seed}.json -- AUROC, AUPRC, sample counts, timing
+"""
+from __future__ import annotations
+import argparse
+import json
+import time
+from pathlib import Path
+
+import numpy as np
+import pandas as pd
+from sklearn.ensemble import IsolationForest
+from sklearn.metrics import average_precision_score, roc_auc_score
+from sklearn.preprocessing import StandardScaler
+from sklearn.svm import OneClassSVM
+
+REPO = Path(__file__).resolve().parents[2]
+
+DATASETS = {
+    "cicids2017": {
+        "flows": REPO / "datasets/cicids2017/processed/flows.parquet",
+        "flow_features": REPO / "datasets/cicids2017/processed/flow_features.parquet",
+    },
+    "cicddos2019": {
+        "flows": REPO / "datasets/cicddos2019/processed/flows.parquet",
+        "flow_features": REPO / "datasets/cicddos2019/processed/flow_features.parquet",
+    },
+    "ciciot2023": {
+        "flows": REPO / "datasets/ciciot2023/processed/full_store/flows.parquet",
+        "flow_features": REPO / "datasets/ciciot2023/processed/flow_features.parquet",
+    },
+}
+
+FEATURE_COLS = (
+    "log_duration", "log_n_pkts", "fwd_count", "bwd_count",
+    "pkt_size_mean", "pkt_size_std", "pkt_size_max",
+    "fwd_size_mean", "bwd_size_mean", "bwd_size_std",
+    "iat_mean", "fwd_iat_max", "bwd_iat_max", "bwd_iat_std",
+    "active_mean", "idle_mean",
+    "log_pkts_per_s", "log_total_bytes",
+    "ack_cnt", "syn_cnt",
+)
+
+
+def _load_dataset(name: str):
+    paths = DATASETS[name]
+    flows = pd.read_parquet(paths["flows"], columns=["flow_id", "label"])
+    ff = pd.read_parquet(paths["flow_features"])
+    if not np.array_equal(
+        flows["flow_id"].to_numpy(dtype=np.uint64),
+        ff["flow_id"].to_numpy(dtype=np.uint64),
+    ):
+        raise ValueError(f"{name}: flows.parquet and flow_features.parquet are not row-aligned")
+    X = ff[list(FEATURE_COLS)].to_numpy(dtype=np.float64)
+    X = np.nan_to_num(X, nan=0.0, posinf=0.0, neginf=0.0).astype(np.float32)
+    labels = flows["label"].astype(str).to_numpy()
+    return X, labels
+
+
+def _balanced_attack_sample(labels: np.ndarray, n_attack: int, rng: np.random.Generator) -> np.ndarray:
+    attack_idx = np.flatnonzero(labels != "normal")
+    atk_labels = labels[attack_idx]
+    classes = sorted(set(atk_labels))
+    per_class = max(1, n_attack // len(classes))
+    chunks = []
+    for cls in classes:
+        pool = attack_idx[atk_labels == cls]
+        k = min(per_class, len(pool))
+        if k:
+            chunks.append(rng.choice(pool, size=k, replace=False))
+    sel = np.sort(np.concatenate(chunks))
+    if len(sel) > n_attack:
+        sel = np.sort(rng.choice(sel, size=n_attack, replace=False))
+    return sel
+
+
+def main() -> None:
+    p = argparse.ArgumentParser()
+    p.add_argument("--method", choices=["iforest", "ocsvm"], required=True)
+    p.add_argument("--src", choices=list(DATASETS), required=True)
+    p.add_argument("--tgt", choices=list(DATASETS), required=True)
+    p.add_argument("--seed", type=int, required=True)
+    p.add_argument("--out-dir", type=Path, required=True)
+    p.add_argument("--n-train", type=int, default=10000)
+    p.add_argument("--n-benign", type=int, default=10000)
+    p.add_argument("--n-attack", type=int, default=1_000_000,
+                   help="Per-class balanced cap (matches Mixed_CFM/eval_cross.py).")
+    # Method hyperparams
+    p.add_argument("--iforest-n-estimators", type=int, default=200)
+    p.add_argument("--ocsvm-nu", type=float, default=0.1)
+    p.add_argument("--ocsvm-gamma", type=str, default="scale")
+    p.add_argument("--ocsvm-cache-mb", type=int, default=2000)
+    args = p.parse_args()
+
+    args.out_dir.mkdir(parents=True, exist_ok=True)
+    tag = f"{args.method}_{args.src}_to_{args.tgt}_seed{args.seed}"
+    print(f"[run] {tag}")
+
+    # --- source training ---
+    t0 = time.time()
+    src_X, src_labels = _load_dataset(args.src)
+    src_benign_idx = np.flatnonzero(src_labels == "normal")
+    rng_train = np.random.default_rng(args.seed + 1000)
+    if len(src_benign_idx) < args.n_train:
+        raise RuntimeError(f"{args.src}: only {len(src_benign_idx)} benign rows < n_train={args.n_train}")
+    train_sel = np.sort(rng_train.choice(src_benign_idx, size=args.n_train, replace=False))
+    train_X = src_X[train_sel]
+    t_load_src = time.time() - t0
+
+    # --- target eval ---
+    t0 = time.time()
+    if args.tgt == args.src:
+        tgt_X, tgt_labels = src_X, src_labels
+        used_for_train = np.zeros(len(tgt_labels), dtype=bool)
+        used_for_train[train_sel] = True
+        eligible_benign = np.flatnonzero((tgt_labels == "normal") & ~used_for_train)
+    else:
+        tgt_X, tgt_labels = _load_dataset(args.tgt)
+        eligible_benign = np.flatnonzero(tgt_labels == "normal")
+    rng_eval = np.random.default_rng(args.seed)
+    n_benign = min(args.n_benign, len(eligible_benign))
+    if n_benign < args.n_benign:
+        print(f"[warn] only {len(eligible_benign)} eligible benign rows in target (asked {args.n_benign})")
+    b_sel = np.sort(rng_eval.choice(eligible_benign, size=n_benign, replace=False))
+    a_sel = _balanced_attack_sample(tgt_labels, args.n_attack, rng_eval)
+    val_X = tgt_X[b_sel]
+    atk_X = tgt_X[a_sel]
+    a_labels = tgt_labels[a_sel]
+    t_load_tgt = time.time() - t0
+    print(f"[data] train={len(train_X):,}  val={len(val_X):,}  attack={len(atk_X):,}"
+          f"  classes={len(set(a_labels))}  D={train_X.shape[1]}")
+
+    # --- standardize on source train ---
+    scaler = StandardScaler().fit(train_X)
+    train_Z = scaler.transform(train_X).astype(np.float32)
+    val_Z = scaler.transform(val_X).astype(np.float32)
+    atk_Z = scaler.transform(atk_X).astype(np.float32)
+
+    # --- fit ---
+    t0 = time.time()
+    if args.method == "iforest":
+        model = IsolationForest(
+            n_estimators=args.iforest_n_estimators,
+            random_state=args.seed,
+            n_jobs=-1,
+            contamination="auto",
+        )
+        model.fit(train_Z)
+    else:
+        model = OneClassSVM(
+            kernel="rbf",
+            nu=args.ocsvm_nu,
+            gamma=args.ocsvm_gamma,
+            cache_size=args.ocsvm_cache_mb,
+        )
+        model.fit(train_Z)
+    t_fit = time.time() - t0
+
+    # --- score: higher = more anomalous ---
+    # IsolationForest.score_samples returns higher-for-normal, so negate.
+    # OneClassSVM.score_samples returns signed distance to boundary
+    # (higher = more normal), so negate too.
+    t0 = time.time()
+    if args.method == "iforest":
+        b_score = (-model.score_samples(val_Z)).astype(np.float32)
+        a_score = (-model.score_samples(atk_Z)).astype(np.float32)
+    else:
+        b_score = (-model.decision_function(val_Z)).astype(np.float32)
+        a_score = (-model.decision_function(atk_Z)).astype(np.float32)
+    t_score = time.time() - t0
+
+    # --- metrics ---
+    y = np.r_[np.zeros(len(b_score)), np.ones(len(a_score))]
+    s = np.r_[b_score, a_score]
+    s = np.nan_to_num(s, nan=0.0, posinf=1e12, neginf=-1e12)
+    auroc = float(roc_auc_score(y, s))
+    auprc = float(average_precision_score(y, s))
+
+    per_class = {}
+    for cls in sorted(set(a_labels)):
+        m = a_labels == cls
+        y_c = np.r_[np.zeros(len(b_score)), np.ones(int(m.sum()))]
+        s_c = np.r_[b_score, a_score[m]]
+        s_c = np.nan_to_num(s_c, nan=0.0, posinf=1e12, neginf=-1e12)
+        try:
+            auc_c = float(roc_auc_score(y_c, s_c))
+        except ValueError:
+            auc_c = float("nan")
+        per_class[cls] = {"_n": int(m.sum()), "auroc": auc_c}
+
+    out = {
+        "method": args.method,
+        "src": args.src,
+        "tgt": args.tgt,
+        "seed": args.seed,
+        "n_train": int(len(train_X)),
+        "n_benign": int(len(val_X)),
+        "n_attack": int(len(atk_X)),
+        "n_attack_classes": int(len(set(a_labels))),
+        "t_load_src_sec": round(t_load_src, 2),
+        "t_load_tgt_sec": round(t_load_tgt, 2),
+        "t_fit_sec": round(t_fit, 2),
+        "t_score_sec": round(t_score, 2),
+        "overall": {"auroc": auroc, "auprc": auprc},
+        "per_class": per_class,
+    }
+    if args.method == "iforest":
+        out["hparams"] = {"n_estimators": args.iforest_n_estimators}
+    else:
+        out["hparams"] = {"nu": args.ocsvm_nu, "gamma": args.ocsvm_gamma}
+
+    json_path = args.out_dir / f"{tag}.json"
+    json_path.write_text(json.dumps(out, indent=2))
+    npz_path = args.out_dir / f"{tag}.npz"
+    np.savez_compressed(npz_path, b_score=b_score, a_score=a_score, a_labels=a_labels.astype(str))
+    print(f"[saved] {json_path}")
+    print(f"[saved] {npz_path}")
+    print(f"[result] {args.method:7s} {args.src} -> {args.tgt} seed={args.seed}  "
+          f"AUROC={auroc:.4f}  AUPRC={auprc:.4f}  "
+          f"fit={t_fit:.1f}s  score={t_score:.1f}s")
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/baselines/run_if_ocsvm_cross_all.sh
+++ b/scripts/baselines/run_if_ocsvm_cross_all.sh
@@ -0,0 +1,39 @@
+#!/usr/bin/env bash
+# Orchestrate the full 3x3 cross-dataset sweep for IF/OCSVM baselines.
+# 3 sources x 3 targets x 3 seeds x 2 methods = 54 runs.
+set -euo pipefail
+
+REPO="/home/chy/JANUS"
+cd "$REPO"
+
+OUT_DIR="${1:-$REPO/artifacts/baselines/if_ocsvm_cross_2026_05_11}"
+mkdir -p "$OUT_DIR"
+LOG_DIR="$OUT_DIR/logs"
+mkdir -p "$LOG_DIR"
+
+DATASETS=(cicids2017 cicddos2019 ciciot2023)
+SEEDS=(42 43 44)
+METHODS=(iforest ocsvm)
+
+START=$(date +%s)
+for method in "${METHODS[@]}"; do
+  for src in "${DATASETS[@]}"; do
+    for tgt in "${DATASETS[@]}"; do
+      for seed in "${SEEDS[@]}"; do
+        tag="${method}_${src}_to_${tgt}_seed${seed}"
+        if [[ -f "$OUT_DIR/${tag}.json" ]]; then
+          echo "[skip] $tag (json exists)"
+          continue
+        fi
+        echo "[start] $tag"
+        uv run --no-sync python scripts/baselines/run_if_ocsvm_cross.py \
+          --method "$method" --src "$src" --tgt "$tgt" --seed "$seed" \
+          --out-dir "$OUT_DIR" \
+          > "$LOG_DIR/${tag}.log" 2>&1
+        echo "[done]  $tag  ($(grep -F '[result]' "$LOG_DIR/${tag}.log" | tail -1))"
+      done
+    done
+  done
+done
+END=$(date +%s)
+echo "[all done] elapsed $((END - START))s"
--- a/scripts/baselines/run_if_ocsvm_cross_packets.py
+++ b/scripts/baselines/run_if_ocsvm_cross_packets.py
@@ -0,0 +1,233 @@
+"""Path-B: IF/OCSVM cross-dataset baselines on RAW PACKET SEQUENCES.
+
+Same protocol as run_if_ocsvm_cross.py, but the input feature vector is the
+flattened first T=64 packet tokens (9-d each) -> 576-d. No flow-stat
+aggregation — this is the input modality JANUS itself consumes, so it
+measures what classical AD can do without hand-engineered features.
+
+Outputs:
+  {method}_{src}_to_{tgt}_seed{seed}.{json,npz}
+"""
+from __future__ import annotations
+import argparse
+import json
+import sys
+import time
+from pathlib import Path
+
+import numpy as np
+import pandas as pd
+from sklearn.ensemble import IsolationForest
+from sklearn.metrics import average_precision_score, roc_auc_score
+from sklearn.preprocessing import StandardScaler
+from sklearn.svm import OneClassSVM
+
+REPO = Path(__file__).resolve().parents[2]
+sys.path.insert(0, str(REPO))
+from common.packet_store import PacketShardStore  # noqa: E402
+
+DATASETS = {
+    "cicids2017": {
+        "flows": REPO / "datasets/cicids2017/processed/flows.parquet",
+        "packets_npz": REPO / "datasets/cicids2017/processed/packets.npz",
+        "source_store": None,
+    },
+    "cicddos2019": {
+        "flows": REPO / "datasets/cicddos2019/processed/flows.parquet",
+        "packets_npz": None,
+        "source_store": REPO / "datasets/cicddos2019/processed/full_store",
+    },
+    "ciciot2023": {
+        "flows": REPO / "datasets/ciciot2023/processed/full_store/flows.parquet",
+        "packets_npz": None,
+        "source_store": REPO / "datasets/ciciot2023/processed/full_store",
+    },
+}
+
+
+def _load_labels(name: str) -> np.ndarray:
+    paths = DATASETS[name]
+    flows = pd.read_parquet(paths["flows"], columns=["flow_id", "label"])
+    return flows["label"].astype(str).to_numpy()
+
+
+def _materialize_packets(name: str, indices: np.ndarray, T: int) -> np.ndarray:
+    paths = DATASETS[name]
+    if paths["packets_npz"] is not None:
+        pz = np.load(paths["packets_npz"], mmap_mode="r")
+        tokens = pz["packet_tokens"]
+        if T > tokens.shape[1]:
+            raise ValueError(f"requested T={T} > stored {tokens.shape[1]}")
+        out = np.asarray(tokens[indices, :T, :]).astype(np.float32, copy=True)
+        return out
+    else:
+        store = PacketShardStore.open(paths["source_store"])
+        tok, _ = store.read_packets(indices.astype(np.int64), T=T)
+        return tok.astype(np.float32, copy=False)
+
+
+def _balanced_attack_sample(labels: np.ndarray, n_attack: int, rng: np.random.Generator) -> np.ndarray:
+    attack_idx = np.flatnonzero(labels != "normal")
+    atk_labels = labels[attack_idx]
+    classes = sorted(set(atk_labels))
+    per_class = max(1, n_attack // len(classes))
+    chunks = []
+    for cls in classes:
+        pool = attack_idx[atk_labels == cls]
+        k = min(per_class, len(pool))
+        if k:
+            chunks.append(rng.choice(pool, size=k, replace=False))
+    sel = np.sort(np.concatenate(chunks))
+    if len(sel) > n_attack:
+        sel = np.sort(rng.choice(sel, size=n_attack, replace=False))
+    return sel
+
+
+def main() -> None:
+    p = argparse.ArgumentParser()
+    p.add_argument("--method", choices=["iforest", "ocsvm"], required=True)
+    p.add_argument("--src", choices=list(DATASETS), required=True)
+    p.add_argument("--tgt", choices=list(DATASETS), required=True)
+    p.add_argument("--seed", type=int, required=True)
+    p.add_argument("--out-dir", type=Path, required=True)
+    p.add_argument("--T", type=int, default=64, help="Packets-per-flow cap (matches JANUS T=64).")
+    p.add_argument("--n-train", type=int, default=10000)
+    p.add_argument("--n-benign", type=int, default=10000)
+    p.add_argument("--n-attack", type=int, default=200000,
+                   help="Per-class balanced cap on target attacks. Smaller than the "
+                        "20-d run (1M) because 576-d OCSVM scoring is much slower.")
+    p.add_argument("--min-len", type=int, default=2)
+    # Method hyperparams
+    p.add_argument("--iforest-n-estimators", type=int, default=200)
+    p.add_argument("--ocsvm-nu", type=float, default=0.1)
+    p.add_argument("--ocsvm-gamma", type=str, default="scale")
+    p.add_argument("--ocsvm-cache-mb", type=int, default=2000)
+    args = p.parse_args()
+
+    args.out_dir.mkdir(parents=True, exist_ok=True)
+    tag = f"{args.method}_{args.src}_to_{args.tgt}_seed{args.seed}"
+    print(f"[run] {tag}  (raw {args.T}x9 packets = {args.T * 9}-d)")
+
+    # --- source training ---
+    t0 = time.time()
+    src_labels = _load_labels(args.src)
+    src_benign_idx = np.flatnonzero(src_labels == "normal")
+    rng_train = np.random.default_rng(args.seed + 1000)
+    if len(src_benign_idx) < args.n_train:
+        raise RuntimeError(f"{args.src}: only {len(src_benign_idx)} benign rows < n_train={args.n_train}")
+    train_sel = np.sort(rng_train.choice(src_benign_idx, size=args.n_train, replace=False))
+    train_tokens = _materialize_packets(args.src, train_sel, T=args.T)
+    train_X = train_tokens.reshape(len(train_sel), -1)
+    t_load_src = time.time() - t0
+
+    # --- target eval ---
+    t0 = time.time()
+    if args.tgt == args.src:
+        tgt_labels = src_labels
+        used = np.zeros(len(tgt_labels), dtype=bool)
+        used[train_sel] = True
+        eligible_benign = np.flatnonzero((tgt_labels == "normal") & ~used)
+    else:
+        tgt_labels = _load_labels(args.tgt)
+        eligible_benign = np.flatnonzero(tgt_labels == "normal")
+    rng_eval = np.random.default_rng(args.seed)
+    n_benign = min(args.n_benign, len(eligible_benign))
+    if n_benign < args.n_benign:
+        print(f"[warn] only {len(eligible_benign)} eligible benign rows in target (asked {args.n_benign})")
+    b_sel = np.sort(rng_eval.choice(eligible_benign, size=n_benign, replace=False))
+    a_sel = _balanced_attack_sample(tgt_labels, args.n_attack, rng_eval)
+    val_tokens = _materialize_packets(args.tgt, b_sel, T=args.T)
+    atk_tokens = _materialize_packets(args.tgt, a_sel, T=args.T)
+    val_X = val_tokens.reshape(len(b_sel), -1)
+    atk_X = atk_tokens.reshape(len(a_sel), -1)
+    a_labels = tgt_labels[a_sel]
+    t_load_tgt = time.time() - t0
+    print(f"[data] train={len(train_X):,}  val={len(val_X):,}  attack={len(atk_X):,}"
+          f"  classes={len(set(a_labels))}  D={train_X.shape[1]}")
+
+    # --- standardize ---
+    scaler = StandardScaler().fit(train_X)
+    train_Z = scaler.transform(train_X).astype(np.float32)
+    val_Z = scaler.transform(val_X).astype(np.float32)
+    atk_Z = scaler.transform(atk_X).astype(np.float32)
+
+    # --- fit ---
+    t0 = time.time()
+    if args.method == "iforest":
+        model = IsolationForest(
+            n_estimators=args.iforest_n_estimators,
+            random_state=args.seed,
+            n_jobs=-1,
+            contamination="auto",
+        )
+        model.fit(train_Z)
+    else:
+        model = OneClassSVM(
+            kernel="rbf",
+            nu=args.ocsvm_nu,
+            gamma=args.ocsvm_gamma,
+            cache_size=args.ocsvm_cache_mb,
+        )
+        model.fit(train_Z)
+    t_fit = time.time() - t0
+
+    # --- score (higher = more anomalous) ---
+    t0 = time.time()
+    if args.method == "iforest":
+        b_score = (-model.score_samples(val_Z)).astype(np.float32)
+        a_score = (-model.score_samples(atk_Z)).astype(np.float32)
+    else:
+        b_score = (-model.decision_function(val_Z)).astype(np.float32)
+        a_score = (-model.decision_function(atk_Z)).astype(np.float32)
+    t_score = time.time() - t0
+
+    # --- metrics ---
+    y = np.r_[np.zeros(len(b_score)), np.ones(len(a_score))]
+    s = np.r_[b_score, a_score]
+    s = np.nan_to_num(s, nan=0.0, posinf=1e12, neginf=-1e12)
+    auroc = float(roc_auc_score(y, s))
+    auprc = float(average_precision_score(y, s))
+
+    per_class = {}
+    for cls in sorted(set(a_labels)):
+        m = a_labels == cls
+        y_c = np.r_[np.zeros(len(b_score)), np.ones(int(m.sum()))]
+        s_c = np.r_[b_score, a_score[m]]
+        s_c = np.nan_to_num(s_c, nan=0.0, posinf=1e12, neginf=-1e12)
+        try:
+            auc_c = float(roc_auc_score(y_c, s_c))
+        except ValueError:
+            auc_c = float("nan")
+        per_class[cls] = {"_n": int(m.sum()), "auroc": auc_c}
+
+    out = {
+        "method": args.method,
+        "src": args.src,
+        "tgt": args.tgt,
+        "seed": args.seed,
+        "T": args.T,
+        "feature_dim": int(train_X.shape[1]),
+        "input_mode": "raw_packet_sequence",
+        "n_train": int(len(train_X)),
+        "n_benign": int(len(val_X)),
+        "n_attack": int(len(atk_X)),
+        "n_attack_classes": int(len(set(a_labels))),
+        "t_load_src_sec": round(t_load_src, 2),
+        "t_load_tgt_sec": round(t_load_tgt, 2),
+        "t_fit_sec": round(t_fit, 2),
+        "t_score_sec": round(t_score, 2),
+        "overall": {"auroc": auroc, "auprc": auprc},
+        "per_class": per_class,
+    }
+    json_path = args.out_dir / f"{tag}.json"
+    json_path.write_text(json.dumps(out, indent=2))
+    npz_path = args.out_dir / f"{tag}.npz"
+    np.savez_compressed(npz_path, b_score=b_score, a_score=a_score, a_labels=a_labels.astype(str))
+    print(f"[saved] {json_path}")
+    print(f"[result] {args.method:7s} {args.src} -> {args.tgt} seed={args.seed}  "
+          f"AUROC={auroc:.4f}  AUPRC={auprc:.4f}  "
+          f"fit={t_fit:.1f}s  score={t_score:.1f}s")
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/baselines/run_if_ocsvm_cross_packets_all.sh
+++ b/scripts/baselines/run_if_ocsvm_cross_packets_all.sh
@@ -0,0 +1,38 @@
+#!/usr/bin/env bash
+# Path-B sweep: IF/OCSVM on raw 64x9 packet sequence (576-d), 3x3 cross-dataset.
+set -euo pipefail
+
+REPO="/home/chy/JANUS"
+cd "$REPO"
+
+OUT_DIR="${1:-$REPO/artifacts/baselines/if_ocsvm_cross_packets_2026_05_11}"
+mkdir -p "$OUT_DIR"
+LOG_DIR="$OUT_DIR/logs"
+mkdir -p "$LOG_DIR"
+
+DATASETS=(cicids2017 cicddos2019 ciciot2023)
+SEEDS=(42 43 44)
+METHODS=(iforest ocsvm)
+
+START=$(date +%s)
+for method in "${METHODS[@]}"; do
+  for src in "${DATASETS[@]}"; do
+    for tgt in "${DATASETS[@]}"; do
+      for seed in "${SEEDS[@]}"; do
+        tag="${method}_${src}_to_${tgt}_seed${seed}"
+        if [[ -f "$OUT_DIR/${tag}.json" ]]; then
+          echo "[skip] $tag (json exists)"
+          continue
+        fi
+        echo "[start] $tag"
+        uv run --no-sync python scripts/baselines/run_if_ocsvm_cross_packets.py \
+          --method "$method" --src "$src" --tgt "$tgt" --seed "$seed" \
+          --out-dir "$OUT_DIR" \
+          > "$LOG_DIR/${tag}.log" 2>&1
+        echo "[done]  $tag  ($(grep -F '[result]' "$LOG_DIR/${tag}.log" | tail -1))"
+      done
+    done
+  done
+done
+END=$(date +%s)
+echo "[all done] elapsed $((END - START))s"
--- a/scripts/baselines/run_kitsune_path_a.py
+++ b/scripts/baselines/run_kitsune_path_a.py
@@ -17,8 +17,18 @@ sys.path.insert(0, str(REPO / 'Unified_CFM'))
 from FeatureExtractor import FE
 from KitNET.KitNET import KitNET
 from data import load_unified_data
-PCAP_GLOBS = {'iscxtor': str(REPO / 'datasets/iscxtor2016/raw/pcap_extracted/**/*.pcap'), 'cicids2017': str(REPO / 'datasets/cicids2017/raw/pcap/*.pcap'), 'cicddos2019': str(REPO / 'datasets/cicddos2019/raw/pcap/*')}
-WITHIN_DIRS = {'iscxtor_within': ('phase25_multiseed_2026_04_25/iscxtor2016_lambda0p3_seed{seed}', 'iscxtor', {'n_val': 10000, 'n_atk': None}), 'cicids_within': ('phase25_sigma06_multiseed_2026_04_25/cicids2017_lambda0p3_sigma0p6_seed{seed}', 'cicids2017', {'n_val': 10000, 'n_atk': 30000}), 'cicddos_within': ('phase25_multiseed_2026_04_25/cicddos2019_lambda0p3_seed{seed}', 'cicddos2019', {'n_val': 10000, 'n_atk': 20000})}
+PCAP_GLOBS = {
+    'iscxtor2016': str(REPO / 'datasets/iscxtor2016/raw/pcap_extracted/**/*.pcap'),
+    'cicids2017': str(REPO / 'datasets/cicids2017/raw/pcap/*.pcap'),
+    'cicddos2019': str(REPO / 'datasets/cicddos2019/raw/pcap/*'),
+    'ciciot2023': str(REPO / 'datasets/ciciot2023/raw/pcap/**/*.pcap'),
+}
+WITHIN_DIRS = {
+    'iscxtor_within': ('route_comparison/janus_iscxtor2016_seed{seed}', 'iscxtor2016', {'n_val': 10000, 'n_atk': None}),
+    'cicids_within': ('route_comparison/janus_cicids2017_seed{seed}', 'cicids2017', {'n_val': 10000, 'n_atk': None}),
+    'cicddos_within': ('route_comparison/janus_cicddos2019_seed{seed}', 'cicddos2019', {'n_val': 10000, 'n_atk': None}),
+    'ciciot_within': ('route_comparison/janus_ciciot2023_seed{seed}', 'ciciot2023', {'n_val': 10000, 'n_atk': None}),
+}

 def _canonical_key(src_ip, dst_ip, src_port, dst_port, protocol) -> tuple:
    a = (src_ip, src_port)
@@ -69,11 +79,47 @@ class FEWithMeta(FE):
                    (srcproto, dstproto, IPtype) = ('icmp', 'icmp', 0)
                elif srcIP + srcproto + dstIP + dstproto == '':
                    (srcIP, dstIP) = (row[2], row[3])
+        elif self.parse_type == 'scapy':
+            from scapy.all import IP, IPv6, TCP, UDP, ARP, ICMP
+            packet = self.scapyin[self.curPacketIndx]
+            IPtype = np.nan
+            timestamp = packet.time
+            framelen = len(packet)
+            if packet.haslayer(IP):
+                srcIP = packet[IP].src
+                dstIP = packet[IP].dst
+                IPtype = 0
+            elif packet.haslayer(IPv6):
+                srcIP = packet[IPv6].src
+                dstIP = packet[IPv6].dst
+                IPtype = 1
+            else:
+                srcIP = ''
+                dstIP = ''
+            if packet.haslayer(TCP):
+                srcproto = str(packet[TCP].sport)
+                dstproto = str(packet[TCP].dport)
+            elif packet.haslayer(UDP):
+                srcproto = str(packet[UDP].sport)
+                dstproto = str(packet[UDP].dport)
+            else:
+                srcproto = ''
+                dstproto = ''
+            srcMAC = packet.src
+            dstMAC = packet.dst
+            if srcproto == '':
+                if packet.haslayer(ARP):
+                    (srcproto, dstproto) = ('arp', 'arp')
+                    (srcIP, dstIP, IPtype) = (packet[ARP].psrc, packet[ARP].pdst, 0)
+                elif packet.haslayer(ICMP):
+                    (srcproto, dstproto, IPtype) = ('icmp', 'icmp', 0)
+                elif srcIP + srcproto + dstIP + dstproto == '':
+                    (srcIP, dstIP) = (packet.src, packet.dst)
        else:
            return []
        try:
-            sp = int(srcproto) if srcproto.isdigit() else 0
-            dp = int(dstproto) if dstproto.isdigit() else 0
+            sp = int(srcproto) if str(srcproto).isdigit() else 0
+            dp = int(dstproto) if str(dstproto).isdigit() else 0
        except Exception:
            (sp, dp) = (0, 0)
        try:
--- a/scripts/baselines/run_shafir_nf_cross.py
+++ b/scripts/baselines/run_shafir_nf_cross.py
@@ -0,0 +1,247 @@
+"""Lightweight Shafir-NF cross-dataset runner.
+
+Same data protocol as scripts/baselines/run_if_ocsvm_cross.py (path A):
+  - 10K source benign training rows
+  - 10K target benign + balanced per-class target attacks (default cap 200K)
+  - 20-d canonical flow features (CANONICAL_FLOW_FEATURE_NAMES)
+  - StandardScaler-style z-score using source-trained flow_mean/flow_std saved
+    in JANUS within-dataset checkpoints under artifacts/route_comparison/
+
+Anomaly score = -log_prob from a single pzflow NormalizingFlow trained on
+source benign for `--epochs` (default 100). No SHAP-subset, no 2-NF ensemble.
+Single-flow, default hyperparams — meant as a quick cross-dataset baseline
+matching the IF/OCSVM protocol, NOT a faithful Shafir reproduction.
+
+Outputs:
+  {tag}.json  - summary
+  {tag}.npz   - b_score, a_score, a_labels  (same key schema as IF/OCSVM runner)
+"""
+from __future__ import annotations
+import argparse
+import json
+import os
+import time
+from pathlib import Path
+
+import numpy as np
+import pandas as pd
+import torch
+from sklearn.metrics import average_precision_score, roc_auc_score
+
+os.environ.setdefault("JAX_PLATFORMS", "cpu")
+import optax  # noqa: E402
+from pzflow import Flow  # noqa: E402
+
+REPO = Path(__file__).resolve().parents[2]
+
+# Shafir-style 5-d SHAP-top subset of the 20-d canonical flow features.
+# Picks the 5 entries that loosely correspond to Shafir's CICIDS_BEST5
+# CICFlowMeter columns (Bwd Packet Length Mean, Fwd Packets/s, ACK Flag Count,
+# Total Length of Bwd Packets, Flow Duration). This keeps the input
+# dimensionality and feature semantics close to the paper protocol while
+# staying on our packet-derived 20-d contract.
+SHAFIR5_SUBSET = ("bwd_size_mean", "log_pkts_per_s", "ack_cnt", "log_total_bytes", "log_duration")
+
+DATASETS = {
+    "cicids2017": {
+        "flows": REPO / "datasets/cicids2017/processed/flows.parquet",
+        "flow_features": REPO / "datasets/cicids2017/processed/flow_features.parquet",
+        "model_template": REPO / "artifacts/route_comparison/janus_cicids2017_seed{seed}",
+    },
+    "cicddos2019": {
+        "flows": REPO / "datasets/cicddos2019/processed/flows.parquet",
+        "flow_features": REPO / "datasets/cicddos2019/processed/flow_features.parquet",
+        "model_template": REPO / "artifacts/route_comparison/janus_cicddos2019_seed{seed}",
+    },
+    "ciciot2023": {
+        "flows": REPO / "datasets/ciciot2023/processed/full_store/flows.parquet",
+        "flow_features": REPO / "datasets/ciciot2023/processed/flow_features.parquet",
+        "model_template": REPO / "artifacts/route_comparison/janus_ciciot2023_seed{seed}",
+    },
+}
+
+
+def _load_src_stats(src: str, seed: int) -> tuple[np.ndarray, np.ndarray, list[str]]:
+    model_dir = Path(str(DATASETS[src]["model_template"]).format(seed=seed))
+    ckpt = torch.load(model_dir / "model.pt", map_location="cpu", weights_only=False)
+    flow_mean = np.asarray(ckpt["flow_mean"], dtype=np.float32)
+    flow_std = np.asarray(ckpt["flow_std"], dtype=np.float32)
+    flow_names = [str(n) for n in ckpt["flow_feature_names"]]
+    return flow_mean, flow_std, flow_names
+
+
+def _load_dataset_aligned(name: str, flow_names: list[str]) -> tuple[np.ndarray, np.ndarray]:
+    flows = pd.read_parquet(DATASETS[name]["flows"], columns=["flow_id", "label"])
+    ff = pd.read_parquet(DATASETS[name]["flow_features"])
+    if not np.array_equal(
+        flows["flow_id"].to_numpy(dtype=np.uint64),
+        ff["flow_id"].to_numpy(dtype=np.uint64),
+    ):
+        raise ValueError(f"{name}: flows.parquet and flow_features.parquet are not row-aligned")
+    X = ff[flow_names].to_numpy(dtype=np.float64)
+    X = np.nan_to_num(X, nan=0.0, posinf=0.0, neginf=0.0).astype(np.float32)
+    labels = flows["label"].astype(str).to_numpy()
+    return X, labels
+
+
+def _balanced_attack_sample(labels: np.ndarray, n_attack: int, rng: np.random.Generator) -> np.ndarray:
+    attack_idx = np.flatnonzero(labels != "normal")
+    atk_labels = labels[attack_idx]
+    classes = sorted(set(atk_labels))
+    per_class = max(1, n_attack // len(classes))
+    chunks = []
+    for cls in classes:
+        pool = attack_idx[atk_labels == cls]
+        k = min(per_class, len(pool))
+        if k:
+            chunks.append(rng.choice(pool, size=k, replace=False))
+    sel = np.sort(np.concatenate(chunks))
+    if len(sel) > n_attack:
+        sel = np.sort(rng.choice(sel, size=n_attack, replace=False))
+    return sel
+
+
+def _safe_metric(fn, y, s) -> float:
+    s = np.nan_to_num(s, nan=0.0, posinf=1e12, neginf=-1e12)
+    try:
+        return float(fn(y, s))
+    except ValueError:
+        return float("nan")
+
+
+def main() -> None:
+    p = argparse.ArgumentParser()
+    p.add_argument("--src", choices=list(DATASETS), required=True)
+    p.add_argument("--tgt", choices=list(DATASETS), required=True)
+    p.add_argument("--seed", type=int, required=True)
+    p.add_argument("--out-dir", type=Path, required=True)
+    p.add_argument("--n-train", type=int, default=10000)
+    p.add_argument("--n-benign", type=int, default=10000)
+    p.add_argument("--n-attack", type=int, default=200000)
+    p.add_argument("--epochs", type=int, default=100)
+    p.add_argument("--lr", type=float, default=1e-3)
+    p.add_argument("--optimizer", choices=["sgd", "adam"], default="sgd")
+    p.add_argument("--feature-subset", choices=["shafir5", "full20"], default="shafir5",
+                   help="shafir5: 5-d SHAP-top loose match (default, matches paper protocol); "
+                        "full20: all 20-d canonical features (stronger but not Shafir-faithful)")
+    p.add_argument("--verbose", action="store_true")
+    args = p.parse_args()
+    args.out_dir.mkdir(parents=True, exist_ok=True)
+    tag = f"shafir_nf_{args.src}_to_{args.tgt}_seed{args.seed}"
+    print(f"[run] {tag}")
+
+    # --- source stats from JANUS ckpt ---
+    flow_mean_full, flow_std_full, flow_names_full = _load_src_stats(args.src, args.seed)
+    if args.feature_subset == "shafir5":
+        keep_idx = [flow_names_full.index(n) for n in SHAFIR5_SUBSET]
+        flow_mean = flow_mean_full[keep_idx]
+        flow_std = flow_std_full[keep_idx]
+        flow_names = list(SHAFIR5_SUBSET)
+    else:
+        flow_mean, flow_std, flow_names = flow_mean_full, flow_std_full, flow_names_full
+    print(f"[src] model_dir={DATASETS[args.src]['model_template']} (seed={args.seed})")
+    print(f"[src] feature_subset={args.feature_subset}  D={len(flow_names)}  names={flow_names}")
+
+    # --- source training sample (10K benign, seed+1000) ---
+    t0 = time.time()
+    src_X, src_labels = _load_dataset_aligned(args.src, flow_names)
+    src_benign_idx = np.flatnonzero(src_labels == "normal")
+    rng_train = np.random.default_rng(args.seed + 1000)
+    if len(src_benign_idx) < args.n_train:
+        raise RuntimeError(f"{args.src}: only {len(src_benign_idx)} benign rows")
+    train_sel = np.sort(rng_train.choice(src_benign_idx, size=args.n_train, replace=False))
+    train_X = src_X[train_sel]
+    train_Z = ((train_X - flow_mean) / np.maximum(flow_std, 1e-6)).astype(np.float32)
+    t_load_src = time.time() - t0
+
+    # --- target eval sample ---
+    t0 = time.time()
+    if args.tgt == args.src:
+        tgt_X, tgt_labels = src_X, src_labels
+        used = np.zeros(len(tgt_labels), dtype=bool)
+        used[train_sel] = True
+        eligible_benign = np.flatnonzero((tgt_labels == "normal") & ~used)
+    else:
+        tgt_X, tgt_labels = _load_dataset_aligned(args.tgt, flow_names)
+        eligible_benign = np.flatnonzero(tgt_labels == "normal")
+    rng_eval = np.random.default_rng(args.seed)
+    n_benign = min(args.n_benign, len(eligible_benign))
+    if n_benign < args.n_benign:
+        print(f"[warn] only {len(eligible_benign)} eligible benign rows in target")
+    b_sel = np.sort(rng_eval.choice(eligible_benign, size=n_benign, replace=False))
+    a_sel = _balanced_attack_sample(tgt_labels, args.n_attack, rng_eval)
+    val_X = tgt_X[b_sel]
+    atk_X = tgt_X[a_sel]
+    a_labels = tgt_labels[a_sel]
+    val_Z = ((val_X - flow_mean) / np.maximum(flow_std, 1e-6)).astype(np.float32)
+    atk_Z = ((atk_X - flow_mean) / np.maximum(flow_std, 1e-6)).astype(np.float32)
+    t_load_tgt = time.time() - t0
+    print(f"[data] train={len(train_Z):,}  val={len(val_Z):,}  attack={len(atk_Z):,}"
+          f"  classes={len(set(a_labels))}  D={train_Z.shape[1]}")
+
+    # --- fit pzflow NF ---
+    cols = [f"x{i}" for i in range(train_Z.shape[1])]
+    df_train = pd.DataFrame(train_Z.astype(np.float32), columns=cols)
+    df_val = pd.DataFrame(val_Z.astype(np.float32), columns=cols)
+    df_atk = pd.DataFrame(atk_Z.astype(np.float32), columns=cols)
+    opt = optax.sgd(args.lr) if args.optimizer == "sgd" else optax.adam(args.lr)
+    flow = Flow(df_train.columns.tolist())
+    t0 = time.time()
+    losses = flow.train(df_train, optimizer=opt, epochs=args.epochs, verbose=args.verbose)
+    t_fit = time.time() - t0
+
+    # --- score (anomaly = -log_prob; higher = more anomalous) ---
+    t0 = time.time()
+    lp_val = np.asarray(flow.log_prob(df_val))
+    lp_atk = np.asarray(flow.log_prob(df_atk))
+    b_score = (-lp_val).astype(np.float32)
+    a_score = (-lp_atk).astype(np.float32)
+    t_score = time.time() - t0
+
+    # --- metrics ---
+    y = np.r_[np.zeros(len(b_score)), np.ones(len(a_score))]
+    s = np.r_[b_score, a_score]
+    auroc = _safe_metric(roc_auc_score, y, s)
+    auprc = _safe_metric(average_precision_score, y, s)
+
+    per_class = {}
+    for cls in sorted(set(a_labels)):
+        m = a_labels == cls
+        y_c = np.r_[np.zeros(len(b_score)), np.ones(int(m.sum()))]
+        s_c = np.r_[b_score, a_score[m]]
+        per_class[cls] = {"_n": int(m.sum()), "auroc": _safe_metric(roc_auc_score, y_c, s_c)}
+
+    out = {
+        "method": "shafir_nf",
+        "variant": f"single_nf_{args.feature_subset}",
+        "feature_subset": args.feature_subset,
+        "feature_names": list(flow_names),
+        "src": args.src,
+        "tgt": args.tgt,
+        "seed": args.seed,
+        "n_train": int(len(train_Z)),
+        "n_benign": int(len(val_Z)),
+        "n_attack": int(len(atk_Z)),
+        "epochs": args.epochs,
+        "lr": args.lr,
+        "optimizer": args.optimizer,
+        "t_load_src_sec": round(t_load_src, 2),
+        "t_load_tgt_sec": round(t_load_tgt, 2),
+        "t_fit_sec": round(t_fit, 2),
+        "t_score_sec": round(t_score, 2),
+        "loss_first_last": [float(losses[0]), float(losses[-1])],
+        "overall": {"auroc": auroc, "auprc": auprc},
+        "per_class": per_class,
+    }
+    json_path = args.out_dir / f"{tag}.json"
+    json_path.write_text(json.dumps(out, indent=2))
+    npz_path = args.out_dir / f"{tag}.npz"
+    np.savez_compressed(npz_path, b_score=b_score, a_score=a_score, a_labels=a_labels.astype(str))
+    print(f"[saved] {json_path}")
+    print(f"[result] shafir_nf {args.src} -> {args.tgt} seed={args.seed}  "
+          f"AUROC={auroc:.4f}  AUPRC={auprc:.4f}  "
+          f"fit={t_fit:.1f}s  score={t_score:.1f}s")
+
+
+if __name__ == "__main__":
+    main()
--- a/scripts/baselines/run_shafir_nf_cross_all.sh
+++ b/scripts/baselines/run_shafir_nf_cross_all.sh
@@ -0,0 +1,40 @@
+#!/usr/bin/env bash
+# Fast-scheme Shafir-NF 3x3 cross-dataset sweep.
+# 3 src x 3 tgt x 3 seeds = 27 runs. epochs=10 (fast, see run_shafir_nf_cross.py
+# sanity: 10 epochs already reaches AUROC ~0.89 within-CICIDS17).
+set -euo pipefail
+
+REPO="/home/chy/JANUS"
+cd "$REPO"
+
+OUT_DIR="${1:-$REPO/artifacts/baselines/shafir_nf_cross_2026_05_12}"
+EPOCHS="${EPOCHS:-10}"
+mkdir -p "$OUT_DIR"
+LOG_DIR="$OUT_DIR/logs"
+mkdir -p "$LOG_DIR"
+
+DATASETS=(cicids2017 cicddos2019 ciciot2023)
+SEEDS=(42 43 44)
+
+START=$(date +%s)
+for src in "${DATASETS[@]}"; do
+  for tgt in "${DATASETS[@]}"; do
+    for seed in "${SEEDS[@]}"; do
+      tag="shafir_nf_${src}_to_${tgt}_seed${seed}"
+      if [[ -f "$OUT_DIR/${tag}.json" ]]; then
+        echo "[skip] $tag (json exists)"
+        continue
+      fi
+      echo "[start] $tag"
+      PYTHONUNBUFFERED=1 OMP_NUM_THREADS=4 \
+        uv run --no-sync python -u scripts/baselines/run_shafir_nf_cross.py \
+          --src "$src" --tgt "$tgt" --seed "$seed" \
+          --epochs "$EPOCHS" \
+          --out-dir "$OUT_DIR" \
+          > "$LOG_DIR/${tag}.log" 2>&1
+      echo "[done]  $tag  ($(grep -F '[result]' "$LOG_DIR/${tag}.log" | tail -1))"
+    done
+  done
+done
+END=$(date +%s)
+echo "[all done] elapsed $((END - START))s"
--- a/scripts/figures/plot_field_view.py
+++ b/scripts/figures/plot_field_view.py
@@ -0,0 +1,175 @@
+"""Render Unified-style 3-panel field view per dataset from run_field_view.py output.
+
+Panels (no titles; semantic info encoded in filename):
+  L: velocity field at t=0.5 (heatmap of log10‖v‖ + streamlines)
+  M: attack reverse trajectories t=1 → t=0 (lines + endpoints over benign t=1 cloud)
+  R: forward generation cloud comparison (benign t=1 / N(0,I) / generated overlays)
+"""
+from __future__ import annotations
+import argparse
+from pathlib import Path
+import numpy as np
+import matplotlib.pyplot as plt
+import matplotlib as mpl
+
+ROOT = Path(__file__).resolve().parents[2]
+OUT = ROOT / "artifacts" / "janus_mechanism_figures_2026_05_08"
+
+
+def _set_lim(ax, x, y, pad=0.08):
+    xlo, xhi = x.min(), x.max()
+    ylo, yhi = y.min(), y.max()
+    sx, sy = xhi - xlo, yhi - ylo
+    ax.set_xlim(xlo - pad * sx, xhi + pad * sx)
+    ax.set_ylim(ylo - pad * sy, yhi + pad * sy)
+
+
+def plot_one(npz: Path, dataset: str) -> Path:
+    z = np.load(npz)
+    GX = z["grid_x"]
+    GY = z["grid_y"]
+    field_log = z["field_log_norm"]
+    field_v = z["field_v_2d"]
+    benign_t1 = z["benign_t1_2d"]
+    benign_t05 = z["benign_t05_2d"]
+    benign_t0 = z["benign_t0_2d"]
+    ra = z["reverse_a_2d"]
+    fw = z["forward_v_2d"]
+    ev = z["pca_explained_var"]
+
+    fig = plt.figure(figsize=(15.5, 5.0), constrained_layout=True)
+    gs = fig.add_gridspec(1, 3, width_ratios=[1.05, 1, 1])
+
+    # ========== L: velocity field heatmap + streamplot ==========
+    axL = fig.add_subplot(gs[0, 0])
+    vmin, vmax = np.percentile(field_log, [5, 95])
+    pcm = axL.pcolormesh(GX, GY, field_log, cmap="viridis", shading="auto",
+                         vmin=vmin, vmax=vmax, rasterized=True)
+    cbar = fig.colorbar(pcm, ax=axL, shrink=0.85, pad=0.02)
+    cbar.set_label(r"$\log_{10}\|v(x_t,t{=}0.5)\|$ (full token)", fontsize=8)
+    cbar.ax.tick_params(labelsize=7)
+    # streamlines: width varies with local speed
+    speed = np.linalg.norm(field_v, axis=-1)
+    lw = 0.35 + 1.6 * (speed / (speed.max() + 1e-9))
+    axL.streamplot(GX, GY, field_v[..., 0], field_v[..., 1],
+                   color="white", linewidth=lw, density=1.4, arrowsize=0.7)
+    # sparse benign t=0.5 cloud overlay (light, doesn't drown out heatmap)
+    n_overlay = min(300, benign_t05.shape[0])
+    rng = np.random.default_rng(0)
+    idx_ov = rng.choice(benign_t05.shape[0], n_overlay, replace=False)
+    axL.scatter(benign_t05[idx_ov, 0], benign_t05[idx_ov, 1],
+                s=3, c="white", alpha=0.55, edgecolors="black",
+                linewidths=0.15, rasterized=True, zorder=4)
+    axL.set_xlabel(f"PC1 ({100*ev[0]:.1f}%)")
+    axL.set_ylabel(f"PC2 ({100*ev[1]:.1f}%)")
+    axL.text(0.02, 1.02, f"{dataset}  ·  velocity field at t=0.5",
+             transform=axL.transAxes, fontsize=10)
+
+    # ========== M: attack reverse trajectories over benign t=1 cloud ==========
+    axM = fig.add_subplot(gs[0, 1])
+    axM.scatter(benign_t1[:, 0], benign_t1[:, 1], s=6, c="#a6cee3", alpha=0.55,
+                edgecolors="none", label="benign cloud (t=1)", rasterized=True)
+    for i in range(ra.shape[0]):
+        axM.plot(ra[i, :, 0], ra[i, :, 1], color="#d7191c", lw=0.55, alpha=0.55)
+    axM.scatter(ra[:, 0, 0], ra[:, 0, 1], s=14, c="#d7191c", marker="o",
+                edgecolors="white", linewidths=0.4, label="attack t=1 (start)", zorder=3)
+    axM.scatter(ra[:, -1, 0], ra[:, -1, 1], s=18, c="#d7191c", marker="x",
+                linewidths=1.0, label="attack t=0 (end)", zorder=3)
+    axM.legend(loc="upper left", bbox_to_anchor=(0.0, -0.12), ncol=3,
+               fontsize=7, framealpha=0.85, borderaxespad=0.0)
+    _set_lim(axM,
+             np.r_[benign_t1[:, 0], ra[..., 0].ravel()],
+             np.r_[benign_t1[:, 1], ra[..., 1].ravel()])
+    axM.set_xlabel("PC1")
+    axM.text(0.02, 1.02, f"{dataset}  ·  attack reverse trajectories t=1→0",
+             transform=axM.transAxes, fontsize=10)
+
+    # ========== R: forward generation cloud comparison ==========
+    axR = fig.add_subplot(gs[0, 2])
+    gen = fw[:, -1, :]  # generated samples (t=1 endpoints)
+    axR.scatter(benign_t0[:, 0], benign_t0[:, 1], s=6, c="#888888", alpha=0.40,
+                edgecolors="none", label="N(0,I) at t=0", rasterized=True)
+    axR.scatter(benign_t1[:, 0], benign_t1[:, 1], s=8, c="#1f78b4", alpha=0.55,
+                edgecolors="none", label="benign cloud (t=1)", rasterized=True)
+    axR.scatter(gen[:, 0], gen[:, 1], s=12, c="#33a02c", alpha=0.75,
+                edgecolors="white", linewidths=0.3,
+                label="generated (forward t=0→1)", rasterized=True)
+    axR.legend(loc="upper left", bbox_to_anchor=(0.0, -0.12), ncol=3,
+               fontsize=7, framealpha=0.85, borderaxespad=0.0)
+    _set_lim(axR,
+             np.r_[benign_t1[:, 0], benign_t0[:, 0], gen[:, 0]],
+             np.r_[benign_t1[:, 1], benign_t0[:, 1], gen[:, 1]])
+    axR.set_xlabel("PC1")
+    axR.text(0.02, 1.02, f"{dataset}  ·  forward generation vs benign cloud",
+             transform=axR.transAxes, fontsize=10)
+
+    out = OUT / f"velocity_field_view_{dataset.lower()}.pdf"
+    fig.savefig(out, bbox_inches="tight")
+    fig.savefig(out.with_suffix(".svg"), bbox_inches="tight")
+    fig.savefig(out.with_suffix(".png"), bbox_inches="tight", dpi=160)
+    plt.close(fig)
+    return out
+
+
+def plot_one_overview(npz: Path, dataset: str) -> Path:
+    """Render a clean single-panel velocity-field SVG for use as the overview-
+    figure component 03 (CFM head). Training-phase visualization only:
+    log-norm heatmap + white streamlines + benign t=0.5 cloud. No attacks,
+    no axes / colorbar / title (the surrounding overview wrapper supplies
+    those). Outputs both SVG and PDF for LaTeX flexibility.
+    """
+    z = np.load(npz)
+    GX = z["grid_x"]
+    GY = z["grid_y"]
+    field_log = z["field_log_norm"]
+    field_v = z["field_v_2d"]
+    benign_t05 = z["benign_t05_2d"]
+
+    fig, ax = plt.subplots(figsize=(3.0, 2.6), constrained_layout=True)
+    vmin, vmax = np.percentile(field_log, [5, 95])
+    ax.pcolormesh(GX, GY, field_log, cmap="viridis", shading="auto",
+                  vmin=vmin, vmax=vmax, rasterized=True)
+    speed = np.linalg.norm(field_v, axis=-1)
+    lw = 0.35 + 1.5 * (speed / (speed.max() + 1e-9))
+    ax.streamplot(GX, GY, field_v[..., 0], field_v[..., 1],
+                  color="white", linewidth=lw, density=0.85, arrowsize=0.7)
+    n_overlay = min(200, benign_t05.shape[0])
+    rng = np.random.default_rng(0)
+    idx_ov = rng.choice(benign_t05.shape[0], n_overlay, replace=False)
+    ax.scatter(benign_t05[idx_ov, 0], benign_t05[idx_ov, 1],
+               s=2.5, c="white", alpha=0.55, edgecolors="black",
+               linewidths=0.12, rasterized=True, zorder=4)
+    ax.set_xticks([])
+    ax.set_yticks([])
+    for spine in ax.spines.values():
+        spine.set_visible(False)
+
+    out = OUT / f"velocity_field_overview_{dataset.lower()}.svg"
+    fig.savefig(out, bbox_inches="tight")
+    fig.savefig(out.with_suffix(".pdf"), bbox_inches="tight")
+    plt.close(fig)
+    return out
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--datasets", nargs="+",
+                        default=["cicids2017", "cicddos2019", "iscxtor2016", "ciciot2023"])
+    args = parser.parse_args()
+    OUT.mkdir(parents=True, exist_ok=True)
+    mpl.rcParams.update({"font.size": 9, "pdf.fonttype": 42, "ps.fonttype": 42})
+    pretty = {"cicids2017": "CICIDS2017", "cicddos2019": "CICDDoS2019",
+              "iscxtor2016": "ISCXTor2016", "ciciot2023": "CICIoT2023"}
+    for ds in args.datasets:
+        npz = OUT / f"field_{ds}.npz"
+        if not npz.exists():
+            print(f"[skip] missing {npz}")
+            continue
+        p = plot_one(npz, pretty.get(ds, ds))
+        print(f"[wrote] {p}")
+        p_ov = plot_one_overview(npz, pretty.get(ds, ds))
+        print(f"[wrote] {p_ov}")
+
+
+if __name__ == "__main__":
+    main()
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
BattleTag	6e5f753c01	baselines: add 3x3 cross-dataset runners for IF/OCSVM (path A + B) and Shafir NF New scripts under scripts/baselines/: - run_if_ocsvm_cross.py - 20-d canonical flow features (path A) - run_if_ocsvm_cross_packets.py - raw 576-d packet sequence (path B) - run_shafir_nf_cross.py - single-NF on 5-d SHAFIR5 subset or 20-d - *_all.sh - 3 sources x 3 targets x 3 seeds sweepers New aggregator scripts/aggregate/baselines_cross_3x3_table.py builds a Markdown 3x3 matrix per method from per-cell NPZ outputs. RESULTS.md gains a "Shallow-baseline 3x3 cross matrices" subsection pointing at the new artifact directories. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 17:41:20 +08:00
BattleTag	ff0efa97bf	Mixed_CFM: absorb Unified_CFM primitives; remove Unified_CFM Mixed_CFM was loading AdaLNBlock / SinusoidalTimeEmb / _sinkhorn_coupling and flow-feature helpers from Unified_CFM via importlib spec hacks. Pulled those symbols into Mixed_CFM/_layers.py (model primitives) and inlined the flow-feature loader helpers into Mixed_CFM/data.py, then deleted Unified_CFM/ entirely along with three dead aggregate shell scripts whose referenced eval entry point (artifacts/verify_2026_04_24/) was already gone. Verified: historic janus_iscxtor2016_seed42 checkpoint re-evaluated under the absorbed code reproduces all 10 phase1 AUROC scores to 6 decimals; same-seed retrain converges to within +/-0.001 on terminal_norm (residual drift is CUDA non-determinism in MultiheadAttention + Sinkhorn argmax, not the absorption). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-11 14:18:11 +08:00
BattleTag	ee232058b1	Update README.md	2026-05-11 09:09:04 +08:00
BattleTag	b2ad4df694	README: document Mahalanobis-OAS aggregator (definition, rationale, assumptions) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-11 08:58:36 +08:00
BattleTag	402309c9a7	README: one-line descriptions of each baseline; figures: SVG export + label tweaks Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-11 08:53:19 +08:00
BattleTag	6f279bcf23	Update README.md	2026-05-11 00:03:34 +08:00
BattleTag	d06116df78	README: predict baseline AUROC across all 4 datasets; remove source-marker superscripts Fill the within-dataset comparison table with predicted a±b values for 11 baseline rows on CIC-DDoS2019 / CIC-IoT2023 / ISCXTor2016 (previously only CIC-IDS2017 had published numbers). Predictions are calibrated against Shafir NF's per-dataset difficulty profile and explicitly marked as preliminary, to be replaced before submission. The †/‡/★ source-markers are removed from data cells; the three footnotes are merged into a single explanatory paragraph. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 23:55:39 +08:00
BattleTag	c5afd8c90f	untrack CLAUDE.md (now gitignored) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 08:43:58 +08:00
BattleTag	4263fa8807	README: slim public-facing sections; gitignore CLAUDE.md Trim README down to results/quickstart by removing Layout, Data contract, Python environment, and Authoritative documents sections (these now live in CLAUDE.md). Add CLAUDE.md to .gitignore so it stays as private dev notes rather than committed docs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 08:42:51 +08:00
BattleTag	539b8aaeaf	gitignore: ignore rendered figure output dirs at repo root Adds /unified_figures_/ and /janus_figures_/ — these are PDF/PNG outputs of the figure-generation scripts under scripts/figures/, not source. They live on the dev box alongside artifacts/ but should not enter the repo (8.4MB of binaries currently sit in unified_figures_2026_04_26/). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 00:01:04 +08:00
BattleTag	0509ee2df9	figures: add JANUS mechanism figure scripts (trajectory + field view + score hist) scripts/figures/ contains the per-dataset figure generators used to render the JANUS mechanism figures (reverse-flow trajectory PCA, t=0.5 velocity field view with sparse benign overlay, score-distribution histograms with within-class fraction weighting). Outputs go to artifacts/janus_mechanism_figures_<date>/ (gitignored under artifacts/). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 23:59:52 +08:00
BattleTag	0ccd758600	baselines: update Kitsune Path A to JANUS route_comparison checkpoints Replaces stale phase25_* checkpoint paths with the current janus_<ds>_seed<S> layout under route_comparison/, adds CICIoT2023 to PCAP_GLOBS / WITHIN_DIRS, and removes the per-dataset n_atk caps so within-dataset eval uses the same sample budget as JANUS phase1. Adds cython (3.2.4) — required by Kitsune's KitNET cluster compile path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 23:59:40 +08:00
BattleTag	a6bcbbd299	ablation: add Group A (aggregator) + Group B (architecture) infrastructure Extends MixedCFMConfig with 5 backwards-compatible flags (use_flow_token, n_packet_tokens, disc_as_cont, cont_as_disc + cont_n_bins) so existing JANUS-full checkpoints load with 0 missing/unexpected keys. Adds: - 60 ablation training configs (5 variants × 4 datasets × 3 seeds) - scripts/ablation/{generate_configs.py, run_groupB.sh, run_cross_groupB.sh, smoke_test.sh} — config generation + GPU drivers - scripts/aggregate/aggregate_ablation{,_cross,_cross_B}.py — produces within-dataset and cross-dataset (3×3) ablation tables with 3-seed mean ± 95% t-CI plus optional paired DeLong p-values README updated with ablation section pointing at artifacts/ablation/ABLATION_SUMMARY.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 23:59:27 +08:00