From a1e81f16b536eb0b7bd04a505a74dc4b4869ecf0 Mon Sep 17 00:00:00 2001 From: BattleTag Date: Fri, 8 May 2026 11:51:47 +0800 Subject: [PATCH] README: academic-style within-dataset comparison table with 12 baselines + JANUS --- README.md | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index 20e3515..d98c2f2 100644 --- a/README.md +++ b/README.md @@ -15,16 +15,29 @@ JANUS is the first NIDS method to use Flow Matching as the training paradigm in 3-seed mean ± std AUROC. Selection-bias-free Mahalanobis-OAS aggregator on the 10-d JANUS score vector, fit on benign val only. -### Within-dataset +### Within-dataset comparison (AUROC %, mean ± std) -| Task | Shafir 2026 SOTA | **JANUS** | Δ | -|---|---|---|---| -| ISCXTor2016 (NonTor → Tor) | 0.8731 | **0.9909 ± 0.0013** | **+0.118** | -| CICIDS2017 within | 0.9303 | **0.9826 ± 0.0035** | **+0.052** | -| CICDDoS2019 within | 0.93 | **0.9918 ± 0.0005** | **+0.062** | -| CICIoT2023 within | F1=0.9951 (different metric) | 0.9590 ± 0.0022 (AUROC) | N/A — metric mismatch | +| Method | Venue | CIC-IDS2017 | CIC-DDoS2019 | CIC-IoT2023 | ISCXTor2016 | +|---|---|---:|---:|---:|---:| +| Isolation Forest | classical | 55.27 ± 0.4 † | — | — | — | +| OCSVM | classical | 59.59 ± 0.6 † | — | — | — | +| AnoFormer | ICLR'22 | 63.37 ± 0.7 † | — | — | — | +| GANomaly | BMVC'18 | 82.75 ± 5.6 † | — | — | — | +| RD4AD | CVPR'22 | 83.78 ± 0.8 † | — | — | — | +| TSLANet | ICML'24 | 84.45 ± 1.7 † | — | — | — | +| ARCADE | — | 84.85 ± 2.0 † | — | — | — | +| MFAD | — | 86.02 ± 0.8 † | — | — | — | +| STFPM | BMVC'21 | 86.29 ± 1.7 † | — | — | — | +| MMR | — | 89.26 ± 1.2 † | — | — | — | +| Shafir NF + Shapley | arXiv'26 | 93.03 ‡ | 93.00 ‡ | 99.51 §‡ | 87.31 ‡ | +| ConMD | TIFS'26 | 94.43 ± 0.1 † | — | — | — | +| **JANUS (ours)** | — | **98.26 ± 0.35** | **99.18 ± 0.05** | **95.90 ± 0.22** | **99.09 ± 0.13** | -3/3 directly comparable within-dataset benchmarks beat external Shafir 2026 SOTA. CICIoT2023 is reported as additional benchmark only (Shafir reports F1, we report AUROC; not a +SOTA claim). See `RESULTS.md` for caveats and the full headline table. +† Numbers from ConMD (TIFS'26) Table I; protocol = train 10 K benign / test 5 K + 5 K balanced; 5-seed mean ± std. +‡ Numbers from Shafir et al. (arXiv'26); protocol = train 10 K benign / SHAP-selected feature subsets per dataset. +§ Metric mismatch on CIC-IoT2023: Shafir reports F1 = 99.51 (Youden's-J threshold tuned with attack labels), we report AUROC = 95.90 (threshold-free); not directly comparable. Thresholded F1 for JANUS is reported in `RESULTS.md` Section D and `artifacts/route_comparison/THRESHOLDED.md`. + +JANUS sets new SOTA on **3/3 directly comparable benchmarks** (CIC-IDS2017 +3.83, CIC-DDoS2019 +6.18, ISCXTor2016 +11.78) — all margins outside seed std. JANUS is fully unsupervised (benign-only training, no attack labels at any stage), and uses the Mahalanobis-OAS aggregator over its 10-d raw score vector with parameters fit on benign val only. ### 3×3 cross-dataset transfer matrix