baselines: add 3x3 cross-dataset runners for IF/OCSVM (path A + B) and Shafir NF

New scripts under scripts/baselines/:
- run_if_ocsvm_cross.py            - 20-d canonical flow features (path A)
- run_if_ocsvm_cross_packets.py    - raw 576-d packet sequence (path B)
- run_shafir_nf_cross.py           - single-NF on 5-d SHAFIR5 subset or 20-d
- *_all.sh                         - 3 sources x 3 targets x 3 seeds sweepers

New aggregator scripts/aggregate/baselines_cross_3x3_table.py builds a
Markdown 3x3 matrix per method from per-cell NPZ outputs.

RESULTS.md gains a "Shallow-baseline 3x3 cross matrices" subsection
pointing at the new artifact directories.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-12 17:41:20 +08:00
parent ff0efa97bf
commit 6e5f753c01
8 changed files with 979 additions and 0 deletions

View File

@@ -133,6 +133,25 @@ Full 4×4 cross matrix at `artifacts/route_comparison/CROSS_MATRIX.md`. All
See `artifacts/route_comparison/SCORE_ROUTER.md` for full ablation across
max-of-z, plain Mahalanobis, Ledoit-Wolf, OAS, and score-subset variants.
#### Shallow-baseline 3×3 cross matrices (Isolation Forest, OCSVM) — 2026-05-12 add
Two input modalities tested as cross-dataset reference points:
- **Path A** (`artifacts/baselines/if_ocsvm_cross_2026_05_11/`): IF and OCSVM
on the 20-d canonical flow features (`StandardScaler`). Strong shallow
baseline — best off-diagonal AUROC is OCSVM 0.966 on CICIDS17→CICDDoS19.
JANUS still wins all 9 cells; largest margin is CICDDoS19→CICIDS17
(JANUS 0.941 vs OCSVM 0.571, **+0.370 AUROC**).
- **Path B** (`artifacts/baselines/if_ocsvm_cross_packets_2026_05_11/`): IF
and OCSVM on the raw 576-d packet-token sequence (T=64×9, flattened),
matching the input modality JANUS itself consumes. Numbers are weaker
across the board (avg 0.16 AUROC vs path A); 3 IF cells and 1 OCSVM cell
drop **below random**. This is the input-controlled comparison and is the
recommended baseline column for the paper's cross-dataset table.
Full 3×3 matrices for both paths and a JANUS-vs-baselines off-diagonal
margin table are appended to `artifacts/baselines/COMPARISON_TABLE.md`.
### Reverse cross (CICDDoS2019 → CICIDS2017) — 2026-05-01 update
The reverse direction was the project's "stuck" failure mode (memory note
@@ -376,6 +395,11 @@ artifacts.
per-seed eval results across all experiments.
- `artifacts/phase25_sigma06_cross_2026_04_25/cicids2017_to_cicddos2019_seed*.json`
3-seed cross-dataset eval JSONs.
- `artifacts/baselines/if_ocsvm_cross_2026_05_11/CROSS_MATRIX_3x3.md`
IF/OCSVM 3×3 cross matrix on 20-d canonical flow features (path A).
- `artifacts/baselines/if_ocsvm_cross_packets_2026_05_11/CROSS_MATRIX_3x3.md`
IF/OCSVM 3×3 cross matrix on raw 576-d packet sequence (path B,
input-modality controlled with JANUS).
- Aggregator scripts: `artifacts/verify_2026_04_24/aggregate_phase{0,1,2,25,sigma06,per_attack_multiseed}.py`.
- Orchestrator scripts: `artifacts/verify_2026_04_24/run_phase*.sh`.