README: slim public-facing sections; gitignore CLAUDE.md

Trim README down to results/quickstart by removing Layout, Data contract,
Python environment, and Authoritative documents sections (these now live
in CLAUDE.md). Add CLAUDE.md to .gitignore so it stays as private dev
notes rather than committed docs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-09 08:42:51 +08:00
parent 539b8aaeaf
commit 4263fa8807
2 changed files with 6 additions and 76 deletions

2
.gitignore vendored
View File

@@ -31,3 +31,5 @@ Thumbs.db
/janus_figures_*/
*.tmp
CLAUDE.md

View File

@@ -1,6 +1,6 @@
# JANUS
**JANUS** (Joint Anomaly via Normalizing-flows of Unified States) — flow-matching unsupervised network anomaly detection over packet sequences.
**JANUS** — flow-matching unsupervised network anomaly detection over packet sequences.
JANUS is a packet-causal Transformer with **two output heads on a shared backbone**:
@@ -37,7 +37,9 @@ JANUS is the first NIDS method to use Flow Matching as the training paradigm in
‡ Numbers from Shafir et al. (arXiv'26) headline tables; protocol = train 10 K benign / SHAP-selected feature subsets per dataset (single NF).
★ Reproduced by us (3-seed mean ± std, 2-NF ensemble, CSV pipeline, paper-specified 5-feat SHAP subset). Shafir's paper does not publish an AUROC for CIC-IoT2023 — only F1 = 99.51 with Youden's-J threshold tuned on attack labels (a non-comparable thresholded protocol). For threshold-free head-to-head AUROC on this dataset we cite our reproduction.
JANUS sets new SOTA on **4/4 within-dataset benchmarks** under matched AUROC protocol — CIC-IDS2017 **+3.83**, CIC-DDoS2019 **+6.18**, CIC-IoT2023 **+23.66** (vs reproduced Shafir), ISCXTor2016 **+11.78** — all margins outside seed std. JANUS is fully unsupervised (benign-only training, no attack labels at any stage) and uses the Mahalanobis-OAS aggregator over its 10-d raw score vector with parameters fit on benign val only. Thresholded F1 metrics for JANUS across all four datasets are in `RESULTS.md` Section D and `artifacts/route_comparison/THRESHOLDED.md`.
JANUS is fully unsupervised (benign-only training, no attack labels at any stage) and uses the Mahalanobis-OAS aggregator over its 10-d raw score vector with parameters fit on benign val only.
Thresholded F1 metrics for JANUS across all four datasets are in `RESULTS.md` Section D.
### 3×3 cross-dataset transfer matrix
@@ -49,8 +51,6 @@ Source (rows) trained on 10K benign of source dataset; target (columns) tested o
| **CICDDoS19** | 0.9413 ± 0.0212 | _0.9918 ± 0.0005_ | 0.8767 ± 0.0068 |
| **CICIoT23** | 0.9394 ± 0.0063 | 0.9030 ± 0.0075 | _0.9590 ± 0.0022_ |
Forward CICIDS17→CICDDoS19 (0.969) beats Shafir 0.89 by **+0.08**; reverse CICDDoS19→CICIDS17 (0.941) approximately matches Shafir 0.93. CICIoT23 is hardest both as source and target — its IoT-protocol diversity makes the "benign of source ≈ benign of target" assumption brittle. Full table at `artifacts/route_comparison/CROSS_MATRIX_3x3.md`.
### Ablations (architecture & aggregator)
Two orthogonal ablation axes, each evaluated **within-dataset** (4 datasets × 3 seeds) **and** **cross-dataset** (3×3 transfer × 3 seeds):
@@ -73,65 +73,6 @@ Three ablations (B3 / B5 / A-aggregator) **marginally beat JANUS-full at within-
Full headline summary: `artifacts/ablation/ABLATION_SUMMARY.md`. Per-variant 3×3 cross matrices: `artifacts/ablation/ABLATION_CROSS_B_full.md` and `artifacts/ablation/ABLATION_TABLE_CROSS_full.md`.
## Layout
```
common/ Data contract — single source of truth for the
9-d packet schema, 20-d packet-derived flow schema,
label normalization, and packet preprocessing.
Mixed_CFM/ The JANUS model. Mixed continuousdiscrete CFM
with two output heads on a shared causal Transformer.
configs/ Per-(dataset × seed) training configs.
model.py MixedTokenCFM + MixedVelocity.
train.py / eval_phase1.py / eval_cross.py
Unified_CFM/ Legacy unified token CFM. Mixed_CFM imports its
AdaLNBlock + sinusoidal time embedding for backbone
reuse. Kept as internal ablation reference.
scripts/ Workspace-level pcap → artifact pipeline,
CSV adapters, cross-package eval tooling.
download/ UNB/CIC dataset downloaders.
baselines/ Third-party baseline runners (Kitsune, Shafir-NF,
Anomaly-Transformer).
aggregate/ Mahalanobis-OAS score-router + cross-matrix
orchestration. aggregate_score_router.py is the
deployable score path; run_cross_3x3.sh +
cross_3x3_table.py produce the cross matrix.
aggregate_ablation.py / aggregate_ablation_cross.py /
aggregate_ablation_cross_B.py produce the ablation
tables in artifacts/ablation/.
ablation/ B-group ablation training/eval drivers
(generate_configs.py, run_groupB.sh,
run_cross_groupB.sh).
tests/ Data-contract unit tests.
```
The following directories are **gitignored** (live on the dev box, not in the repo):
```
artifacts/ All run outputs (checkpoints, eval JSONs, score
npzs, figures). Per-(dataset × seed) model dirs at
artifacts/route_comparison/janus_<ds>_seed<N>/.
datasets/ Raw + processed datasets (~1 TB).
baselines/ Third-party baseline forks (Kitsune-py,
Anomaly-Transformer, ConMD, ganomaly, TIPSO-GAN, ...).
paper/ Paper sources & external PDFs (Shafir 2026, Lipman
2210.02747, etc.).
.venv/ uv-managed Python 3.14 virtual env.
```
## Data contract
Every processed dataset under `datasets/<name>/processed/` ships an aligned triple, all with the same row order (`flow_id = arange(N)`):
```
packets.npz packet_tokens [N, T_full, 9], packet_lengths [N], flow_id [N]
(or full_store/ — sharded PacketShardStore — for large datasets)
flows.parquet flow_id + label + 5-tuple metadata (src_ip, dst_ip, ports, protocol)
flow_features.parquet flow_id + label + 20 canonical packet-derived features
```
The 9-d packet schema and 20-d flow schema are FIXED in `common/data_contract.py`. Flow features are computed by `compute_flow_features_from_packets(packet_tokens, lens)` so row alignment is guaranteed.
## Quick start
```bash
@@ -202,16 +143,3 @@ Write one driver at `scripts/extract_<name>.py` that calls `extract_lib.extract_
To upgrade an existing artifact pair that lacks `flow_features.parquet`, run `scripts/generate_flow_features.py --packets-npz ... --flows-parquet ... --out ...` (or `--source-store` for sharded stores).
Common gotcha: if CSV timestamps and pcap epochs are in different time zones, `extract_lib` prints a diagnostic with the recommended `--time-offset`; rerun with that value.
## Authoritative documents
- `RESULTS.md` — full headline tables, per-attack analysis, JANUS configuration, thresholded operating-point metrics, what the experiments proved / disproved.
- `artifacts/ablation/ABLATION_SUMMARY.md` — paper-facing ablation summary (Group A aggregator + Group B architecture, both within and cross views).
- `Mixed_CFM/model.py` and `common/data_contract.py` — model + data-contract source of truth.
## Python environment
- `requires-python = ">=3.14"`; PyTorch pinned to the `pytorch-cu128` index, plus `mamba-ssm`, `causal-conv1d`, `scapy`, `dpkt`, `pyarrow`, `sklearn` (for the OAS aggregator).
- Two `pyproject.toml` files exist: root and `Mixed_CFM/`; they are not declared as a uv workspace and resolve independently. Run `uv run ...` from whichever directory owns the entry point.
- `Unified_CFM/` has no `pyproject.toml`; it uses the root venv (`uv run --no-sync python <script.py>`).
- Scripts under `scripts/download/` are pure stdlib — invoke with `python3`.