Commit Graph

3 Commits

Author SHA1 Message Date
ff0efa97bf Mixed_CFM: absorb Unified_CFM primitives; remove Unified_CFM
Mixed_CFM was loading AdaLNBlock / SinusoidalTimeEmb / _sinkhorn_coupling
and flow-feature helpers from Unified_CFM via importlib spec hacks. Pulled
those symbols into Mixed_CFM/_layers.py (model primitives) and inlined
the flow-feature loader helpers into Mixed_CFM/data.py, then deleted
Unified_CFM/ entirely along with three dead aggregate shell scripts whose
referenced eval entry point (artifacts/verify_2026_04_24/) was already gone.

Verified: historic janus_iscxtor2016_seed42 checkpoint re-evaluated under
the absorbed code reproduces all 10 phase1 AUROC scores to 6 decimals;
same-seed retrain converges to within +/-0.001 on terminal_norm (residual
drift is CUDA non-determinism in MultiheadAttention + Sinkhorn argmax,
not the absorption).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 14:18:11 +08:00
a6bcbbd299 ablation: add Group A (aggregator) + Group B (architecture) infrastructure
Extends MixedCFMConfig with 5 backwards-compatible flags (use_flow_token,
n_packet_tokens, disc_as_cont, cont_as_disc + cont_n_bins) so existing
JANUS-full checkpoints load with 0 missing/unexpected keys.

Adds:
- 60 ablation training configs (5 variants × 4 datasets × 3 seeds)
- scripts/ablation/{generate_configs.py, run_groupB.sh, run_cross_groupB.sh,
  smoke_test.sh} — config generation + GPU drivers
- scripts/aggregate/aggregate_ablation{,_cross,_cross_B}.py — produces
  within-dataset and cross-dataset (3×3) ablation tables with 3-seed mean
  ± 95% t-CI plus optional paired DeLong p-values

README updated with ablation section pointing at
artifacts/ablation/ABLATION_SUMMARY.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 23:59:27 +08:00
fae2db8cff Initial commit: code, paper, small artifacts 2026-05-07 20:47:30 +08:00