Skip to content

Commit f612dc7

Browse files
committed
test(clam): address Codex P2 — drop the auc<0.85 upper-bound tripwire
Codex correctly flagged that asserting auc < 0.85 in a library unit test turns a future quality improvement into a failing test: once a multi-method CHAODA ensemble lifts the signal past the 0.85 probe bar, cargo test -p ndarray would fail until an external lance-graph doc is updated. A library test must never fail because the code got better, and ndarray CI should not be coupled to a lance-graph note. Fix: remove the upper-bound assertion. The test now asserts only lower-bound, forward-compatible invariants — valid range, bit-exact determinism, correct polarity (outliers >= cluster mean), and better-than-chance (auc > 0.5). The measured AUC (~ 0.62 today) is surfaced via the existing eprintln diagnostic, not enforced. Refreshing the PROBE-CHAODA-1000G FINDING in lance-graph when the ensemble lands is a documentation step, not a gate enforced from this library's test suite. Doc comment updated to match. Re-run: test green, ROC_AUC=0.6240 still printed. https://claude.ai/code/session_01VysoWJ6vsyg3wEGc5v7T5v
1 parent f5f7e76 commit f612dc7

1 file changed

Lines changed: 16 additions & 20 deletions

File tree

src/hpc/clam.rs

Lines changed: 16 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -2590,15 +2590,15 @@ mod tests {
25902590
/// here. PROBE-CHAODA-1000G therefore needs the multi-method ensemble (or an
25912591
/// augmented signal) before it can pass — not just genomic fixtures.
25922592
///
2593-
/// This test locks the *robust* invariants — valid range, bit-exact
2594-
/// determinism, correct polarity (outliers ≥ cluster mean), better-than-
2595-
/// chance lower bound — with wide tolerance: the AUC may drift anywhere in
2596-
/// [0.5, 0.85) without breaking. The one tripwire is `auc < 0.85`: it does
2597-
/// not assert the measured 0.62, but it does fail by design if a future
2598-
/// change (e.g. a multi-method CHAODA ensemble) lifts the single-method
2599-
/// signal to the probe bar — forcing whoever does that to update the
2600-
/// PROBE-CHAODA-1000G FINDING in lance-graph rather than letting the
2601-
/// cross-repo claim silently rot.
2593+
/// This test asserts only *lower-bound*, forward-compatible invariants —
2594+
/// valid range, bit-exact determinism, correct polarity (outliers ≥ cluster
2595+
/// mean), and a better-than-chance signal (`auc > 0.5`). It deliberately does
2596+
/// NOT cap the AUC: a future multi-method CHAODA ensemble that lifts the
2597+
/// signal past the 0.85 probe bar must keep `cargo test -p ndarray` green,
2598+
/// never fail it. The measured AUC (≈ 0.62 today) is surfaced as an
2599+
/// `eprintln!` diagnostic, not enforced. When the ensemble lands and raises
2600+
/// it, refresh the `PROBE-CHAODA-1000G` FINDING in lance-graph — but that is
2601+
/// a documentation step, not a gate enforced from this library's test suite.
26022602
#[test]
26032603
fn test_chaoda_flags_novel_outliers_in_genetics_like_mixture() {
26042604
let (data, outliers) = make_genetics_like_mixture();
@@ -2663,23 +2663,19 @@ mod tests {
26632663
assert_eq!(a.score.to_bits(), b.score.to_bits(), "non-deterministic score");
26642664
}
26652665

2666-
// Robust, forward-compatible invariants (see the doc comment for the
2667-
// measured AUC ≈ 0.62 finding; we deliberately do NOT assert the ceiling).
2666+
// Robust, forward-compatible invariants. These are LOWER bounds only:
2667+
// they stay green whether the signal is the current weak leaf-LFD
2668+
// (AUC ~ 0.62) or a future multi-method ensemble that lifts it past the
2669+
// 0.85 probe bar. The measured AUC is surfaced via the eprintln above as
2670+
// a diagnostic; we deliberately do NOT cap it (a quality improvement must
2671+
// never fail `cargo test -p ndarray`).
26682672
assert!(
26692673
mean_out >= mean_clu,
26702674
"polarity wrong: outliers ({mean_out:.4}) below cluster mean ({mean_clu:.4})"
26712675
);
26722676
assert!(
26732677
auc > 0.5,
2674-
"leaf-LFD anomaly signal is not better than chance (AUC={auc:.4})"
2675-
);
2676-
// Documents the gap to the PROBE-CHAODA-1000G bar without making the
2677-
// test brittle to a future multi-method CHAODA port that raises the AUC.
2678-
assert!(
2679-
auc < 0.85,
2680-
"single-method leaf-LFD unexpectedly met the >= 0.85 probe bar \
2681-
(AUC={auc:.4}); if a multi-method CHAODA ensemble was added, update \
2682-
this assertion AND the PROBE-CHAODA-1000G FINDING in lance-graph"
2678+
"anomaly signal is not better than chance (AUC={auc:.4})"
26832679
);
26842680
}
26852681

0 commit comments

Comments
 (0)