test(clam): CHAODA outlier spike — single-method LFD is below the PROBE-CHAODA-1000G bar (AUC 0.62) by AdaWorldAPI · Pull Request #219 · AdaWorldAPI/ndarray

AdaWorldAPI · 2026-06-16T07:56:43Z

Summary

Runs the "1-day spike substitute" named in genetics-probes-v1.md (AdaWorldAPI/lance-graph, merged in lance-graph #503): a kernel smoke test for the PROBE-CHAODA-1000G claim — "CHAODA detects novel variants without a trained classifier."

The spike synthesises a 5-lane Gaussian mixture (matching the probe's 5-lane variant feature vector: AF / DP / FS / entropy / conservation) — three tight "common" clusters + eight deliberately extreme "novel" outliers — thermometer-encodes each lane into 48 bits (so Hamming distance is monotone in per-lane L1 magnitude, the honest bridge from ordinal features to the Hamming-metric CLAM default), builds the shipped ClamTree, and scores via anomaly_scores.

Measured (deterministic, seed-fixed)

metric	value
mean cluster score	0.6749
mean outlier score	0.7500
frac cluster ≥ 0.5	0.733
frac outlier ≥ 0.5	0.750
ROC-AUC (Mann-Whitney U)	0.6240

FINDING

The shipped single-method leaf-LFD anomaly_scores reaches only AUC ≈ 0.62 on the easiest possible case (clean synthetic clusters with far outliers) — well below the probe's ≥ 0.85 bar.

The cause is mechanical: leaf LFD = log₂(|B(c,r)|/|B(c,r/2)|) measures intra-leaf geometry complexity, not inter-leaf isolation. An isolated singleton lands in a leaf whose LFD is comparable to a dense cluster's, and global min-max normalisation compresses both into the same score band. The CHAODA ensemble of Ishaq et al. 2021 combines several graph-based signals (relative/component cardinality, graph neighbourhood, random-walk stationary distribution, vertex degree); only the LFD signal is shipped here.

PROBE-CHAODA-1000G therefore needs the multi-method ensemble (or an augmented signal) before it can pass — not merely genomic fixtures. This is the evidence-before-build payoff: the gap is caught before any adapter-genetics-experimental (D-GEN-1..4) spend.

Test design

test_chaoda_flags_novel_outliers_in_genetics_like_mixture locks robust, wide-tolerance invariants (valid range, bit-exact determinism, correct polarity mean_out ≥ mean_clu, better-than-chance auc > 0.5) plus one tripwire (auc < 0.85) that fails by design if a future multi-method port lifts the signal to the probe bar — forcing a cross-repo FINDING update in lance-graph rather than letting the claim silently rot. AUC may drift freely in [0.5, 0.85) without breaking the test.

Test plan

cargo test --lib hpc::clam::tests — 52 passed (51 pre-existing + this spike).
Determinism: rebuild + rescore is bit-exact (f64::to_bits compare).
No production code changed — test-module-only addition.

Cross-refs

lance-graph/.claude/plans/genetics-probes-v1.md — the probe this spike substitutes for (a companion lance-graph PR records this AUC=0.624 as a CONJECTURE→FINDING update).
src/hpc/clam.rs:1493-1567 — CHAODA Phase 4 anomaly_scores under test.
Ishaq et al. 2021 — the multi-method CHAODA ensemble the single shipped signal is a subset of.

https://claude.ai/code/session_01VysoWJ6vsyg3wEGc5v7T5v

Update — 2026-06-16 (post-#220 merge, verified)

The branch absorbed #220 (the CHAODA multi-method ensemble) before merge, so the merged change delivers BOTH the spike FINDING and the ensemble that closes it. Verified on the branch HEAD — cargo test --lib hpc::clam::tests::test_chaoda -- --nocapture, both green:

signal	ROC-AUC	gate
single-method leaf-LFD (spike)	0.6240	lower-bound only (`auc > 0.5`) — forward-compatible
path-only	0.9938	`auc_path >= 0.85` ✅
ensemble	0.9906 (+0.367 lift)	`auc_ens >= 0.85` ✅

So PROBE-CHAODA-1000G's ≥0.85 bar is now cleared in-branch by the ensemble, while the single-method spike test stays the documented-FINDING tripwire that can't fail when the code improves (codex P2 on this PR — resolved). The original spike narrative above remains accurate for the single-method commit; this note records the merged final state for the cross-repo genetics-probes-v1 FINDING.

…s below the PROBE-CHAODA-1000G bar Runs the "1-day spike substitute" named in the genetics-probes-v1 spec (AdaWorldAPI/lance-graph): a kernel smoke test for the claim "CHAODA detects novel variants without a trained classifier." Synthesises a 5-lane Gaussian mixture (matching the probe's 5-lane variant feature vector) — three tight "common" clusters plus eight deliberately extreme "novel" outliers — thermometer-encodes each lane into 48 bits so Hamming distance is monotone in per-lane L1 magnitude (the honest bridge from ordinal features to the Hamming-metric CLAM default), builds the shipped ClamTree, and scores via anomaly_scores. MEASURED (deterministic, seed-fixed): mean cluster score = 0.6749, mean outlier score = 0.7500 frac cluster >= 0.5 = 0.733, frac outlier >= 0.5 = 0.750 ROC-AUC (Mann-Whitney U) = 0.6240 FINDING: the shipped single-method leaf-LFD anomaly_scores reaches only AUC ~ 0.62 on the EASIEST possible case (clean synthetic clusters with far outliers) — well below the probe's >= 0.85 bar. The cause is mechanical: leaf LFD = log2(|B(c,r)|/|B(c,r/2)|) measures intra-leaf geometry complexity, not inter-leaf isolation, so an isolated singleton lands in a leaf whose LFD is comparable to a dense cluster's, and global min-max normalisation compresses both into the same band. The CHAODA ensemble of Ishaq et al. 2021 combines several graph-based signals (relative/component cardinality, graph neighbourhood, random-walk stationary distribution, vertex degree); only the LFD signal is shipped here. PROBE-CHAODA-1000G therefore needs the multi-method ensemble or an augmented signal before it can pass — not merely genomic fixtures. The test locks robust, wide-tolerance invariants (valid range, bit-exact determinism, correct polarity, better-than-chance lower bound) plus one tripwire (auc < 0.85) that fails by design if a future multi-method port lifts the signal to the probe bar, forcing a cross-repo FINDING update rather than letting the claim silently rot. This is the evidence-before-build payoff: the gap is caught before any adapter-genetics-experimental (D-GEN-1..4) spend. https://claude.ai/code/session_01VysoWJ6vsyg3wEGc5v7T5v

coderabbitai · 2026-06-16T07:56:52Z

Warning

Review limit reached

@AdaWorldAPI, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 39 minutes and 33 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 55cc4c27-866e-4b73-b563-d6f9d7cba42a

📥 Commits

Reviewing files that changed from the base of the PR and between 2d708ef and 2ef18ed.

📒 Files selected for processing (1)

src/hpc/clam.rs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f5f7e76d53

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Codex correctly flagged that asserting auc < 0.85 in a library unit test turns a future quality improvement into a failing test: once a multi-method CHAODA ensemble lifts the signal past the 0.85 probe bar, cargo test -p ndarray would fail until an external lance-graph doc is updated. A library test must never fail because the code got better, and ndarray CI should not be coupled to a lance-graph note. Fix: remove the upper-bound assertion. The test now asserts only lower-bound, forward-compatible invariants — valid range, bit-exact determinism, correct polarity (outliers >= cluster mean), and better-than-chance (auc > 0.5). The measured AUC (~ 0.62 today) is surfaced via the existing eprintln diagnostic, not enforced. Refreshing the PROBE-CHAODA-1000G FINDING in lance-graph when the ensemble lands is a documentation step, not a gate enforced from this library's test suite. Doc comment updated to match. Re-run: test green, ROC_AUC=0.6240 still printed. https://claude.ai/code/session_01VysoWJ6vsyg3wEGc5v7T5v

Fixes the format/stable CI check on PR #219. rustfmt reflows the centers array literal and two assert! calls in the spike test; no logic change, test still green (single-LFD AUC 0.6240 unchanged). Changes confined to the added test code. https://claude.ai/code/session_01VysoWJ6vsyg3wEGc5v7T5v

…HAODA-1000G synthetic bar (AUC 0.62 -> 0.99) Increment 1 of D-GEN-CHAODA-ENSEMBLE (lance-graph genetics-probes-v1.md). Adds ClamTree::ensemble_anomaly_scores as a NEW scoring entry point alongside the unchanged single-method anomaly_scores baseline. The spike (#219) measured single-method leaf-LFD at ROC-AUC 0.624 on a synthetic 5-lane Gaussian mixture, below the 0.85 bar. Mechanical cause: leaf LFD measures intra-leaf geometry, not inter-leaf isolation. This ensemble combines isolation-sensitive CHAODA signals: - parent-child path-minority ratio (dominant): walking a leaf to the root, the minimum child/parent cardinality ratio is tiny for a point that split off as a minority (isolated outlier) and moderate for a point that always stayed in the majority (dense-cluster member). Immune to the leaf-fragmentation that defeats raw leaf cardinality. - connected-component cardinality over the leaf-overlap graph (small components are anomalous). Averaged into one score; every point inherits its leaf's score. A first attempt using raw leaf cardinality + vertex degree + component size scored AUC 0.621 (no lift) because the tree fragments dense blobs into many tiny leaves that mimic isolated outliers under those metrics; the path-minority signal is what actually separates. Leaf degree and raw leaf cardinality were dropped as fragmentation noise. The remaining CHAODA methods (random-walk stationary distribution) are deferred. MEASURED (deterministic synthetic mixture, same fixture as #219): single-LFD AUC = 0.6240 ensemble AUC = 0.9906 (lift +0.3667, clears the 0.85 bar) This is the synthetic SMOKE TEST only. It proves the ensemble approach captures isolation where single-LFD does not; it does NOT prove genomic novelty detection. PROBE-CHAODA-1000G on real corpora remains gated on D-GEN-1 + D-GEN-2 (VCF -> feature-vector pipeline). Tests: full hpc::clam suite green (53 incl. the new ensemble test); ensemble is deterministic (bit-exact rebuild) and built purely from shipped tree fields + the public dist(). https://claude.ai/code/session_01VysoWJ6vsyg3wEGc5v7T5v

…t doc + rustfmt Addresses the Codex P2 on PR #220 (quadratic leaf-overlap build) and a doc-comment inconsistency I introduced, and fixes the format/stable CI. (1) Quadratic-build guard (Codex P2). The connected-component term needs an O(L^2 * vec_len) leaf-overlap graph; on production corpora with small min_cluster_size, L approaches the point count and the public API could hang. Split into: - ensemble_anomaly_scores_budgeted(.., graph_budget): computes the linear O(L*depth) parent-child path-minority signal always, and only builds the overlap graph + component term when n_leaves <= graph_budget. - ensemble_anomaly_scores(..): convenience wrapper using the default ENSEMBLE_GRAPH_BUDGET = 4096; above that it degrades to path-minority alone, so the public API never runs the quadratic build at scale. (2) Path-only fallback is validated, not assumed. New measurement on the synthetic fixture (graph_budget = 0 forces the fallback): single-LFD 0.6240 | path-only 0.9938 | full ensemble 0.9906 Path-minority alone clears the 0.85 bar (slightly above the combined — the component term is a marginal refinement), so degrading at scale is safe. The test now asserts path-only AUC >= 0.85 so the guard can never silently degrade large-corpus accuracy. (3) Doc-comment correction. When the scoring pivoted to path-minority + component, the method doc still described the abandoned relative-cardinality / vertex-degree set and listed parent-child ratio as "deferred" when it is in fact the dominant shipped signal. Rewritten to match the implementation. (4) rustfmt: format/stable was red; the new code is now rustfmt-clean (changes confined to the added ensemble method + tests; no pre-existing code touched). clippy --lib clean; full hpc::clam suite green (53 tests). https://claude.ai/code/session_01VysoWJ6vsyg3wEGc5v7T5v

coderabbitai · 2026-06-16T09:28:43Z

Caution

Review failed

An error occurred during the review process. Please try again later.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

feat(clam): CHAODA multi-method ensemble — clears the synthetic PROBE-CHAODA-1000G bar (AUC 0.62 -> 0.99)

chatgpt-codex-connector Bot reviewed Jun 16, 2026

View reviewed changes

Comment thread src/hpc/clam.rs Outdated

AdaWorldAPI mentioned this pull request Jun 16, 2026

feat(clam): CHAODA multi-method ensemble — clears the synthetic PROBE-CHAODA-1000G bar (AUC 0.62 -> 0.99) #220

Merged

4 tasks

claude added 3 commits June 16, 2026 09:24

Merge pull request #220 from AdaWorldAPI/claude/chaoda-ensemble-v1

2ef18ed

feat(clam): CHAODA multi-method ensemble — clears the synthetic PROBE-CHAODA-1000G bar (AUC 0.62 -> 0.99)

AdaWorldAPI merged commit bf606eb into master Jun 16, 2026
18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(clam): CHAODA outlier spike — single-method LFD is below the PROBE-CHAODA-1000G bar (AUC 0.62)#219

test(clam): CHAODA outlier spike — single-method LFD is below the PROBE-CHAODA-1000G bar (AUC 0.62)#219
AdaWorldAPI merged 6 commits into
masterfrom
claude/chaoda-outlier-spike-v1

AdaWorldAPI commented Jun 16, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 16, 2026 •

edited

Loading

Review limit reached

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 16, 2026

Review failed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AdaWorldAPI commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Measured (deterministic, seed-fixed)

FINDING

Test design

Test plan

Cross-refs

Update — 2026-06-16 (post-#220 merge, verified)

Uh oh!

coderabbitai Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 16, 2026

Review failed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AdaWorldAPI commented Jun 16, 2026 •

edited

Loading

coderabbitai Bot commented Jun 16, 2026 •

edited

Loading