Skip to content

Fix silent inf/NaN from zero-coverage control genes#12

Merged
dlopez-bioinfo merged 1 commit into
masterfrom
11-zero-coverage-control-genes-silently-corrupt-output-with-infnan
Feb 20, 2026
Merged

Fix silent inf/NaN from zero-coverage control genes#12
dlopez-bioinfo merged 1 commit into
masterfrom
11-zero-coverage-control-genes-silently-corrupt-output-with-infnan

Conversation

@dlopez-bioinfo

@dlopez-bioinfo dlopez-bioinfo commented Feb 20, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Detect samples with zero-coverage control genes and warn to stderr with gene names
  • Guard statistics computation against division by zero using np.where
  • Exclude affected samples from zmean_k so healthy samples in the same batch are not corrupted
  • Explicitly set all derived stats (pi_ij, theta_i, z_ik) to NaN for affected samples
  • Add test validating zero-coverage handling

Closes #11

Test plan

  • All 7 existing tests pass (python -m unittest discover -v smaca/)
  • flake8 clean
  • New test_zero_coverage_gene_warning validates affected sample gets NaN and healthy sample gets finite results

When a BAM has no reads in a control gene, the division by zero
produced silent inf/NaN that could corrupt the entire batch.

Now: warn to stderr, exclude bad samples from zmean_k, and
explicitly set affected sample results to NaN.
@dlopez-bioinfo dlopez-bioinfo linked an issue Feb 20, 2026 that may be closed by this pull request
@dlopez-bioinfo dlopez-bioinfo merged commit 6716694 into master Feb 20, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Zero-coverage control genes silently corrupt output with inf/NaN

1 participant