Releases: msk-access/krewlyzer
Releases · msk-access/krewlyzer
0.8.3
0.8.2
Full Changelog: 0.8.1...0.8.2
0.8.0
What's Changed
[0.8.0] - 2026-03-17
Added
- On-target PON Z-scores: Panel mode now computes on-target/off-target PON baselines
for MDS, OCF, and FSD features, providing clinically-scoped z-scores in panel assays. - FSR On-target Output: FSR now emits a separate
.FSR.ontarget.tsvfile in panel mode. - FSR Real Genomic Coordinates: Region labels now reflect true genomic window coordinates
instead of internal indices.
Fixed
- Output Format / Compress: All output files now correctly respect
--output-formatand
--compressflags (gzip compression, path handling, GC correction factors loading). - Rust GC Correction: Added missing
PathBufimport; fixed path handling for correction
factor files. - Rust Clippy: Replaced manual string strip with
strip_suffix(manual_striplint).
Documentation
- Corrected stale FSR column names in
concepts.mdandjson-output.md. - Updated FSR on-target output docs and window/panel-mode descriptions.
- Documented
_core.pyistub maintenance requirements.
CI
- GitHub Actions Node.js 24: Bumped
actions/checkoutv4→v5,actions/setup-pythonv5→v6,
actions/cachev4→v5 to address Node.js 20 deprecation (enforced June 2, 2026).
Full Changelog: 0.7.0...0.8.0
0.7.0
What's New in 0.7.0
✨ Features
- Configurable Output Formats —
--output-format tsv|parquet|both(default: tsv) controls all tabular feature outputs.--compressgzip-compresses TSV outputs (.tsv.gz). WPS outputs remain Parquet regardless of setting.
🐛 Bug Fixes
- build-pon: Explicitly force
output_format="tsv"in allprocess_sample()calls to prevent silent failures when default output format changes - Feature Serializer: mds_z field now correctly included in JSON output for the from_outputs() code path
- OCF file naming:
OCF.tsv= all reads (on + off combined);OCF.offtarget.tsv= true panel off-target score. Clarified in docs and code comments - Rust wps.rs: Removed erroneous
*dereference onnode.metadata(E0614) - Rust gc_correction.rs: Prefixed unused
valid_regions_pathparam with_to silence compiler warning while retaining API symmetry - MkDocs Snippets: Fixed broken
--8<-- "CHANGELOG.md"includes by updatingpymdownx.snippets.base_pathto['.', 'docs']
🔧 Code Quality
- Lint CI Job: New parallel
lintjob (ruff · black · mypy · cargo clippy -- -D warnings) runs on all push/PR events - 11 Clippy warnings resolved:
repeat_vec_with_capacity,ptr_arg(×2),needless_range_loop(×4),collapsible_match,lines_filter_map_ok(×2),field_reassign_with_default - Mypy clean: 27 type errors resolved across 8 files; Rust extension stub (_core.pyi) added
- Pinned linter versions:
black==26.1.0,ruff==0.15.4,mypy==1.19.1for reproducible CI
📚 Documentation
- Output Format Options: New section in
reference/output-files.mddocumenting--output-format,--compress, WPS always-Parquet exception, and--generate-json - OCF Variant Clarification: 3-variant table explaining
OCF.tsvvs.ontarget.tsvvs.offtarget.tsv - Post-0.6.0 Audit (12 files): Fixed metadata.json →
metadata.tsvreferences, updated test counts, added Nextflow--output_format/--compressparams - Docker tag: docs/index.md updated to explicit
:0.7.0per release policy
🏗️ Full Changelog
0.6.0
[0.6.0] - 2026-02-28
Added
- mFSD Base Quality Filtering:
--min-baseq/-Q(default 20) gates variant evidence by base quality - mFSD GC Correction: Rust-native LOESS GC bias correction for variant fragment size distributions
- mFSD Duplex Weighting: Proper consensus fragment handling via
--duplex - Region MDS
--sample-name: Consistent output naming without post-hoc rename - Feature Serializer: Auto-load
fsc_counts,region_mds,uxminfrom_outputs() - IRIS Batch Submission: scripts/run_krewlyzer_iris.sh for SLURM/IRIS cluster runs with
--generate_json - nf-core Institutional Configs:
custom_config_baseparam and IRIS profile support - Versioned Documentation: Implemented
mikefor dev/stable doc versions - Nextflow mfsd Module: Full standalone params (
--reference,--correction-factors,--mapq,--minlen,--maxlen,--min-baseq,--duplex,--no-skip-duplicates) - Nextflow runall:
fsc_counts.tsvoutput declaration,--min-baseqwired - mFSD Integration Tests: 161 lines of new test coverage
Fixed
- mFSD MAF Parsing: Header-based column lookup (fixes column-index mismatch with different MAF flavors)
- Nextflow FILTER_MAF: Complete overhaul — eliminated join operator blocking, replaced regex with substring match, fixed SyntaxError in versions.yml, dynamic maxForks for SLURM
- Nextflow Workflow Streaming: Fixed RUNALL blocking from
remainder:true,failOnMismatch, channel round-robin; usedmultiMapfor proper routing - Nextflow RUNALL Outputs: Added 14 output declarations, fixed BreakPointMotif casing, explicit publishDir
- Region MDS Nextflow:
--sample-namereplacesmvworkaround - Nextflow Config: Executor queueSize placement,
-qsCLI flag, global publishDir removal - WPS CLI Tests: Fixed
--inputflag (was positional arg) — recovered 2 skipped tests - Pandas FutureWarning: Fixed
pd.concatwith all-NA columns in PON test fixture
Changed
- Code Quality: Black formatted 71 files, ruff fixed 129 lint errors, cargo clippy applied
- Ruff Config: Added
[tool.ruff]to pyproject.toml with documented E402/F821 ignores - Agent Config:
.agent/→.agent/rules/withalwaysApplyfrontmatter
Documentation
- 45-item Audit: Corrections across 25 doc files including
.csv→.tsv(7 files),.WPS.tsv.gz→.parquet(3 files), phantom--output-formatremoved, Docker versions→X.Y.Z, parameters.md 12→28, outputs.md 14→41, JSON schema corrected, developer guide Rust table 10→19, architecture pipeline signature updated - PDF Embedding: Fixed rendering with mkdocs-pdf plugin
Full Changelog: 0.5.2...0.6.0
Documentation Fix
Full Changelog: 0.5.2...0.5.3
Update to have a seprate Duplex Bam Input
What's Changed
- Add .agent/DEVELOPMENT.md with development guide and QC checklist by @rhshah in #14
- Feature/dual bam support by @rhshah in #15
Full Changelog: 0.5.1...0.5.2
Docker image with data
Full Changelog: 0.5.0...0.5.1
Version with Panel features
What's Changed
- Feature/pon framework by @rhshah in #8
- Feature/fsc ml features by @rhshah in #9
- Add dual GC correction and on-target baselines for panel mode by @rhshah in #10
- Feature/output format standardization by @rhshah in #11
- Add Region MDS feature for per-gene motif diversity by @rhshah in #12
Full Changelog: 0.3.2...0.5.0