Test/phase 2 integration by joshfactorial · Pull Request #134 · ncsa/rusty-neat

joshfactorial · 2026-05-20T14:06:24Z

Adding integration tests

Adds a new rneat/tests/ harness that exercises the real binary boundary (CLI → config → model → output) via assert_cmd. Four integration suites with 12 tests total, plus one prod fix that the suite caught on its first run. New harness: - tests/common/mod.rs — shared helpers (binary command, fixtures, config builders, decompression). Also provides a GenReadsConfig builder with paired-ended, model, thread, and seed knobs. - New dev-deps: assert_cmd, predicates. cli_smoke.rs (4 tests): - `rneat --help` lists all 6 subcommands. - Each subcommand's `--help` exits 0 and mentions --configuration-yaml. - Missing config file → non-zero exit + stderr error message. - No arguments → non-zero exit + help text on stderr. pipeline_e2e.rs (2 tests): - gen-reads with default model produces a structurally well-formed FASTQ (multiple of 4 lines, '@'/+' markers, seq.len == qual.len). - gen-seq-error-model with binned_quality_bins → gen-reads → only bin-valued qualities appear in the output FASTQ. Prod fix caught by the second pipeline test: gen-reads previously loaded `quality_score_model` independently from `sequence_error_model`, so the QualityScoreModel embedded in a trained SequencingErrorModel was silently ignored. When a user set `sequence_error_model:` without a separate `quality_score_model:`, gen-reads quietly fell back to the built-in default — meaning binned-quality training had no effect on output. Fixed in gen_reads/utils/runner.rs by falling through to the SeqErrorModel's embedded QSM when no explicit override is configured. This matches the user-facing docstring in gen_reads_template.yml. determinism.rs (3 tests): - Same seed, single-threaded → same record multiset. - Same seed, multi-threaded → same record multiset. - Different seeds → different output (seed argument is load-bearing). Comparisons are on decompressed contents (gzip headers carry mtime). Multiset rather than byte-identical because rneat iterates HashMaps during contig assembly, so the line order in the output is non- deterministic even with num_threads=1; the record *set* is stable. fastq_validation.rs (3 tests): - Single-ended FASTQ passes strict structural validation (ACGTN-only seq, printable-ASCII qual, seq.len == qual.len) and every read's length matches the configured read_len. - Paired-end run produces both _r1 and _r2 with equal record counts; R1 names end in /1, R2 names end in /2, and name stems match pairwise. - Every quality byte decodes to a valid Phred+33 score in [0, 93]. cargo test --workspace: 12 new integration tests + 355 existing, all passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

joshfactorial changed the base branch from main to develop May 20, 2026 14:06

joshfactorial linked an issue May 20, 2026 that may be closed by this pull request

Fill out testing #104

Closed

joshfactorial merged commit 6af1373 into develop May 20, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test/phase 2 integration#134

Test/phase 2 integration#134
joshfactorial merged 1 commit into
developfrom
test/phase-2-integration

joshfactorial commented May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joshfactorial commented May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant