Skip to content

Performance: Eliminate per-sequence char[] allocations in seven QC modules#199

Open
ewels wants to merge 1 commit into
s-andrews:masterfrom
ewels:perf/tochararray-churn
Open

Performance: Eliminate per-sequence char[] allocations in seven QC modules#199
ewels wants to merge 1 commit into
s-andrews:masterfrom
ewels:perf/tochararray-churn

Conversation

@ewels
Copy link
Copy Markdown
Contributor

@ewels ewels commented May 21, 2026

String.toCharArray() allocates a fresh char[] the same length as the string, ~300 bytes per call. Seven modules each did this per sequence, so a single 50 M-read file produced ~100 GB of char arrays. The GC handled it, but at the cost of heap headroom and GC time.

This branch replaces every such loop with String + length + charAt, which on JDK 17 intrinsifies to the same machine code as char[] indexing, minus the allocation and copy. Where the inner loop reads the same base more than once, the new code caches it into a local char b so the bounds check fires once per iteration. PerSequenceGCContent.truncateSequence switches from returning a char[] to returning a String for the same reason.

Affects: BasicStats, NContent, PerBaseQualityScores, PerBaseSequenceContent, PerSequenceGCContent, PerSequenceQualityScores, PerTileQualityScores.

Benchmark report shows a small speed increase and a drop in peak RSS memory usage for single files. When running with multiple files however the memory usage is significantly less (~30% less). All fastqc_data.txt, summary.txt, and fastqc_report.html files remain byte-identical to master. Full report: report.html

Screenshot _Volumes_T7%20Shield_fastqc-bench_tochararray-full_report html

Each per-sequence module called String.toCharArray() once per read,
allocating a fresh char[] each time. Switching to a String reference
plus charAt() removes that allocation without changing the algorithm.
PerSequenceGCContent#truncateSequence now returns the truncated String
directly so the per-read char[] in the no-truncation path also goes
away.

Affects: BasicStats, NContent, PerBaseQualityScores,
PerBaseSequenceContent, PerSequenceGCContent, PerSequenceQualityScores,
PerTileQualityScores.

Co-Authored-By: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
@ewels
Copy link
Copy Markdown
Contributor Author

ewels commented May 21, 2026

I expect this to be the last performance-related PR for a bit. @pditommaso did push a whole load of other changes as well, but none seem to make a significant impact on run time (even if they seem sensible changes), so I'm not sure that they're worth the code changes.

I thought this one was worth pushing forward mostly because of the memory savings when running with 2 FastQ files, which is a pretty common setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant