Performance: Eliminate per-sequence char[] allocations in seven QC modules#199
Open
ewels wants to merge 1 commit into
Open
Performance: Eliminate per-sequence char[] allocations in seven QC modules#199ewels wants to merge 1 commit into
char[] allocations in seven QC modules#199ewels wants to merge 1 commit into
Conversation
Each per-sequence module called String.toCharArray() once per read, allocating a fresh char[] each time. Switching to a String reference plus charAt() removes that allocation without changing the algorithm. PerSequenceGCContent#truncateSequence now returns the truncated String directly so the per-read char[] in the no-truncation path also goes away. Affects: BasicStats, NContent, PerBaseQualityScores, PerBaseSequenceContent, PerSequenceGCContent, PerSequenceQualityScores, PerTileQualityScores. Co-Authored-By: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Contributor
Author
|
I expect this to be the last performance-related PR for a bit. @pditommaso did push a whole load of other changes as well, but none seem to make a significant impact on run time (even if they seem sensible changes), so I'm not sure that they're worth the code changes. I thought this one was worth pushing forward mostly because of the memory savings when running with 2 FastQ files, which is a pretty common setup. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
String.toCharArray()allocates a freshchar[]the same length as the string, ~300 bytes per call. Seven modules each did this per sequence, so a single 50 M-read file produced ~100 GB of char arrays. The GC handled it, but at the cost of heap headroom and GC time.This branch replaces every such loop with
String + length + charAt, which on JDK 17 intrinsifies to the same machine code aschar[]indexing, minus the allocation and copy. Where the inner loop reads the same base more than once, the new code caches it into a localchar bso the bounds check fires once per iteration.PerSequenceGCContent.truncateSequenceswitches from returning achar[]to returning aStringfor the same reason.Affects: BasicStats, NContent, PerBaseQualityScores, PerBaseSequenceContent, PerSequenceGCContent, PerSequenceQualityScores, PerTileQualityScores.
Benchmark report shows a small speed increase and a drop in peak RSS memory usage for single files. When running with multiple files however the memory usage is significantly less (~30% less). All
fastqc_data.txt,summary.txt, andfastqc_report.htmlfiles remain byte-identical to master. Full report: report.htmlScreenshot