Performance upgrades#198
Merged
Merged
Conversation
AnalysisRunner now runs as one reader thread plus three processor threads. The reader batches Sequences (1024 per batch) and pushes each batch reference onto N ArrayBlockingQueues. Each processor drains its own queue and runs an evenly split subset of the QCModule array, so modules stay single-threaded per processor and no in-module locking is needed. Progress callbacks (analysisUpdated) are fired from the reader thread at the same cadence as the previous single-threaded version (every batch boundary, gated on a 5% file-position advance). Co-Authored-By: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
AnalysisQueue treats -t as a total-thread budget and splits it between
outer concurrency (files in parallel) and inner concurrency (per-file
reader + processor pipeline):
processorsPerFile = min(MAX_PROCESSORS_PER_FILE, totalThreads - 1)
outerSlots = max(1, totalThreads / (1 + processorsPerFile))
When -t is unset, OfflineRunner now tells AnalysisQueue how many files
the run has via configure(); the default becomes
min(THREADS_PER_FILE * max(1, expectedFiles), availableProcessors), so
a single file gets the full per-file pipeline and many files scale up
to the host's CPU count without the user needing to set -t.
A budget of one CPU makes AnalysisRunner take its single-threaded path
so -t 1 produces byte-identical behaviour to the unbatched runner.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
~3x run time speedup using parallel processing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pulling in the new threading model and performace upgrades which give about a 4X speed improvement. Thanks to @ewels and other Seqera people for pulling this together.