Skip to content

Fix dup_length handling for short reads#190

Merged
s-andrews merged 1 commit into
s-andrews:masterfrom
ehsanestaji:codex/fix-dup-length-short-reads
May 20, 2026
Merged

Fix dup_length handling for short reads#190
s-andrews merged 1 commit into
s-andrews:masterfrom
ehsanestaji:codex/fix-dup-length-short-reads

Conversation

@ehsanestaji
Copy link
Copy Markdown
Contributor

Fixes #145.

When fastqc.dup_length / --dup_length is longer than a read, the overrepresented sequence module currently calls substring(0, dupLength) and crashes with StringIndexOutOfBoundsException.

This change caps the duplication truncation length at the actual sequence length, so short reads are kept whole instead of failing.

Also adds an integration regression test using the existing 16bp minimal.fastq fixture with fastqc.dup_length=20.

Validation:

  • Reproduced the original crash with fastqc.dup_length=20 against test/data/minimal.fastq before the fix.
  • Compiled the touched production class and new regression test with Java 11.
  • Ran DupLengthTest: 1 test found, 1 succeeded, 0 failed.
  • Ran unit test package through JUnit launcher: 8 tests found, 8 succeeded, 0 failed.
  • Ran git diff --check.

@s-andrews
Copy link
Copy Markdown
Owner

Thanks for submitting this. That fix looks good to me, I'll merge it for the next release.

@s-andrews s-andrews marked this pull request as ready for review May 20, 2026 14:39
@s-andrews s-andrews merged commit 78a5182 into s-andrews:master May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fastqc crashes if given --dup_length too long

2 participants