Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
c027350
add larger nf-core runners
LuisHeinzlmeier Feb 13, 2026
3559b14
add zenodo and citation
LuisHeinzlmeier Feb 13, 2026
bc88744
update citation
LuisHeinzlmeier Feb 13, 2026
ce91aa9
add missing bam files to docs
LuisHeinzlmeier Feb 13, 2026
a30283e
add random comment to trigger CI tests
LuisHeinzlmeier Feb 13, 2026
396a05c
remove nf-core TODO's
LuisHeinzlmeier Feb 17, 2026
b24b1ad
nf-core pipelines lint --fix rocrate_readme_sync
LuisHeinzlmeier Feb 17, 2026
f322742
use BibTeX for citations
LuisHeinzlmeier Feb 17, 2026
25c9c0c
try another style for citations
LuisHeinzlmeier Feb 20, 2026
90696cf
try another style for citations (pre-commit)
LuisHeinzlmeier Feb 20, 2026
8e41fd6
change dropdown syntax to html
LuisHeinzlmeier Feb 20, 2026
500e955
adjust syntax
LuisHeinzlmeier Feb 20, 2026
c37666b
add detailed docs for demuxlet
LuisHeinzlmeier Feb 28, 2026
fd466e3
update format
LuisHeinzlmeier Feb 28, 2026
ea3520a
update format 2
LuisHeinzlmeier Feb 28, 2026
c6220c7
Fix Numba caching error in container by setting NUMBA_CACHE_DIR
LuisHeinzlmeier Feb 28, 2026
00e4de4
setup test_full
LuisHeinzlmeier Mar 9, 2026
eb2846c
fix hash modules for full_test compatibility
LuisHeinzlmeier Mar 11, 2026
29df45c
remove demuxem TODOs
LuisHeinzlmeier Mar 11, 2026
4689f4e
update tests and snapshots
LuisHeinzlmeier Mar 13, 2026
89b5772
current test setup
LuisHeinzlmeier Mar 16, 2026
e8ff787
reduce computing/storage resources of test and test_full
LuisHeinzlmeier Mar 17, 2026
a079d9b
closes #107
LuisHeinzlmeier Mar 17, 2026
b1add81
update nextflow_schema.json
LuisHeinzlmeier Mar 17, 2026
fc9b1a1
update docs
LuisHeinzlmeier Mar 17, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 9 additions & 3 deletions .github/workflows/nf-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,9 @@ env:
jobs:
nf-test-changes:
name: nf-test-changes
runs-on: ubuntu-latest
runs-on: # use self-hosted runners
- runs-on=${{ github.run_id }}-nf-test-changes
- runner=4cpu-linux-x64
outputs:
shard: ${{ steps.set-shards.outputs.shard }}
total_shards: ${{ steps.set-shards.outputs.total_shards }}
Expand Down Expand Up @@ -59,7 +61,9 @@ jobs:
name: "${{ matrix.profile }} | ${{ matrix.NXF_VER }} | ${{ matrix.shard }}/${{ needs.nf-test-changes.outputs.total_shards }}"
needs: [nf-test-changes]
if: ${{ needs.nf-test-changes.outputs.total_shards != '0' }}
runs-on: ubuntu-latest
runs-on: # use self-hosted runners
- runs-on=${{ github.run_id }}-nf-test
- runner=4cpu-linux-x64
strategy:
fail-fast: false
matrix:
Expand Down Expand Up @@ -115,7 +119,9 @@ jobs:
confirm-pass:
needs: [nf-test]
if: always()
runs-on: ubuntu-latest
runs-on: # use self-hosted runners
- runs-on=${{ github.run_id }}-confirm-pass
- runner=2cpu-linux-x64
steps:
- name: One or more tests failed (excluding latest-everything)
if: ${{ contains(needs.*.result, 'failure') }}
Expand Down
30 changes: 25 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://github.com/codespaces/new/nf-core/hadge)
[![GitHub Actions CI Status](https://github.com/nf-core/hadge/actions/workflows/nf-test.yml/badge.svg)](https://github.com/nf-core/hadge/actions/workflows/nf-test.yml)
[![GitHub Actions Linting Status](https://github.com/nf-core/hadge/actions/workflows/linting.yml/badge.svg)](https://github.com/nf-core/hadge/actions/workflows/linting.yml)[![AWS CI](https://img.shields.io/badge/CI%20tests-full%20size-FF9900?labelColor=000000&logo=Amazon%20AWS)](https://nf-co.re/hadge/results)[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.XXXXXXX-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.XXXXXXX)
[![GitHub Actions Linting Status](https://github.com/nf-core/hadge/actions/workflows/linting.yml/badge.svg)](https://github.com/nf-core/hadge/actions/workflows/linting.yml)[![AWS CI](https://img.shields.io/badge/CI%20tests-full%20size-FF9900?labelColor=000000&logo=Amazon%20AWS)](https://nf-co.re/hadge/results)[![Cite with Zenodo](http://img.shields.io/badge/DOI-10.5281/zenodo.10634731-1073c8?labelColor=000000)](https://doi.org/10.5281/zenodo.10634731)
[![nf-test](https://img.shields.io/badge/unit_tests-nf--test-337ab7.svg)](https://www.nf-test.com)

[![Nextflow](https://img.shields.io/badge/version-%E2%89%A525.04.0-green?style=flat&logo=nextflow&logoColor=white&color=%230DC09D&link=https%3A%2F%2Fnextflow.io)](https://www.nextflow.io/)
Expand Down Expand Up @@ -102,7 +102,6 @@ We thank the following people for their extensive assistance in the development
- [Luis Heinzlmeier](https://github.com/LuisHeinzlmeier)
- [Nico Trummer](https://github.com/nictru)
- [Seo Hyon Kim](https://github.com/seohyonkim)
<!-- TODO nf-core: If applicable, make list of people who have also contributed -->

## Contributions and Support

Expand All @@ -112,10 +111,31 @@ For further information or help, don't hesitate to get in touch on the [Slack `#

## Citations

<!-- TODO nf-core: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file. -->
<!-- If you use nf-core/hadge for your analysis, please cite it using the following doi: [10.5281/zenodo.XXXXXX](https://doi.org/10.5281/zenodo.XXXXXX) -->
If you use nf-core/hadge for your analysis, please cite it as follows:

<!-- TODO nf-core: Add bibliography of tools and data used in your pipeline -->
> **hadge: a comprehensive pipeline for donor deconvolution in single-cell studies.**
>
> Fabiola Curion, Xichen Wu, Lukas Heumos, Mariana Gonzales Andre, Lennard Halle, Melissa Grant-Peters, Charlotte Rich-Griffin, Hing-Yuen Yeung, Calliope A. Dendrou, Herbert B. Schiller & Fabian J. Theis.
>
> _Genome Biol._ 2024 Apr 26. doi: [10.1186/s13059-024-03249-z](https://doi.org/10.1186/s13059-024-03249-z).

<details><summary>BibTeX</summary>

```bibtex
@article{curion2024hadge,
title={hadge: a comprehensive pipeline for donor deconvolution in single-cell studies},
author={Curion, Fabiola and Wu, Xichen and Heumos, Lukas and Andr{\'e}, Mylene Mariana Gonzales and Halle, Lennard and Ozols, Matiss and Grant-Peters, Melissa and Rich-Griffin, Charlotte and Yeung, Hing-Yuen and Dendrou, Calliope A and others},
journal={Genome Biology},
volume={25},
number={1},
pages={109},
year={2024},
publisher={Springer}
}

```

</details>

An extensive list of references for the tools used by the pipeline can be found in the [`CITATIONS.md`](CITATIONS.md) file.

Expand Down
4 changes: 2 additions & 2 deletions bin/update_snapshots.sh
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ for test_file in "${test_files[@]}"; do
test_profile="test"
fi

command="nf-test test tests/${test_file}.nf.test --profile ${test_profile},docker --update-snapshot"
command="nf-test test tests/${test_file}.nf.test --profile ${test_profile},apptainer --update-snapshot"

echo "Updating snapshot for: $test_file"
echo "Running: ${command}"
Expand All @@ -36,7 +36,7 @@ for test_file in "${test_files[@]}"; do
# test if testing is consistent
if [[ "$CHECK_CONSISTENCY" == "true" ]]; then
echo "Re-running test to verify snapshot consistency for: $test_file"
command="nf-test test tests/${test_file}.nf.test --profile ${test_profile},docker"
command="nf-test test tests/${test_file}.nf.test --profile ${test_profile},apptainer"
echo "Running: ${command}"
eval "$command"
echo "✓ Consistency check passed for: $test_file"
Expand Down
4 changes: 2 additions & 2 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,12 @@ params {

// Input data
mode = 'rescue'
hash_tools = 'hasheddrops,bff,gmm-demux'
hash_tools = 'htodemux,demuxem,bff'
genetic_tools = 'freemuxlet,vireo,souporcell'
input = 'https://github.com/nf-core/test-datasets/raw/refs/heads/hadge/samplesheet/samplesheet_rescue.csv'
genome = 'GRCh38'
fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/chr21/sequence/genome.fasta'
bam_qc = true
outdir = "outdir_test"

// all possible modules
// hash_tools = 'htodemux,hasheddrops,multiseq,demuxem,gmm-demux,bff,hashsolo'
Expand Down
9 changes: 5 additions & 4 deletions conf/test_donor_match.config
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,9 @@ params {
// Input data
mode = 'donor_match'
input = 'https://github.com/nf-core/test-datasets/raw/refs/heads/hadge/samplesheet/samplesheet_donor_match.csv'
demultiplexing_result = 'https://github.com/nf-core/test-datasets/raw/refs/heads/hadge/testdata/donor_match_assignment.csv'
vireo_filtered_variants = 'https://github.com/nf-core/test-datasets/raw/refs/heads/hadge/testdata/donor_match_filtered_variants.tsv'
cell_genotype = 'https://github.com/nf-core/test-datasets/raw/refs/heads/hadge/testdata/donor_match.cells.vcf.gz'
gt_donors = 'https://github.com/nf-core/test-datasets/raw/refs/heads/hadge/testdata/donor_match_GT_donors.vireo.vcf.gz'
demultiplexing_result = 'https://github.com/nf-core/test-datasets/raw/refs/heads/hadge/dataset_test/donor_match_assignment.csv'
vireo_filtered_variants = 'https://github.com/nf-core/test-datasets/raw/refs/heads/hadge/dataset_test/donor_match_filtered_variants.tsv'
cell_genotype = 'https://github.com/nf-core/test-datasets/raw/refs/heads/hadge/dataset_test/donor_match.cells.vcf.gz'
gt_donors = 'https://github.com/nf-core/test-datasets/raw/refs/heads/hadge/dataset_test/donor_match_GT_donors.vireo.vcf.gz'
outdir = "outdir_test_donor_match"
}
30 changes: 23 additions & 7 deletions conf/test_full.config
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,31 @@
----------------------------------------------------------------------------------------
*/

process {
resourceLimits = [
cpus: 60,
memory: '64.GB',
time: '23.h'
]

withName: SOUPORCELL {
cpus = 60
memory = 64.GB
time = 8.h
}
}

params {
config_profile_name = 'Full test profile'
config_profile_description = 'Full test dataset to check pipeline function'

// Input data for full size test
// TODO nf-core: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
// TODO nf-core: Give any required params for the test so that command line flags are not needed
input = params.pipelines_testdata_base_path + 'viralrecon/samplesheet/samplesheet_full_illumina_amplicon.csv'

// Genome references
genome = 'R64-1-1'
// Input data
mode = 'rescue'
hash_tools = 'htodemux,hasheddrops,multiseq,demuxem,gmm-demux,bff,hashsolo'
genetic_tools = 'demuxlet,freemuxlet,vireo,souporcell'
input = '/lustre/groups/ml01/code/luis.heinzlmeier/hadge/conf/test_full_new.csv'
genome = 'GRCh38'
bam_qc = false
find_variants = false
outdir = "outdir_test_full"
}
2 changes: 1 addition & 1 deletion conf/test_genetic.config
Original file line number Diff line number Diff line change
Expand Up @@ -28,5 +28,5 @@ params {
input = 'https://github.com/nf-core/test-datasets/raw/refs/heads/hadge/samplesheet/samplesheet_genetic.csv'
genome = 'GRCh38'
fasta = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/chr21/sequence/genome.fasta'
bam_qc = true
outdir = "outdir_test_genetic"
}
5 changes: 2 additions & 3 deletions conf/test_hashing.config
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,7 @@ params {

// Input data
mode = 'hashing'
hash_tools = 'htodemux,hasheddrops,multiseq,gmm-demux,bff,hashsolo'
hash_tools = 'htodemux,hasheddrops,multiseq,demuxem,gmm-demux,bff,hashsolo'
input = 'https://github.com/nf-core/test-datasets/raw/refs/heads/hadge/samplesheet/samplesheet_hashing.csv'

// TODO demuxem: include demuxem to hash_tools if #81 is fixed
outdir = "outdir_test_hashing"
}
62 changes: 44 additions & 18 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,8 @@ Finally, it assigns SNPs to cells to determine donor identity but requires addit

```csv title="samplesheet.csv"
sample,bam,vcf,n_samples,barcodes
id1,donor_genotype_chr21.vcf,2,barcodes.tsv
id2,donor_genotype_chr21.vcf,2,barcodes.tsv
id1,chr21.bam,donor_genotype_chr21.vcf,2,barcodes.tsv
id2,chr21.bam,donor_genotype_chr21.vcf,2,barcodes.tsv
id3,chr21.bam,donor_genotype_chr21.vcf,2,barcodes.tsv
```

Expand Down Expand Up @@ -142,7 +142,7 @@ id3,rna.tar.gz,hto.tar.gz,chr21.bam,donor_genotype_chr21.vcf,2,barcodes.tsv
| `rna_matrix` | Full path to the RNA-Seq count matrices provided in a 10x Genomics format and compressed as `.tar.gz`. |
| `hto_matrix` | Full path to the hashing count matrices provided in a 10x Genomics format and compressed as `.tar.gz`. |
| `bam` | Full path to the alignment file (`.bam`). |
| `vcf` | Full path to the list of common SNPs (`.vcf`). |
| `vcf` | Full path to common SNP genotypes vcf (`.vcf`). |
| `n_samples` | The number of multiplexed donors. |
| `barcodes` | Full path to the list of cell barcodes (e.g., `barcodes.tsv` from Cell Ranger) |

Expand All @@ -155,21 +155,47 @@ id3,rna.tar.gz,hto.tar.gz,chr21.bam,donor_genotype_chr21.vcf,2,barcodes.tsv
| hashing | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| donor_match | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ | ❌ |

| Module | sample | rna_matrix | hto_matrix | bam | barcodes | n_samples | vcf |
| ----------- | :----: | :--------: | :--------: | :-: | :------: | :-------: | :-: |
| htodemux | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| multiseq | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| bff | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |
| demuxem | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| gmm-demux | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |
| hasheddrops | ✅ | ✅\* | ✅ | ❌ | ❌ | ❌ | ❌ |
| hashsolo | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |
| vireo | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ |
| demuxlet | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ✅ |
| freemuxlet | ✅ | ❌ | ❌ | ✅ | ❌ | ✅ | ✅ |
| souporcell | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ |

\* if `params.hasheddrops_runEmptyDrops` is true
| Module | sample | rna_matrix | hto_matrix | bam | barcodes | n_samples | vcf<sup>1</sup> |
| ----------- | :----: | :------------: | :--------: | :-: | :------: | :-------: | :-------------: |
| htodemux | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| multiseq | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| bff | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |
| demuxem | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| gmm-demux | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |
| hasheddrops | ✅ | ✅<sup>2</sup> | ✅ | ❌ | ❌ | ❌ | ❌ |
| hashsolo | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |
| vireo | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ |
| demuxlet | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ✅<sup>3</sup> |
| freemuxlet | ✅ | ❌ | ❌ | ✅ | ❌ | ✅ | ✅ |
| souporcell | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ |

<sup>1</sup> The requirements for the VCF file differ between genetic deconvolution methods.
Check out [Demuxafy](https://demultiplexing-doublet-detecting-docs.readthedocs.io/en/latest/DemultiplexingSoftwares.html) to find the right VCF file for the methods you want to use.
`POPSCLE_DSCPILEUP` (needed for `freemuxlet` and `demuxlet`) requires the VCF file to be sorted the same way as the BAM file. If you encounter an error due to this, consider using `picard SortVcf`.

<sup>2</sup> if `params.hasheddrops_runEmptyDrops` is true

<sup>3</sup> reference SNP genotypes for each individual ([demuxlet docs](https://demultiplexing-doublet-detecting-docs.readthedocs.io/en/latest/Demuxlet.html))

:::

:::tip{collapse title="Recommendations for naming HTO-labels and barcodes"}

1. Avoid single DNA base letters as suffixes

- **Incorrect:** `HTO-A`, `HTO-C`, `HTO-G`, `HTO-T`
- **Reason:** The `BFF` module uses `cellhashR`'s `ProcessCountMatrix()`, which internally calls `SimplifyHtoNames()` and incorrectly strips single DNA base letters, collapsing `HTO-A`, `HTO-C`, `HTO-G` all to `HTO` and causing a crash.

2. Avoid barcode sequences as part of the label

- **Incorrect:** `HTO-1-ACTGTCTAACGG`
- **Reason:** `SimplifyHtoNames()` strips the barcode suffix in `BFF`, causing the same HTO to appear as `HTO-1` in `BFF` output but `HTO-1-ACTGTCTAACGG` in other methods, making cross-method comparison unreliable.

3. Avoid using the same trailing suffixes on all barcodes

- **Incorrect:** `AAACCCAAGAAACACT-1` (`-1` at all barcodes)
- **Reason:** In the `DEMUXEM` module, `pegasusio.read_input()` only removes the suffix from RNA barcodes, but not from HTO barcodes, which leads to a known issue (see [#21](https://github.com/lilab-bcb/demuxEM/issues/21)).

:::

An [example samplesheet](../assets/samplesheet.csv) has been provided with the pipeline.
Expand Down
3 changes: 0 additions & 3 deletions main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,6 @@ include { getGenomeAttribute } from './subworkflows/local/utils_nfcore_hadg
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/

// TODO nf-core: Remove this line if you don't need a FASTA file
// This is an example of how to use getGenomeAttribute() to fetch parameters
// from igenomes.config using `--genome`
params.fasta = getGenomeAttribute('fasta')

/*
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/usr/bin/env python3

import os
os.environ["NUMBA_CACHE_DIR"] = "./tmp/numba"

# versions
import platform
import yaml
Expand Down
14 changes: 0 additions & 14 deletions modules/local/dropletutils/mtxconvert/templates/convert.R
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ library(DropletUtils)

mtx_dir <- "${input_mtx_dir}"


sce <- read10xCounts(mtx_dir) # Read to SingleCellExperiment object

print(sce)
Expand All @@ -18,19 +17,6 @@ if ("${write_csv}" == "true") {
write.csv(as.matrix(count_matrix), file = "${prefix}.csv", row.names = TRUE)
}

# TODO demuxem: remove if demuxEM issue is solved (https://github.com/theislab/hadge/issues/81)
# Write to h5 file
# write10xCounts(
# path = "${prefix}.h5",
# x = counts(sce),
# barcodes = colData(sce)\$Barcode,
# gene.id = rownames(sce),
# gene.symbol = if (!is.null(rowData(sce)\$Symbol)) rowData(sce)\$Symbol else rownames(sce),
# gene.type = if (!is.null(rowData(sce)\$Type)) rowData(sce)\$Type else rep("Gene Expression", nrow(sce)),
# type = "HDF5",
# version = "3", # <-- ensures /matrix layout instead of /unknown
# overwrite = TRUE
# )
write10xCounts("${prefix}.h5", count_matrix, type = "HDF5")

################################################
Expand Down
Loading
Loading