Master runner for multiple nf-core Nextflow pipelines on SLURM using Apptainer via the biocontainers module. This repo only contains configuration, wrapper scripts, templates, and validators. No real data is included.
Concepts
Scratch vs results
All intermediates and caches live under a single scratch root set by NGS_SCRATCH. Final outputs live in each run directory under results/.
Scratch layout
$NGS_SCRATCH/work/<pipeline>/<run_id>
$NGS_SCRATCH/cache/nf
$NGS_SCRATCH/cache/singularity
$NGS_SCRATCH/tmp
Run directory layout
<run_dir>/inputs/
<run_dir>/results/
<run_dir>/logs/
Resume and caching
bin/run_nfcore.sh always uses -resume and enforces the scratch layout so runs are reproducible and restartable.
Setup
-
Set a scratch root
export NGS_SCRATCH="$HOME/scratch/ngs" -
Initialize the environment
source bin/env.sh -
Ensure scripts are executable if needed
chmod +x bin/*.sh pipelines/*/run.sh
Notes
bin/env.sh attempts module load biocontainers and module load nextflow. If modules are not available, it prints a warning and you can load tools another way. It also verifies nextflow and apptainer are available.
Create A Run Directory
bin/make_run_dir.sh --pipeline cutandrun --run-dir ~/runs/nfcore/2026-02-05_proj_cutandrun_hg38 --copy-template
Run A Pipeline
All pipelines use the same UX. Each run.sh uses the shared launcher and writes standard logs and summaries in <run_dir>/logs/.
Example
pipelines/cutandrun/run.sh --run-dir ~/runs/nfcore/2026-02-05_proj_cutandrun_hg38 --profile slurm_72h_med,singularity --revision 3.2.2
Available pipelines and run scripts
pipelines/fetchngs/run.sh
pipelines/cutandrun/run.sh
pipelines/chipseq/run.sh
pipelines/atacseq/run.sh
pipelines/rnaseq/run.sh
pipelines/hic/run.sh
Pass-through arguments
Use -- to pass extra flags to Nextflow or the nf-core pipeline. Example
pipelines/rnaseq/run.sh --run-dir <run_dir> --profile slurm_48h_med,singularity -- --max_memory 64.GB
Profiles
Use --profile with a comma-separated list like slurm_72h_med,singularity.
Available SLURM time and CPU tiers
slurm_4h_low slurm_4h_med slurm_4h_high
slurm_8h_low slurm_8h_med slurm_8h_high
slurm_24h_low slurm_24h_med slurm_24h_high
slurm_48h_low slurm_48h_med slurm_48h_high
slurm_72h_low slurm_72h_med slurm_72h_high
slurm_167h_low slurm_167h_med slurm_167h_high
All SLURM profiles include
--account=biochem --partition=cpu --qos=standby --mail-user=tsheeley@purdue.edu --mail-type=END,FAIL,BEGIN
Pin Pipeline Versions
Each pipeline run.sh has a placeholder default revision like X.Y.Z. Always set a real version with --revision for reproducibility. This prevents accidental changes when nf-core releases updates.
Example
pipelines/chipseq/run.sh --run-dir <run_dir> --profile slurm_72h_high,singularity --revision 2.1.0
Unified Output Manifest
Generate a best-effort manifest of outputs per sample.
bin/make_output_manifest.py --pipeline cutandrun --run-dir <run_dir> --out <run_dir>/results/manifest.tsv
The manifest schema is documented in schemas/manifest.schema.tsv.
Templates And Validators
Each pipeline has a template samplesheet or inputs file with placeholders only. Validators check required columns and basic path sanity. See schemas/samplesheet_notes.md for notes and refer to nf-core docs for detailed column meanings.
Execution Summaries
Every run writes the following to <run_dir>/logs/:
report.html timeline.html trace.tsv dag.png cmd.sh run_manifest.json