Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
125 commits
Select commit Hold shift + click to select a range
a5af10e
adding mskcc modules
Apr 13, 2023
9edde93
adding test yaml for mskccprepnucleo
Apr 13, 2023
f07a6de
name change
Apr 20, 2023
043bec3
removing redundant channel
Apr 20, 2023
a830c37
re-doing first access subworkflow
Apr 27, 2023
6503b00
clean up old workflow, after re-name
May 2, 2023
db72d2c
try fork syncing
May 9, 2023
00fa7d7
Merge branch 'develop' into feature/extractumi
buehlere May 9, 2023
fc28f6e
experimenting with auto sync
May 9, 2023
d669e02
Merge branch 'feature/extractumi' of https://github.com/mskcc-omics-w…
May 9, 2023
7c9816d
testing gitactions
May 9, 2023
228c772
gitactions test
May 9, 2023
2d70a15
Merge pull request #4 from nf-core/master
buehlere May 9, 2023
d612e60
gitactions experiment
May 9, 2023
fde2b5d
Merge branch 'feature/extractumi' of https://github.com/mskcc-omics-w…
May 9, 2023
a78e7ea
trying to fix gitactions
May 9, 2023
06b27e0
Create sync-fork.yml
May 9, 2023
f2c759d
Update sync-fork.yml
May 9, 2023
7ff8837
Update sync-fork.yml
May 9, 2023
03ede3c
Update sync-fork.yml
May 9, 2023
37ad4bd
Merge branch 'master' of https://github.com/mskcc-omics-workflows/mod…
May 9, 2023
879199a
Update sync-fork.yml
May 9, 2023
c21e35b
Update sync-fork.yml
May 9, 2023
cd09579
Update sync-fork.yml
May 9, 2023
31331e4
Update sync-fork.yml
May 9, 2023
cf56e9b
Merge branch 'master' into feature/extractumi
May 9, 2023
9a1f635
remove output folders
May 9, 2023
8ce9df3
adding base template
May 9, 2023
be48046
Update sync-fork.yml
May 10, 2023
caa71e9
Revert "Update sync-fork.yml"
May 10, 2023
e0c8587
Create sync-action.yml
May 10, 2023
0877f99
Merge pull request #10 from nf-core/master
buehlere May 10, 2023
5eee44f
Merge branch 'master' into feature/alignment
May 11, 2023
6b5a9c2
Merge pull request #11 from nf-core/master
buehlere May 12, 2023
d65303a
Merge pull request #12 from nf-core/master
buehlere May 13, 2023
ff7519d
Merge pull request #13 from nf-core/master
buehlere May 15, 2023
ff6690a
Merge pull request #14 from nf-core/master
buehlere May 16, 2023
37ceaf0
Merge pull request #15 from nf-core/master
buehlere May 17, 2023
441d2d9
remove this for separate PR
May 17, 2023
95ef62c
Merge branch 'master' into feature/extractumi
May 17, 2023
05e1212
working alignment example
May 17, 2023
eaef95a
clean up for extractumi
May 17, 2023
4c663a3
Merge branch 'master' into develop
May 17, 2023
7215646
Update sync-action.yml
May 17, 2023
ab7ba98
Delete sync-fork.yml
May 17, 2023
13d45de
Merge pull request #16 from nf-core/master
buehlere May 17, 2023
12be703
Merge branch 'develop' into feature/extractumi
May 17, 2023
35baf68
Update sync-action.yml
May 17, 2023
f2f4e04
Update sync-action.yml
May 17, 2023
5912f29
Revert "Merge pull request #16 from nf-core/master"
May 17, 2023
668223d
Merge pull request #17 from nf-core/master
buehlere May 18, 2023
d3d728d
Update sync-action.yml
May 18, 2023
1b166e5
Merge pull request #20 from nf-core/master
buehlere May 18, 2023
4b5b4ba
Merge branch 'develop' into feature/extractumi
May 18, 2023
e49802d
cleanup
May 18, 2023
744b72e
Merge branch 'develop' into feature/alignment
May 18, 2023
98ecd46
Merge pull request #23 from nf-core/master
buehlere May 22, 2023
e3e34fa
Merge pull request #25 from nf-core/master
buehlere May 23, 2023
a49348f
Merge pull request #27 from nf-core/master
buehlere May 24, 2023
bf8e125
Merge pull request #29 from nf-core/master
buehlere May 25, 2023
fa5bb4d
Trigger github actions when PR is made against develop
anoronh4 May 25, 2023
bdb938f
split github workflows into different files for running on develop PRs
anoronh4 May 25, 2023
4a27c56
Merge pull request #32 from nf-core/master
buehlere May 26, 2023
78a953f
Disable sentieon testing in CI pytests for PRs against develop
anoronh4 May 26, 2023
3c06e80
changed names of all github action workflows that trigger in PRs agai…
anoronh4 May 26, 2023
2c128f2
Merge pull request #34 from nf-core/master
buehlere May 27, 2023
2d80a5a
Merge pull request #36 from nf-core/master
buehlere May 30, 2023
c7e7c4e
Merge pull request #38 from nf-core/master
buehlere May 31, 2023
8485cc7
Merge pull request #40 from nf-core/master
buehlere Jun 1, 2023
365d9eb
Merge pull request #42 from nf-core/master
buehlere Jun 2, 2023
320b705
Merge pull request #44 from nf-core/master
buehlere Jun 3, 2023
0b85cd9
Merge pull request #46 from nf-core/master
buehlere Jun 5, 2023
84aa571
Merge pull request #48 from nf-core/master
buehlere Jun 6, 2023
ea666e8
Merge pull request #50 from nf-core/master
buehlere Jun 8, 2023
c1b05ee
Merge pull request #30 from mskcc-omics-workflows/enhancement/run_wor…
buehlere Jun 8, 2023
b82a132
Merge branch 'develop' into feature/extractumi
Jun 8, 2023
e32f380
prettier formatting
Jun 8, 2023
c588273
fix formatting for linters
Jun 8, 2023
601c863
removing tools-test-dataset
Jun 8, 2023
a3ae978
re-doing md5 sum
Jun 8, 2023
10d52c3
Update test.yml
Jun 8, 2023
a804e7c
redoing md5sums
Jun 8, 2023
579028c
Merge pull request #52 from nf-core/master
buehlere Jun 9, 2023
ad64d4c
Update main.nf
Jun 9, 2023
f1d34f7
Merge pull request #54 from nf-core/master
buehlere Jun 10, 2023
b4e4c01
Merge pull request #56 from nf-core/master
buehlere Jun 13, 2023
e640e59
Merge pull request #58 from nf-core/master
buehlere Jun 22, 2023
79efdf9
Merge branch 'develop' into feature/alignment
Jun 22, 2023
9c5e0b8
adding bwa2 init
Jun 26, 2023
42c879b
updating test
Jun 27, 2023
64cbb79
updating how data is pulled for test
Jun 27, 2023
3e727a0
updating alignment, first version I'm expecting to work
Jun 28, 2023
a102d07
remove dangling features
Jun 28, 2023
ca2ab7e
Delete .gitmodules
Jun 28, 2023
8a12ed3
fix formatting
Jul 6, 2023
772cb59
fixing test
Jul 6, 2023
ea145c8
Merge branch 'feature/alignment' of https://github.com/mskcc-omics-wo…
Jul 6, 2023
2b004a9
fixing indexing
Jul 18, 2023
cff132b
update testing
Jul 18, 2023
45accbd
Update main.nf
Jul 18, 2023
e8724f1
Merge branch 'develop' into feature/extractumi
Aug 8, 2023
73fedb5
Delete sync-action.yml
Aug 8, 2023
1902b7f
Merge branch 'develop' into feature/extractumi
Aug 8, 2023
0b5ff75
remove poorly synced files
Aug 8, 2023
053d201
more poorly synced cleanup
Aug 8, 2023
75a0cfe
name change
Aug 9, 2023
ac38f09
removing old name workflow
Aug 9, 2023
10669ef
undo bad syncing
Aug 9, 2023
0c7c483
Merge branch 'develop' into feature/alignment
Aug 9, 2023
b11d933
merge with develop
Aug 9, 2023
a88f7ec
Delete test_data_msk.config
Aug 9, 2023
93c3b3b
Update main.nf
Aug 9, 2023
a6c1275
Merge branch 'feature/extractumi' into feature/extract_align
Aug 9, 2023
afeb38e
working extract to align
Aug 9, 2023
3dbcae9
Update meta.yml
Aug 9, 2023
824be8d
Update meta.yml
Aug 9, 2023
0fad17d
fixing testing
Aug 9, 2023
8d27871
Merge branch 'feature/extract_align' of https://github.com/mskcc-omic…
Aug 9, 2023
b9d6cdb
Update main.nf
Aug 9, 2023
73549b2
fixing md5sums?
Aug 9, 2023
66fb57c
update testing ymls
Aug 9, 2023
bdca9d6
updates, still struggling with test
Aug 10, 2023
8da5b73
trying to update test
Aug 10, 2023
5b56c3d
please work test
Aug 10, 2023
23b2cd3
prettier
Aug 10, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions install_data.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#!/usr/bin/env bash


# Test data is hosted on Google Drive at:
# https://drive.google.com/file/d/1GtT8jsBGwRoQC-5wHh06r8RFkiFBuirp/view?usp=sharing

fileid=1GtT8jsBGwRoQC-5wHh06r8RFkiFBuirp

filename=test_nucleo.tar.gz
foldername=test_nucleo

# Skip if already have test data
[[ -f $filename ]] && exit 0
[[ -d $foldername ]] && exit 0

curl -c ./cookie -s -k -L "https://drive.google.com/uc?export=download&id=$fileid" > /dev/null

curl -k -Lb ./cookie "https://drive.google.com/uc?export=download&confirm=`awk '/download/ {print $NF}' ./cookie`&id=${fileid}" -o ${filename}

# Suppress linux warnings for MacOS tar.gz files
if [[ "$OSTYPE" == "linux-gnu" ]]; then
tar --warning=no-unknown-keyword -xzvf $filename
elif [[ "$OSTYPE" == "darwin"* ]]; then
tar -xzvf $filename
fi

rm $filename
63 changes: 63 additions & 0 deletions subworkflows/nf-core/fastq_alignsort_bwa_picard/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
// import
// bwa2 for extra alignment option
include { BWA_MEM } from '../../../modules/nf-core/bwa/mem/main'
include { BWA_INDEX } from '../../../modules/nf-core/bwa/index/main'
include { BWAMEM2_MEM } from '../../../modules/nf-core/bwamem2/mem/main'
include { BWAMEM2_INDEX } from '../../../modules/nf-core/bwamem2/index/main'
include { PICARD_ADDORREPLACEREADGROUPS } from '../../../modules/nf-core/picard/addorreplacereadgroups/main'

workflow FASTQ_ALIGNSORT_BWA_PICARD {

take:
fastqs // channel: [ val(meta), [ bam ] ]
reference
bwa

main:

versions = Channel.empty()

// switch statement to determine which bwa to use, this is a passed parameter
switch(bwa){
case 1:
BWA_INDEX ( reference )
// MEM
aligned_bam = BWA_MEM ( fastqs, BWA_INDEX.out.index, true ).bam.map {
meta, bam ->
new_id = 'aligned_bam'
[[id: new_id], bam ]
}
versions = versions.mix(BWA_MEM.out.versions)
break
case 2:
// INDEX
BWAMEM2_INDEX (reference)
versions = versions.mix(BWAMEM2_INDEX.out.versions)
// BWA MEM2
aligned_bam = BWAMEM2_MEM ( fastqs, BWAMEM2_INDEX.out.index, true ).bam.map {
meta, bam ->
new_id = 'aligned_bam'
[[id: new_id], bam ]
}
versions = versions.mix(BWAMEM2_MEM.out.versions)
break
default:
throw new Exception("The argument bwa must be either 1 or 2, not ${bwa}.")
}

// Picard add and replace
PICARD_ADDORREPLACEREADGROUPS(aligned_bam).bam.map {
meta, bam ->
new_id = 'grouped_aligned_bam'
[[id: new_id], bam ]
}.set {grouped_bam}
versions = versions.mix(PICARD_ADDORREPLACEREADGROUPS.out.versions)

// final output
emit:

bam = PICARD_ADDORREPLACEREADGROUPS.out.bam // channel: [ val(meta), [ bam ] ]

versions = versions // channel: [ versions.yml ]
}

48 changes: 48 additions & 0 deletions subworkflows/nf-core/fastq_alignsort_bwa_picard/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
name: "alignment"
## TODO nf-core: Add a description of the subworkflow and list keywords
description: Sort SAM/BAM/CRAM file
keywords:
- sort
- bam
- sam
- cram
## TODO nf-core: Add a list of the modules used in the subworkflow
modules:
- samtools/sort
- samtools/index
## TODO nf-core: List all of the variables used as input, including their types and descriptions
input:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test' ]
- bam:
type: file
description: BAM/CRAM/SAM file
pattern: "*.{bam,cram,sam}"
## TODO nf-core: List all of the variables used as output, including their types and descriptions
output:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test' ]
- bam:
type: file
description: Sorted BAM/CRAM/SAM file
pattern: "*.{bam,cram,sam}"
- bai:
type: file
description: BAM/CRAM/SAM samtools index
pattern: "*.{bai,crai,sai}"
- csi:
type: file
description: CSI samtools index
pattern: "*.csi"
- versions:
type: file
description: File containing software versions
pattern: "versions.yml"
authors:
- "@buehlere"
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
// TODO nf-core: If in doubt look at other nf-core/subworkflows to see how we are doing things! :)
// https://github.com/nf-core/modules/tree/master/subworkflows
// You can also ask for help via your pull request or on the #subworkflows channel on the nf-core Slack workspace:
// https://nf-co.re/join
// TODO nf-core: A subworkflow SHOULD import at least two modules

include { FGBIO_FASTQTOBAM } from '../../../modules/nf-core/fgbio/fastqtobam/main'
include { PICARD_MERGESAMFILES } from '../../../modules/nf-core/picard/mergesamfiles/'
include { GATK4_SAMTOFASTQ } from '../../../modules/nf-core/gatk4/samtofastq/main'
include { FASTP } from '../../../modules/nf-core/fastp/main'

workflow FASTQ_EXTRACTUMI_FGBIO_PICARD_GATK4_FASTP {

take:
// TODO nf-core: edit input (take) channels
ch_fastq // channel: [ val(meta), [ bam ] ]

main:

ch_versions = Channel.empty()

// FGBIO_FASTQTOBAM: get unmerged bams
// ch_fastq is a channel, which enables parallel
// channels enable parallel: https://www.nextflow.io/docs/latest/faq.html?highlight=parallel
FGBIO_FASTQTOBAM (
ch_fastq
)
FGBIO_FASTQTOBAM.out.bam.map{
meta, bam ->
[bam]
}.collect().map{
bams ->
[[id: 'unmerged_bams'], bams ]
}.set{unmerged_bams}
ch_versions = ch_versions.mix(FGBIO_FASTQTOBAM.out.versions) //write out versioning

// PICARD_MERGESAMFILES: merge bams files
PICARD_MERGESAMFILES (
unmerged_bams
).bam.map {
meta, bam ->
new_id = 'merged_bam'
[[id: new_id], bam ]
}.set {merged_bam}
ch_versions = ch_versions.mix(PICARD_MERGESAMFILES.out.versions)
// GATK4_SAMTOFASTQ: get fastqs from merged bam
GATK4_SAMTOFASTQ (
merged_bam
).fastq.map {
meta, fastq ->
new_id = 'merged_fastq'
[[id: new_id], fastq ]
}.set {merged_fastq}
ch_versions = ch_versions.mix(GATK4_SAMTOFASTQ.out.versions)

// GATK4_SAMTOFASTQ: Run fastp on fastqs
FASTP (
merged_fastq, [], false, false
)
ch_versions = ch_versions.mix(FASTP.out.versions)
// final emit
emit:
// TODO nf-core: edit emitted channels
fastq = FASTP.out.reads

versions = ch_versions // channel: [ versions.yml ]

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
name: "fastq_extractumi_fgbio_picard_gatk4_fastp"
## TODO nf-core: Add a description of the subworkflow and list keywords
description: Sort SAM/BAM/CRAM file
keywords:
- sort
- bam
- sam
- cram
## TODO nf-core: Add a list of the modules used in the subworkflow
modules:
- samtools/sort
- samtools/index
## TODO nf-core: List all of the variables used as input, including their types and descriptions
input:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test' ]
- bam:
type: file
description: BAM/CRAM/SAM file
pattern: "*.{bam,cram,sam}"
## TODO nf-core: List all of the variables used as output, including their types and descriptions
output:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test' ]
- bam:
type: file
description: Sorted BAM/CRAM/SAM file
pattern: "*.{bam,cram,sam}"
- bai:
type: file
description: BAM/CRAM/SAM samtools index
pattern: "*.{bai,crai,sai}"
- csi:
type: file
description: CSI samtools index
pattern: "*.csi"
- versions:
type: file
description: File containing software versions
pattern: "versions.yml"
authors:
- "@buehlere"
8 changes: 8 additions & 0 deletions tests/config/pytest_modules.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3696,6 +3696,10 @@ subworkflows/fasta_newick_epang_gappa:
- subworkflows/nf-core/fasta_newick_epang_gappa/**
- tests/subworkflows/nf-core/fasta_newick_epang_gappa/**

subworkflows/fastq_alignsort_bwa_picard:
- subworkflows/nf-core/fastq_alignsort_bwa_picard/**
- tests/subworkflows/nf-core/fastq_alignsort_bwa_picard/**

subworkflows/fastq_align_bowtie2:
- subworkflows/nf-core/fastq_align_bowtie2/**
- tests/subworkflows/nf-core/fastq_align_bowtie2/**
Expand Down Expand Up @@ -3736,6 +3740,10 @@ subworkflows/fastq_download_prefetch_fasterqdump_sratools:
- subworkflows/nf-core/fastq_download_prefetch_fasterqdump_sratools/**
- tests/subworkflows/nf-core/fastq_download_prefetch_fasterqdump_sratools/**

subworkflows/fastq_extractumi_fgbio_picard_gatk4_fastp:
- subworkflows/nf-core/fastq_extractumi_fgbio_picard_gatk4_fastp/**
- tests/subworkflows/nf-core/fastq_extractumi_fgbio_picard_gatk4_fastp/**

subworkflows/fastq_fastqc_umitools_fastp:
- subworkflows/nf-core/fastq_fastqc_umitools_fastp/**
- tests/subworkflows/nf-core/fastq_fastqc_umitools_fastp/**
Expand Down
37 changes: 37 additions & 0 deletions tests/subworkflows/nf-core/fastq_alignsort_bwa_picard/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#!/usr/bin/env nextflow

nextflow.enable.dsl = 2

include { FASTQ_EXTRACTUMI_FGBIO_PICARD_GATK4_FASTP } from '../../../../subworkflows/nf-core/fastq_extractumi_fgbio_picard_gatk4_fastp/main.nf'
include { FASTQ_ALIGNSORT_BWA_PICARD } from '../../../../subworkflows/nf-core/fastq_alignsort_bwa_picard/main.nf'

workflow test_fastq_alignsort_bwa_picard {
// load test data
def bashScriptFile = new File('install_data.sh')

def processBuilder = new ProcessBuilder('bash', bashScriptFile.toString())
processBuilder.redirectOutput(ProcessBuilder.Redirect.INHERIT)
processBuilder.redirectError(ProcessBuilder.Redirect.INHERIT)

def process = processBuilder.start()
process.waitFor()

// run extract umi
fastq = [
[[id:'gene1', single_end:false], [file('test_nucleo/fastq/seracare_0-5_R1_001ad.fastq.gz'), file('test_nucleo/fastq/seracare_0-5_R2_001ad.fastq.gz')]],
[[id:'gene2', single_end:false], [file('test_nucleo/fastq/seracare_0-5_R1_001ae.fastq.gz'), file('test_nucleo/fastq/seracare_0-5_R2_001ae.fastq.gz')]]
]
fastq = ch_fastq = Channel.fromList(fastq)
FASTQ_EXTRACTUMI_FGBIO_PICARD_GATK4_FASTP ( fastq )



// channels enable parallel: https://www.nextflow.io/docs/latest/faq.html?highlight=parallel
// test data
reference = [
[id:'reference'],
file('test_nucleo/reference/chr14_chr16.fasta')
]
// workflow
FASTQ_ALIGNSORT_BWA_PICARD ( FASTQ_EXTRACTUMI_FGBIO_PICARD_GATK4_FASTP.out.fastq, reference, 1)
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
executor.cpus = 12
executor.memory = 15.GB
process {

publishDir = { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }
withName:PICARD_ADDORREPLACEREADGROUPS{
ext.args = "--RGID 4 --RGLB 'lib1' --RGPL 'ILLUMINA' --RGPU 'unit1' --RGSM 20"
}
withName: BWA_MEM {
ext.args2 = { sort_bam ? "" : "-bh" }
}
}
45 changes: 45 additions & 0 deletions tests/subworkflows/nf-core/fastq_alignsort_bwa_picard/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
- name: fastq_alignsort_bwa_picard test_fastq_alignsort_bwa_picard
command: nextflow run ./tests/subworkflows/nf-core/fastq_alignsort_bwa_picard -entry test_fastq_alignsort_bwa_picard -c ./tests/config/nextflow.config
tags:
- bwa
- bwa/index
- bwa/mem
- bwamem2
- bwamem2/index
- bwamem2/mem
- picard
- picard/addorreplacereadgroups
- subworkflows
- subworkflows/fastq_alignsort_bwa_picard
files:
- path: output/bwa/bwa/chr14_chr16.amb
md5sum: 00fb74627e074db6238dcd9bc08dc48a
- path: output/bwa/bwa/chr14_chr16.ann
md5sum: d8825e2fcb3cd372cd61ededfe283025
- path: output/bwa/bwa/chr14_chr16.bwt
md5sum: 45637ec2c011d0f73cac6c470c5b5d2b
- path: output/bwa/bwa/chr14_chr16.pac
md5sum: 46f856371d59e859295497c967478d31
- path: output/bwa/bwa/chr14_chr16.sa
md5sum: 466dbbbce2fb9528e760477ccdc2ea5b
- path: output/bwa/merged_fastq.bam
md5sum: cb3f0b70c106c737ccd21eb590611bab
- path: output/fastp/merged_fastq.fastp.html
- path: output/fastp/merged_fastq.fastp.json
md5sum: 8c9cfed703141dd32818ce8abe9d2465
- path: output/fastp/merged_fastq.fastp.log
- path: output/fastp/merged_fastq_1.fastp.fastq.gz
md5sum: fdba450358c761a3e4264164da8b60b1
- path: output/fastp/merged_fastq_2.fastp.fastq.gz
md5sum: ca18410c7ebc49c778fe3dd059e8fee9
- path: output/fgbio/gene1.bam
md5sum: fb9749c10bbf3917dbf739de827276f7
- path: output/fgbio/gene2.bam
md5sum: 4596c2c2a8f94d770410246fb03f3baa
- path: output/gatk4/merged_bam_1.fastq.gz
md5sum: 8d87d76a1a07892a4bd0b2709338831e
- path: output/gatk4/merged_bam_2.fastq.gz
md5sum: 94c76d2f77a71d81ac26adcf94720ecd
- path: output/picard/aligned_bam.bam
md5sum: 975d449b09f28bc48456b0c6942f39d4
- path: output/picard/unmerged_bams.bam
Loading