Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#274](https://github.com/nf-core/phaseimpute/pull/274) - Fix issue with compressed reference genome by adding `.gzi` file for `BCFTOOLS_MPILEUP`
- [#275](https://github.com/nf-core/phaseimpute/pull/275) - Fix nf-test errors with latest-everything.
- [#281](https://github.com/nf-core/phaseimpute/pull/281) - Fix `diffchr()` function.
- [#293](https://github.com/nf-core/phaseimpute/pull/293) - Fix nf-core and nextflow linting.

### `Dependencies`

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ We thank the following people for their extensive assistance in the development

## Contributions and Support

If you would like to contribute to this pipeline, please see the [contributing guidelines](docs/CONTRIBUTING.md). Further development tips can be found in the [development documentation](docs/development.md).
If you would like to contribute to this pipeline, please see the [contributing guidelines](docs/CONTRIBUTING.md).

For further information or help, don't hesitate to get in touch on the [Slack `#phaseimpute` channel](https://nfcore.slack.com/channels/phaseimpute) (you can join with [this invite](https://nf-co.re/join/slack)).

Expand Down
2 changes: 0 additions & 2 deletions assets/methods_description_template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@ description: "Suggested text and references to use when describing pipeline usag
section_name: "nf-core/phaseimpute Methods Description"
section_href: "https://github.com/nf-core/phaseimpute"
plot_type: "html"
## TODO nf-core: Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
## You inject any metadata in the Nextflow '${workflow}' object
data: |
<h4>Methods</h4>
<p>Data was processed using nf-core/phaseimpute v${workflow.manifest.version} ${doi_text} of the nf-core collection of workflows (<a href="https://doi.org/10.1038/s41587-020-0439-x">Ewels <em>et al.</em>, 2020</a>), utilising reproducible software environments from the Bioconda (<a href="https://doi.org/10.1038/s41592-018-0046-7">Grüning <em>et al.</em>, 2018</a>) and Biocontainers (<a href="https://doi.org/10.1093/bioinformatics/btx192">da Veiga Leprevost <em>et al.</em>, 2017</a>) projects.</p>
Expand Down
31 changes: 30 additions & 1 deletion docs/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -182,4 +182,33 @@ If you update images or graphics, follow the nf-core [style guidelines](https://

## Pipeline specific contribution guidelines

<!-- TODO nf-core: Add any pipeline specific contribution guidelines here, such as coding styles, procedures, checklists etc. -->
`nf-core/phaseimpute` pipeline aim to strictly follow latest nextflow and nf-core guidelines.
As such, each local modules and subworkflow should be properly written and unittested with nf-test.

Local components should only be considered when they use are specific for this pipeline.
Otherwise, they should be part of the `nf-core/modules` repository.

### Channel management and combination

All channels need to be identified by a meta map. To follow which information is available, the `meta` argument
is suffixed with a combination of the following capital letters:

- I : individual id
- P : panel id
- R : region used
- M : map used
- T : tool used
- G : reference genome used (is it needed ?)
- S : simulation (depth or genotype array)

Therefore, the following channel operation example includes a meta map containing the panel id with the region and tool used:

```nextflow
ch_panel_for_impute.map {
metaPRT, vcf, index -> ...
}
```

### Release names

The names of releases are composed of a color and a dog breed.
26 changes: 0 additions & 26 deletions docs/development.md

This file was deleted.

61 changes: 61 additions & 0 deletions modules/local/addcolumns/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
name: addcolumns
description: Add metadata information to an existing file as additional columns
keywords:
- columns
- metadata
- awk
tools:
- gawk:
description: "GNU awk"
homepage: "https://www.gnu.org/software/gawk/"
documentation: "https://www.gnu.org/software/gawk/manual/"
tool_dev_url: "https://www.gnu.org/prep/ftp.html"
licence:
- "GPL v3"
identifier: ""
input:
- meta:
type: map
description: |
Groovy Map containing sample information
The following keys are added in additional columns to the input file.
e.g. [ id:'test', depth:1, gparray:'illumina', tools:'glimpse', panel:'1000G' ]
- input:
type: file
description: Textual format file
output:
txt:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- txt:
type: file
description: Resulting file with additional columns containing the metadata information
pattern: "*.txt"
versions_gawk:
- - ${task.process}:
type: string
description: The name of the process
- gawk:
type: string
description: The name of the tool
- awk -Wversion | sed '1!d; s/.*Awk //; s/,.*//':
type: eval
description: The expression to obtain the version of the tool
topics:
versions:
- - ${task.process}:
type: string
description: The name of the process
- gawk:
type: string
description: The name of the tool
- awk -Wversion | sed '1!d; s/.*Awk //; s/,.*//':
type: eval
description: The expression to obtain the version of the tool
authors:
- "@louislenezet"
maintainers:
- "@louislenezet"
13 changes: 8 additions & 5 deletions modules/local/listtofile/meta.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,14 @@ keywords:
- list
- gawk
tools:
- annotate:
description: |
Extract the file names from a list of path and register them into a file.
The corresponding identifier can be added to the file in a second column
or in a separate file.
- gawk:
description: "GNU awk"
homepage: "https://www.gnu.org/software/gawk/"
documentation: "https://www.gnu.org/software/gawk/manual/"
tool_dev_url: "https://www.gnu.org/prep/ftp.html"
licence:
- "GPL v3"
identifier: ""
input:
- - meta:
type: map
Expand Down
8 changes: 5 additions & 3 deletions modules/local/vcfchrextract/meta.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,14 @@ keywords:
- head
- contig
tools:
- head:
description: Extract header from variant calling file.
- query:
description: |
Extracts fields from VCF or BCF files and outputs them in user-defined format.
homepage: http://samtools.github.io/bcftools/bcftools.html
documentation: https://samtools.github.io/bcftools/bcftools.html#head
documentation: http://www.htslib.org/doc/bcftools.html
doi: 10.1093/bioinformatics/btp352
licence: ["MIT"]
identifier: biotools:bcftools
input:
- meta:
type: map
Expand Down
2 changes: 1 addition & 1 deletion ro-crate-metadata.json

Large diffs are not rendered by default.

74 changes: 0 additions & 74 deletions subworkflows/local/bam_impute_quilt2/main.nf

This file was deleted.

39 changes: 0 additions & 39 deletions subworkflows/local/bam_impute_quilt2/meta.yml

This file was deleted.

24 changes: 12 additions & 12 deletions subworkflows/local/prepare_genome/main.nf
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the decision of having a prepare_genome subworkflow. What was the issue before with how things were done? Thanks.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a recommendation from Maxime to take out the genome indexing out of the local phaseimpute utils sbwf.
I agree that the sbwf is a bit trivial, but the "complex" channel handling make it fit from my point of view as a local sbwf.

Original file line number Diff line number Diff line change
Expand Up @@ -20,23 +20,23 @@ workflow PREPARE_GENOME {
[]
])

def need_faidx = !fasta_fai_path || (is_compressed && !fasta_gzi_path)
if (need_faidx) {
SAMTOOLS_FAIDX(ch_fasta, false)
}

if (fasta_fai_path) {
ch_fai = channel.of(file(fasta_fai_path, checkIfExists:true))
} else {
SAMTOOLS_FAIDX(ch_fasta, false)
ch_fai = SAMTOOLS_FAIDX.out.fai.map{ _meta, fasta_fai -> fasta_fai }
}
if (is_compressed) {
if (fasta_gzi_path) {
ch_gzi = channel.of(file(fasta_gzi_path, checkIfExists:true))
} else if (!fasta_fai_path) {
ch_gzi = SAMTOOLS_FAIDX.out.gzi.map{ _meta, gzi -> gzi }
} else {
SAMTOOLS_FAIDX(ch_fasta, false)
ch_gzi = SAMTOOLS_FAIDX.out.gzi.map{ _meta, gzi -> gzi }
}

if (!is_compressed) {
ch_gzi = channel.of([[]])
} else if (fasta_gzi_path) {
ch_gzi = channel.of(file(fasta_gzi_path, checkIfExists:true))
} else {
ch_gzi = channel.of([])
ch_gzi = SAMTOOLS_FAIDX.out.gzi.map{ _meta, gzi -> gzi }
}

ch_fasta_fai_gzi = ch_fasta
Expand All @@ -46,5 +46,5 @@ workflow PREPARE_GENOME {
.collect()

emit:
ch_fasta_fai_gzi
ch_fasta_fai_gzi = ch_fasta_fai_gzi
}
51 changes: 51 additions & 0 deletions subworkflows/local/prepare_genome/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/subworkflows/yaml-schema.json
name: "PREPARE_GENOME"
description: |
Subworkflow to prepare the reference genome for imputation.
This includes indexing the genome and preparing the necessary files.
keywords:
- genome
- reference
- indexing
components:
- samtools/faidx
input:
- genome:
type: string
description: Reference genome name
- fasta_path:
type: file
description: Reference genome FASTA file
pattern: "*.{fa,fasta,fa.gz,fasta.gz}"
- fasta_index_path:
type: file
description: Reference genome FASTA index file
pattern: "*.{fai,faidx}"
- fasta_gzi_path:
type: file
description: Reference genome FASTA gzi index file (optional)
pattern: "*.gzi"
output:
- ch_fasta_fai_gzi:
type: channel
description: Channel containing the reference genome FASTA file, its index and gzi index if present
structure:
- meta:
type: map
description: Metadata map that will be combined with the input data map
- fasta:
type: file
description: Reference genome FASTA file
pattern: "*.{fa,fasta,fa.gz,fasta.gz}"
- fai:
type: file
description: Reference genome FASTA index file
pattern: "*.{fai,faidx}"
- gzi:
type: file
description: Reference genome FASTA gzi index file (optional)
pattern: "*.gzi"
authors:
- "@louislenezet"
maintainers:
- "@louislenezet"
Loading
Loading