Conversation
There was a problem hiding this comment.
Pull request overview
This PR refines the Talos Nextflow pipelines and container build setup by standardising DSL2 usage, aligning output handling with Nextflow “workflow outputs”, and tightening script execution/IO conventions.
Changes:
- Add/standardise
nextflow.enable.dsl=2, remove mostpublishDirusage, and expand workflow outputs to include a PanelApp artefact. - Improve robustness/consistency across modules (e.g.,
set -euo pipefail, safer cleanup, corrected channel usage/capitalisation, VCF/index tuple wiring). - Optimise Docker build and CI version extraction (switch VERSION detection to
ARG VERSION=..., useuv syncin the main Dockerfile).
Reviewed changes
Copilot reviewed 34 out of 34 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
talos_only.nf |
Enables DSL2 and updates how prior annotation outputs are discovered; adds PanelApp to outputs. |
src/talos/cpg_internal_scripts/cpgflow_jobs/make_config.py |
Minor comment wording fix (“fault-tolerant”). |
src/talos/cpg_internal_scripts/CPG_Dockerfile |
Docker build layer/flow adjustments; consolidates stages and aligns version handling. |
nextflow_schema.json |
Adds a Nextflow parameter schema for config validation/documentation. |
nextflow/talos.nf |
Renames PanelApp path variable and emits PanelApp output channel. |
nextflow/modules/talos/ValidateMOI/main.nf |
Removes publishDir; adds stricter shell flags. |
nextflow/modules/talos/UnifiedPanelAppParser/main.nf |
Removes publishDir; adds explicit script: and stricter shell flags. |
nextflow/modules/talos/StartupChecks/main.nf |
Removes commented publishDir; uses set -euo pipefail. |
nextflow/modules/talos/RunHailFiltering/main.nf |
Removes publishDir; uses set -euo pipefail. |
nextflow/modules/talos/HPOFlagging/main.nf |
Removes publishDir; adds explicit script: and stricter shell flags. |
nextflow/modules/talos/CreateTalosHTML/main.nf |
Removes publishDir; uses set -euo pipefail. |
nextflow/modules/prep/ResummariseRawSubmissions/main.nf |
Emits VCF+index as a tuple; adds stricter shell flags; safer cleanup. |
nextflow/modules/prep/ParseManeIntoJson/main.nf |
Adds set -euo pipefail. |
nextflow/modules/prep/ParseAlphaMissense/main.nf |
Adds set -euo pipefail. |
nextflow/modules/prep/MakeClinvarbitrationPm5/main.nf |
Adds set -euo pipefail; safer cleanup. |
nextflow/modules/prep/EncodeAlphaMissense/main.nf |
Adds set -euo pipefail. |
nextflow/modules/prep/DownloadPanelApp/main.nf |
Adds set -euo pipefail. |
nextflow/modules/prep/DownloadClinVarFiles/main.nf |
Adds set -euo pipefail. |
nextflow/modules/prep/CreateRoiFromGff3/main.nf |
Minor comment tweak; adds set -euo pipefail. |
nextflow/modules/prep/ConvertSpliceVarDb/main.nf |
Adds set -euo pipefail. |
nextflow/modules/prep/AnnotateClinvarWithBcftools/main.nf |
Accepts (vcf, vcf_idx) tuple input; adds set -euo pipefail. |
nextflow/modules/annotation/SplitVcf/main.nf |
Removes commented publishDir; adds set -euo pipefail. |
nextflow/modules/annotation/NormaliseAndRegionFilterVcf/main.nf |
Removes commented publishDir; adds set -euo pipefail. |
nextflow/modules/annotation/MergeVcfsWithBcftools/main.nf |
Removes commented publishDir; switches merge output to -Oz; adds set -euo pipefail. |
nextflow/modules/annotation/AnnotatedVcfIntoMatrixTable/main.nf |
Removes publishDir; uses set -euo pipefail; safer checkpoint cleanup. |
nextflow/modules/annotation/AnnotateWithEchtvar/main.nf |
Removes commented publishDir; uses set -euo pipefail. |
nextflow/modules/annotation/AnnotateCsqWithBcftools/main.nf |
Removes commented publishDir; adds set -euo pipefail. |
nextflow/annotation.nf |
Fixes Channel.fromPath capitalisation consistency. |
nextflow.config |
Removes redundant params, uses 1.GB-style memory values, and sets per-process resources. |
main.nf |
Enables DSL2 and emits PanelApp as a workflow output. |
docs/NextflowConfiguration.md |
Updates docs to remove outdir and rely on outputDir. |
README.md |
Updates output path documentation to reference outputDir. |
Dockerfile |
Simplifies stages and installs Python deps via uv sync; switches VERSION to ARG. |
.github/workflows/get_version.py |
Updates version extraction to read ARG VERSION=.... |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| .map { row -> tuple( | ||
| row.cohort, | ||
| files("${params.outdir}/${row.cohort}_outputs/*.mt", type: 'dir'), | ||
| Channel.fromPath("${outputDir}/${row.cohort}_outputs/*.mt", checkIfExists: true), | ||
| file(row.pedigree, checkIfExists: true), | ||
| file(row.config, checkIfExists: true), |
There was a problem hiding this comment.
Channel.fromPath(...) returns a channel; embedding it inside the tuple emitted by ch_inputs.map means ch_mts becomes a nested channel rather than a list/path value. Downstream (TALOS expects ch_mts items shaped like tuple(cohort, mts, ...), where mts is a path or collection of paths, not a Channel), so this will break joins and process inputs. Use files("${workflow.outputDir}/${row.cohort}_outputs/*.mt", type: 'dir') (or build a channel of .mt paths and groupTuple(by:0) like ANNOTATION does) instead of Channel.fromPath inside the tuple.
| def extract_version_from_file(file_path: str) -> str | None: | ||
| """ | ||
| Extract the version from a Dockerfile by searching for a line like: | ||
| ENV VERSION=1.0.0 | ||
| """ | ||
| with open(file_path) as f: | ||
| content = f.read() | ||
| pattern = re.compile(r'^\s*ENV\s+VERSION\s*=\s*([^\s]+)', re.MULTILINE) | ||
| pattern = re.compile(r'^\s*ARG\s+VERSION\s*=\s*([^\s]+)', re.MULTILINE) | ||
| match = pattern.search(content) |
There was a problem hiding this comment.
The version-extraction regex now looks for ARG VERSION=..., but the function docstring still says it searches for ENV VERSION=... (and the later error message still references ENV). Update the documentation/error text to match the new ARG VERSION convention to avoid confusing future maintainers.
Claude initiated
Fixes
Proposed Changes
nextflow.enable.dsl=2etc.)"1 GB"instead of1.GB- same result, but though string parsing instead of plain Groovyset -euo pipefailadded to all script blocks),rm -fto prevent failure if files (which are definitely written) stop being writtenscript:directives to some filesRemains a failure around workflow outputs..
Checklist