Skip to content

ML-based splice prediction for variants outside canonical splice windows #297

@iskandr

Description

@iskandr

Background

SpliceOutcomeSet (#262, PR #292) wraps variants that the existing classifier already flags as splice-adjacent — SpliceDonor, SpliceAcceptor, ExonicSpliceSite, IntronicSpliceSite. That covers the canonical donor/acceptor di-nucleotides, the last 3 bases of each exon, and the first 3–6 intronic bases.

But plenty of splice-altering variants sit outside that window:

  • Exonic splicing enhancer (ESE) / silencer (ESS) disruption: a silent or missense variant in the middle of an exon can disrupt an ESE motif and cause the exon to be skipped in the mature transcript. varcode today emits a bare Substitution/Silent for these; SpliceOutcomeSet cannot wrap them because the classifier never flagged them.
  • Deep intronic variants that create or disrupt a cryptic splice site hundreds of bases from the canonical boundary.
  • Branch point disruption (~20–50 bp upstream of the acceptor).

These require ML-based prediction (SpliceAI, SpliceTransformer, MMSplice, Pangolin) or RNA evidence to detect. Without them, the possibility-set model silently under-reports splice consequences for exonic/intronic variants outside the canonical window.

Scope

  • Integrate a splice-prediction scorer as an optional dependency (SpliceAI is the obvious first target).
  • When a variant is outside the canonical window but the scorer flags a high-probability splice change, wrap it in a SpliceOutcomeSet (or the successor multi-outcome abstraction from the OutcomeSet generalization work).
  • Keep the dependency optional; default splice_outcomes=True uses canonical-window classification only, ML-informed wrapping is a second opt-in (e.g. splice_outcomes="ml" or splice_scorer=...).

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions