Skip to content

Rename FASTA lookup helper: sequence_lookup_with_ens_fallback → lookup_sequence_with_version_fallback#352

Merged
iskandr merged 1 commit into
mainfrom
rename-version-fallback-helper
May 13, 2026
Merged

Rename FASTA lookup helper: sequence_lookup_with_ens_fallback → lookup_sequence_with_version_fallback#352
iskandr merged 1 commit into
mainfrom
rename-version-fallback-helper

Conversation

@iskandr

@iskandr iskandr commented May 13, 2026

Copy link
Copy Markdown
Contributor

Summary

The helper I added in #350 / v2.9.6 was misnamed. ENS isn't the fallback — both Ensembl and GENCODE protein/transcript IDs start with ENS. The actual fallback is to a version-stripped form of the same ID; the ENS-prefix check is just a guard that says "this ID has a version we know how to strip safely" (in contrast to e.g. TAIR AT1G01010.1 where .1 is an isoform suffix, not a version).

Renamed to lookup_sequence_with_version_fallback and the docstring is rewritten to correctly describe both Ensembl and GENCODE as having versioned IDs (they just split that information differently across the GTF / FASTA pair).

Internal helper only — not exported from pyensembl/__init__.py, was added today in v2.9.6, so no external callers can depend on the old name. No back-compat shim.

Three call sites updated:

  • Transcript.sequence
  • Transcript.protein_sequence
  • Genome.transcript_sequence(id) / Genome.protein_sequence(id)

Bumps version to 2.9.7.

Test plan

  • pytest tests/test_versioned_protein_fasta.py tests/test_mouse.py tests/test_tair10_complete.py tests/test_versions.py tests/test_sequence_data.py tests/test_transcript_sequences.py (20 passed locally)
  • ./lint.sh
  • CI on PR

…on_fallback

The old name was misleading: the fallback isn't to "ENS" (both Ensembl
and GENCODE IDs start with ENS); the fallback is to a version-stripped
form. The ENS prefix is just a safety gate so we don't strip non-Ensembl
.N suffixes (e.g. TAIR isoform suffixes like AT1G01010.1).

Docstring rewritten to reflect that both Ensembl and GENCODE produce
versioned IDs - the formats just split that information differently
(Ensembl in a separate *_version attribute, GENCODE embedded in the ID
itself). The helper exists for the GENCODE case where the GTF stores
the full versioned ID and a literal FASTA lookup would miss.

No back-compat shim: this helper was added today in v2.9.6 and isn't
exposed in __init__.py, so no external callers can depend on the old
name.

Bump version to 2.9.7.
@coveralls

Copy link
Copy Markdown

Coverage Status

coverage: 84.971%. remained the same — rename-version-fallback-helper into main

@iskandr iskandr merged commit 5e08bb6 into main May 13, 2026
10 checks passed
@iskandr iskandr deleted the rename-version-fallback-helper branch May 13, 2026 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants