This repository provides a reproducible workflow for antisense oligonucleotide (ASO) masking experiments using AlphaGenome. The analysis proceeds through:
- Visualize the gene and context-specific model outputs
- Generate ASO-masked sequence variants across a target window
- Predict effects and compute ASO impact scores
- Visualize ASO scores as sequence logos around the target exon
The alphagenome library is used as-is via its public API.
Get an AlphaGenome API key: https://deepmind.google.com/science/alphagenome/
Create and activate a Python environment (Python ≥3.10 recommended):
conda create -y -n alphagenome_env python=3.11
conda activate alphagenome_env
pip install -e alphagenome biopython pandas numpy matplotlibDownload primary assembly FASTA:
cd data
wget https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_46/GRCh38.primary_assembly.genome.fa.gz
gunzip GRCh38.primary_assembly.genome.fa.gz
cd ..Note: AlphaGenome currently supports GENCODE up to v46.
Edit parameters (e.g., config_private/SMN2.json) before running aso.ipynb:
- gene_symbol: Target gene (e.g.,
SMN2) - dna_api_key: AlphaGenome API key
- ontology_terms: Context identifiers (UBERON/CL ontology). See
results/metadata*.csv - requested_outputs: Prediction types (e.g.,
RNA_SEQ,SPLICE_SITE_USAGE) - exon_intervals: Genomic start/end of target exon
- flank: Bases around exon to include in ASO window (<250 recommended)
- ASO_length: Length of the ASO masking window
- strand: Track strand filter (
+,-,stranded,unstranded,all) - track_filter: Substring to filter tracks (optional)
- SNV: Single nucleotide variant positions (0-based index; currently SNV is the only supported variant type).
Tips:
- Ensure
results_direxists or let the notebook create it; ASO outputs are saved underresults_dir/ASO/. - For SNVs, set
positionrelative to the resized interval start (0-based); the notebook converts to 1-based for AlphaGenome.
ASO experiments mask a short window (length = ASO_length) with N bases while sliding across a target region around the exon. For each masked variant, AlphaGenome predicts selected outputs (e.g., RNA-seq coverage, splice-site usage). Differences relative to the reference are aggregated into ASO impact scores and visualized as sequence logos over the window.
${results_dir}/ASO/{config}_ASO_scores.csv: Per-ASO scores (one row per mask position) for each requested output type.${results_dir}/ASO/{config}_ASO_{OUTPUT}.bed: Top and bottom ASOs colored by effect size for genome browser tracks.${results_dir}/ASO/{config}_ASO_{OUTPUT}_full.bed: Full set of ASO windows (all positions) for the output type.- SeqLogo plots and overlaid tracks are rendered inline in
aso.ipynbfor quick inspection.