A Python-based tool for designing and evaluating CRISPR guide RNAs (sgRNAs) with a focus on longevity-related genes.
- Fetch gene sequences from NCBI using Biopython
- Design candidate sgRNAs with NGG PAM sequences
- Score sgRNAs based on multiple criteria:
- GC content optimization (40-60%)
- Homopolymer detection and penalization
- Off-target prediction using BLAST
- Position-based efficiency scoring
- Visualization of sgRNA scores and rankings
- Comparative analysis with existing tools
- Clone this repository:
git clone https://github.com/yourusername/crispr_design.git
cd crispr_design- Create a virtual environment and activate it:
python -m venv venv
source venv/bin/activate # On Windows, use: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txtcrispr_design/
├── src/ # Source code
│ ├── sequence.py # Gene sequence retrieval
│ ├── design.py # sgRNA design algorithms
│ ├── scoring.py # Scoring functions
│ └── visualization.py # Data visualization
├── tests/ # Unit tests
├── data/ # Gene sequences and results
├── notebooks/ # Jupyter notebooks for analysis
└── requirements.txt # Project dependencies
The examples/ directory contains sample outputs from analyzing key longevity-associated genes:
foxo3_candidates.txt: List of candidate sgRNAs with their scores and propertiesfoxo3_gc_distribution.png: Visualization of GC content distribution across candidate sgRNAs
apoe_candidates.txt: List of candidate sgRNAs with their scores and propertiesapoe_gc_distribution.png: Visualization of GC content distribution across candidate sgRNAs
sirt1_candidates.txt: List of candidate sgRNAs with their scores and propertiessirt1_gc_distribution.png: Visualization of GC content distribution across candidate sgRNAs
- Set your NCBI email in the configuration:
from Bio import Entrez
Entrez.email = "your.email@example.com" # Required for NCBI data access- Basic sgRNA design workflow:
from src.sequence import fetch_gene_sequence
from src.design import find_sgrnas
from src.scoring import score_sgrna
# Example using FOXO3
sequence = fetch_gene_sequence("NM_001455.4") # FOXO3 gene
# Or APOE: NM_000041.4
# Or SIRT1: NM_012238.5
# Find candidate sgRNAs
candidates = find_sgrnas(sequence)
# Score candidates
scores = [score_sgrna(sg) for sg in candidates]Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- NCBI for providing gene sequence data
- Biopython community for their excellent tools
- CRISPR research community for scoring guidelines