Skip to content

feat: add analyze-fasta skill for FASTA sequence analysis#38

Open
santiago-rodriguezs wants to merge 1 commit into
anthropics:mainfrom
santiago-rodriguezs:feat/analyze-fasta-skill
Open

feat: add analyze-fasta skill for FASTA sequence analysis#38
santiago-rodriguezs wants to merge 1 commit into
anthropics:mainfrom
santiago-rodriguezs:feat/analyze-fasta-skill

Conversation

@santiago-rodriguezs
Copy link
Copy Markdown

@santiago-rodriguezs santiago-rodriguezs commented Apr 7, 2026

Summary

Add analyze-fasta as a new skill to the Life Sciences marketplace. It analyzes FASTA files (nucleotide and protein) with automatic sequence type detection, comprehensive bioinformatics metrics, and interactive HTML reports.

  • 762-line Python script powered by Biopython
  • Auto-detects nucleotide vs protein sequences
  • Three output formats: text, JSON, HTML
  • Nucleotide analysis: GC%, base composition, dinucleotides, ORF detection in 3 forward reading frames, molecular weight, N50
  • Protein analysis: molecular weight, isoelectric point, instability index, GRAVY, aromaticity, secondary structure prediction, charged/aromatic residues
  • HTML reports with composition bars, metric cards, ORF tables, JSON viewer, and a "How it works" methodology section

Designed so that Claude runs the script and then interprets the JSON output in biological context (organism inference, functional predictions, follow-up suggestions).

Files Changed

  • analyze-fasta/SKILL.md — Skill definition with usage instructions and biological interpretation guidelines
  • analyze-fasta/scripts/analyze_fasta.py — Main analysis script (762 lines, Biopython)
  • analyze-fasta/LICENSE.txt — Apache 2.0
  • .claude-plugin/marketplace.json — Added entry to plugins array

Requirements

  • Python 3.10+
  • biopython, numpy (pip install biopython numpy)

Usage

# Install the skill
/plugin install analyze-fasta@life-sciences

# Use via Claude Code
/analyze-fasta input.fasta

# Or run directly
python3 scripts/analyze_fasta.py input.fasta --json
python3 scripts/analyze_fasta.py input.fasta --html

Tested With

  • Protein sequence: Butyrate kinase from Clostridium butyricum (326 aa, UniProt Q9ZJI3)
  • Nucleotide sequence: Human chromosome 13 BRCA2 region (~85 kb, NCBI NC_000013.11)

Both text, JSON, and HTML outputs verified.

Add a skill that analyzes FASTA files (nucleotide and protein) with
automatic sequence type detection, comprehensive bioinformatics metrics,
and interactive HTML reports. Designed to be used with Claude for
biological interpretation of results.

Features:
- Auto-detects nucleotide vs protein sequences
- Nucleotide analysis: GC%, base composition, dinucleotides, ORF detection
  in 3 reading frames, molecular weight, N50
- Protein analysis: molecular weight, isoelectric point, instability index,
  GRAVY, aromaticity, secondary structure prediction, charged/aromatic residues
- Three output formats: text, JSON, HTML
- HTML reports with composition bars, metric cards, ORF tables, and a
  "How it works" section explaining the analysis pipeline

Requirements: biopython, numpy (Python 3.10+)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Faridmurzone
Copy link
Copy Markdown

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants