Skip to content

Latest commit

Β 

History

History
108 lines (80 loc) Β· 3.92 KB

File metadata and controls

108 lines (80 loc) Β· 3.92 KB

DeepFold-PLM: Accelerating Protein Structure Prediction via Efficient Homology Search Using Protein Language Models

Status Website License

🧬 Overview

DeepFold-PLM accelerates protein structure prediction by integrating advanced protein language models with vector embedding databases to achieve ultra-fast MSA construction and enhanced structure prediction capabilities.

Architecture of DeepFold-PLM pipeline

Key Features

  • ⚑ 47x Faster MSA Generation: Dramatically accelerated multiple sequence alignment construction
  • πŸ“ˆ Enhanced Diversity: Increased sequence diversity for better coevolutionary information
  • πŸš€ Superior Performance: Outperforms AlphaFold's JAX implementation for sequences longer than 3,000 residues
  • ⚑ Optimized Attention: 6x faster than PyTorch baseline with custom CUDA kernels
  • πŸ”§ Multi-GPU Scaling: Linear performance scaling across 1-4 NVIDIA A100 GPUs
  • 🌐 User-Friendly Interface: Real-time analysis through web service
  • πŸ”Œ API Access: plmMSA API access with automatic pairing capabilities

πŸš€ Quick Start

plmMSA

✨ Try our fast plmMSA API - Get MSA results in seconds with automatic pairing support! Fully compatible with ColabFold and MMseqs2 API formats for seamless integration into your existing workflows.

Easy Integration with ColabFold:

from colabfold.batch import run

results = run(
    queries=queries,
    result_dir=result_dir,
    use_templates=use_templates,
    ...  # other parameters
    host_url="https://df-plm.deepfold.org/api/colab"
)

Easy Integration with Boltz:

boltz predict 8JEL.yaml --use_msa_server --msa_server_url "https://df-plm.deepfold.org/api/colab"

REST API Example:

# Submit MSA job for protein complex
curl -X POST 'https://df-plm.deepfold.org/api/plmmsa/v1/submit' \
-H 'Content-Type: application/json' \
-d '{
    "mode": "unpaired+paired", 
    "sequences": [
        "MAHHHHHHVAVDAVSFTLLQDQLQSVLDTLSEREAGVVRLRFGLTDGQPRTLDEIGQVYGVTRERIRQIESKTMSKLRHPSRSQVLRDYLDGSSGSGTPEERLLRAIFGEKA",
        "MRYAFAAEATTCNAFWRNVDMTVTALYEVPLGVCTQDPDRWTTTPDDEAKTLCRACPRRWLCARDAVESAGAEGLWAGVVIPESGRARAFALGQLRSLAERNGYPVRDHRVSAQSA"
    ]
}'

# Check job status (replace YOUR_JOB_ID with actual job ID)
curl -X GET 'https://df-plm.deepfold.org/api/plmmsa/v1/job/YOUR_JOB_ID'

See plmMSA for more information.

DeepFold PyTorch

Performance Comparison

πŸš€ Our optimized PyTorch implementation achieves significant speedups through:

  • ⚑️ Multi-GPU parallelization
  • πŸ”§ Custom CUDA kernels
  • πŸ’ͺ High-throughput processing

Enabling large-scale structural biology research and production deployments.

See DeepFold for more information.

πŸ–₯️ Website

Under Construction!, Explore (experimental): https://df-plm.deepfold.org/

πŸ“š Citation

If you use DeepFold-PLM in your research, please cite our paper:

@article{kim2025deepfold,
  title={DeepFold-PLM: Accelerating Protein Structure Prediction via Efficient Homology Search Using Protein Language Models},
  author={Kim, Minsoo and Bae, Hanjin and Jo, Gyeongpil and Kim, Kunwoo and Lee, Sung Jong and Yoo, Jejoong and Joo, Keehyoung},
  journal={Bioinformatics},
  volume={41},
  issue={11},
  doi={https://doi.org/10.1093/bioinformatics/btaf579},
  year={2025},
  publisher={Oxford University Press (OUP)},
  pages={1--13},
  url={https://df-plm.deepfold.org/}
}

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.