A Snakemake workflow for high-throughput AlphaFold 3 structure predictions. This workflow has the following advantages over standard AlphaFold 3 runs:
- Separated data and inference pipelines for better resource utilization.
- Implements the assemble-from-monomers technique described in the official AlphaFold 3 documentation for predicting multimers. See here for details.
- Treats each random seed as a separate job when multiple seeds are used, substantially increasing throughput for large-scale sampling campaigns.
- Efficient on HPC systems with multi-GPU nodes that lack support for consumable resources (i.e., nodes are allocated exclusively to a single user). See the configuration documentation at
config/README.mdfor details.
📖 Better documentation to make setup & usage smoother
🔄 Support for different running modes, including:
🧲 Pulldown
💊 Virtual screening
🔬 All-vs-all pairwise interactions
⚖️ Stoichiometry screen
🎲 Massive sampling
Detailed information about input data and workflow configuration can also be found in the config/README.md.
If you use this workflow in a paper, don't forget to give credits to the authors by citing the URL of this repository or its DOI.
To run the workflow from command line, change the working directory.
git clone https://github.com/ntnn19/AlphaFold3_workflow.git
cd path/to/AlphaFold3_workflowUse this option if you prefer to have a copy of the sif file in a specific directory instead of letting snakemake automatically build it. Otherwise, you can skip it.
See here or here for instructions.
Run the following command to build the Singularity container that supports parallel inference runs:
singularity build alphafold3_parallel.sif docker://ntnn19/alphafold3:latest_parallel_a100_40gbOr
singularity build alphafold3_parallel.sif docker://ntnn19/alphafold3:latest_parallel_a100_80gbInstall mamba or micromamba if not already installed.
Then, set up and activate the environment using the following commands:
mamba env create -p $(pwd)/venv -f environment.yml
mamba activate $(pwd)/venvFor Maxwell users
module load maxwell mamba
. mamba-init
mamba env create -p $(pwd)/venv -f environment.yml
mamba activate $(pwd)/venvOr if using micromamba
micromamba env create -p $(pwd)/venv -f environment.yml
eval "$(micromamba shell hook --shell=<YOUR SHELL>)"
micromamba activate $(pwd)/venvMake sure to download the required AlphaFold3 databases and weights before proceeding.
Adjust options in the default config file config/config.yaml.
Before running the complete workflow, you can perform a dry run using:
snakemake --dry-runTo run the workflow with test files using singularity, add a link to a container registry in the config.yaml, for example af3_container: "oras://ghcr.io/<user>/<repository>:<version>" for Github's container registry.
Run the workflow with:
snakemake --cores 2 --use-singularity --singularity-args "--nv -B <LOCAL_AF3_SEQUENCE_DATABASE>:/root/public_databases -B <YOUR_AF3_PARAMETERS>:/root/models" --configfile .test/config/custom/config.yaml --directory .test/config/customOther example JSON, TSV and configuration files (YAML format) files for testing other configurations are available in .test/config. Users with no access to GPU should remove the --nv flag.
The profiles/ directory can contain any number of workflow-specific profiles that users can choose from.
The profiles README.md provides more details.
- Nathan Nagar @ CSSB/LIV
Köster, J., Mölder, F., Jablonski, K. P., Letcher, B., Hall, M. B., Tomkins-Tinch, C. H., Sochat, V., Forster, J., Lee, S., Twardziok, S. O., Kanitz, A., Wilm, A., Holtgrewe, M., Rahmann, S., & Nahnsen, S. Sustainable data analysis with Snakemake. F1000Research, 10:33, 10, 33, 2021. https://doi.org/10.12688/f1000research.29032.2.
- Add configuration-specific
config/README.mdfile. - Add scoring report.
- Update snakemake version to latest in environment.yml.
If you find this useful, please consider giving it a star! ⭐
