GWASPipe is a Python-based computational pipeline that streamlines the management and processing of genome-wide association study (GWAS) summary statistics. It automates complex workflows for quality control, standardization, and visualization, making multi-study harmonization more reproducible, efficient, and less error-prone.
- Modular Architecture: Organized into reusable components (
order_alleles,utils) - Automated Workflows: Handles normalization, allele harmonization, filtering, and QC metrics
- Flexible Configuration: Uses YAML configuration files for customizable processing
- Comprehensive Reporting: Generates QC reports and visualizations
- High Performance: Leverages parallel processing and optimized algorithms
- Python: 3.11 or higher
- Dependencies: See
pyproject.tomlfor complete list - Key Packages:
gwaslab- Core GWAS processing librarypandas,numpy- Data manipulationclick,cloup- Command-line interfaceloguru- Advanced loggingruamel.yaml- YAML configuration
# Clone the repository
git clone https://github.com/ht-diva/gwaspipe.git
cd gwaspipe
# Create and activate environment
conda env create -f environment_docker.yml
conda activate gwaspipe
# Install package
make install# Pull the Docker image
docker pull ghcr.io/ht-diva/gwaspipe:latest
# Run container
docker run -v $(pwd):/data ghcr.io/ht-diva/gwaspipe gwaspipe --helpProcess GWAS summary statistics with a single command:
gwaspipe \
-c examples/config_sumstats_harmonization.yml \
-i examples/input_data.tsv.gz \
-f regenie \
-o results/The gwaspipe.order_alleles module provides comprehensive allele ordering functionality:
from gwaspipe.order_alleles import order_alleles
import pandas as pd
from gwaslab.info.g_Log import Log
# Basic usage
df = pd.DataFrame({
'CHR': [1, 2, 3],
'POS': [1000, 2000, 3000],
'EA': ['A', 'T', 'C'],
'NEA': ['T', 'A', 'G'],
'STATUS': [9999999, 9999999, 9999999]
})
log = Log()
result = order_alleles(df, log=log)GWASPipe uses YAML configuration files to define processing pipelines. See Getting Started Guide for detailed configuration examples.
We welcome contributions! Please follow these steps:
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature - Make changes and add tests
- Commit changes:
git commit -m "Add feature description" - Push branch:
git push origin feature/your-feature - Open a Pull Request
GWASPipe is released under the MIT License.