A Python tool, based on GWASLab, for assembling a computational pipeline to standardise, QC, harmonise, convert and plot summary statistics in genetic association studies (GWAS).
To use GWASPipe, you'll need:
- Python 3.10 or higher
- The following dependencies installed:
click
for command-line interfacecloup
for file handling and metadata managementloguru
for loggingruamel-yaml
for YAML parsing and generationpandas
for data manipulation and analysispyarrow
for high-performance data processingnumpy
for numerical computationsmatplotlib
for plottinggwaslab
for handling sumstats
see pyproject.toml
You can use one of the following ways for installing Gwaspipe.
- Clone the repository using Git:
git clone https://github.com/your-username/gwaspipe.git
- Create a conda environment:
conda env create -n gwaspipe -f environment.yml
- Activate it:
conda activate gwaspipe
- Install gwaspipe:
make install
This will install snakemake into an isolated software environment
- Run the tool:
gwaspipe --help
Gwaspipe is available also as a Dockerfile
- Build the image locally:
docker build -t gwaspipe:latest .
- Run it:
docker run -t -i gwaspipe:latest gwaspipe --help
The Linux OS image is available from the github packages repository: Docker image
GWASPipe provides a command-line interface (CLI) for easy usage. You can customize the behavior of the tool by providing configuration files in YAML format.
The CLI takes the following arguments:
-c
or--config
: Path to the configuration file-i
or--input
: Path to the summary statistics file-f
or--format
: Format of the input file (vcf, gwaslab, regenie, fastgwa, ldsc, fuma, pickle, metal_het)-o
or--output
: Path where results should be saved-s
or--input_file_separator
: Input file separator-q
or--quiet
: Set log verbosity--study_label
: Input study label, valid only for VCF files--pid
: Preserve ID
Example configuration files are provided in the examples
directory.
The following example configuration files demonstrate how to customize GWASPipe:
- config_harmonize_VCF_sumstats.yml: a sample configuration file for harmonizing summary statistics in gwas-vcf format
We welcome contributions to GWASPipe! If you'd like to contribute, please follow these guidelines:
- Fork the repository on GitHub:
https://github.com/your-username/gwaspipe.git
- Create a new branch for your changes:
git checkout -b feature/your-feature-name
- Make your changes and commit them:
git add . && git commit -m "Your commit message"
- Push your changes to the remote repository:
git push origin feature/your-feature-name
- Open a pull request on GitHub to propose your changes
We appreciate your contributions and look forward to seeing GWASPipe grow!