LYSSA: A Rabies Analysis Pipeline

Introduction

This pipeline is designed for the analysis of rabies data using Pacbio or MinION sequencing data. It performs quality control, species identification, abundance estimation, SNP calling, and annotation.

Prerequisites

Nextflow is needed. The details of installation can be found at https://github.com/nextflow-io/nextflow. For HiPerGator users, its installation is not needed.

Singularity/APPTAINER is needed. The details of installation can be found at https://singularity-tutorial.github.io/01-installation/. For HiPerGator users, its installation is not needed.

SLURM is needed. For HiPerGator users, its installation is not needed.

Python3 is needed. The package "pandas" should be installed by pip3 install pandas if not included in your python3.

LongQC is needed. Please install it to your local computer from its GitHub repository (https://github.com/yfukasawa/LongQC). For HiPerGator users, its installation is not needed.

PacBio SMRTLINK stand-alone tools are needed. About how to install them, please see the file "How_to_install_smrtlink_tools.txt" in the pipeline.

The Kraken2/Bracken Refseq index--PlusPF is needed. Please download PlusPF index (over 77 GB) from the link (https://benlangmead.github.io/aws-indexes/k2) to the "PlusPF" folder in your local computer. And then extract the tar.gz archive. For HiPerGator users, downloading is not needed. It has been downloaded and configured in the pipeline.

Recommended conda environment installation

conda create -n LYSSA -c conda-forge python=3.10 pandas

conda activate LYSSA

Pipeline summary

Quality control (LongQC)
Species identification (Kraken2)
Species abundance estimation (Bracken)
Read alignment (pbmm2)
SNP calling (BCFtools)
Variant annotation (SnpEff)

Pipeline Overview

How to run

Rename your data files and make them looks like "bc2024bc2024.bam.pbi" and "bc2024bc2024.bam". You can use to the script "rename.sh" in the pipeline to rename your data files.
Put the renamed data files (*.bam and *.bam.pbi) into the directory /pbbams.
Open file "params.yaml", set the full paths of the parameters.
input : the full path to pbbams dir of the pipeline in your computer. It looks like "/<the full path to the pipeline dir in your computer>/pbbams".
output : the full path to output dir of the pipeline in your computer. It looks like "/<the full path to the pipeline dir in your computer>/output".
reference : the full path to reference dir of the pipeline in your computer. It looks like "/<the full path to the pipeline dir in your computer>/reference".
snpeffconfig : the full path to configs dir of the pipeline in your computer. It looks like "/<the full path to the pipeline dir in your computer>/configs".

db : the full path to kraken/bracken-database (PlusPF) in your computer. It looks like "/<the full path to the parent dir of PlusPF foler in your computer>/PlusPF".
qc : the full path to LongQC dir in your computer. It looks like "/<the full path to the parent dir of LongQC foler in your computer>/LongQC".

Note: For HiperGator users, the parameters "db" and "qc" do not need to be changed. Just keep default settings.
Get to the top directory of the pipeline, run

sbatch ./lyssa.sh

Note:

If you want to get email notification when the pipeline running ends, please input your email address in the line "#SBATCH --mail-user=" of the file lyssa.sh.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
configs		configs
data/L_rabies_GCF000859625		data/L_rabies_GCF000859625
modules		modules
pbbams		pbbams
reference		reference
README.md		README.md
lyssa.nf		lyssa.nf
lyssa.sh		lyssa.sh
nextflow.config		nextflow.config
params.yaml		params.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LYSSA: A Rabies Analysis Pipeline

Introduction

Prerequisites

Recommended conda environment installation

Pipeline summary

Pipeline Overview

How to run

Note:

About

Releases

Packages

Contributors 2

Languages

BPHL-Molecular/Lyssa

Folders and files

Latest commit

History

Repository files navigation

LYSSA: A Rabies Analysis Pipeline

Introduction

Prerequisites

Recommended conda environment installation

Pipeline summary

Pipeline Overview

How to run

Note:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages