*Currently deployed for Cattle SVs & SNPs Discovery in the Bovine Long-Read Consortium (BovLRC) *
Initial setup:
- Clone this Github
git clone https://github.com/tuannguyen8390/AgVic_CLRC.git
Pipeline developed for usage in the Bovine Long-Read Consortium (BovLRC). The pipeline deployed multiple bioinformatics software for the detection of Single Nucldeotide Polymorphism (SNPs) & Structural Variants (SV). The pipeline (version 0.0.2) currently deployed. It was designed to deal with data from both Oxford Nanopore as well as PacBio (However we only test at the moment with ONT). Written in Nextflow DSL2.
- Obtain & install Docker/Shifter/Singularity
- Installation guide for Docker can be found here
- Installation guide for Shifter can be found here
- Installation guide for Singularity can be found here
- Pull assets (genome) and perform some initial setup
- Run the following command to pull assets (genome) and perform some initial setup (choose 1 among Shifter/Docker/Singularity only)
nextflow run setup.nf -profile shifter/docker/singularity
- Test run the pipeline ((choose 1 among Shifter/Docker/Singularity only)
Edit the nextflow.config files to suit your local environment
nextflow run setup.nf -profile shifter/docker/singularity,test
- Run the pipeline
nextflow run main.nf -profile shifter/docker/singularity
5*. If you run AWS, you can use the following command to run the pipeline
nextflow run main.nf -profile shifter/docker/singularity,awsbatch
The pipeline works using 2 metadata spreadsheet in the meta
folder, in which:
metadata_SR.csv
: metadata for short-read data
metadata_LR.csv
: metadata for long-read data
Please refer to these files for editing your own. You can run with your own files deploying --LR_MetaDir
AND/OR --SR_MetaDir
- QC :
-
FiltLong : QC for both LongReads and ShortReads (DEFAULT)
-
NanoFilt + FMLRC2 : NanoFilt for QC of Long-Read samples, and FMLRC2 + NanoFilt for QC of Short-Read samples .
- Mapping:
-
Minimap2 : (DEFAULT)
- SNP Caller: All callers are run in parallel & deploy per chromosome (1 to 29 & X as the pipe currently deployed in cattle)
-
Clair3 : (RECOMMEND FOR DOWNSTREAM ANALYSIS)
-
PEPPER - By default, Flowcell < 10.4 will be analyzed with PEPPER
-
DEEPVARIANT - By default, Flowcell >= 10.4 will be analyzed with DEEPVARIANT & HIFI (RECOMMEND FOR DOWNSTREAM ANALYSIS)
- SV Caller: All callers are run in parallel
-
Sniffles2 (RECOMMEND FOR DOWNSTREAM ANALYSIS)
-
DYSGU (RECOMMEND FOR DOWNSTREAM ANALYSIS)
-
CuteSV2 (RECOMMEND FOR DOWNSTREAM ANALYSIS)
- Reporting
- PRE/POST QC : NanoPlot
- Alignment Depth : Mosdepth
- Extra process for Nanopore
- PorechopABI
I've absolutely no doubt that there should be some problems :). It runs on my end, but perhaps not yours. If that is the case, please email to Tuan Nguyen