This project is a successor to the C-WAP pipeline and is intended to process SARS-CoV-2 wastewater samples to determine relative variant abundance.
The results generated by this pipeline are not CLIA certified and should not be considered diagnostic.
CDCgov/aquascope is a bioinformatics best-practice pipeline for early detection of SARS-COV variants of concern, sequenced throughshotgun metagenomic sequencing, from wastewater.
The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible.
- Read QC:
FastQC
- Trimming reads:
Fastp
- Aligning short reads:
Minimap2
- Freyja Variant classification:
Freyja
- Present QC for raw reads:
MultiQC
-
Install
Nextflow
(>=21.04.0
) -
Install any of
Docker
,Singularity
,Podman
,Shifter
orCharliecloud
for full pipeline reproducibility (please only useConda
as a last resort; see docs) -
Prepare the
assets/samplesheet.csv
-
Use the
assets/test_highcoverage_samplesheet.csv
as an example -
Create custom sample sheets using the
fastq_dir_to_samplesheet.py
scriptUsage: fastq_dir_to_samplesheet.py \ /absolute/path/to/fastq/dir \ -st <forward/reverse/unstranded> \ samplesheetName.csv
i. FASTQ files must be paired end, following
_R1
,_R2
naming convention.ii. Strandedness must be known or "unstranded". NOTE: DNASeq is by default
unstranded
, while RNASeq is usuallystranded
-
-
Prepare the configuration files A.
nextflow.config
is prepared with default parameters, update as needed B.test.config
is prepared with default parameters, update as needed -
Run the pipeline profile
nextflow run \ main.nf \ -profile <docker/singularity/podman/shifter/charliecloud/conda/institute> \ -entry <QUALITY_ALIGN, FREYJA_ONLY, AQUASCOPE>
A. The
-profile test
will run the test parameters and samples only for test data - test_illumina: Runs Illumina data - test_ont: Runs ONT data - test_bams: Runs BAM data B. Please check nf-core/configs to see if a custom config file to run nf-core pipelines already exists for your Institute. If so, you can simply use-profile <institute>
in your command. This will enable eitherdocker
orsingularity
and set the appropriate execution settings for your local compute environment. NOTE: CDC users can only use singularity on SciComp resources. C. If you are usingsingularity
then the pipeline will auto-detect this and attempt to download the Singularity images directly. If you are persistently observing issues downloading Singularity images directly due to timeout or network issues then please use the--singularity_pull_docker_container
parameter to pull and convert the Docker image instead. D. If you are usingconda
, it is highly recommended to use theNXF_CONDA_CACHEDIR
orconda.cacheDir
settings to store the environments in a central location for future pipeline runs.
For more detailed documentation, please visit our user-guides.
Aquascope
was largely developed by OAMD's SciComp Team, with inputs from NWSS and the DCIPHER Team at Palantir. Detailed contributions can be found in our user-guides.
If you would like to contribute to this pipeline, please see the contributing guidelines.
An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md
file.