Skip to content

Latest commit

 

History

History
152 lines (97 loc) · 4.53 KB

File metadata and controls

152 lines (97 loc) · 4.53 KB

CCBGpipe

This pipeline is designed to complete circular bacterial genomes using a sampling strategy from a single MinION with barcoding.

To run with Docker

Since CCBGpipe implements many analytical tools, we strongly recommend you to run it with Docker. Alternatively, you can install all dependencies as listed in the Installation section.

git clone https://github.com/espirado/CCBGpipe.git

cd CCBGpipe

docker build -t "ccbgpipe:v1" ./

docker run -h ccbgpipe --name ccbgpipe -t -i -v /:/MyData ccbgpipe:v1 /bin/bash

Inside the Docker: root@ccbgpipe:/# 
To install java:
    apt-get update
    apt-get install -y software-properties-common
    add-apt-repository ppa:webupd8team/java
    apt-get update && apt-get install oracle-java8-installer

Please note: the Oracle JDK license has changed starting April 16, 2019.
You can download zulu to include Java (https://www.azul.com/downloads/zulu/).
    apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 0xB1998361219BD9C9
    apt-add-repository 'deb http://repos.azulsystems.com/ubuntu stable main'
    echo 'deb http://repos.azulsystems.com/debian stable main' > /etc/apt/sources.list.d/zulu.list
    apt-get update
    apt-get install zulu-8

Installation

Installation from source

cd /opt

git clone https://github.com/jade-nhri/CCBGpipe.git

cd CCBGpipe/CCBGpipe

chmod +x *.py

export PATH="$PATH:/opt/CCBGpipe/CCBGpipe/"

Dependencies

Before installing these dependencies it may be required to install some prerequisite libraries, best installed by a package manager. On Ubuntu theses are:

  • cmake
  • liblzma-dev
  • libbz2-dev
  • libz-dev
  • libncurses-dev
  • libcurl4-gnutls-dev
  • libssl-dev
  • make
  • wget
  • python3-all-dev
  • parallel
  • networkx
  • pandas
  • pyfastaq

Quick usage

  • To extract fastq (joinedreads.fastq) and fast5 files using extract.py. Joinedreads.fastq and fast5 files are both produced in the directory (outpath/fast5/barcodeXX/).

extract.py path-to-raw_reads outpath (e.g., extract.py raw-reads albacore)

  • To create a Run folder and enter it

mkdir Run && cd Run

  • To get high-quality and long-length reads from joinedreads.fastq using runGetFastq.py

runGetFastq.py path-to-fast5 (e.g., runGetFastq.py ../albacore/fast5)

  • To get miniasm assemblies using runmini.py

runmini.py

  • To run canu with the sampling strategy by using runAssembly.py

runAssembly.py

  • To run racon and nanopolish for consensus sequence generation using runConsensus.py with fast5 files

runConsensus.py path-to-fast5 (e.g., runConsensus.py ../albacore/fast5/)

  • To get circular genomes by using finalize.py

finalize.py outpath (e.g., finalize.py ../results)

Basecalling with Guppy instead of Albacore

Running with guppy

./opt/ont-guppy-cpu/bin/guppy_basecaller -i path to raw reads -s outputpath

  • or

cd /opt/ont-guppy-cpu/bin ./guppy_bascaller

  • To extract fastq files using guppy_bascaller

guppy_basecaller -i path-to-raw_reads -s outpath (e.g., guppy_basecaller -i Fast5 -s guppy_out)

  • To de-multiplex

guppy_barcoder -i inpath -s outpath (e.g., guppy_barcoder -i guppy_out -s barcoding)

  • To produce read_id list and joinedreads.fastq for each barcode

preprocess.py -b path-to-barcoding_summary.txt -s path-to-sequencing_summary.txt -o outpath (e.g., preprocess.py -b barcoding/barcoding_summay.txt -s guppy_out/sequencing_summary.txt -o outdir)

e.g., filter_reads --recursive --multi --workers 32 Fast5/ fast5/barcode01 outdir/barcode01/barcode01_readid.tsv

  • With the data produced by the above process, you can perform CCBGpipe by beginning with creating a Run folder

mkdir Run && cd Run

runGetFastq.py path-to-fast5 (e.g., runGetFastq.py ../outdir/)

runmini.py

runAssembly.py

runConsensus.py path-to-fast5 (e.g., runConsensus.py ../fast5/)

finalize.py outpath (e.g., finalize.py ../results)