Partial runs [Deprecated]

[Deprecated warning] The instructions in this chapter only work with MOSCA up to version 1.6.1. However, I'm leaving them here as someone might be interested in using this information.

You may not want to use the entire workflow of MOSCA. Here follow some interesting examples of tasks that are better executed running parts of MOSCA separately. The following commands assume you have installed MOSCA as instructed.

Preprocess NGS reads

MOSCA's preprocessing script can be used standalone, as it automatically downloads all resources required.

python ~/anaconda3/envs/mosca/share/MOSCA/scripts/preprocess.py -i {your input reads (e.g. mg_R1.fq,mg_R2.fq)} -t {number of threads} -o {output directory} -adaptdir {resources directory}/adapters -rrnadbs {resources directory}/rRNA_databases -d {data_type (either "dna" or "mrna")} -rd {resources directory} -n --minlen {minimum length of reads to keep} --avgqual {minimum average quality of reads to keep}

Run MOSCA without replicates

MOSCA's differential expression analysis module requires replicates. MOSCA's analysis is still possible without replicates by bypassing this task:

First, preprocess your datasets as explained above
Join your reads by sample by running, for each "forward" and "reverse" files, the following command:

cat {forward_file} >> {output}/Preprocess/{sample}_forward.fastq
cat {reverse_file} >> {output}/Preprocess/{sample}_forward.fastq

Perform assembly by running this, for each sample

python ~/anaconda3/envs/mosca/share/MOSCA/scripts/assembly.py -r {output}/Preprocess/{sample}_forward.fastq,{output}/Preprocess/{sample}_reverse.fastq -t {threads} -o {output}/Assembly/{sample} -a {assembler (either "metaspades" or "megahit"} -m {max_memory}

Perform binning, if you want to, by running, for each sample

python ~/anaconda3/envs/mosca/share/MOSCA/scripts/binning.py -c {output}/Assembly/{sample}/contigs.fasta -t {threads} -o {output}/Binning/{sample} -r {output}/Preprocess/{sample}_forward.fastq,{output}/Preprocess/{sample}_reverse.fastq -mset {markerset (either "107" or "40")}

Perform gene calling and annotation over the contigs by running, for each sample

python ~/anaconda3/envs/mosca/share/MOSCA/scripts/annotation.py -i {output}/Assembly/{sample}/contigs.fasta -t {threads} -o {output}/Annotation/{sample} -em {error_model} -db {path/to/diamond_database.(fasta/dmnd)} -mts {diamond_max_target_seqs} --assembled"

Run UPIMAPI for each sample

upimapi.py -i {output}/Annotation/{sample}/aligned.blast -o {output}/Annotation/uniprotinfo --blast --full-id

Run reCOGnizer for each sample

recognizer.py -f {output}/Annotation/{sample}/fgs.faa -t {threads} -o {output}/Annotation/{sample} -rd {path/to/resources_directory} --remove-spaces

Run quantification, all at once

python ~/anaconda3/envs/mosca/share/MOSCA/scripts/quantification_analyser.py -e {path/to/experiments_file} -t {threads} -o {output} -if {input_format_of_experiments_file ("excel" or "tsv")}

Join all information

python ~/anaconda3/envs/mosca/share/MOSCA/scripts/join_information.py -e {path/to/experiments_file} -t {threads} -o {output} -if {input_format_of_experiments_file ("excel" or "tsv")} -nm {normalization_method ("TMM" or "RLE"}

Run KEGGCharter

kegg_charter.py -f {output}/MOSCA_Entry_Report.xlsx -o {output}/KEGG_maps -mm {metabolic_maps comma-separate (e.g. 00030,00680,...)} -gcol {mg_names comma-separated} -tcol {mt_names comma-separated} -tc 'Taxonomic lineage ({taxa_level})' -not {number_of_taxa} -keggc 'Cross-reference (KEGG)'

Run final reporting

python ~/anaconda3/envs/mosca/share/MOSCA/scripts/report.py -e {path/to/experiments_file} -o {output} -ldir ~/anaconda3/envs/mosca/share/MOSCA/resources -if {input_format_of_experiments_file ("excel" or "tsv")}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partial runs [Deprecated]

Preprocess NGS reads

Run MOSCA without replicates

Clone this wiki locally