This directory contains the scripts used to map and analyze RNAseq data in batch on a cluster (specifically written with for slurm job submission on sherlock.stanford.edu.)
Two options currently, tophat2-bowtie2-cufflinks pipeline (slow), or the Kallisto-Sleuth pipeline (fast).
-
Bowtie requires an indexed representation of the target DNA to align to. To generate the index (here generating Long indices, see Bowtie Documentation for more info), edit the 'bowtieIndexL.sh' file to point to your target fasta/fa.gz.
-
Run indexing on slurm cluster:
> sbatch bowtieIndexL.sh
-
Move all fastq files to $SCRATCH on cluster
-
Edit tophatBatch.sh to point toward your input and outputs (index file, target fasta, fasts).
-
From within $SCRATCH directory (or dir where fastq's live), run tophatBatch.sh.
> sh /path/to/tophatBatch.sh
This will generate all of the slurm command files for the directory listed in the *Batch.sh file.
- Then run:
> for i in *tophat.slurm; do
> sbatch $i
> done
-
Confirm that all tophat2 runs have completed, 'align_summary.txt' and 'mapped_reads.bam' should now be in the "*batchOutput/" directories where * is the sample ID info.
-
From within $SCRATCH (where fastq's live), run Batch.sh.
> sh /path/to/cufflinksBatch.sh
This will generate all of the slurm command files for cufflinks based on the folders from tophat.
- Then run:
> for i in *cufflinks.slurm; do
> sbatch $i
> done
-
Kallisto requires an indexed representation of the target DNA to align to. To generate the index, edit the
kallistoIndex.sh
file to point to your targetfasta/fa.gz
. -
Run indexing on slurm cluster, like sherlock.stanford.edu:
> sbatch kallistoIndex.sh
Kallisto Manual: https://pachterlab.github.io/kallisto
- Move all fastq files to $SCRATCH on cluster (faster I/O)
- From within $SCRATCH directory (or dir where fastq's live), run 'tophatBatch.sh' .
sh /path/to/kallistoBatch.sh
This will generate all of the slurm command files for the directory listed in the *Batch.sh file.
- Then run:
for i in *kallisto.slurm; do sbatch $i done
Sleuth Manual: https://pachterlab.github.io/sleuth/about
- Place all "KallistoOutput/" directories into same parent directory.
- Edit SleuthAnalysis.R to match your file directory and experimental model structure.
- Run Sleuth to generate outputs.
- To view interactive report of results (requires 'Shiny' package) type sleuth_live(SO), where 'SO' is your sleuth object.