This WILDS WDL workflow performs alignment using the two-pass methodology of STAR and subsequently analyzes that alignment via DESeq2. It is intended to be a relatively straightforward demonstration of an RNA sequencing pipeline within the context of the WILDS ecosystem.
For Fred Hutch users that are new to WDL, we recommend using PROOF to submit this workflow directly to the on-premise HPC cluster, as it simplifies interaction with Cromwell and provides a user-friendly front-end for job submission and tracking. To do this:
- Start by either cloning or downloading a copy of this repository to your local machine.
- Cloning:
git clone https://github.com/getwilds/ww-star-deseq2.git
- Downloading: Click the green "Code" button in the top right corner, then click "Download ZIP".
- Cloning:
- Update
ww-star-deseq2-inputs.json
with your sample names (omics_sample_name
) and FASTQ file paths (R1
andR2
). - Update
ww-star-deseq2-options.json
with your preferred location for output data to be saved to (final_workflow_outputs_dir
). - Submit the WDL file along with your custom json's to the Fred Hutch cluster via PROOF by following our SciWiki documentation.
Additional Notes:
- Keep in mind that all file paths in the jsons must be visible to the Fred Hutch cluster, e.g.
/fh/fast/
, AWS S3 bucket. Input file paths on your local machine won't work in PROOF. - Specific reference genome files can be provided as inputs, but if none are provided, the workflow will automatically download a GRCh38 reference genome and use that. For the first go-around, we recommend starting with the default reference files.
- To avoid duplication of reference genome data, we highly recommend executing this workflow with call caching enabled in the options json (
write_to_cache
,read_from_cache
, already set totrue
here).
For users outside of Fred Hutch or more advanced users who would like to run the workflow locally, command line execution is relatively straightforward:
java -jar cromwell-86.jar run ww-star-deseq2.wdl --inputs ww-star-deseq2-inputs.json --options ww-star-deseq2-options.json
Although Cromwell is demonstrated here, this pipeline is not specific to Cromwell and can be run using whichever WDL execution method you prefer (miniwdl, Terra, HealthOmics, etc.).
For questions, bugs, and/or feature requests, reach out to the Fred Hutch Data Science Lab (DaSL) at wilds@fredhutch.org, or open an issue on our issue tracker.
If you would like to contribute to this WILDS WDL workflow, see our contribution guidelines as well out our WILDS Contributor Guide for more details.
Distributed under the MIT License. See LICENSE
for details.