Skip to content

Commit

Permalink
Merge branch 'data-aws' into 'dev'
Browse files Browse the repository at this point in the history
Move data to AWS

See merge request epi2melabs/workflows/wf-transcriptomes!81
  • Loading branch information
sarahjeeeze committed Nov 16, 2022
2 parents 9a8b197 + 5de212c commit f852dde
Show file tree
Hide file tree
Showing 5 changed files with 72 additions and 26 deletions.
4 changes: 2 additions & 2 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ docker-run:
NF_IGNORE_PROCESSES: preprocess_reads,merge_transcriptomes
- if: $MATRIX_NAME == "differential_expression"
variables:
NF_BEFORE_SCRIPT: tar -xzvf test_data/differential_expression.tar.gz
NF_BEFORE_SCRIPT: wget -O differential_expression.tar.gz https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-isoforms/differential_expression.tar.gz && tar -xzvf differential_expression.tar.gz
NF_WORKFLOW_OPTS: "--fastq differential_expression/differential_expression_fastq \
--de_analysis \
--ref_genome differential_expression/hg38_chr20.fa \
Expand All @@ -52,7 +52,7 @@ docker-run:
NF_IGNORE_PROCESSES: preprocess_reads,merge_transcriptomes
- if: $MATRIX_NAME == "only_differential_expression"
variables:
NF_BEFORE_SCRIPT: tar -xzvf test_data/differential_expression.tar.gz
NF_BEFORE_SCRIPT: wget -O differential_expression.tar.gz https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-isoforms/differential_expression.tar.gz && tar -xzvf differential_expression.tar.gz
NF_WORKFLOW_OPTS: "--fastq differential_expression/differential_expression_fastq \
--de_analysis \
--ref_genome differential_expression/hg38_chr20.fa \
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ repos:
pass_filenames: false
additional_dependencies:
- epi2melabs
- repo: https://gitlab.com/pycqa/flake8
- repo: https://github.com/pycqa/flake8
rev: 3.7.9
hooks:
- id: flake8
Expand Down
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [unreleased]
### Updated
- Removed sanitize option
- Reduce size of differential expression data.
### Added
- Demo differential expression data in repository.
- Improved DE explanation in docs
- Option to turn off transcript assembly steps with param transcript_assembly
### Fixed
Expand Down
45 changes: 34 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,15 +120,25 @@ tar -xzvf test_data.tar.gz
**Example execution of a workflow for reference-based transcript assembly and fusion detection**
```
OUTPUT=~/output;
nexflow run epi2me-labs/wf-transcriptomes --fastq ERR6053095_chr20.fastq --ref_genome chr20/hg38_chr20.fa --ref_annotation chr20/gencode.v22.annotation.chr20.gtf \
--jaffal_refBase chr20/ --jaffal_genome hg38_chr20 --jaffal_annotation genCode22" --out_dir outdir -w workspace_dir -profile conda -resume
nexflow run epi2me-labs/wf-transcriptomes \
--fastq ERR6053095_chr20.fastq \
--ref_genome chr20/hg38_chr20.fa \
--ref_annotation chr20/gencode.v22.annotation.chr20.gtf \
--jaffal_refBase chr20/ \
--jaffal_genome hg38_chr20 \
--jaffal_annotation "genCode22" \
--out_dir outdir -w workspace_dir
```

**Example workflow for denovo transcript assembly**
```
OUTPUT=~/output
nextflow run . --fastq test_data/fastq --denovo --ref_genome test_data/SIRV_150601a.fasta -profile local --out_dir ${OUTPUT} -w ${OUTPUT}/workspace \
--sample sample_id -resume
nextflow run . --fastq test_data/fastq \
--denovo \
--ref_genome test_data/SIRV_150601a.fasta \
--out_dir ${OUTPUT} \
-w ${OUTPUT}/workspace \
--sample sample_id
```
A full list of options can be seen in nextflow_schema.json. Below are some commonly used ones.

Expand Down Expand Up @@ -223,17 +233,30 @@ barcode06,treated
```

You will also need to provide a reference genome and a reference annotation file.
Here is an example cmd to run the workflow using the test_data provided.

Here is an example cmd to run the workflow. First you will need to download the data with wget.
eg.
```
wget -O differential_expression.tar.gz https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-isoforms/differential_expression.tar.gz && tar -xzvf differential_expression.tar.gz
OUTPUT=~/output;
nexflow run epi2me-labs/wf-transcriptomes --fastq test_data/differential_expression_fastq \
nextflow run epi2me-labs/wf-transcriptomes \
--fastq differential_expression/differential_expression_fastq \
--de_analysis \
--ref_genome test_data/hg38_chr20.fa \
--ref_annotation test_data/gencode.v22.annotation.chr20.gtf \
--direct_rna
--ref_genome differential_expression/hg38_chr20.fa \
--ref_annotation differential_expression/gencode.v22.annotation.chr20.gtf \
--direct_rna --minimap_index_opts \-k15
```
You can also run the differential expression section of the workflow on its own by providing a reference transcriptome and setting the transcriptome assembly parameter to false.
eg.
```
nextflow run epi2me-labs/wf-transcriptomes \
--fastq differential_expression/differential_expression_fastq \
--de_analysis \
--ref_genome differential_expression/hg38_chr20.fa \
--ref_annotation differential_expression/gencode.v22.annotation.chr20.gtf \
--direct_rna --minimap_index_opts \-k15 \
--ref_transcriptome differential_expression/ref_transcriptome.fasta \
--transcriptome_assembly false
```


## Workflow outputs
* an HTML report document detailing the primary findings of the workflow.
Expand Down
45 changes: 34 additions & 11 deletions docs/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,15 +38,25 @@ tar -xzvf test_data.tar.gz
**Example execution of a workflow for reference-based transcript assembly and fusion detection**
```
OUTPUT=~/output;
nexflow run epi2me-labs/wf-transcriptomes --fastq ERR6053095_chr20.fastq --ref_genome chr20/hg38_chr20.fa --ref_annotation chr20/gencode.v22.annotation.chr20.gtf \
--jaffal_refBase chr20/ --jaffal_genome hg38_chr20 --jaffal_annotation genCode22" --out_dir outdir -w workspace_dir -profile conda -resume
nexflow run epi2me-labs/wf-transcriptomes \
--fastq ERR6053095_chr20.fastq \
--ref_genome chr20/hg38_chr20.fa \
--ref_annotation chr20/gencode.v22.annotation.chr20.gtf \
--jaffal_refBase chr20/ \
--jaffal_genome hg38_chr20 \
--jaffal_annotation "genCode22" \
--out_dir outdir -w workspace_dir
```

**Example workflow for denovo transcript assembly**
```
OUTPUT=~/output
nextflow run . --fastq test_data/fastq --denovo --ref_genome test_data/SIRV_150601a.fasta -profile local --out_dir ${OUTPUT} -w ${OUTPUT}/workspace \
--sample sample_id -resume
nextflow run . --fastq test_data/fastq \
--denovo \
--ref_genome test_data/SIRV_150601a.fasta \
--out_dir ${OUTPUT} \
-w ${OUTPUT}/workspace \
--sample sample_id
```
A full list of options can be seen in nextflow_schema.json. Below are some commonly used ones.

Expand Down Expand Up @@ -141,17 +151,30 @@ barcode06,treated
```

You will also need to provide a reference genome and a reference annotation file.
Here is an example cmd to run the workflow using the test_data provided.

Here is an example cmd to run the workflow. First you will need to download the data with wget.
eg.
```
wget -O differential_expression.tar.gz https://ont-exd-int-s3-euwst1-epi2me-labs.s3.amazonaws.com/wf-isoforms/differential_expression.tar.gz && tar -xzvf differential_expression.tar.gz
OUTPUT=~/output;
nexflow run epi2me-labs/wf-transcriptomes --fastq test_data/differential_expression_fastq \
nextflow run epi2me-labs/wf-transcriptomes \
--fastq differential_expression/differential_expression_fastq \
--de_analysis \
--ref_genome test_data/hg38_chr20.fa \
--ref_annotation test_data/gencode.v22.annotation.chr20.gtf \
--direct_rna
--ref_genome differential_expression/hg38_chr20.fa \
--ref_annotation differential_expression/gencode.v22.annotation.chr20.gtf \
--direct_rna --minimap_index_opts \-k15
```
You can also run the differential expression section of the workflow on its own by providing a reference transcriptome and setting the transcriptome assembly parameter to false.
eg.
```
nextflow run epi2me-labs/wf-transcriptomes \
--fastq differential_expression/differential_expression_fastq \
--de_analysis \
--ref_genome differential_expression/hg38_chr20.fa \
--ref_annotation differential_expression/gencode.v22.annotation.chr20.gtf \
--direct_rna --minimap_index_opts \-k15 \
--ref_transcriptome differential_expression/ref_transcriptome.fasta \
--transcriptome_assembly false
```


## Workflow outputs
* an HTML report document detailing the primary findings of the workflow.
Expand Down

0 comments on commit f852dde

Please sign in to comment.