Skip to content

Commit

Permalink
Merge pull request nf-core#1502 from lescai/cnvkit_update
Browse files Browse the repository at this point in the history
Update CNVkit
  • Loading branch information
maxulysse authored May 13, 2024
2 parents 6e2325a + bb0ec8d commit b6b4e52
Show file tree
Hide file tree
Showing 13 changed files with 263 additions and 3 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added

- [#1502](https://github.com/nf-core/sarek/pull/1502) - export CNVs into VCF format in `bam_variant_calling_cnvkit`

### Changed

- [#1502](https://github.com/nf-core/sarek/pull/1502) - Improved handling of CNVkit reference
- [#1502](https://github.com/nf-core/sarek/pull/1502) - Specific CNV call step, with recommended settings for germline
- [#1508](https://github.com/nf-core/sarek/pull/1508) - Sync `TEMPLATE` with `tools` `2.14.0`
- [#1513](https://github.com/nf-core/sarek/pull/1513) - Back to dev
- [#1518](https://github.com/nf-core/sarek/pull/1518) - Sync `TEMPLATE` with `tools` `2.14.1`
Expand Down
30 changes: 30 additions & 0 deletions conf/modules/cnvkit.config
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,36 @@ process {
]
}

withName: '.*:BAM_VARIANT_CALLING_CNVKIT:CNVKIT_CALL' {
ext.when = { params.tools && params.tools.split(',').contains('cnvkit') }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/cnvkit/${meta.id}/" },
pattern: "*{cns}"
]
}
withName: '.*:BAM_VARIANT_CALLING_GERMLINE_ALL:BAM_VARIANT_CALLING_CNVKIT:CNVKIT_CALL' {
ext.prefix = { "${cns.baseName}.germline.call" }
ext.args = "--filter ci"
}
withName: '.*:BAM_VARIANT_CALLING_SOMATIC_ALL:BAM_VARIANT_CALLING_CNVKIT:CNVKIT_CALL' {
ext.prefix = { "${cns.baseName}.somatic.call" }
}
withName: '.*:BAM_VARIANT_CALLING_TUMOR_ONLY_ALL:BAM_VARIANT_CALLING_CNVKIT:CNVKIT_CALL' {
ext.prefix = { "${cns.baseName}.tumor_only.call" }
}

withName: 'CNVKIT_EXPORT' {
ext.args = "vcf"
ext.prefix = { "${meta.id}.cnvcall" }
ext.when = { params.tools && params.tools.split(',').contains('cnvkit') }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/cnvkit/${meta.id}/" },
pattern: "*{vcf}"
]
}

withName: 'CNVKIT_GENEMETRICS' {
ext.prefix = { "${cnr.baseName}.genemetrics" }
ext.when = { params.tools && params.tools.split(',').contains('cnvkit') }
Expand Down
25 changes: 25 additions & 0 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -1136,3 +1136,28 @@ Currently, Sentieon's version of BQSR, QualCal, is not available in Sarek. Recen
Resource requests are difficult to generalize and are often dependent on input data size. Currently, the number of cpus and memory requested by default were adapted from tests on 5 ICGC paired whole-genome sequencing samples with approximately 40X and 80X depth.
For targeted data analysis, this is overshooting by a lot. In this case resources for each process can be limited by either setting `--max_memory` and `-max_cpus` or tailoring the request by process name as described [here](#resource-requests). If you are using sarek for a certain data type regulary, and would like to make these requests available to others on your system, an institution-specific, pipeline-specific config file can be added [here](https://github.com/nf-core/configs/tree/master/conf/pipeline/sarek).
## CNV calling with CNVkit
The CNV calling in Sarek implements the approach proposed by [CNVkit](https://cnvkit.readthedocs.io/en/stable/).
It is possible to call CNVs with whole-genome or targeted capture data (exome and amplicons): depending on the sequencing approach, Sarek applies different [settings](https://cnvkit.readthedocs.io/en/stable/nonhybrid.html) as recommended by CNVkit.
### Reference background
Given the nature of this type of CNV calling algorithms, which rely on the detection of variations in the coverage profile, the definition of a background reference in control data is known to improve the calling in targeted and hybrid capture applications. This is to ensure an accurate profiling, especially in the off-target regions.
We recommend creating a background reference with the nf-core pipeline [createpanelrefs](https://nf-co.re/createpanelrefs).
:warning: In creating a coverage reference, one should pay particular attention to:
- the control samples should be processed with the same targeted capture and sequencing technology
- if BAM files are used to compute the background, they should have been processed with the same pipeline used to call the CNVs
### Germline calling
Sarek implements the [recommended germline settings](https://cnvkit.readthedocs.io/en/stable/germline.html), i.e. applying the `--filter ci` option in the CVNkit call step.
However, this is defined at a config level by adding this option to the `ext.args`: the user can therefore choose any desired different approach by changing the arguments in a custom config.
### Somatic calling
The [available options](https://cnvkit.readthedocs.io/en/stable/tumor.html) a user can choose from for tumour analysis depend very much on the specific design being analysed. Sarek therefore doesn't implement any of these choices, i.e. it runs the CNVkit call step with default settings.
We encourage the user to verify whether particular settings might be more appropriate for their data.
10 changes: 10 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,16 @@
"git_sha": "f53b071421340e6fac0806c86ba030e578e94826",
"installed_by": ["modules"]
},
"cnvkit/call": {
"branch": "master",
"git_sha": "a64788f5ad388f1d2ac5bd5f1f3f8fc81476148c",
"installed_by": ["modules"]
},
"cnvkit/export": {
"branch": "master",
"git_sha": "a64788f5ad388f1d2ac5bd5f1f3f8fc81476148c",
"installed_by": ["modules"]
},
"cnvkit/genemetrics": {
"branch": "master",
"git_sha": "a64788f5ad388f1d2ac5bd5f1f3f8fc81476148c",
Expand Down
7 changes: 7 additions & 0 deletions modules/nf-core/cnvkit/call/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

36 changes: 36 additions & 0 deletions modules/nf-core/cnvkit/call/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

48 changes: 48 additions & 0 deletions modules/nf-core/cnvkit/call/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 7 additions & 0 deletions modules/nf-core/cnvkit/export/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

35 changes: 35 additions & 0 deletions modules/nf-core/cnvkit/export/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

43 changes: 43 additions & 0 deletions modules/nf-core/cnvkit/export/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 15 additions & 2 deletions subworkflows/local/bam_variant_calling_cnvkit/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@
// For all modules here:
// A when clause condition is defined in the conf/modules.config to determine if the module should be run

include { CNVKIT_BATCH } from '../../../modules/nf-core/cnvkit/batch/main'
include { CNVKIT_BATCH } from '../../../modules/nf-core/cnvkit/batch/main'
include { CNVKIT_CALL } from '../../../modules/nf-core/cnvkit/call/main'
include { CNVKIT_EXPORT } from '../../../modules/nf-core/cnvkit/export/main'
include { CNVKIT_GENEMETRICS } from '../../../modules/nf-core/cnvkit/genemetrics/main'

workflow BAM_VARIANT_CALLING_CNVKIT {
Expand All @@ -21,12 +23,23 @@ workflow BAM_VARIANT_CALLING_CNVKIT {

CNVKIT_BATCH(cram, fasta, fasta_fai, targets, reference, generate_pon)

// right now we do not use an input VCF to improve the calling of B alleles
// based on SNV frequencies from the VCF file
// in the future we might consider to add this, by connecting the emission from
// SNV variant calling modules
CNVKIT_CALL(CNVKIT_BATCH.out.cns.map{ meta, cns -> [meta, cns[2], []]})

// export to VCF for compatibility with other tools
CNVKIT_EXPORT(CNVKIT_CALL.out.cns)

ch_genemetrics = CNVKIT_BATCH.out.cnr.join(CNVKIT_BATCH.out.cns).map{ meta, cnr, cns -> [meta, cnr, cns[2]]}
CNVKIT_GENEMETRICS(ch_genemetrics)

versions = versions.mix(CNVKIT_BATCH.out.versions)
versions = versions.mix(CNVKIT_GENEMETRICS.out.versions)

emit:
versions // channel: [ versions.yml ]
cnv_calls_raw = CNVKIT_CALL.out.cns // channel: [ meta, cns ]
cnv_calls_export = CNVKIT_EXPORT.out.output // channel: [ meta, export_format ]
versions // channel: [ versions.yml ]
}
3 changes: 2 additions & 1 deletion subworkflows/local/bam_variant_calling_germline_all/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ workflow BAM_VARIANT_CALLING_GERMLINE_ALL {
skip_tools // Mandatory, list of tools to skip
cram // channel: [mandatory] meta, cram
bwa // channel: [mandatory] meta, bwa
cnvkit_reference // channel: [optional] cnvkit reference
dbsnp // channel: [mandatory] meta, dbsnp
dbsnp_tbi // channel: [mandatory] dbsnp_tbi
dbsnp_vqsr
Expand Down Expand Up @@ -87,7 +88,7 @@ workflow BAM_VARIANT_CALLING_GERMLINE_ALL {
fasta,
fasta_fai,
intervals_bed_combined.map{ it -> [[id:it[0].baseName], it] },
[[id:"null"], []]
params.cnvkit_reference ? cnvkit_reference.map{ it -> [[id:it[0].baseName], it] } : [[:],[]]
)
versions = versions.mix(BAM_VARIANT_CALLING_CNVKIT.out.versions)
}
Expand Down
1 change: 1 addition & 0 deletions workflows/sarek/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -712,6 +712,7 @@ workflow SAREK {
params.skip_tools,
cram_variant_calling_status_normal,
[ [ id:'bwa' ], [] ], // bwa_index for tiddit; not used here
cnvkit_reference,
dbsnp,
dbsnp_tbi,
dbsnp_vqsr,
Expand Down

0 comments on commit b6b4e52

Please sign in to comment.