-
Notifications
You must be signed in to change notification settings - Fork 716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add NGSCheckMate #993
base: dev
Are you sure you want to change the base?
Add NGSCheckMate #993
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inclusion probably a call for @drpatelh, but a couple of minor comments in the meantime
@@ -768,6 +772,15 @@ workflow RNASEQ { | |||
ch_versions = ch_versions.mix(DUPRADAR.out.versions.first()) | |||
} | |||
|
|||
if (params.ngscheckmate_bed) { | |||
BAM_NGSCHECKMATE ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way the output could be baked into the MultiQC report?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That has been suggested over in sarek too, something I could consider.
docs/output.md
Outdated
|
||
</details> | ||
|
||
[NGSCheckMate](https://github.com/parklab/NGSCheckMate) is a tool to verify that samples come from the same individual, by examining a set of single nucleotide polymorphisms (SNPs). This calculates correlations between the samples, and then applies a depth-dependent model of allele fractions to call samples as being related or not. The principal output is a dendrogram, where samples that are . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfinished sentence
ch_versions = ch_versions.mix(BCFTOOLS_MPILEUP.out.versions) | ||
|
||
BCFTOOLS_MPILEUP | ||
.out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indent the ops (here and below)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in the nf-core subworkflow, but I'll try and remember to do it when I convert to nf-test.
Hmm, we could check for the existence of chr mismatching, that would be relatively simple I guess? |
This PR adds the NGSCheckMate tool to the pipeline. I find this an essential tool and part of my initial QC for all sequencing we perform. This tool takes a bed file with a set of SNPs and tries to determine if samples come from the same individuals. Developed and generally used in humans (there are SNP bed files for hg19/hg38/GRCh37/GRCh38 available), but could be used in other species if a suitable set of common SNPs was provided.
This should be configured to run automatically on GRCh37/38, hg37/hg38 I think needs the files adding to igenomes - need to check.
This ensures that no sample swaps have occurred by checking:
It works on human derived cell lines, but won't separate different treatments applied to the same cell line.
Testing.
I have used a bed file on the test datasets to get it to run on yeast, haven't tried the full tests yet.
nf-core lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).