-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create tools for SV VCF cleaning #8996
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great overall. I have some suggestions on style and some places where you can reuse other classes. I did not mark all places where you can add final
to variable declarations, just a few cases. You may not need to create a separate "Engine" class for the internals here unless you think some of the components would be reusable in another step or if it makes testing easier.
src/main/java/org/broadinstitute/hellbender/tools/walkers/sv/SVCleanPt1a.java
Outdated
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/tools/walkers/sv/SVCleanPt1a.java
Outdated
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/tools/walkers/sv/SVCleanPt1a.java
Outdated
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/tools/walkers/sv/SVCleanPt1a.java
Outdated
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/tools/walkers/sv/SVCleanPt1a.java
Outdated
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/tools/walkers/sv/SVCleanPt1a.java
Outdated
Show resolved
Hide resolved
@Argument( | ||
fullName = OUTPUT_REVISED_EVENTS_LIST_LONG_NAME, | ||
doc="Output list of revised genotyped events" | ||
) | ||
private GATKPath outputRevisedEventsList; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be possible to build this info directly into the VCF rather than have this extra file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated accordingly - added this as a new flag in the INFO field.
src/main/java/org/broadinstitute/hellbender/tools/spark/sv/utils/GATKSVVCFConstants.java
Outdated
Show resolved
Hide resolved
|
||
private void processSVType(VariantContext variant, VariantContextBuilder builder) { | ||
final String svType = variant.getAttributeAsString(GATKSVVCFConstants.SVTYPE, null); | ||
if (svType != null && variant.getAlleles().stream().noneMatch(allele -> allele.getDisplayString().contains(GATKSVVCFConstants.ME))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to parse the alleles with GATKSVVariantContextUtils.getSymbolicAlleleSymbols()
and check for ME
. Sometimes the alt is just <INS:ME>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated accordingly - thanks for the pointer.
failSet = readLastColumn(failList); | ||
passSet = readLastColumn(passList); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use TableReader
here. It would require you to add header lines to the lists in the WDL, which would be okay too. See TableUtils.reader()
and look at some implementations to see examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated accordingly - thanks for the tip. For visibility, I have made corresponding changes to GATK-SV in broadinstitute/gatk-sv#733.
…VCleanPt1a.java Co-authored-by: Mark Walker <markw@broadinstitute.org>
…VCleanPt1a.java Co-authored-by: Mark Walker <markw@broadinstitute.org>
…VCleanPt1a.java Co-authored-by: Mark Walker <markw@broadinstitute.org>
…VCleanPt1a.java Co-authored-by: Mark Walker <markw@broadinstitute.org>
…VCleanPt1a.java Co-authored-by: Mark Walker <markw@broadinstitute.org>
This PR is intended to introduce several new tools related to the CleanVcf workflow in GATK-SV, which the use of these tools being documented in broadinstitute/gatk-sv#733. These tools are intended to introduce several enhancements over the existing implementation, including but not limited to: