This repository has been archived by the owner on Jul 17, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 152
denovo_scoring.py extension - failure to read sample IDs in VCF #292
Comments
I encountered the same error, unzipped the diploidSV.vcf.gz file and ran successfully. |
Could you show the code you used to run |
您好,我是李桐,已收到您的邮件,祝您天天开心,工作愉快。This is an automatic reply, confirming that your e-mail was received. Thank you.
|
#!/usr/bin/env python2 Manta - Structural Variant and Indel CallerCopyright (c) 2013-2019 Illumina, Inc.This program is free software: you can redistribute it and/or modifyit under the terms of the GNU General Public License as published bythe Free Software Foundation, either version 3 of the License, orat your option) any later version.This program is distributed in the hope that it will be useful,but WITHOUT ANY WARRANTY; without even the implied warranty ofMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See theGNU General Public License for more details.You should have received a copy of the GNU General Public Licensealong with this program. If not, see http://www.gnu.org/licenses/.import sys def check_genotype(probandGT, fatherGT, motherGT):
def add_dq(tokens, probandIx, dq): def process_vcf(vcfFile, probandID,
if name=='main':
|
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
I am currently working with manta/1.5.0 for SV calling of short read human WGS data - accessible to me as Illumina Isaac-aligned BAMs - for family-based (trio) calling. I am working in a (remote) trusted research environment in which manta has been installed and implemented by root administrators. For data security reasons, I cannot provide screenshots or verbatim output.
I am keen to generate a benchmark set of denovo SVs in my data. I've been working on a single test trio in a cohort ~120
After performing the configuration and execution steps with the three family bams as input. Absolute filepaths for bams were given in the config step.
I have subsequently attempted to run
denovo_scoring.py
on the outputdiploidSV.vcf.gz
VCF as:python $MANTA_INSTALL_FOLDER/libexec/denovo_scoring.py <proband_id> <father_id> <mother_id>
as per README.mdThe job terminates spontaneously and with the .stderr output of:
The sample ID <proband_id>,<father_id>,<mother_id> does not exist in the vcf. Program exits
In
diploidSV.vcf.gz
, the sample IDs appear have been correctly parsed and appear in the final row of the VCF as:#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT <PROBAND_ID> <FATHER_ID> <MOTHER_ID>
The IDs are correct and appear in the same order as the input. They are in a common format - "AB1234-CDE_F56", indicating fixed positions of alpha-numerical and special characters respectively. The parsed IDs in the header are free of leading file paths or file extension.
I'm not sure why the script is not detecting these as the correct sample ids, as they are clearly present.
Any advice from original engineers or community would be appreciated.
The text was updated successfully, but these errors were encountered: