-
Notifications
You must be signed in to change notification settings - Fork 28
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #305 from cidgoh/cancogen-testfile-0155
covid19 exampleInput 0.15.5
- Loading branch information
Showing
7 changed files
with
6 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
6 changes: 3 additions & 3 deletions
6
...9/exampleInput/invalidTestData_0-15-4.csv → ...9/exampleInput/invalidTestData_0-15-5.csv
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
Database Identifiers,,,,,,,,,,,,Sample collection and processing,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Host Information,,,,,,,,,,,,,,,,,Host vaccination information,,,,,,,,,,,Host exposure information,,,,,,,,,,,,,,Host reinfection information,,,,,,Sequencing,,,,,,,,,,,,Bioinformatics and QC metrics,,,,,,,,,,,,,,,,,,,,,Lineage and Variant information,,,,,,Pathogen diagnostic testing,,,,,,,,,Contributor acknowledgement, | ||
specimen collector sample ID,third party lab service provider name,third party lab sample ID,case ID,Related specimen primary ID,IRIDA sample name,umbrella bioproject accession,bioproject accession,biosample accession,SRA accession,GenBank accession,GISAID accession,sample collected by,sample collector contact email,sample collector contact address,sequence submitted by,sequence submitter contact email,sequence submitter contact address,sample collection date,sample collection date precision,sample received date,geo_loc_name (country),geo_loc_name (state/province/territory),geo_loc_name (city),organism,isolate,purpose of sampling,purpose of sampling details,NML submitted specimen type,Related specimen relationship type,anatomical material,anatomical part,body product,environmental material,environmental site,collection device,collection method,collection protocol,specimen processing,specimen processing details,lab host,passage number,passage method,biomaterial extracted,host (common name),host (scientific name),host health state,host health status details,host health outcome,host disease,host age,host age unit,host age bin,host gender,host residence geo_loc name (country),host residence geo_loc name (state/province/territory),host subject ID,symptom onset date,signs and symptoms,pre-existing conditions and risk factors,complications,host vaccination status,number of vaccine doses received,vaccination dose 1 vaccine name,vaccination dose 1 vaccination date,vaccination dose 2 vaccine name,vaccination dose 2 vaccination date,vaccination dose 3 vaccine name,vaccination dose 3 vaccination date,vaccination dose 4 vaccine name,vaccination dose 4 vaccination date,vaccination history,location of exposure geo_loc name (country),destination of most recent travel (city),destination of most recent travel (state/province/territory),destination of most recent travel (country),most recent travel departure date,most recent travel return date,travel point of entry type,border testing test day type,travel history,exposure event,exposure contact level,host role,exposure setting,exposure details,prior SARS-CoV-2 infection,prior SARS-CoV-2 infection isolate,prior SARS-CoV-2 infection date,prior SARS-CoV-2 antiviral treatment,prior SARS-CoV-2 antiviral treatment agent,prior SARS-CoV-2 antiviral treatment date,purpose of sequencing,purpose of sequencing details,sequencing date,library ID,amplicon size,library preparation kit,flow cell barcode,sequencing instrument,sequencing protocol name,sequencing protocol,sequencing kit number,amplicon pcr primer scheme,raw sequence data processing method,dehosting method,consensus sequence name,consensus sequence filename,consensus sequence filepath,consensus sequence software name,consensus sequence software version,breadth of coverage value,depth of coverage value,depth of coverage threshold,r1 fastq filename,r2 fastq filename,r1 fastq filepath,r2 fastq filepath,fast5 filename,fast5 filepath,number of base pairs sequenced,consensus genome length,Ns per 100 kbp,reference genome accession,bioinformatics protocol,lineage/clade name,lineage/clade analysis software name,lineage/clade analysis software version,variant designation,variant evidence,variant evidence details,gene name 1,diagnostic pcr protocol 1,diagnostic pcr Ct value 1,gene name 2,diagnostic pcr protocol 2,diagnostic pcr Ct value 2,gene name 3,diagnostic pcr protocol 3,diagnostic pcr Ct value 3,authors,DataHarmonizer provenance | ||
sample123,Switch Health,abc12345,case4444,NMLsample2222,prov_rona_99,PRJNA623807,PRJNA608651,SAMN14180202,SRR11177792,MN908947.3,EPI_ISL_436489,,switch@email.ca,"123 Main Street, City, Province",National Microbiology Laboratory (NML),RespLab@lab.ca,"123 Sunnybrooke St, Toronto, Ontario, M4P 1L6, Canada",2018-03-01,,30-Apr,Canda,BC,Thunder Bay,Severe acute respiratory syndrome coronavirus 2,hCov-19/CANADA/BC-prov_rona_99/2020,Surveillance testing,Not Provided,Not Applicable, Reinfection testing,Not Applicable,Lungs,Not Applicable,Not Applicable,Not Applicable,Swab,Not Applicable,SOP123,Not Provided,Not Provided,Not Applicable,Not Applicable,Not Applicable,Not Provided,Batman,Homo chiroptera,Sick, Hospitalized (ICU),Recovered,,89,,80 - 89,Female,Cnada,British Columbia,PHN1234,2022-02-23,Cough;Fever,Not Provided,Not Provided,Fully Vaccinated,3,Pfizer-BioNTech (Comirnaty),2021-07-01,Pfizer-BioNTech (Comirnaty),2021-11-02,Moderna (Spikevax),2022-02-01,,,,United States of America,Portland,Oregon,United States of America,2022-03-02,05-2020,Air,day 10,,Occupational exposure (retail),direct human to human,Attendee,"Occupational, Residency or Patronage Exposure",,Prior infection,SARS-CoV-2/human/USA/CA-CDPH-001/2020,2021-06-01,Prior antiviral treatment,remdesivir,2021-06-05, Surveillance of international border crossing by air travel,Not Provided,,XYZ_123345,1200bp,Nextera XT,FAB06069, Illumina NextSeq 2000,SeqProt1234,"Genomes were generated through amplicon sequencing of 1200 bp amplicons with Freed schema primers. Libraries were created using Illumina DNA Prep kits, and sequence data was produced using Miseq Micro v2 (500 cycles) sequencing kits.",1234546,Freed,Trimmomatic 0.38,,ncov123assembly3,ncov123assembly.fasta,User/Documents/RespLab/Data/ncov123assembly.fasta,iVar,1.3,95%,400x,100x,ABC123_S1_L001_R1_001.fastq.gz,ABC123_S1_L001_R2_001.fastq.gz,/User/Documents/RespLab/Data/ABC123_S1_L001_R1_001.fastq.gz,/User/Documents/RespLab/Data/ABC123_S1_L001_R2_001.fastq.gz,rona123assembly.fast5,User/Documents/RespLab/Data/rona123assembly.fast5,387566,38677,330,NC_045512.2,https://github.com/phac-nml/ncov2019-artic-nf,B.1.1.7,Pangolin,2.1.10,VOC,Sequencing,"Lineage-defining mutations: ORF1ab (K1655N), Spike (K417N, E484K, N501Y, D614G, A701V), N (T205I), E (P71L).",E gene (orf4),,21.2,Spike (orf2),,19.2,,,,"Tejinder Singh, Fei Hu, Joe Blogs",DataHarmonizer provenance: v0.15.3 | ||
sample1234,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,DataHarmonizer provenance: v0.15.3 | ||
sample1234,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,DataHarmonizer provenance: v0.15.3 | ||
sample123,Switch Health,abc12345,case4444,NMLsample2222,prov_rona_99,PRJNA623807,PRJNA608651,SAMN14180202,SRR11177792,MN908947.3,EPI_ISL_436489,SharEd hospital Laboratory,switch@email.ca,"123 Main Street, City, Province",National Microbiology Laboratory (NML),RespLab@lab.ca,"123 Sunnybrooke St, Toronto, Ontario, M4P 1L6, Canada",2018-03-01,,30-Apr,Canda,BC,Thunder Bay,Severe acute respiratory syndrome coronavirus 2,hCov-19/CANADA/BC-prov_rona_99/2020,Surveillance testing,Not Provided,Not Applicable, Reinfection testing,Not Applicable,Lungs,Not Applicable,Not Applicable,Not Applicable,Swab,Not Applicable,SOP123,Not Provided,Not Provided,Not Applicable,Not Applicable,Not Applicable,Not Provided,Batman,Homo chiroptera,Sick, Hospitalized (ICU),Recovered,,89,,80 - 89,Female,Cnada,British Columbia,PHN1234,2022-02-23,Cough;Fever,Not Provided,Not Provided,Fully Vaccinated,3,Pfizer-BioNTech (Comirnaty),2021-07-01,Pfizer-BioNTech (Comirnaty),2021-11-02,Moderna (Spikevax),2022-02-01,,,,United States of America,Portland,Oregon,United States of America,2022-03-02,05-2020,Air,day 10,,Occupational exposure (retail),direct human to human,Attendee,"Occupational, Residency or Patronage Exposure",,Prior infection,SARS-CoV-2/human/USA/CA-CDPH-001/2020,2021-06-01,Prior antiviral treatment,remdesivir,2021-06-05, Surveillance of international border crossing by air travel,Not Provided,,XYZ_123345,1200bp,Nextera XT,FAB06069, Illumina NextSeq 2000,SeqProt1234,"Genomes were generated through amplicon sequencing of 1200 bp amplicons with Freed schema primers. Libraries were created using Illumina DNA Prep kits, and sequence data was produced using Miseq Micro v2 (500 cycles) sequencing kits.",1234546,Freed,Trimmomatic 0.38,,ncov123assembly3,ncov123assembly.fasta,User/Documents/RespLab/Data/ncov123assembly.fasta,iVar,1.3,95%,400x,100x,ABC123_S1_L001_R1_001.fastq.gz,ABC123_S1_L001_R2_001.fastq.gz,/User/Documents/RespLab/Data/ABC123_S1_L001_R1_001.fastq.gz,/User/Documents/RespLab/Data/ABC123_S1_L001_R2_001.fastq.gz,rona123assembly.fast5,User/Documents/RespLab/Data/rona123assembly.fast5,387566,38677,330,NC_045512.2,https://github.com/phac-nml/ncov2019-artic-nf,B.1.1.7,Pangolin,2.1.10,VOC,Sequencing,"Lineage-defining mutations: ORF1ab (K1655N), Spike (K417N, E484K, N501Y, D614G, A701V), N (T205I), E (P71L).",E gene (orf4),,21.2,Spike (orf2),,19.2,,,,"Tejinder Singh, Fei Hu, Joe Blogs",DataHarmonizer provenance: v0.15.4 | ||
sample1234,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,DataHarmonizer provenance: v0.15.4 | ||
sample1234,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,DataHarmonizer provenance: v0.15.4 |
Binary file not shown.
2 changes: 1 addition & 1 deletion
2
...d19/exampleInput/validTestData_0-15-4.csv → ...d19/exampleInput/validTestData_0-15-5.csv
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
Database Identifiers,,,,,,,,,,,,Sample collection and processing,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Host Information,,,,,,,,,,,,,,,,,Host vaccination information,,,,,,,,,,,Host exposure information,,,,,,,,,,,,,,Host reinfection information,,,,,,Sequencing,,,,,,,,,,,,Bioinformatics and QC metrics,,,,,,,,,,,,,,,,,,,,,Lineage and Variant information,,,,,,Pathogen diagnostic testing,,,,,,,,,Contributor acknowledgement, | ||
specimen collector sample ID,third party lab service provider name,third party lab sample ID,case ID,Related specimen primary ID,IRIDA sample name,umbrella bioproject accession,bioproject accession,biosample accession,SRA accession,GenBank accession,GISAID accession,sample collected by,sample collector contact email,sample collector contact address,sequence submitted by,sequence submitter contact email,sequence submitter contact address,sample collection date,sample collection date precision,sample received date,geo_loc_name (country),geo_loc_name (state/province/territory),geo_loc_name (city),organism,isolate,purpose of sampling,purpose of sampling details,NML submitted specimen type,Related specimen relationship type,anatomical material,anatomical part,body product,environmental material,environmental site,collection device,collection method,collection protocol,specimen processing,specimen processing details,lab host,passage number,passage method,biomaterial extracted,host (common name),host (scientific name),host health state,host health status details,host health outcome,host disease,host age,host age unit,host age bin,host gender,host residence geo_loc name (country),host residence geo_loc name (state/province/territory),host subject ID,symptom onset date,signs and symptoms,pre-existing conditions and risk factors,complications,host vaccination status,number of vaccine doses received,vaccination dose 1 vaccine name,vaccination dose 1 vaccination date,vaccination dose 2 vaccine name,vaccination dose 2 vaccination date,vaccination dose 3 vaccine name,vaccination dose 3 vaccination date,vaccination dose 4 vaccine name,vaccination dose 4 vaccination date,vaccination history,location of exposure geo_loc name (country),destination of most recent travel (city),destination of most recent travel (state/province/territory),destination of most recent travel (country),most recent travel departure date,most recent travel return date,travel point of entry type,border testing test day type,travel history,exposure event,exposure contact level,host role,exposure setting,exposure details,prior SARS-CoV-2 infection,prior SARS-CoV-2 infection isolate,prior SARS-CoV-2 infection date,prior SARS-CoV-2 antiviral treatment,prior SARS-CoV-2 antiviral treatment agent,prior SARS-CoV-2 antiviral treatment date,purpose of sequencing,purpose of sequencing details,sequencing date,library ID,amplicon size,library preparation kit,flow cell barcode,sequencing instrument,sequencing protocol name,sequencing protocol,sequencing kit number,amplicon pcr primer scheme,raw sequence data processing method,dehosting method,consensus sequence name,consensus sequence filename,consensus sequence filepath,consensus sequence software name,consensus sequence software version,breadth of coverage value,depth of coverage value,depth of coverage threshold,r1 fastq filename,r2 fastq filename,r1 fastq filepath,r2 fastq filepath,fast5 filename,fast5 filepath,number of base pairs sequenced,consensus genome length,Ns per 100 kbp,reference genome accession,bioinformatics protocol,lineage/clade name,lineage/clade analysis software name,lineage/clade analysis software version,variant designation,variant evidence,variant evidence details,gene name 1,diagnostic pcr protocol 1,diagnostic pcr Ct value 1,gene name 2,diagnostic pcr protocol 2,diagnostic pcr Ct value 2,gene name 3,diagnostic pcr protocol 3,diagnostic pcr Ct value 3,authors,DataHarmonizer provenance | ||
sample1234,Switch Health,abc12345,case4444,NMLsample2222,prov_rona_99,PRJNA623807,PRJNA608651,SAMN14180202,SRR11177792,MN908947.3,EPI_ISL_436489,Switch Health,switch@email.ca,"123 Main Street, City, Province",National Microbiology Laboratory (NML),RespLab@lab.ca,"123 Sunnybrooke St, Toronto, Ontario, M4P 1L6, Canada",2022-03-01,day,2022-03-15,Canada,British Columbia,Thunder Bay,Severe acute respiratory syndrome coronavirus 2,hCov-19/CANADA/BC-prov_rona_99/2020,Diagnostic testing,Not Provided,Not Applicable, Reinfection testing,Not Applicable,Nasopharynx (NP); Oropharynx (OP),Not Applicable,Not Applicable,Not Applicable,Swab,Not Applicable,SOP123,Not Provided,Not Provided,Not Applicable,Not Applicable,Not Applicable,Not Provided,Human,Homo sapiens,Symptomatic, Hospitalized (ICU),Recovered,COVID-19,34,year,30 - 39,Female,Canada,British Columbia,PHN1234,2022-02-23,Cough;Fever,Not Provided,Not Provided,Fully Vaccinated,3,Pfizer-BioNTech (Comirnaty),2021-07-01,Pfizer-BioNTech (Comirnaty),2021-11-02,Moderna (Spikevax),2022-02-01,,,,United States of America,Portland,Oregon,United States of America,2022-03-02,2022-03-11,Air,day 10,, Convention," Close contact (face-to-face, no direct contact)",Attendee,"Occupational, Residency or Patronage Exposure",,Prior infection,SARS-CoV-2/human/USA/CA-CDPH-001/2020,2021-06-01,Prior antiviral treatment,remdesivir,2021-06-05, Surveillance of international border crossing by air travel,Not Provided,2022-04-04,XYZ_123345,1200bp,Nextera XT,FAB06069, Illumina NextSeq 2000,SeqProt1234,"Genomes were generated through amplicon sequencing of 1200 bp amplicons with Freed schema primers. Libraries were created using Illumina DNA Prep kits, and sequence data was produced using Miseq Micro v2 (500 cycles) sequencing kits.",1234546,Freed,Trimmomatic 0.38,Nanostripper,ncov123assembly3,ncov123assembly.fasta,User/Documents/RespLab/Data/ncov123assembly.fasta,iVar,1.3,95%,400x,100x,ABC123_S1_L001_R1_001.fastq.gz,ABC123_S1_L001_R2_001.fastq.gz,/User/Documents/RespLab/Data/ABC123_S1_L001_R1_001.fastq.gz,/User/Documents/RespLab/Data/ABC123_S1_L001_R2_001.fastq.gz,rona123assembly.fast5,User/Documents/RespLab/Data/rona123assembly.fast5,387566,38677,330,NC_045512.2,https://github.com/phac-nml/ncov2019-artic-nf,B.1.1.7,Pangolin,2.1.10,Variant of Concern (VOC),Sequencing,"Lineage-defining mutations: ORF1ab (K1655N), Spike (K417N, E484K, N501Y, D614G, A701V), N (T205I), E (P71L).",E gene (orf4),,21.2, RdRp gene (nsp12),,19.2,,,,"Tejinder Singh, Fei Hu, Joe Blogs",DataHarmonizer provenance: v0.15.4 | ||
sample1234,Switch Health,abc12345,case4444,NMLsample2222,prov_rona_99,PRJNA623807,PRJNA608651,SAMN14180202,SRR11177792,MN908947.3,EPI_ISL_436489,Shared Hospital Laboratory,shl@email.ca,"123 Main Street, City, Province",National Microbiology Laboratory (NML),RespLab@lab.ca,"123 Sunnybrooke St, Toronto, Ontario, M4P 1L6, Canada",2022-03-01,day,2022-03-15,Canada,British Columbia,Thunder Bay,Severe acute respiratory syndrome coronavirus 2,hCov-19/CANADA/BC-prov_rona_99/2020,Diagnostic testing,Not Provided,Not Applicable, Reinfection testing,Not Applicable, Nasopharynx (NP); Oropharynx (OP),Not Applicable,Not Applicable,Not Applicable,Swab,Not Applicable,SOP123,Not Provided,Not Provided,Not Applicable,Not Applicable,Not Applicable,Not Provided,Human,Homo sapiens,Symptomatic, Hospitalized (ICU),Recovered,COVID-19,34,year,30 - 39,Female,Canada,British Columbia,PHN1234,2022-02-23,Cough;Fever,Not Provided,Not Provided,Fully Vaccinated,3,Pfizer-BioNTech (Comirnaty),2021-07-01,Pfizer-BioNTech (Comirnaty),2021-11-02,Moderna (Spikevax),2022-02-01,,,,United States of America,Portland,Oregon,United States of America,2022-03-02,2022-03-11,Air,day 10,, Convention," Close contact (face-to-face, no direct contact)",Attendee,"Occupational, Residency or Patronage Exposure",,Prior infection,SARS-CoV-2/human/USA/CA-CDPH-001/2020,2021-06-01,Prior antiviral treatment,remdesivir,2021-06-05, Surveillance of international border crossing by air travel,Not Provided,2022-04-04,XYZ_123345,1200bp,Nextera XT,FAB06069, Illumina NextSeq 2000,SeqProt1234,"Genomes were generated through amplicon sequencing of 1200 bp amplicons with Freed schema primers. Libraries were created using Illumina DNA Prep kits, and sequence data was produced using Miseq Micro v2 (500 cycles) sequencing kits.",1234546,Freed,Trimmomatic 0.38,Nanostripper,ncov123assembly3,ncov123assembly.fasta,User/Documents/RespLab/Data/ncov123assembly.fasta,iVar,1.3,95%,400x,100x,ABC123_S1_L001_R1_001.fastq.gz,ABC123_S1_L001_R2_001.fastq.gz,/User/Documents/RespLab/Data/ABC123_S1_L001_R1_001.fastq.gz,/User/Documents/RespLab/Data/ABC123_S1_L001_R2_001.fastq.gz,rona123assembly.fast5,User/Documents/RespLab/Data/rona123assembly.fast5,387566,38677,330,NC_045512.2,https://github.com/phac-nml/ncov2019-artic-nf,B.1.1.7,Pangolin,2.1.10,Variant of Concern (VOC),Sequencing,"Lineage-defining mutations: ORF1ab (K1655N), Spike (K417N, E484K, N501Y, D614G, A701V), N (T205I), E (P71L).",E gene (orf4),,21.2, RdRp gene (nsp12),,19.2,,,,"Tejinder Singh, Fei Hu, Joe Blogs",DataHarmonizer provenance: v0.15.4 |
Oops, something went wrong.