You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Following the instructions on Accessing GTEX v8 phenotypic data here.
When running PyPFB on the export of the GTEx data the sequencing.tsv file contains duplicates of almost all rows. The other tsvs do not have duplicates. It is not clear if the duplicates exist in the PFB file or are generated when PyPFB converts it to tsv. Given the sequencing file is the only one that shows this problem the more likely guess is that the duplication is present in the PFB.
46 rows in sequencing.tsv do not appear to be duplicates. These are the sequencing files related to the project as a whole rather than to samples (see parent_type). This also suggests the duplicates are present in the PFB/Avro and are not generated by PyPFB.
Following the instructions on Accessing GTEX v8 phenotypic data here.
When running PyPFB on the export of the GTEx data the sequencing.tsv file contains duplicates of almost all rows. The other tsvs do not have duplicates. It is not clear if the duplicates exist in the PFB file or are generated when PyPFB converts it to tsv. Given the sequencing file is the only one that shows this problem the more likely guess is that the duplication is present in the PFB.
46 rows in sequencing.tsv do not appear to be duplicates. These are the sequencing files related to the project as a whole rather than to samples (see parent_type). This also suggests the duplicates are present in the PFB/Avro and are not generated by PyPFB.
See also the related pull request which deals with linking between tsvs.
The text was updated successfully, but these errors were encountered: