You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is the first in several class content fields that originate from the dsp clinvar ingest stream that will need to be parsed and stored formally in the final transformed messages.
This first element is the array of HGVS expressions that are embedded in the Variation.content serialized json field.
Each Variation object will have zero, one or more HGVS elements in the stringified json content attribute.
The json path $.HGVSlist.HGVS may either be an array (when more than one exists) or a single element (when only one exists). I believe there is no $.HGVSlist.HGVS node found when no HGVS expressions exist for a variation, but it may be an empty array or an empty single node (can't recall right now).
Each HGVS node will need to be parsed into a structure with the following shape:
Some general patterns that may be informational as to how these fields are typically populated...
if the type is genomic then only the nucleotiedXXX values will be included
if the type is transcript then the isManeSelect may be TRUE otherwise it defaults to FALSE
if the type is transcript and its a protein coding expression then the corresponding derived protein expression will likely be provided
if the type is protein only (or something like that) then only the protein expression will be provided (i think)
The molecularConsequence fields will optionally be available for many of the hgvs expressions that have a transcript nucleotide expression.
We will need to do some finalization of the destination structure for this data in our GeneGraph model. For a general reference these fields will ultimately end up in the VariationDescriptor class that is associated with the core VCV and SCV statements being transformed.
The text was updated successfully, but these errors were encountered:
NOTE: eventually we will be extracting ALL the data from the various Class.content fields. In the initial MVP for the standardization of ClinVar into GeneGraph we will be identifying fields from the ClinicalAssertionObservation.content json and possibly from the GeneAssociation.content.
This is the first in several class content fields that originate from the dsp clinvar ingest stream that will need to be parsed and stored formally in the final transformed messages.
This first element is the array of HGVS expressions that are embedded in the
Variation.content
serialized json field.Each Variation object will have zero, one or more HGVS elements in the stringified json content attribute.
The json path
$.HGVSlist.HGVS
may either be an array (when more than one exists) or a single element (when only one exists). I believe there is no$.HGVSlist.HGVS
node found when no HGVS expressions exist for a variation, but it may be an empty array or an empty single node (can't recall right now).Each HGVS node will need to be parsed into a structure with the following shape:
Some general patterns that may be informational as to how these fields are typically populated...
type
is genomic then only the nucleotiedXXX values will be includedtype
is transcript then the isManeSelect may be TRUE otherwise it defaults to FALSEtype
is transcript and its a protein coding expression then the corresponding derived protein expression will likely be providedtype
is protein only (or something like that) then only the protein expression will be provided (i think)We will need to do some finalization of the destination structure for this data in our GeneGraph model. For a general reference these fields will ultimately end up in the VariationDescriptor class that is associated with the core VCV and SCV statements being transformed.
The text was updated successfully, but these errors were encountered: