You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The iri column is coming in from kg-phenio, through monarch-ingest. It's not yet defined in the schema, so Solr represents it as a multivalued column, which isn't what we want.
For the moment, #474 is going out of its way to trim the iri field out of Solr documents to avoid problems when creating pydantic instances, and this issue is so that we don't lose track of that hack.
On the monarch-ingest / linkml-solr side, we probably want to avoid passing extra fields from the tsv file to Solr. It would have probably been better to get an index-time error.
As for iri itself, right now we handle that expansion in via curies in the app, so if we include it, it would only be for phenio. We could also make the choice to populate it for other entities? or we could leave it out of our kg-phenio ingest, and then stick with only handling curie expansion in the code level.
The text was updated successfully, but these errors were encountered:
kevinschaper
changed the title
iri field wasn't in schema, but still made it into Solr from kg-phenio
Validate kgx files against monarch-app schema
Mar 26, 2024
I want to add a note that I tried this out, and found that there were a lot of false negatives where linkml-validate complained about types, like nodes where the name is a number would fail for not being a string, or that single values in multivalued fields were erroneously not lists. We probably want to run as a module rather than from the cli, so that we can swallow some categories of errors - or we want to validate against a more type-defined file
The
iri
column is coming in from kg-phenio, through monarch-ingest. It's not yet defined in the schema, so Solr represents it as a multivalued column, which isn't what we want.For the moment, #474 is going out of its way to trim the
iri
field out of Solr documents to avoid problems when creating pydantic instances, and this issue is so that we don't lose track of that hack.On the monarch-ingest / linkml-solr side, we probably want to avoid passing extra fields from the tsv file to Solr. It would have probably been better to get an index-time error.
As for
iri
itself, right now we handle that expansion in via curies in the app, so if we include it, it would only be for phenio. We could also make the choice to populate it for other entities? or we could leave it out of our kg-phenio ingest, and then stick with only handling curie expansion in the code level.The text was updated successfully, but these errors were encountered: