From 9f0ad63f198efbaf509c09eedd1737d9d104f781 Mon Sep 17 00:00:00 2001 From: Nico Matentzoglu Date: Fri, 22 Jul 2022 17:50:58 +0300 Subject: [PATCH 01/14] Create Phenio.md --- docs/Sources/Phenio.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) create mode 100644 docs/Sources/Phenio.md diff --git a/docs/Sources/Phenio.md b/docs/Sources/Phenio.md new file mode 100644 index 00000000..05c70b95 --- /dev/null +++ b/docs/Sources/Phenio.md @@ -0,0 +1,14 @@ +# Phenio - Phenomics Integrated Ontology + +Phenio provides the "semantic backbone" of the Monarch Knowledge Graph. +Designed as an application ontology, it comprises a variety of ontological concepts, in particular all the ones +that are "core entities" in the Monarch Knowledge Graph (KG), such as diseases, phenotypes and anatomical entities. + +Here, we document how Phenio "flows" through the system, from how it is created to how it is utilised. Note that while forming +an integral part of the Monarch KG, it does not have a "Koza Ingest" configuration like all the other sources, but is +instead ingested into Monarch KG straight via a OWL->obographs->KGX transform. + +Under construction, materials: + +- https://github.com/monarch-initiative/monarch-ingest/blob/91eaacaab23941bb58130c52040c5e118f2c1a21/monarch_ingest/cli_utils.py#L93 +- https://docs.google.com/presentation/d/1YEYMeP0vBpV52ezDDJfn2xmOOEGROlOT-CXjEk6LIkA/edit#slide=id.p From 162bd2fb133765a0fbf53a8a316d984a48e3d97e Mon Sep 17 00:00:00 2001 From: Glass Date: Wed, 25 Oct 2023 11:12:57 -0600 Subject: [PATCH 02/14] Update Phenio.md --- docs/Sources/Phenio.md | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/docs/Sources/Phenio.md b/docs/Sources/Phenio.md index 05c70b95..e6e9ef65 100644 --- a/docs/Sources/Phenio.md +++ b/docs/Sources/Phenio.md @@ -1,14 +1,21 @@ -# Phenio - Phenomics Integrated Ontology +# Phenio -Phenio provides the "semantic backbone" of the Monarch Knowledge Graph. +An ontology for accessing and comparing knowledge concerning phenotypes across species and genetic backgrounds. + +Phenio provides the "semantic backbone" of the Monarch Knowledge Graph. Designed as an application ontology, it comprises a variety of ontological concepts, in particular all the ones that are "core entities" in the Monarch Knowledge Graph (KG), such as diseases, phenotypes and anatomical entities. -Here, we document how Phenio "flows" through the system, from how it is created to how it is utilised. Note that while forming -an integral part of the Monarch KG, it does not have a "Koza Ingest" configuration like all the other sources, but is -instead ingested into Monarch KG straight via a OWL->obographs->KGX transform. +Note that while forming an integral part of the Monarch KG, it does not have a "Koza Ingest" configuration like all the other sources, +but is instead ingested into Monarch KG straight via a `OWL -> obographs -> KGX` transform. + +For more information, see: + +- [NCATS Translater Phenio Overview](https://github.com/NCATSTranslator/Translator-All/wiki/phenio) +- [KGHub Phenio](https://github.com/Knowledge-Graph-Hub/kg-phenio) +- [Monarch Phenio](https://github.com/monarch-initiative/phenio) +- [Documentation](https://monarch-initiative.github.io/phenio/) -Under construction, materials: +## Source Code -- https://github.com/monarch-initiative/monarch-ingest/blob/91eaacaab23941bb58130c52040c5e118f2c1a21/monarch_ingest/cli_utils.py#L93 -- https://docs.google.com/presentation/d/1YEYMeP0vBpV52ezDDJfn2xmOOEGROlOT-CXjEk6LIkA/edit#slide=id.p +https://github.com/monarch-initiative/phenio From d6d5c5937965d17d32dbeda6434277bd9cb872c6 Mon Sep 17 00:00:00 2001 From: Harry Caufield Date: Wed, 25 Oct 2023 13:55:18 -0400 Subject: [PATCH 03/14] Expanded PHENIO doc --- docs/Sources/Phenio.md | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/docs/Sources/Phenio.md b/docs/Sources/Phenio.md index e6e9ef65..059b7f5a 100644 --- a/docs/Sources/Phenio.md +++ b/docs/Sources/Phenio.md @@ -1,15 +1,27 @@ -# Phenio +# PHENIO -An ontology for accessing and comparing knowledge concerning phenotypes across species and genetic backgrounds. +PHENIO is an ontology for accessing and comparing knowledge concerning phenotypes across species and genetic backgrounds. Phenio provides the "semantic backbone" of the Monarch Knowledge Graph. -Designed as an application ontology, it comprises a variety of ontological concepts, in particular all the ones -that are "core entities" in the Monarch Knowledge Graph (KG), such as diseases, phenotypes and anatomical entities. +Designed as an application ontology, PHENIO integrates a variety of ontological concepts, in particular +the "core entities" in the Monarch Knowledge Graph (KG), including diseases, phenotypes and anatomical entities. -Note that while forming an integral part of the Monarch KG, it does not have a "Koza Ingest" configuration like all the other sources, +Note that while forming an integral part of the Monarch KG, PHENIO does not have a "Koza Ingest" configuration like all the other sources, but is instead ingested into Monarch KG straight via a `OWL -> obographs -> KGX` transform. -For more information, see: +## Sources + +PHENIO integrates several different types of hierarchical relationships from a variety of sources. + +These include: +* Chemical entities and relationships from [CHEBI](https://www.ebi.ac.uk/chebi/) +* Disease entities and relationships from [MONDO](https://mondo.monarchinitiative.org/) +* Abnormal phenotypes of humans ([HPO](https://hpo.jax.org/app/)), mouse and other mammalian species ([MPO](https://www.informatics.jax.org/vocab/mp_ontology)), the nematode worm Caenorhabditis elegans ([WBBT](http://www.obofoundry.org/ontology/wbphenotype.html)), and zebrafish ([ZFA](http://www.obofoundry.org/ontology/zfa.html)). + +[A full list of files used in the construction of PHENIO is available here.](https://monarch-initiative.github.io/phenio/odk-workflows/RepositoryFileStructure/) + +## More Information +For more information, see: - [NCATS Translater Phenio Overview](https://github.com/NCATSTranslator/Translator-All/wiki/phenio) - [KGHub Phenio](https://github.com/Knowledge-Graph-Hub/kg-phenio) From de903b3abfec15a0bee3b7ae95389a1bc081ca08 Mon Sep 17 00:00:00 2001 From: Kevin Schaper Date: Wed, 25 Oct 2023 17:26:31 -0700 Subject: [PATCH 04/14] Using a closurizer branch and passing in the grouping_key fields --- pyproject.toml | 3 ++- scripts/load_solr.sh | 20 ++++++++++++++++++-- src/monarch_ingest/cli_utils.py | 4 +++- 3 files changed, 23 insertions(+), 4 deletions(-) diff --git a/pyproject.toml b/pyproject.toml index 4ab94c4d..66a4cb69 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -19,7 +19,8 @@ python = ">=3.10,<3.12" kghub-downloader = "^0.3.2" koza = "^0.3.0" cat-merge = ">=0.2.0" -closurizer = "^0.3.2" +#closurizer = "^0.3.2" +closurizer = {git="https://github.com/monarch-initiative/closurizer.git", branch="grouping_key"} kgx = ">=2.1" multi-indexer = "0.0.5" botocore = "^1.31" diff --git a/scripts/load_solr.sh b/scripts/load_solr.sh index b31ce7a7..dce18c57 100755 --- a/scripts/load_solr.sh +++ b/scripts/load_solr.sh @@ -13,7 +13,11 @@ fi echo "Download the schema from monarch-py" # This replaces poetry run monarch schema > model.yaml -curl -O https://raw.githubusercontent.com/monarch-initiative/monarch-app/v0.15.8/backend/src/monarch_py/datamodels/model.yaml + +# temporarily retrieve from a branch that has the sssom changes + +curl -O https://raw.githubusercontent.com/monarch-initiative/monarch-app/schema-sssom-and-grouping/backend/src/monarch_py/datamodels/model.yaml +#curl -O https://raw.githubusercontent.com/monarch-initiative/monarch-app/v0.15.8/backend/src/monarch_py/datamodels/model.yaml curl -O https://raw.githubusercontent.com/monarch-initiative/monarch-app/v0.15.8/backend/src/monarch_py/datamodels/similarity.yaml echo "Starting the server" @@ -21,7 +25,7 @@ poetry run lsolr start-server sleep 30 echo "Adding cores" -poetry run lsolr add-cores entity association +poetry run lsolr add-cores entity association sssom sleep 10 # todo: ideally, this will live in linkml-solr @@ -37,12 +41,24 @@ echo "Adding association schema" poetry run lsolr create-schema -C association -s model.yaml -t Association sleep 5 +echo "Adding sssom schema" +poetry run lsolr create-schema -C sssom -s model.yaml -t Mapping +sleep 5 + # todo: this also should live in linkml-solr, and copy-fields should be based on the schema echo "Add dynamic fields and copy fields declarations" scripts/add_entity_copyfields.sh scripts/add_association_copyfields.sh sleep 5 +# todo: this should probably happen after associations, but putting it first for testing +echo "Loading SSSOM mappings" +grep -v "^#" data/monarch/mondo.sssom.tsv > headless.mondo.sssom.tsv +# todo: copy the mappings to output/mappings as part of an earlier step +poetry run lsolr bulkload -C sssom -s model.yaml headless.mondo.sssom.tsv +poetry run lsolr bulkload -C sssom -s model.yaml data/monarch/gene_mappings.tsv +poetry run lsolr bulkload -C sssom -s model.yaml data/monarch/chebi-mesh.biomappings.sssom.tsv + echo "Loading entities" poetry run lsolr bulkload -C entity -s model.yaml output/monarch-kg_nodes.tsv diff --git a/src/monarch_ingest/cli_utils.py b/src/monarch_ingest/cli_utils.py index b7e488fc..44725fa2 100644 --- a/src/monarch_ingest/cli_utils.py +++ b/src/monarch_ingest/cli_utils.py @@ -319,11 +319,13 @@ def apply_closure( output_file=output_file, fields=['subject', 'object', + 'qualifiers', 'frequency_qualifier', 'onset_qualifier', 'sex_qualifier', 'stage_qualifier'], - evidence_fields=['has_evidence', 'publications']) + evidence_fields=['has_evidence', 'publications'], + grouping_fields=['subject', 'negated', 'predicate', 'object']) sh.gzip(output_file, force=True) def load_sqlite(): From 09f549fc74bcc344fd2e6f30e5b4ef8b445e640d Mon Sep 17 00:00:00 2001 From: Kevin Schaper Date: Fri, 27 Oct 2023 16:16:03 -0700 Subject: [PATCH 05/14] Use the latest schema from monarch-app by url (since the KG build will be ahead of the release), bring in closurizer 0.4.0 to switch to duckdb --- poetry.lock | 128 ++++++++++++++++++++++++++++++++----------- pyproject.toml | 3 +- scripts/load_solr.sh | 9 +-- 3 files changed, 100 insertions(+), 40 deletions(-) diff --git a/poetry.lock b/poetry.lock index fa99d9bd..1066405b 100644 --- a/poetry.lock +++ b/poetry.lock @@ -357,18 +357,18 @@ colorama = {version = "*", markers = "platform_system == \"Windows\""} [[package]] name = "closurizer" -version = "0.3.2" +version = "0.4.0" description = "Add closure expansion fields to kgx files following the Golr pattern" optional = false python-versions = ">=3.8,<4.0" files = [ - {file = "closurizer-0.3.2-py3-none-any.whl", hash = "sha256:656b5a3275b3d88e7a9ce91fae5dd43cd044b73f4b5842d32e0da98c256c7a99"}, - {file = "closurizer-0.3.2.tar.gz", hash = "sha256:9db271727db4b85ccc2a661afd361b0efd27e62bf4627b014e6fb9289b315edd"}, + {file = "closurizer-0.4.0-py3-none-any.whl", hash = "sha256:3f49cb5edea4c673079752d2e9256282a50d3c5e857a068346792ee1070d8182"}, + {file = "closurizer-0.4.0.tar.gz", hash = "sha256:df655756264300861c33dddf3231aa158a7a68fa00f5cf7374b3889eb196ccdf"}, ] [package.dependencies] click = ">=8,<9" -petl = ">=1.7.10,<2.0.0" +duckdb = ">=0.9.1,<0.10.0" SQLAlchemy = ">=1.4.37,<2.0.0" [[package]] @@ -521,6 +521,54 @@ files = [ {file = "docutils-0.18.1.tar.gz", hash = "sha256:679987caf361a7539d76e584cbeddc311e3aee937877c87346f31debc63e9d06"}, ] +[[package]] +name = "duckdb" +version = "0.9.1" +description = "DuckDB embedded database" +optional = false +python-versions = ">=3.7.0" +files = [ + {file = "duckdb-0.9.1-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:6c724e105ecd78c8d86b3c03639b24e1df982392fc836705eb007e4b1b488864"}, + {file = "duckdb-0.9.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:75f12c5a3086079fb6440122565f1762ef1a610a954f2d8081014c1dd0646e1a"}, + {file = "duckdb-0.9.1-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:151f5410c32f8f8fe03bf23462b9604349bc0b4bd3a51049bbf5e6a482a435e8"}, + {file = "duckdb-0.9.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9c1d066fdae22b9b711b1603541651a378017645f9fbc4adc9764b2f3c9e9e4a"}, + {file = "duckdb-0.9.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:1de56d8b7bd7a7653428c1bd4b8948316df488626d27e9c388194f2e0d1428d4"}, + {file = "duckdb-0.9.1-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:1fb6cd590b1bb4e31fde8efd25fedfbfa19a86fa72789fa5b31a71da0d95bce4"}, + {file = "duckdb-0.9.1-cp310-cp310-win32.whl", hash = "sha256:1039e073714d668cef9069bb02c2a6756c7969cedda0bff1332520c4462951c8"}, + {file = "duckdb-0.9.1-cp310-cp310-win_amd64.whl", hash = "sha256:7e6ac4c28918e1d278a89ff26fd528882aa823868ed530df69d6c8a193ae4e41"}, + {file = "duckdb-0.9.1-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:5eb750f2ee44397a61343f32ee9d9e8c8b5d053fa27ba4185d0e31507157f130"}, + {file = "duckdb-0.9.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:aea2a46881d75dc069a242cb164642d7a4f792889010fb98210953ab7ff48849"}, + {file = "duckdb-0.9.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:ed3dcedfc7a9449b6d73f9a2715c730180056e0ba837123e7967be1cd3935081"}, + {file = "duckdb-0.9.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0c55397bed0087ec4445b96f8d55f924680f6d40fbaa7f2e35468c54367214a5"}, + {file = "duckdb-0.9.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:3261696130f1cfb955735647c93297b4a6241753fb0de26c05d96d50986c6347"}, + {file = "duckdb-0.9.1-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:64c04b1728e3e37cf93748829b5d1e028227deea75115bb5ead01c608ece44b1"}, + {file = "duckdb-0.9.1-cp311-cp311-win32.whl", hash = "sha256:12cf9fb441a32702e31534330a7b4d569083d46a91bf185e0c9415000a978789"}, + {file = "duckdb-0.9.1-cp311-cp311-win_amd64.whl", hash = "sha256:fdfd85575ce9540e593d5d25c9d32050bd636c27786afd7b776aae0f6432b55e"}, + {file = "duckdb-0.9.1-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:704700a4b469e3bb1a7e85ac12e58037daaf2b555ef64a3fe2913ffef7bd585b"}, + {file = "duckdb-0.9.1-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:cf55b303b7b1a8c2165a96e609eb30484bc47481d94a5fb1e23123e728df0a74"}, + {file = "duckdb-0.9.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b70e23c14746904ca5de316436e43a685eb769c67fe3dbfaacbd3cce996c5045"}, + {file = "duckdb-0.9.1-cp37-cp37m-musllinux_1_1_x86_64.whl", hash = "sha256:77379f7f1f8b4dc98e01f8f6f8f15a0858cf456e2385e22507f3cb93348a88f9"}, + {file = "duckdb-0.9.1-cp37-cp37m-win32.whl", hash = "sha256:92c8f738489838666cae9ef41703f8b16f660bb146970d1eba8b2c06cb3afa39"}, + {file = "duckdb-0.9.1-cp37-cp37m-win_amd64.whl", hash = "sha256:08c5484ac06ab714f745526d791141f547e2f5ac92f97a0a1b37dfbb3ea1bd13"}, + {file = "duckdb-0.9.1-cp38-cp38-macosx_10_9_universal2.whl", hash = "sha256:f66d3c07c7f6938d3277294677eb7dad75165e7c57c8dd505503fc5ef10f67ad"}, + {file = "duckdb-0.9.1-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:c38044e5f78c0c7b58e9f937dcc6c34de17e9ca6be42f9f8f1a5a239f7a847a5"}, + {file = "duckdb-0.9.1-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:73bc0d715b79566b3ede00c367235cfcce67be0eddda06e17665c7a233d6854a"}, + {file = "duckdb-0.9.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d26622c3b4ea6a8328d95882059e3cc646cdc62d267d48d09e55988a3bba0165"}, + {file = "duckdb-0.9.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:3367d10096ff2b7919cedddcf60d308d22d6e53e72ee2702f6e6ca03d361004a"}, + {file = "duckdb-0.9.1-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:d88a119f1cb41911a22f08a6f084d061a8c864e28b9433435beb50a56b0d06bb"}, + {file = "duckdb-0.9.1-cp38-cp38-win32.whl", hash = "sha256:99567496e45b55c67427133dc916013e8eb20a811fc7079213f5f03b2a4f5fc0"}, + {file = "duckdb-0.9.1-cp38-cp38-win_amd64.whl", hash = "sha256:5b3da4da73422a3235c3500b3fb541ac546adb3e35642ef1119dbcd9cc7f68b8"}, + {file = "duckdb-0.9.1-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:eca00c0c2062c0265c6c0e78ca2f6a30611b28f3afef062036610e9fc9d4a67d"}, + {file = "duckdb-0.9.1-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:eb5af8e89d40fc4baab1515787ea1520a6c6cf6aa40ab9f107df6c3a75686ce1"}, + {file = "duckdb-0.9.1-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:9fae3d4f83ebcb47995f6acad7c6d57d003a9b6f0e1b31f79a3edd6feb377443"}, + {file = "duckdb-0.9.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:16b9a7efc745bc3c5d1018c3a2f58d9e6ce49c0446819a9600fdba5f78e54c47"}, + {file = "duckdb-0.9.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:66b0b60167f5537772e9f5af940e69dcf50e66f5247732b8bb84a493a9af6055"}, + {file = "duckdb-0.9.1-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:4f27f5e94c47df6c4ccddf18e3277b7464eea3db07356d2c4bf033b5c88359b8"}, + {file = "duckdb-0.9.1-cp39-cp39-win32.whl", hash = "sha256:d43cd7e6f783006b59dcc5e40fcf157d21ee3d0c8dfced35278091209e9974d7"}, + {file = "duckdb-0.9.1-cp39-cp39-win_amd64.whl", hash = "sha256:e666795887d9cf1d6b6f6cbb9d487270680e5ff6205ebc54b2308151f13b8cff"}, + {file = "duckdb-0.9.1.tar.gz", hash = "sha256:603a878746015a3f2363a65eb48bcbec816261b6ee8d71eee53061117f6eef9d"}, +] + [[package]] name = "elastic-transport" version = "8.4.0" @@ -985,6 +1033,17 @@ files = [ {file = "ijson-3.2.3-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:4a3a6a2fbbe7550ffe52d151cf76065e6b89cfb3e9d0463e49a7e322a25d0426"}, {file = "ijson-3.2.3-cp311-cp311-win32.whl", hash = "sha256:6a4db2f7fb9acfb855c9ae1aae602e4648dd1f88804a0d5cfb78c3639bcf156c"}, {file = "ijson-3.2.3-cp311-cp311-win_amd64.whl", hash = "sha256:ccd6be56335cbb845f3d3021b1766299c056c70c4c9165fb2fbe2d62258bae3f"}, + {file = "ijson-3.2.3-cp312-cp312-macosx_10_9_universal2.whl", hash = "sha256:055b71bbc37af5c3c5861afe789e15211d2d3d06ac51ee5a647adf4def19c0ea"}, + {file = "ijson-3.2.3-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:c075a547de32f265a5dd139ab2035900fef6653951628862e5cdce0d101af557"}, + {file = "ijson-3.2.3-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:457f8a5fc559478ac6b06b6d37ebacb4811f8c5156e997f0d87d708b0d8ab2ae"}, + {file = "ijson-3.2.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9788f0c915351f41f0e69ec2618b81ebfcf9f13d9d67c6d404c7f5afda3e4afb"}, + {file = "ijson-3.2.3-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:fa234ab7a6a33ed51494d9d2197fb96296f9217ecae57f5551a55589091e7853"}, + {file = "ijson-3.2.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bdd0dc5da4f9dc6d12ab6e8e0c57d8b41d3c8f9ceed31a99dae7b2baf9ea769a"}, + {file = "ijson-3.2.3-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:c6beb80df19713e39e68dc5c337b5c76d36ccf69c30b79034634e5e4c14d6904"}, + {file = "ijson-3.2.3-cp312-cp312-musllinux_1_1_i686.whl", hash = "sha256:a2973ce57afb142d96f35a14e9cfec08308ef178a2c76b8b5e1e98f3960438bf"}, + {file = "ijson-3.2.3-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:105c314fd624e81ed20f925271ec506523b8dd236589ab6c0208b8707d652a0e"}, + {file = "ijson-3.2.3-cp312-cp312-win32.whl", hash = "sha256:ac44781de5e901ce8339352bb5594fcb3b94ced315a34dbe840b4cff3450e23b"}, + {file = "ijson-3.2.3-cp312-cp312-win_amd64.whl", hash = "sha256:0567e8c833825b119e74e10a7c29761dc65fcd155f5d4cb10f9d3b8916ef9912"}, {file = "ijson-3.2.3-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:eeb286639649fb6bed37997a5e30eefcacddac79476d24128348ec890b2a0ccb"}, {file = "ijson-3.2.3-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:396338a655fb9af4ac59dd09c189885b51fa0eefc84d35408662031023c110d1"}, {file = "ijson-3.2.3-cp36-cp36m-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:0e0243d166d11a2a47c17c7e885debf3b19ed136be2af1f5d1c34212850236ac"}, @@ -1594,6 +1653,16 @@ files = [ {file = "MarkupSafe-2.1.3-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:5bbe06f8eeafd38e5d0a4894ffec89378b6c6a625ff57e3028921f8ff59318ac"}, {file = "MarkupSafe-2.1.3-cp311-cp311-win32.whl", hash = "sha256:dd15ff04ffd7e05ffcb7fe79f1b98041b8ea30ae9234aed2a9168b5797c3effb"}, {file = "MarkupSafe-2.1.3-cp311-cp311-win_amd64.whl", hash = "sha256:134da1eca9ec0ae528110ccc9e48041e0828d79f24121a1a146161103c76e686"}, + {file = "MarkupSafe-2.1.3-cp312-cp312-macosx_10_9_universal2.whl", hash = "sha256:f698de3fd0c4e6972b92290a45bd9b1536bffe8c6759c62471efaa8acb4c37bc"}, + {file = "MarkupSafe-2.1.3-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:aa57bd9cf8ae831a362185ee444e15a93ecb2e344c8e52e4d721ea3ab6ef1823"}, + {file = "MarkupSafe-2.1.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ffcc3f7c66b5f5b7931a5aa68fc9cecc51e685ef90282f4a82f0f5e9b704ad11"}, + {file = "MarkupSafe-2.1.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:47d4f1c5f80fc62fdd7777d0d40a2e9dda0a05883ab11374334f6c4de38adffd"}, + {file = "MarkupSafe-2.1.3-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:1f67c7038d560d92149c060157d623c542173016c4babc0c1913cca0564b9939"}, + {file = "MarkupSafe-2.1.3-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:9aad3c1755095ce347e26488214ef77e0485a3c34a50c5a5e2471dff60b9dd9c"}, + {file = "MarkupSafe-2.1.3-cp312-cp312-musllinux_1_1_i686.whl", hash = "sha256:14ff806850827afd6b07a5f32bd917fb7f45b046ba40c57abdb636674a8b559c"}, + {file = "MarkupSafe-2.1.3-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:8f9293864fe09b8149f0cc42ce56e3f0e54de883a9de90cd427f191c346eb2e1"}, + {file = "MarkupSafe-2.1.3-cp312-cp312-win32.whl", hash = "sha256:715d3562f79d540f251b99ebd6d8baa547118974341db04f5ad06d5ea3eb8007"}, + {file = "MarkupSafe-2.1.3-cp312-cp312-win_amd64.whl", hash = "sha256:1b8dd8c3fd14349433c79fa8abeb573a55fc0fdd769133baac1f5e07abf54aeb"}, {file = "MarkupSafe-2.1.3-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:8e254ae696c88d98da6555f5ace2279cf7cd5b3f52be2b5cf97feafe883b58d2"}, {file = "MarkupSafe-2.1.3-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:cb0932dc158471523c9637e807d9bfb93e06a95cbf010f1a38b98623b929ef2b"}, {file = "MarkupSafe-2.1.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9402b03f1a1b4dc4c19845e5c749e3ab82d5078d16a2a4c2cd2df62d57bb0707"}, @@ -1957,32 +2026,6 @@ files = [ {file = "pathspec-0.11.2.tar.gz", hash = "sha256:e0d8d0ac2f12da61956eb2306b69f9469b42f4deb0f3cb6ed47b9cce9996ced3"}, ] -[[package]] -name = "petl" -version = "1.7.14" -description = "A Python package for extracting, transforming and loading tables of data." -optional = false -python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*" -files = [ - {file = "petl-1.7.14.tar.gz", hash = "sha256:d4802e3c4804bf85f2267a0102fcad35c61e6a37c90d9e1a1674331f35a90a7f"}, -] - -[package.extras] -avro = ["fastavro (>=0.24.0)"] -bcolz = ["bcolz (>=1.2.1)"] -db = ["SQLAlchemy (>=1.3.6,<2.0)"] -hdf5 = ["cython (>=0.29.13)", "numexpr (>=2.6.9)", "numpy (>=1.16.4)", "tables (>=3.5.2)"] -http = ["aiohttp (>=3.6.2)", "requests"] -interval = ["intervaltree (>=3.0.2)"] -numpy = ["numpy (>=1.16.4)"] -pandas = ["pandas (>=0.24.2)"] -remote = ["fsspec (>=0.7.4)"] -smb = ["smbprotocol (>=1.0.1)"] -whoosh = ["whoosh"] -xls = ["xlrd (>=2.0.1)", "xlwt (>=1.3.0)"] -xlsx = ["openpyxl (>=2.6.2)"] -xpath = ["lxml (>=4.4.0)"] - [[package]] name = "platformdirs" version = "3.10.0" @@ -2448,6 +2491,7 @@ files = [ {file = "PyYAML-6.0.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:69b023b2b4daa7548bcfbd4aa3da05b3a74b772db9e23b982788168117739938"}, {file = "PyYAML-6.0.1-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:81e0b275a9ecc9c0c0c07b4b90ba548307583c125f54d5b6946cfee6360c733d"}, {file = "PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ba336e390cd8e4d1739f42dfe9bb83a3cc2e80f567d8805e11b46f4a943f5515"}, + {file = "PyYAML-6.0.1-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:326c013efe8048858a6d312ddd31d56e468118ad4cdeda36c719bf5bb6192290"}, {file = "PyYAML-6.0.1-cp310-cp310-win32.whl", hash = "sha256:bd4af7373a854424dabd882decdc5579653d7868b8fb26dc7d0e99f823aa5924"}, {file = "PyYAML-6.0.1-cp310-cp310-win_amd64.whl", hash = "sha256:fd1592b3fdf65fff2ad0004b5e363300ef59ced41c2e6b3a99d4089fa8c5435d"}, {file = "PyYAML-6.0.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:6965a7bc3cf88e5a1c3bd2e0b5c22f8d677dc88a455344035f03399034eb3007"}, @@ -2455,8 +2499,15 @@ files = [ {file = "PyYAML-6.0.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:42f8152b8dbc4fe7d96729ec2b99c7097d656dc1213a3229ca5383f973a5ed6d"}, {file = "PyYAML-6.0.1-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:062582fca9fabdd2c8b54a3ef1c978d786e0f6b3a1510e0ac93ef59e0ddae2bc"}, {file = "PyYAML-6.0.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d2b04aac4d386b172d5b9692e2d2da8de7bfb6c387fa4f801fbf6fb2e6ba4673"}, + {file = "PyYAML-6.0.1-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:e7d73685e87afe9f3b36c799222440d6cf362062f78be1013661b00c5c6f678b"}, {file = "PyYAML-6.0.1-cp311-cp311-win32.whl", hash = "sha256:1635fd110e8d85d55237ab316b5b011de701ea0f29d07611174a1b42f1444741"}, {file = "PyYAML-6.0.1-cp311-cp311-win_amd64.whl", hash = "sha256:bf07ee2fef7014951eeb99f56f39c9bb4af143d8aa3c21b1677805985307da34"}, + {file = "PyYAML-6.0.1-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:855fb52b0dc35af121542a76b9a84f8d1cd886ea97c84703eaa6d88e37a2ad28"}, + {file = "PyYAML-6.0.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:40df9b996c2b73138957fe23a16a4f0ba614f4c0efce1e9406a184b6d07fa3a9"}, + {file = "PyYAML-6.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6c22bec3fbe2524cde73d7ada88f6566758a8f7227bfbf93a408a9d86bcc12a0"}, + {file = "PyYAML-6.0.1-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:8d4e9c88387b0f5c7d5f281e55304de64cf7f9c0021a3525bd3b1c542da3b0e4"}, + {file = "PyYAML-6.0.1-cp312-cp312-win32.whl", hash = "sha256:d483d2cdf104e7c9fa60c544d92981f12ad66a457afae824d146093b8c294c54"}, + {file = "PyYAML-6.0.1-cp312-cp312-win_amd64.whl", hash = "sha256:0d3304d8c0adc42be59c5f8a4d9e3d7379e6955ad754aa9d6ab7a398b59dd1df"}, {file = "PyYAML-6.0.1-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:50550eb667afee136e9a77d6dc71ae76a44df8b3e51e41b77f6de2932bfe0f47"}, {file = "PyYAML-6.0.1-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:1fe35611261b29bd1de0070f0b2f47cb6ff71fa6595c077e42bd0c419fa27b98"}, {file = "PyYAML-6.0.1-cp36-cp36m-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:704219a11b772aea0d8ecd7058d0082713c3562b4e271b849ad7dc4a5c90c13c"}, @@ -2473,6 +2524,7 @@ files = [ {file = "PyYAML-6.0.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a0cd17c15d3bb3fa06978b4e8958dcdc6e0174ccea823003a106c7d4d7899ac5"}, {file = "PyYAML-6.0.1-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:28c119d996beec18c05208a8bd78cbe4007878c6dd15091efb73a30e90539696"}, {file = "PyYAML-6.0.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7e07cbde391ba96ab58e532ff4803f79c4129397514e1413a7dc761ccd755735"}, + {file = "PyYAML-6.0.1-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:49a183be227561de579b4a36efbb21b3eab9651dd81b1858589f796549873dd6"}, {file = "PyYAML-6.0.1-cp38-cp38-win32.whl", hash = "sha256:184c5108a2aca3c5b3d3bf9395d50893a7ab82a38004c8f61c258d4428e80206"}, {file = "PyYAML-6.0.1-cp38-cp38-win_amd64.whl", hash = "sha256:1e2722cc9fbb45d9b87631ac70924c11d3a401b2d7f410cc0e3bbf249f2dca62"}, {file = "PyYAML-6.0.1-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:9eb6caa9a297fc2c2fb8862bc5370d0303ddba53ba97e71f08023b6cd73d16a8"}, @@ -2480,6 +2532,7 @@ files = [ {file = "PyYAML-6.0.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5773183b6446b2c99bb77e77595dd486303b4faab2b086e7b17bc6bef28865f6"}, {file = "PyYAML-6.0.1-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:b786eecbdf8499b9ca1d697215862083bd6d2a99965554781d0d8d1ad31e13a0"}, {file = "PyYAML-6.0.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bc1bf2925a1ecd43da378f4db9e4f799775d6367bdb94671027b73b393a7c42c"}, + {file = "PyYAML-6.0.1-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:04ac92ad1925b2cff1db0cfebffb6ffc43457495c9b3c39d3fcae417d7125dc5"}, {file = "PyYAML-6.0.1-cp39-cp39-win32.whl", hash = "sha256:faca3bdcf85b2fc05d06ff3fbc1f83e1391b3e724afa3feba7d13eeab355484c"}, {file = "PyYAML-6.0.1-cp39-cp39-win_amd64.whl", hash = "sha256:510c9deebc5c0225e8c96813043e62b680ba2f9c50a08d3724c7f28a747d1486"}, {file = "PyYAML-6.0.1.tar.gz", hash = "sha256:bfdf460b1736c775f2ba9f6a92bca30bc2095067b8a9d77876d1fad6cc3b4a43"}, @@ -2778,7 +2831,8 @@ files = [ {file = "ruamel.yaml.clib-0.2.7-cp310-cp310-win32.whl", hash = "sha256:763d65baa3b952479c4e972669f679fe490eee058d5aa85da483ebae2009d231"}, {file = "ruamel.yaml.clib-0.2.7-cp310-cp310-win_amd64.whl", hash = "sha256:d000f258cf42fec2b1bbf2863c61d7b8918d31ffee905da62dede869254d3b8a"}, {file = "ruamel.yaml.clib-0.2.7-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:045e0626baf1c52e5527bd5db361bc83180faaba2ff586e763d3d5982a876a9e"}, - {file = "ruamel.yaml.clib-0.2.7-cp311-cp311-macosx_12_6_arm64.whl", hash = "sha256:721bc4ba4525f53f6a611ec0967bdcee61b31df5a56801281027a3a6d1c2daf5"}, + {file = "ruamel.yaml.clib-0.2.7-cp311-cp311-macosx_13_0_arm64.whl", hash = "sha256:1a6391a7cabb7641c32517539ca42cf84b87b667bad38b78d4d42dd23e957c81"}, + {file = "ruamel.yaml.clib-0.2.7-cp311-cp311-manylinux2014_aarch64.whl", hash = "sha256:9c7617df90c1365638916b98cdd9be833d31d337dbcd722485597b43c4a215bf"}, {file = "ruamel.yaml.clib-0.2.7-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl", hash = "sha256:41d0f1fa4c6830176eef5b276af04c89320ea616655d01327d5ce65e50575c94"}, {file = "ruamel.yaml.clib-0.2.7-cp311-cp311-win32.whl", hash = "sha256:f6d3d39611ac2e4f62c3128a9eed45f19a6608670c5a2f4f07f24e8de3441d38"}, {file = "ruamel.yaml.clib-0.2.7-cp311-cp311-win_amd64.whl", hash = "sha256:da538167284de58a52109a9b89b8f6a53ff8437dd6dc26d33b57bf6699153122"}, @@ -3119,6 +3173,7 @@ files = [ {file = "SQLAlchemy-1.4.49-cp27-cp27mu-manylinux_2_5_x86_64.manylinux1_x86_64.whl", hash = "sha256:03db81b89fe7ef3857b4a00b63dedd632d6183d4ea5a31c5d8a92e000a41fc71"}, {file = "SQLAlchemy-1.4.49-cp310-cp310-macosx_11_0_x86_64.whl", hash = "sha256:95b9df9afd680b7a3b13b38adf6e3a38995da5e162cc7524ef08e3be4e5ed3e1"}, {file = "SQLAlchemy-1.4.49-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a63e43bf3f668c11bb0444ce6e809c1227b8f067ca1068898f3008a273f52b09"}, + {file = "SQLAlchemy-1.4.49-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ca46de16650d143a928d10842939dab208e8d8c3a9a8757600cae9b7c579c5cd"}, {file = "SQLAlchemy-1.4.49-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:f835c050ebaa4e48b18403bed2c0fda986525896efd76c245bdd4db995e51a4c"}, {file = "SQLAlchemy-1.4.49-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9c21b172dfb22e0db303ff6419451f0cac891d2e911bb9fbf8003d717f1bcf91"}, {file = "SQLAlchemy-1.4.49-cp310-cp310-win32.whl", hash = "sha256:5fb1ebdfc8373b5a291485757bd6431de8d7ed42c27439f543c81f6c8febd729"}, @@ -3128,26 +3183,35 @@ files = [ {file = "SQLAlchemy-1.4.49-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:5debe7d49b8acf1f3035317e63d9ec8d5e4d904c6e75a2a9246a119f5f2fdf3d"}, {file = "SQLAlchemy-1.4.49-cp311-cp311-win32.whl", hash = "sha256:82b08e82da3756765c2e75f327b9bf6b0f043c9c3925fb95fb51e1567fa4ee87"}, {file = "SQLAlchemy-1.4.49-cp311-cp311-win_amd64.whl", hash = "sha256:171e04eeb5d1c0d96a544caf982621a1711d078dbc5c96f11d6469169bd003f1"}, + {file = "SQLAlchemy-1.4.49-cp312-cp312-macosx_10_9_universal2.whl", hash = "sha256:f23755c384c2969ca2f7667a83f7c5648fcf8b62a3f2bbd883d805454964a800"}, + {file = "SQLAlchemy-1.4.49-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:8396e896e08e37032e87e7fbf4a15f431aa878c286dc7f79e616c2feacdb366c"}, + {file = "SQLAlchemy-1.4.49-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:66da9627cfcc43bbdebd47bfe0145bb662041472393c03b7802253993b6b7c90"}, + {file = "SQLAlchemy-1.4.49-cp312-cp312-win32.whl", hash = "sha256:9a06e046ffeb8a484279e54bda0a5abfd9675f594a2e38ef3133d7e4d75b6214"}, + {file = "SQLAlchemy-1.4.49-cp312-cp312-win_amd64.whl", hash = "sha256:7cf8b90ad84ad3a45098b1c9f56f2b161601e4670827d6b892ea0e884569bd1d"}, {file = "SQLAlchemy-1.4.49-cp36-cp36m-macosx_10_14_x86_64.whl", hash = "sha256:36e58f8c4fe43984384e3fbe6341ac99b6b4e083de2fe838f0fdb91cebe9e9cb"}, {file = "SQLAlchemy-1.4.49-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:b31e67ff419013f99ad6f8fc73ee19ea31585e1e9fe773744c0f3ce58c039c30"}, + {file = "SQLAlchemy-1.4.49-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ebc22807a7e161c0d8f3da34018ab7c97ef6223578fcdd99b1d3e7ed1100a5db"}, {file = "SQLAlchemy-1.4.49-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:c14b29d9e1529f99efd550cd04dbb6db6ba5d690abb96d52de2bff4ed518bc95"}, {file = "SQLAlchemy-1.4.49-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c40f3470e084d31247aea228aa1c39bbc0904c2b9ccbf5d3cfa2ea2dac06f26d"}, {file = "SQLAlchemy-1.4.49-cp36-cp36m-win32.whl", hash = "sha256:706bfa02157b97c136547c406f263e4c6274a7b061b3eb9742915dd774bbc264"}, {file = "SQLAlchemy-1.4.49-cp36-cp36m-win_amd64.whl", hash = "sha256:a7f7b5c07ae5c0cfd24c2db86071fb2a3d947da7bd487e359cc91e67ac1c6d2e"}, {file = "SQLAlchemy-1.4.49-cp37-cp37m-macosx_11_0_x86_64.whl", hash = "sha256:4afbbf5ef41ac18e02c8dc1f86c04b22b7a2125f2a030e25bbb4aff31abb224b"}, {file = "SQLAlchemy-1.4.49-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:24e300c0c2147484a002b175f4e1361f102e82c345bf263242f0449672a4bccf"}, + {file = "SQLAlchemy-1.4.49-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:393cd06c3b00b57f5421e2133e088df9cabcececcea180327e43b937b5a7caa5"}, {file = "SQLAlchemy-1.4.49-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:201de072b818f8ad55c80d18d1a788729cccf9be6d9dc3b9d8613b053cd4836d"}, {file = "SQLAlchemy-1.4.49-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7653ed6817c710d0c95558232aba799307d14ae084cc9b1f4c389157ec50df5c"}, {file = "SQLAlchemy-1.4.49-cp37-cp37m-win32.whl", hash = "sha256:647e0b309cb4512b1f1b78471fdaf72921b6fa6e750b9f891e09c6e2f0e5326f"}, {file = "SQLAlchemy-1.4.49-cp37-cp37m-win_amd64.whl", hash = "sha256:ab73ed1a05ff539afc4a7f8cf371764cdf79768ecb7d2ec691e3ff89abbc541e"}, {file = "SQLAlchemy-1.4.49-cp38-cp38-macosx_11_0_x86_64.whl", hash = "sha256:37ce517c011560d68f1ffb28af65d7e06f873f191eb3a73af5671e9c3fada08a"}, {file = "SQLAlchemy-1.4.49-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a1878ce508edea4a879015ab5215546c444233881301e97ca16fe251e89f1c55"}, + {file = "SQLAlchemy-1.4.49-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:95ab792ca493891d7a45a077e35b418f68435efb3e1706cb8155e20e86a9013c"}, {file = "SQLAlchemy-1.4.49-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:0e8e608983e6f85d0852ca61f97e521b62e67969e6e640fe6c6b575d4db68557"}, {file = "SQLAlchemy-1.4.49-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ccf956da45290df6e809ea12c54c02ace7f8ff4d765d6d3dfb3655ee876ce58d"}, {file = "SQLAlchemy-1.4.49-cp38-cp38-win32.whl", hash = "sha256:f167c8175ab908ce48bd6550679cc6ea20ae169379e73c7720a28f89e53aa532"}, {file = "SQLAlchemy-1.4.49-cp38-cp38-win_amd64.whl", hash = "sha256:45806315aae81a0c202752558f0df52b42d11dd7ba0097bf71e253b4215f34f4"}, {file = "SQLAlchemy-1.4.49-cp39-cp39-macosx_11_0_x86_64.whl", hash = "sha256:b6d0c4b15d65087738a6e22e0ff461b407533ff65a73b818089efc8eb2b3e1de"}, {file = "SQLAlchemy-1.4.49-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a843e34abfd4c797018fd8d00ffffa99fd5184c421f190b6ca99def4087689bd"}, + {file = "SQLAlchemy-1.4.49-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:738d7321212941ab19ba2acf02a68b8ee64987b248ffa2101630e8fccb549e0d"}, {file = "SQLAlchemy-1.4.49-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:1c890421651b45a681181301b3497e4d57c0d01dc001e10438a40e9a9c25ee77"}, {file = "SQLAlchemy-1.4.49-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d26f280b8f0a8f497bc10573849ad6dc62e671d2468826e5c748d04ed9e670d5"}, {file = "SQLAlchemy-1.4.49-cp39-cp39-win32.whl", hash = "sha256:ec2268de67f73b43320383947e74700e95c6770d0c68c4e615e9897e46296294"}, @@ -3568,4 +3632,4 @@ testing = ["big-O", "jaraco.functools", "jaraco.itertools", "more-itertools", "p [metadata] lock-version = "2.0" python-versions = ">=3.10,<3.12" -content-hash = "67d172d649a7aa6751c9eccc71a2b2fd2c509dd5e0263cbbcd4754dce1f4ff42" +content-hash = "53b50642d03ff2691d2621fbe06eb6e7ab599a7174f9a7829796d54e7842b910" diff --git a/pyproject.toml b/pyproject.toml index 66a4cb69..32462d22 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -19,8 +19,7 @@ python = ">=3.10,<3.12" kghub-downloader = "^0.3.2" koza = "^0.3.0" cat-merge = ">=0.2.0" -#closurizer = "^0.3.2" -closurizer = {git="https://github.com/monarch-initiative/closurizer.git", branch="grouping_key"} +closurizer = "0.4.0" kgx = ">=2.1" multi-indexer = "0.0.5" botocore = "^1.31" diff --git a/scripts/load_solr.sh b/scripts/load_solr.sh index dce18c57..462f1f19 100755 --- a/scripts/load_solr.sh +++ b/scripts/load_solr.sh @@ -12,13 +12,10 @@ if test -f "output/monarch-kg-denormalized-edges.tsv.gz"; then fi echo "Download the schema from monarch-py" -# This replaces poetry run monarch schema > model.yaml +# This replaces poetry run monarch schema > model.yaml and just awkwardly pulls from a github raw link -# temporarily retrieve from a branch that has the sssom changes - -curl -O https://raw.githubusercontent.com/monarch-initiative/monarch-app/schema-sssom-and-grouping/backend/src/monarch_py/datamodels/model.yaml -#curl -O https://raw.githubusercontent.com/monarch-initiative/monarch-app/v0.15.8/backend/src/monarch_py/datamodels/model.yaml -curl -O https://raw.githubusercontent.com/monarch-initiative/monarch-app/v0.15.8/backend/src/monarch_py/datamodels/similarity.yaml +curl -O https://raw.githubusercontent.com/monarch-initiative/monarch-app/main/backend/src/monarch_py/datamodels/model.yaml +curl -O https://raw.githubusercontent.com/monarch-initiative/monarch-app/main/backend/src/monarch_py/datamodels/similarity.yaml echo "Starting the server" poetry run lsolr start-server From 1086338e5f3bee4511a4da7601701d76faa7757f Mon Sep 17 00:00:00 2001 From: Kevin Schaper Date: Fri, 27 Oct 2023 16:26:52 -0700 Subject: [PATCH 06/14] Go back to getting the schema from a branch, because it can't go into main on monarch-app until after the ingest build runs. awkward. very awkward. --- scripts/load_solr.sh | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/scripts/load_solr.sh b/scripts/load_solr.sh index 462f1f19..fcbf25bd 100755 --- a/scripts/load_solr.sh +++ b/scripts/load_solr.sh @@ -14,8 +14,9 @@ fi echo "Download the schema from monarch-py" # This replaces poetry run monarch schema > model.yaml and just awkwardly pulls from a github raw link -curl -O https://raw.githubusercontent.com/monarch-initiative/monarch-app/main/backend/src/monarch_py/datamodels/model.yaml -curl -O https://raw.githubusercontent.com/monarch-initiative/monarch-app/main/backend/src/monarch_py/datamodels/similarity.yaml +# temporarily retrieve from a branch that has the sssom changes, they can't be merged until the new build runs +curl -O https://raw.githubusercontent.com/monarch-initiative/monarch-app/schema-sssom-and-grouping/backend/src/monarch_py/datamodels/model.yaml +curl -O https://raw.githubusercontent.com/monarch-initiative/monarch-app/schema-sssom-and-grouping/backend/src/monarch_py/datamodels/similarity.yaml echo "Starting the server" poetry run lsolr start-server From 04e2fd30cfcecce977486406476a6ad51c5985f3 Mon Sep 17 00:00:00 2001 From: Kevin Schaper Date: Fri, 27 Oct 2023 16:36:06 -0700 Subject: [PATCH 07/14] Fix hardcoded field names in copyfields command --- scripts/add_association_copyfields.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/add_association_copyfields.sh b/scripts/add_association_copyfields.sh index 596ea92b..0bf9cf58 100755 --- a/scripts/add_association_copyfields.sh +++ b/scripts/add_association_copyfields.sh @@ -36,7 +36,7 @@ curl -X POST -H 'Content-type:application/json' --data-binary '{ # now add copyfields declarations for subject_label, subject_closure_label, object_label, object_closure_label -for field in subject_label subject_closure_label subject_taxon subject_taxon_label predicate object_label object_closure_label object_taxon object_taxon_label primary_knowledge_source qualifier_label onset_qualifier_label frequency_qualifier_label sex_qualifier_label +for field in subject_label subject_closure_label subject_taxon subject_taxon_label predicate object_label object_closure_label object_taxon object_taxon_label primary_knowledge_source qualifiers_label onset_qualifier_label frequency_qualifier_label sex_qualifier_label do curl -X POST -H 'Content-type:application/json' --data-binary "{ \"add-copy-field\": { From 85515a633dbffd772cf036e976734e1c2e7c8840 Mon Sep 17 00:00:00 2001 From: Kevin Schaper Date: Mon, 30 Oct 2023 14:08:45 -0700 Subject: [PATCH 08/14] fix release update script --- scripts/update_latest_release.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/update_latest_release.sh b/scripts/update_latest_release.sh index 66c81aba..0ccbeb5b 100755 --- a/scripts/update_latest_release.sh +++ b/scripts/update_latest_release.sh @@ -3,7 +3,7 @@ # This script will push a local copy of the Solr, Sqlite and denormalized edge artifacts up to all # all copies of the bucket for a given release. It needs to be run from the root of the repo -RELEASE=$(gsutil ls gs://data-public-monarchinitiative/monarch-kg-dev/latest/ | grep -Eo "(\d){4}-(\d){2}-(\d){2}") +export RELEASE=$(gsutil ls gs://data-public-monarchinitiative/monarch-kg-dev/latest/ | grep -Eo "(\d){4}-(\d){2}-(\d){2}") echo "Updating Solr, SQLite and denormalized edge files for $RELEASE" gsutil cp output/monarch-kg.db.gz gs://monarch-archive/monarch-kg-dev/$RELEASE/ From 08e828fa6532895a3729472320e2aca94822737a Mon Sep 17 00:00:00 2001 From: Kevin Schaper Date: Mon, 30 Oct 2023 14:26:43 -0700 Subject: [PATCH 09/14] make release update file moving a little more specific --- scripts/update_latest_release.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/scripts/update_latest_release.sh b/scripts/update_latest_release.sh index 0ccbeb5b..8b11434c 100755 --- a/scripts/update_latest_release.sh +++ b/scripts/update_latest_release.sh @@ -10,5 +10,5 @@ gsutil cp output/monarch-kg.db.gz gs://monarch-archive/monarch-kg-dev/$RELEASE/ gsutil cp output/monarch-kg-denormalized-edges.tsv.gz gs://monarch-archive/monarch-kg-dev/$RELEASE/ gsutil cp output/solr.tar.gz gs://monarch-archive/monarch-kg-dev/$RELEASE/ -gsutil cp -r "gs://monarch-archive/monarch-kg-dev/$RELEASE/*" gs://data-public-monarchinitiative/monarch-kg-dev/$RELEASE/ -gsutil cp -r "gs://monarch-archive/monarch-kg-dev/$RELEASE/*" gs://monarch-archive/monarch-kg/latest/ +gsutil cp "gs://monarch-archive/monarch-kg-dev/$RELEASE/*.gz" gs://data-public-monarchinitiative/monarch-kg-dev/$RELEASE/ +gsutil cp "gs://monarch-archive/monarch-kg-dev/$RELEASE/*.gz" gs://monarch-archive/monarch-kg/latest/ From d65456eb3667a47d960e21f766b1ec65f1b4f774 Mon Sep 17 00:00:00 2001 From: Kevin Schaper Date: Mon, 30 Oct 2023 15:02:20 -0700 Subject: [PATCH 10/14] Update ingest artifact redo jenkinsfile --- Jenkinsfile-redo-solr | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/Jenkinsfile-redo-solr b/Jenkinsfile-redo-solr index dba905eb..a707af5b 100644 --- a/Jenkinsfile-redo-solr +++ b/Jenkinsfile-redo-solr @@ -17,6 +17,16 @@ pipeline { description: 'Re-run denormalization step', name: 'RUN_CLOSURIZER' ), + booleanParam( + defaultValue: false, + description: 'Load Solr', + name: 'SOLR' + ), + booleanParam( + defaultValue: false, + description: 'Load sqlite', + name: 'SQLITE' + ), booleanParam( defaultValue: false, description: 'Upload to bucket', @@ -69,11 +79,21 @@ pipeline { } } stage('solr') { + when { + expression { + return params.SOLR + } + } steps { sh 'poetry run ingest solr' } } stage('sqlite') { + when { + expression { + return params.SQLITE + } + } steps { sh 'poetry run ingest sqlite' } From 0e07ab502be8304457b4c8b61d1954683d94796e Mon Sep 17 00:00:00 2001 From: Kevin Schaper Date: Wed, 1 Nov 2023 14:57:52 -0700 Subject: [PATCH 11/14] Update closurizer to 0.4.1 to fix string vs array aggregation problem in denormalized edge output --- poetry.lock | 8 ++++---- pyproject.toml | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/poetry.lock b/poetry.lock index 1066405b..765a3e1a 100644 --- a/poetry.lock +++ b/poetry.lock @@ -357,13 +357,13 @@ colorama = {version = "*", markers = "platform_system == \"Windows\""} [[package]] name = "closurizer" -version = "0.4.0" +version = "0.4.1" description = "Add closure expansion fields to kgx files following the Golr pattern" optional = false python-versions = ">=3.8,<4.0" files = [ - {file = "closurizer-0.4.0-py3-none-any.whl", hash = "sha256:3f49cb5edea4c673079752d2e9256282a50d3c5e857a068346792ee1070d8182"}, - {file = "closurizer-0.4.0.tar.gz", hash = "sha256:df655756264300861c33dddf3231aa158a7a68fa00f5cf7374b3889eb196ccdf"}, + {file = "closurizer-0.4.1-py3-none-any.whl", hash = "sha256:b25d03d017b098c0d11a1238df2807d4cba8af23af9b4eafcd48888b60fb91dd"}, + {file = "closurizer-0.4.1.tar.gz", hash = "sha256:20d99424b1cc036c5c4328cb14848f7ca9806a1c21a14b9f4bd9231a2d091660"}, ] [package.dependencies] @@ -3632,4 +3632,4 @@ testing = ["big-O", "jaraco.functools", "jaraco.itertools", "more-itertools", "p [metadata] lock-version = "2.0" python-versions = ">=3.10,<3.12" -content-hash = "53b50642d03ff2691d2621fbe06eb6e7ab599a7174f9a7829796d54e7842b910" +content-hash = "67a68dc4ef89e34b84026c73139e95272adb275410cb4dde7b4bbe1bafa862a9" diff --git a/pyproject.toml b/pyproject.toml index 32462d22..461c438a 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -19,7 +19,7 @@ python = ">=3.10,<3.12" kghub-downloader = "^0.3.2" koza = "^0.3.0" cat-merge = ">=0.2.0" -closurizer = "0.4.0" +closurizer = "0.4.1" kgx = ">=2.1" multi-indexer = "0.0.5" botocore = "^1.31" From 82a04ecb5128ab889cffb16c96af627f3c3bbe49 Mon Sep 17 00:00:00 2001 From: Kevin Schaper Date: Wed, 1 Nov 2023 18:50:32 -0700 Subject: [PATCH 12/14] Use pigz instead of gzip --- scripts/after_download.sh | 2 +- scripts/load_solr.sh | 2 +- scripts/load_sqlite.sh | 8 ++++---- src/monarch_ingest/cli_utils.py | 2 +- 4 files changed, 7 insertions(+), 7 deletions(-) diff --git a/scripts/after_download.sh b/scripts/after_download.sh index 66c35194..f1da15a8 100755 --- a/scripts/after_download.sh +++ b/scripts/after_download.sh @@ -1,7 +1,7 @@ #!/bin/sh # Make a simple text file of all the gene IDs in Alliance -zcat data/alliance/BGI_*.gz | jq '.data[].basicGeneticEntity.primaryId' | gzip > data/alliance/alliance_gene_ids.txt.gz +zcat data/alliance/BGI_*.gz | jq '.data[].basicGeneticEntity.primaryId' | pigz > data/alliance/alliance_gene_ids.txt.gz # Make an id, name map of DDPHENO terms sqlite3 -cmd ".mode tabs" -cmd ".headers on" data/dictybase/ddpheno.db "select subject as id, value as name from rdfs_label_statement where predicate = 'rdfs:label' and subject like 'DDPHENO:%'" > data/dictybase/ddpheno.tsv diff --git a/scripts/load_solr.sh b/scripts/load_solr.sh index fcbf25bd..7b2e617c 100755 --- a/scripts/load_solr.sh +++ b/scripts/load_solr.sh @@ -78,4 +78,4 @@ chmod -R a+rX solr-data tar czf solr.tar.gz -C solr-data data mv solr.tar.gz output/ -gzip --force output/monarch-kg-denormalized-edges.tsv +pigz --force output/monarch-kg-denormalized-edges.tsv diff --git a/scripts/load_sqlite.sh b/scripts/load_sqlite.sh index d427dfb0..fd7a582b 100755 --- a/scripts/load_sqlite.sh +++ b/scripts/load_sqlite.sh @@ -27,8 +27,8 @@ sqlite3 output/monarch-kg.db "create index if not exists denormalized_edges_obje echo "Cleaning up..." rm output/monarch-kg_*.tsv -gzip --force output/qc/monarch-kg-dangling-edges.tsv -gzip --force output/monarch-kg-denormalized-edges.tsv +pigz --force output/qc/monarch-kg-dangling-edges.tsv +pigz --force output/monarch-kg-denormalized-edges.tsv echo "Populate phenio db term_association..." cp data/monarch/phenio.db.gz output/phenio.db.gz @@ -36,5 +36,5 @@ gunzip output/phenio.db.gz sqlite3 -cmd "attach 'monarch-kg.db' as monarch" phenio.db "insert into term_association (id, subject, predicate, object, evidence_type, publication, source) select id, subject, predicate, object, has_evidence as evidence_type, publications as publication, primary_knowledge_source as source from monarch.edges where predicate = 'biolink:has_phenotype' and negated <> 'True'" echo "Compressing databases" -gzip --force output/phenio.db -gzip --force output/monarch-kg.db +pigz --force output/phenio.db +pigz --force output/monarch-kg.db diff --git a/src/monarch_ingest/cli_utils.py b/src/monarch_ingest/cli_utils.py index 44725fa2..c608532a 100644 --- a/src/monarch_ingest/cli_utils.py +++ b/src/monarch_ingest/cli_utils.py @@ -326,7 +326,7 @@ def apply_closure( 'stage_qualifier'], evidence_fields=['has_evidence', 'publications'], grouping_fields=['subject', 'negated', 'predicate', 'object']) - sh.gzip(output_file, force=True) + sh.pigz(output_file, force=True) def load_sqlite(): sh.bash("scripts/load_sqlite.sh") From 83e462dd18d8d6025cbe99434bd1d02b7d8f62ba Mon Sep 17 00:00:00 2001 From: Kevin Schaper Date: Wed, 1 Nov 2023 18:57:04 -0700 Subject: [PATCH 13/14] Fix release finding regex in update_lastest_relase.sh --- scripts/update_latest_release.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/update_latest_release.sh b/scripts/update_latest_release.sh index 8b11434c..d947568b 100755 --- a/scripts/update_latest_release.sh +++ b/scripts/update_latest_release.sh @@ -3,7 +3,7 @@ # This script will push a local copy of the Solr, Sqlite and denormalized edge artifacts up to all # all copies of the bucket for a given release. It needs to be run from the root of the repo -export RELEASE=$(gsutil ls gs://data-public-monarchinitiative/monarch-kg-dev/latest/ | grep -Eo "(\d){4}-(\d){2}-(\d){2}") +export RELEASE=$(gsutil ls gs://data-public-monarchinitiative/monarch-kg-dev/latest/ | grep -o '[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}') echo "Updating Solr, SQLite and denormalized edge files for $RELEASE" gsutil cp output/monarch-kg.db.gz gs://monarch-archive/monarch-kg-dev/$RELEASE/ From c47dc45640dcde83a5d6d5c06c50ce766f4f6982 Mon Sep 17 00:00:00 2001 From: Kevin Schaper Date: Wed, 1 Nov 2023 18:59:37 -0700 Subject: [PATCH 14/14] Fix upload locations in upload_latest_release.sh --- scripts/update_latest_release.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/update_latest_release.sh b/scripts/update_latest_release.sh index d947568b..9ab68d1a 100755 --- a/scripts/update_latest_release.sh +++ b/scripts/update_latest_release.sh @@ -11,4 +11,4 @@ gsutil cp output/monarch-kg-denormalized-edges.tsv.gz gs://monarch-archive/monar gsutil cp output/solr.tar.gz gs://monarch-archive/monarch-kg-dev/$RELEASE/ gsutil cp "gs://monarch-archive/monarch-kg-dev/$RELEASE/*.gz" gs://data-public-monarchinitiative/monarch-kg-dev/$RELEASE/ -gsutil cp "gs://monarch-archive/monarch-kg-dev/$RELEASE/*.gz" gs://monarch-archive/monarch-kg/latest/ +gsutil cp "gs://monarch-archive/monarch-kg-dev/$RELEASE/*.gz" gs://data-public-monarchinitiative/monarch-kg-dev/latest/