From ec4fed27a6e8758f5be38b4fcb50f47e407a223b Mon Sep 17 00:00:00 2001 From: Damien Goutte-Gattat Date: Thu, 9 Jan 2025 12:46:28 +0000 Subject: [PATCH] Update documentation. Update the documentation about bridge files and collected/composite ontologies to reflect the recent changes in how those products are managed. --- docs/bridges.md | 88 ++++++++++++++++++++++++++++++++--- docs/combined_multispecies.md | 43 +++++++++++------ 2 files changed, 110 insertions(+), 21 deletions(-) diff --git a/docs/bridges.md b/docs/bridges.md index 1839b2c200..fe21fc63dd 100644 --- a/docs/bridges.md +++ b/docs/bridges.md @@ -157,16 +157,17 @@ current source of truth. | `uberon-bridge-to-go.owl` | [Gene ontology](http://obofoundry.org/ontology/go) (GO) | GO xrefs in Uberon | | `uberon-bridge-to-hao.owl` | [Hymenoptera anatomy ontology](http://obofoundry.org/ontology/hao) (HAO) | HAO xrefs in Uberon | | `uberon-bridge-to-hba.owl` | [Human brain atlas](https://human.brain-map.org/) (HBA) | HBA xrefs in Uberon | -| `uberon-bridge-to-hsapdv.owl` | [Human developmental stages](http://obofoundry.org/ontology/hsapdv) (HsapDv) | HsapDv xrefs in Uberon | +| `uberon-bridge-to-hsapdv.owl` | [Human developmental stages](http://obofoundry.org/ontology/hsapdv) (HsapDv) | [SSLSO-maintained](https://github.com/obophenotype/developmental-stage-ontologies) mapping set | | `uberon-bridge-to-kupo.owl` | [Kidney and urinary pathway ontology](https://jbiomedsem.biomedcentral.com/articles/10.1186/2041-1480-2-S2-S7) (KUPO) | KUPO xrefs in Uberon | | `uberon-bridge-to-ma.owl` | [Mouse adult gross anatomy](http://obofoundry.org/ontology/ma) (MA) | MA xrefs in Uberon | | `uberon-bridge-to-mba.owl` | [Mouse brain atlas](https://mouse.brain-map.org/) (MBA) | Externally provided custom bridge, maintained [here](https://github.com/brain-bican/mouse_brain_atlas_ontology) | -| `uberon-bridge-to-mmusdv.owl` | [Mouse developmental stages](http://obofoundry.org/ontology/mmusdv) (MmusDv) | MmusDv xrefs in Uberon | +| `uberon-bridge-to-mmusdv.owl` | [Mouse developmental stages](http://obofoundry.org/ontology/mmusdv) (MmusDv) | [SSLSO-maintained](https://github.com/obophenotype/developmental-stage-ontologies) mapping set | | `uberon-bridge-to-ncit.owl` | [NCI thesaurus OBO edition](http://obofoundry.org/ontology/ncit) (NCIT) | NCIT xrefs in Uberon | | `uberon-bridge-to-oges.owl` | ? | OGES xrefs in Uberon | | `uberon-bridge-to-pba.owl` | Primate brain atlas (PBA) | PBA xrefs in Uberon | -i| `uberon-bridge-to-sctid.owl` | SNOMED CT (SCTID) | SCTID xrefs in Uberon | +| `uberon-bridge-to-sctid.owl` | SNOMED CT (SCTID) | SCTID xrefs in Uberon | | `uberon-bridge-to-spd.owl` | [Spider ontology](http://obofoundry.org/ontology/spd) (SPD) | SPD xrefs in Uberon | +| `uberon-bridge-to-sslso.owl` | [Species-specific life stages ontology](https://github.com/obophenotype/developmental-stage-ontologies) (SSLSO) | [SSLSO-maintained](https://github.com/obophenotype/developmental-stage-ontologies) mapping set | | `uberon-bridge-to-tads.owl` | [Tick anatomy ontology](http://obofoundry.org/ontology/tads) (TADS) | TADS xrefs in Uberon | | `uberon-bridge-to-tgma.owl` | [Mosquito gross anatomy ontology](http://obofoundry.org/ontology/tgma) (TGMA) | TGMA xrefs in Uberon | | `uberon-bridge-to-wbbt.owl` | [C. elegans gross anatomy ontology](http://obofoundry.org/ontology/wbbt) (WBbt) | WBbt xrefs in Uberon | @@ -223,10 +224,10 @@ Makefile is located). Here is what happens then: were derived from the cross-references of foreign ontologies (step 2), and the sets that we obtained directly from foreign ontologies (step 3). -5. Production of the bridges. We apply a large SSSOM/T-OWL ruleset (in - `tmp/bridges.rules`, derived from a M4 source in `bridges/rules.m4`) - to produce bridging axioms from the mappings in the combined mapping - set obtained at step 4. +5. Production of the bridges. We apply a large SSSOM/T-OWL ruleset + (derived from `bridge/bridges.rules`, see below for more details about + that ruleset) to produce bridging axioms from the mappings in the + combined mapping set obtained at step 4. 6. Dispatching into individual bridges. The axioms produced at step 5 are written to different bridges depending on the ontologies they are bridging (e.g., an axiom that bridges a Uberon term and a ZFA term @@ -243,3 +244,76 @@ previously committed to the repository. Uberon maintainers should run `make refresh-bridges` periodically to refresh external resources and commit refreshed versions to the repository, similarly to what is needed to refresh the imports. + +### The bridging SSSOM/T-OWL ruleset + +The SSSOM/T-OWL ruleset mentioned in step 5 above is where most of the +logic for the production of bridge files is described. For a general +overview of the SSSOM/T-OWL language, please refer to the corresponding +[documentation](https://incenp.org/dvlpt/sssom-java/sssom-ext/sssomt-owl.html) +in the SSSOM-Java project. + +Because the ruleset is highly repetitive, with many almost identical +rules for every species we need to bridge to, it is partially +automatically generated from a taxa list found in the `config/taxa.yaml` +file. + +An entry in the taxa list should look like the following: + +```yaml +taxon_id: NCBITaxon:7227 +label: D melanogaster +bridging: + - prefix: FBbt + namespace: http://purl.obolibrary.org/obo/FBbt_ + label: D melanogaster + name: fbbt + - prefix: FBdv + namespace: http://purl.obolibrary.org/obo/FBdv_ + label: D melanogaster + name: fbdv +``` + +* `taxon_id` is the NCBI taxonomic identifier for the species we are + bridging to. +* `label` is the human-readable name of the taxon. +* `bridging` is the list of bridges (there may be more than one, as in + example above) for that taxon. + +Each bridge is in turn described by: + +* the `prefix` identifying the terms the bridge should include; +* the corresponding `namespace` the prefix expands to; +* the human-readable `label` of the taxon that should be appended to the + label of each bridged term to form the “OBO Foundry Unique Label”; +* the `name` of the bridge, which will be used to form the name of the + bridge file (`uberon-bridge-to-NAME.owl`). + +So, the example above states that `FBbt:*` terms (terms in the +`http://purl.obolibrary.org/obo/FBbt_` namespace) mapped to Uberon terms +should yield bridging axioms in a `uberon-bridge-to-fbbt.owl` bridge +file, while `FBdv:*` terms (in the +`http://purl.obolibrary.org/obo/FBdv_` namespace) mapped to Uberon terms +should yield bridging axioms in a `uberon-bridge-to-fbdv.owl` file. In +all cases, the bridged terms should be annotated with a “OBO Foundry +Unique Label” annotation of the form “Uberon label (D melanogaster)”. + +Note that: + +* If a bridge does not have an explicit `namespace`, a default namespace + of `http://purl.obolibrary.org/obo/PREFIX_` is used. +* If a bridge does not have an explicit `label`, the top-level `label` + for the taxon is used. +* If a bridge does not have an explicit `name`, the default name is the + lowercase version of the prefix. + +Those default rules mean that the example above can be written more +simply as: + +```yaml +taxon_id: NCBITaxon:7227 +label: D melanogaster +bridging: + - prefix: FBbt + - prefix: FBdv +``` diff --git a/docs/combined_multispecies.md b/docs/combined_multispecies.md index ebcdc02b7f..b7efe76a12 100644 --- a/docs/combined_multispecies.md +++ b/docs/combined_multispecies.md @@ -45,8 +45,8 @@ Uberon defines several collected ontologies for different taxonomic levels. The custom [Uberon Makefile](https://github.com/obophenotype/uberon/blob/master/src/ontology/uberon.Makefile), in its “Composite pipeline” section, is the definitive source of truth -for the various collected ontologies that are available, but as of April -2024 the list is as follows (for simplicity, bridge files are not +for the various collected ontologies that are available, but as of +January 2025 the list is as follows (for simplicity, bridge files are not mentioned): | Product | Components | @@ -65,13 +65,13 @@ mentioned): | collected-tetrapod | collected-amniote + collected-xenopus | | collected-vertebrate | collected-tetrapod + collected-zebrafish | | collected-metazoan | collected-vertebrate + collected-drosophila + collected-worm + CEPH + CTENO + PORO | +| collected-lifestages | Uberon’s life-stages subset + all available species-specific life stages ontologies | -Note that only `collected-metazoan` is regularly built and provided as a -release artifact, available through a permanent URL in both -[OBO](http://purl.obolibrary.org/obo/uberon/collected-metazoan.obo) and -[OWL](http://purl.obolibrary.org/obo/uberon/collected-metazoan.owl) -formats. Other products, if they are needed, must be built on demand -(see further below for instructions on how to do that). +Note that only `collected-metazoan`, `collected-vertebrate`, and +`collected-lifestages` are regularly built and provided as release +artifacts, available through permanent URLs in OBO, OWL, and +Obograph-JSON formats. Other products, if they are needed, must be built +on demand (see further below for instructions on how to do that). #### Advantages @@ -152,12 +152,10 @@ Because composite ontologies are derived from the collected ontologies, each collected ontology has a corresponding composite ontology. Therefore, you may refer to the list of collected ontologies above. -As for the collected ontologies, only `composite-metazoan` is built -regularly and provided as a pre-built artifact, available through a -permanent URL in both -[OBO](http://purl.obolibrary.org/obo/uberon/composite-metazoan.obo) and -[OWL](http://purl.obolibrary.org/obo/uberon/composite-metazoan.owl) -formats. Other products, if they are needed, must be built on demand. +As for the collected ontologies, only `composite-metazoan`, +`composite-vertebrate`, and `composite-lifestages` are built regularly +and provided as pre-built artifacts. Other products, if they are needed, +must be built on demand. #### Advantages and disadvantages @@ -212,3 +210,20 @@ To build a given composite ontology, simply run: ```sh sh run.sh make composite-.owl ``` + +## Adding a species to a collected/composite ontology + +(This is _not_ an exhaustive documentation, but intended as a rough +guide for future reference.) + +1. Add a “local mirror” for the species-specific ontology to be + included. Follow the examples of the Makefile rules for the existing + local mirrors. +2. Ensure that mappings between Uberon/CL terms and the species-specific + terms are available -- either maintained in Uberon directly, or + fetched from a remote source (likely the species-specific ontology). +3. Add a [bridge file](bridges.md) (or _several bridges_, if needed) for + the species-specific ontology. +4. Add the local mirror and any corresponding bridge file to the list of + source files in the `COLLECTED_xxx_SOURCES` variable (where `xxx` is + the name of the collected/composite ontology, e.g. `metazoan`).