Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add get_sssom_mappings_by_curie to BioportalImplementation class #17

Merged
merged 3 commits into from
Apr 14, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
.idea
.vscode
__pycache__
.ipynb_checkpoints/
docs/_build/
Expand Down
18 changes: 9 additions & 9 deletions docs/concepts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Ontology Concepts
=================

Here we describe some of the over-arching concepts in this library. Note that distinct :ref:`datamodels` may impose
their own specific views of the world, but the concepts here are intended as a kind of lingua-franca
their own specific views of the world, but the concepts here are intended as a kind of lingua-franca.

Ontology
--------
Expand All @@ -16,15 +16,15 @@ paper from 2000, a collection of thousands of inter-related terms. However, the
very flexible and malleable, and might include things like:

* things that are more "schema-like", such as schema.org or PROV
* formal logical artefacts like BFO
* formal logical artifacts like BFO
* an "instance graph" for example of countries and their connections
* a knowledge base encoded in RDF
* the entirety of wikidata
* in OWL, an ontology is just a collection of axioms

We try to be as pluralistic as possible and provide a way to access all of the above using
the appropriate abstractions. However, the main community served in "classic" ontologies such
as those found in the OBO library or those encoded in OWL
the appropriate abstractions. However, the main community served is "classic" ontologies such
as those found in the OBO library or those encoded in OWL.

Ontology Element
----------------
Expand Down Expand Up @@ -55,7 +55,7 @@ When working with a specific datamodel these may be partitioned more strictly. F
- AnnotationProperties
- DatatypeProperties

The BasicOntologyInterface does not discriminate between different kinds of element. This can be confusing,
The BasicOntologyInterface does not discriminate between different kinds of elements. This can be confusing,
if you ask for all elements thinking you might get back only the "terms" but you would also get elements for
relationship types, subsets, etc.

Expand All @@ -70,7 +70,7 @@ what appears to be one ontology has pieces of other ontologies incorporated in.

This library is designed to handle all of these scenarios. In the BasicOntologyInterface, you don't have to worry about imports,
you just get a view where everything appears as if it were in a single ontologies (even this ontologies actually a combination of
ontologies). Other interfaces let you explore the compositional structure in more detail
ontologies). Other interfaces let you explore the compositional structure in more detail.

URIs and CURIEs and identifiers
-------------------------------
Expand All @@ -86,14 +86,14 @@ Most methods in the interfaces in this library accept CURIEs, but these can alwa
Prefix Maps
-----------

A prefix map maps between prefixes and their URI base expansions
A prefix map maps between prefixes and their URI base expansions.

Relationships / Edges
---------------------

.. note ::
It may seem surprising but the Owl standard has no construct that directly corresponds to what we call
It may seem surprising but the OWL standard has no construct that directly corresponds to what we call
a relationship here.
Mappings
Expand All @@ -107,7 +107,7 @@ Statements and Axioms

.. note ::
You only need to understand this if you are working with the OwlInterface or the RdfInterface
You only need to understand this if you are working with the OwlInterface or the RdfInterface.
Subsets
-------
Expand Down
46 changes: 42 additions & 4 deletions src/oaklib/implementations/bioportal/bioportal_implementation.py
Original file line number Diff line number Diff line change
@@ -1,22 +1,36 @@
import requests
import logging
from dataclasses import dataclass, field
from typing import Any, List, Dict, Union, Iterator, Iterable, Tuple
from typing import Any, Dict, Iterable, Iterator, List, Tuple, Union
from urllib.parse import quote

import requests
from oaklib.datamodels.text_annotator import TextAnnotation
from oaklib.interfaces.basic_ontology_interface import PREFIX_MAP
from oaklib.interfaces.search_interface import SearchInterface, SearchConfiguration
from oaklib.interfaces.mapping_provider_interface import MappingProviderInterface
from oaklib.interfaces.search_interface import SearchConfiguration, SearchInterface
from oaklib.interfaces.text_annotator_interface import TextAnnotatorInterface
from oaklib.types import CURIE
from oaklib.utilities.apikey_manager import get_apikey_value
from sssom import Mapping
from sssom.sssom_datamodel import MatchTypeEnum

REST_URL = "http://data.bioontology.org"

ANNOTATION = Dict[str, Any]

# See:
# https://www.bioontology.org/wiki/BioPortal_Mappings
# https://github.com/agroportal/project-management/wiki/Mappings
SOURCE_TO_PREDICATE = {
'CUI': 'skos:closeMatch',
'LOOM': 'skos:closeMatch',
'REST': 'skos:relatedMatch', # maybe??
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeh it's not super clear which skos property to use (other than SAME_URI), I think this is good enough to get started!

'SAME_URI': 'skos:exactMatch',
}


@dataclass
class BioportalImplementation(TextAnnotatorInterface, SearchInterface):
class BioportalImplementation(TextAnnotatorInterface, SearchInterface, MappingProviderInterface):
"""
Implementation over bioportal endpoint
Expand Down Expand Up @@ -115,4 +129,28 @@ def basic_search(self, search_term: str, config: SearchConfiguration = SearchCon
collection = obj['collection']


# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# Implements: MappingProviderInterface
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

def get_sssom_mappings_by_curie(self, curie: CURIE) -> Iterable[Mapping]:
[prefix, _] = curie.split(':', 2)
class_uri = quote(self.curie_to_uri(curie), safe='')
# This may return lots of duplicate mappings
# See: https://github.com/ncbo/ontologies_linked_data/issues/117
req_url = f'{REST_URL}/ontologies/{prefix}/classes/{class_uri}/mappings'
response = requests.get(req_url, headers=self._headers(), params={'display_links': 'false', 'display_context': 'false'})
body = response.json()
for result in body:
yield self.result_to_mapping(result)


def result_to_mapping(self, result: dict) -> Mapping:
mapping = Mapping(
subject_id=result['classes'][0]['@id'],
predicate_id=SOURCE_TO_PREDICATE[result['source']],
match_type=MatchTypeEnum.Unspecified,
object_id=result['classes'][1]['@id'],
mapping_provider=result['@type']
)
return mapping
30 changes: 17 additions & 13 deletions tests/test_implementations/test_bioportal.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,32 +2,36 @@
import logging
import unittest

from linkml_runtime.dumpers import yaml_dumper
from oaklib.implementations.bioportal.bioportal_implementation import BioportalImplementation

from tests import HUMAN, NEURON
from tests import DIGIT, HUMAN, NEURON

class TestBioportal(unittest.TestCase):

def setUp(self) -> None:
oi = BioportalImplementation()
self.has_apikey = True
try:
oi.load_bioportal_api_key()
except ValueError:
logging.info('Skipping bioportal tests, no API key set')
self.has_apikey = False
self.skipTest('Skipping bioportal tests, no API key set')
self.oi = oi

def test_text_annotator(self):
if self.has_apikey:
results = list(self.oi.annotate_text('hippocampal neuron from human'))
for ann in results:
logging.info(ann)
assert any(r for r in results if r.object_id == HUMAN)
assert any(r for r in results if r.object_id == NEURON)
results = list(self.oi.annotate_text('hippocampal neuron from human'))
for ann in results:
logging.info(ann)
assert any(r for r in results if r.object_id == HUMAN)
assert any(r for r in results if r.object_id == NEURON)


def test_search(self):
if self.has_apikey:
results = list(itertools.islice(self.oi.basic_search('tentacle pocket'), 20))
assert 'CEPH:0000259' in results
results = list(itertools.islice(self.oi.basic_search('tentacle pocket'), 20))
assert 'CEPH:0000259' in results


def test_mappings(self):
mappings = list(self.oi.get_sssom_mappings_by_curie(DIGIT))
for m in mappings:
print(yaml_dumper.dumps(m))
assert any(m for m in mappings if m.object_id == 'http://purl.obolibrary.org/obo/NCIT_C73791')