Skip to content

Commit

Permalink
Add pages on how to contribute a new data source (#78)
Browse files Browse the repository at this point in the history
* init data contributors section

* init subpages

* add page on contributing data using biothings sdk/x-bte method

* fix filename typo

* Update plover.md

* add link from data-contributor guide to roadmap

* add data-contributor-guide links to roadmap

* Merged and slight restructuring of links to turnkey option pages (which were moved into the development-guide/tutorials folder but their content was otherwise left intact)

* External link to Plater; add a bit more descriptive text to the main tutorial page.

---------

Co-authored-by: Andrew Su <asu@scripps.edu>
Co-authored-by: Andrew Su <andrew.su@gmail.com>
Co-authored-by: Amy Glen <49423686+amykglen@users.noreply.github.com>
Co-authored-by: RichardBruskiewich <richard.bruskiewich@delphinai.com>
  • Loading branch information
5 people authored Sep 26, 2024
1 parent 91fe95a commit c9633d3
Show file tree
Hide file tree
Showing 3 changed files with 47 additions and 11 deletions.
14 changes: 14 additions & 0 deletions docs/development-guide/tutorials/biothings-sdk.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# BioThings SDK

This method is for users who have a dataset of biomedical knowledge (ex: asserting relationships between two biomedical entities) and want to build a clean and high-performance API/web service that can be used both inside AND outside of the Translator formats/ecosystem.

## Workflow

1. use the [BioThings Software Development Kit (SDK)](https://docs.biothings.io/en/latest/index.html) to create the API (Python) - [specific instructions](https://github.com/biothings/biothings_explorer/blob/main/docs/README-contributing-new-data-source.md), [context](https://github.com/biothings/biothings_explorer/blob/main/docs/README-types-of-apis.md#biothings). For Translator, we highly prefer creating using an "association" format where each JSON document represents one relationship/edge between two biomedical entities and has `subject`, `association`, and `object` properties. The Service Provider team may help with development and deploying the API.
2. write an SmartAPI yaml with x-bte annotation to describe this API and its data - [specific instructions](https://github.com/biothings/biothings_explorer/blob/main/docs/README-writing-x-bte.md), [json-schema](https://github.com/NCATSTranslator/translator_extensions/tree/main/x-bte). The BioThings Explorer (BTE) tool can ingest this yaml and act as a wrapper, transforming Translator-format queries (TRAPI) into corresponding queries to your API and then transforming your API's responses into Translator-format responses (TRAPI).
* This involves knowledge of [biolink-model](https://github.com/biolink/biolink-model), the data model the Translator consortium uses.
* The use of Translator NodeNormalizer (NodeNorm) and NameResolver tools can also be very useful.
3. register this SmartAPI yaml in the [SmartAPI Registry](https://smart-api.info/), which is used by Translator to collect all tools in its ecosystem
4. contact the developer team for the BioThings Explorer (BTE) tool and ask them to add your API to their config of APIs to use. They will review your SmartAPI yaml and API and may ask for changes first.
5. Test that you can retrieve knowledge graph (KG) edges from your dataset using BTE (direct queries).
6. [Maintain your api and smartapi yaml](https://github.com/biothings/biothings_explorer/blob/main/docs/README-maintaining-a-data-source.md), which may involve continued communication and collaboration with BTE and Service Provider teams.
26 changes: 15 additions & 11 deletions docs/development-guide/tutorials/component_builders_roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,18 @@ Even with Phase 2 project teams, already familiar with the platform, the mainten

The purpose of this document is to provide a concrete one-stop, step-by-step road map about component design and implementation which serves all the above development needs.

# Resources

1. Turnkey Options for Knowledge Provider creation:
- [BioThings Service Provider](../../architecture/kp/service-provider.md)
- [PLOVER](https://github.com/RTXteam/PloverDB)
- [Plater](https://github.com/TranslatorSRI/Plater)
5. [Biolink Model](https://biolink.github.io/biolink-model/working-with-the-model/)
6. [Implementing TRAPI](https://github.com/NCATSTranslator/ReasonerAPI/tree/master/ImplementationGuidance)
7. [Specifying Workflows](workflows.md)
8. [Deploying a Translator Component](../../deployment-guide/index.md)
9. [Registering a TRAPI service](../../architecture/registry.md#adding-an-api-to-the-translator-smartapi-registry)
## Resources

For data owners who would like to make their data accessible within Translator as Knowledge Providers, there are several turnkey options to consider:

* [BioThings SDK](biothings-sdk.md): good for directly wrapping external 3rd party online databases that have non-TRAPI compliant web service API's.
* [Plater](https://github.com/TranslatorSRI/Plater): good for wrapping a local Neo4j database loaded with Biolink Model compliant knowledge graph(s).
* [PLOVER](plover.md): serves _in-memory_ hosted Biolink Model compliant datasets, without the complication of maintaining a backend graph database.

To better understand the various standards and component facets, or perhaps, roll-your-own Translator component from scratch, the following topics should be reviewed:

* [Biolink Model](https://biolink.github.io/biolink-model/working-with-the-model/)
* [Implementing TRAPI](https://github.com/NCATSTranslator/ReasonerAPI/tree/master/ImplementationGuidance)
* [Specifying Workflows](workflows.md)
* [Deploying a Translator Component](../../deployment-guide/index.md)
* [Registering a TRAPI service](../../architecture/registry.md#adding-an-api-to-the-translator-smartapi-registry)
18 changes: 18 additions & 0 deletions docs/development-guide/tutorials/plover.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Plover

Plover is a fully **in-memory** Python-based service for hosting/serving Biolink-compliant knowledge graphs as TRAPI Knowledge Provider APIs. It's entirely **Dockerized** and doesn't use any intermediary database.

In answering queries, Plover abides by all Translator Knowledge Provider **reasoning requirements**; it can also normalize the underlying graph and convert query node IDs to the proper equivalent identifiers for the given knowledge graph.

**Multiple KPs** can be run on the same Plover; each is exposed at its own endpoint (see [here](https://github.com/RTXteam/PloverDB/blob/newkg2/README.md#multiple-kps) for more info).

There are two things needed to serve a KG via Plover:
1. **Nodes** and **edges** files for the graph (TSV or JSON Lines format), hosted at a publicly-accessible URL (more info [here](https://github.com/RTXteam/PloverDB/blob/newkg2/README.md#nodes-and-edges-files))
2. A Plover **config file** for the graph (more info [here](https://github.com/RTXteam/PloverDB/blob/newkg2/README.md#config-file))

The steps to then **run/deploy** Plover differ slightly depending on what you're trying to do:
1. Run Plover in a **dev fashion** (see [here](https://github.com/RTXteam/PloverDB/blob/newkg2/README.md#how-to-run-a-dev-plover))
2. Deploy a new KP on an **existing Plover** (e.g., the Multiomics Plover) (see [here](https://github.com/RTXteam/PloverDB/blob/newkg2/README.md#how-to-deploy-a-new-kp-to-an-existing-plover))
3. Deploy Plover on a **new instance** (see [here](https://github.com/RTXteam/PloverDB/blob/newkg2/README.md#how-to-deploy-plover-on-a-new-instance))

Plover automatically stands up the required TRAPI `/query` and `/meta_knowledge_graph` endpoints, as well as an endpoint that exposes SRI test triples for the KP (`/sri_test_triples`). It also provides a few other dev-oriented endpoints, described [here](https://github.com/RTXteam/PloverDB/blob/newkg2/README.md#provided-endpoints).

0 comments on commit c9633d3

Please sign in to comment.