Skip to content

Commit

Permalink
Merge pull request #559 from monarch-initiative/update-prefixes
Browse files Browse the repository at this point in the history
Rename GOA to GO
  • Loading branch information
glass-ships authored Jan 22, 2024
2 parents 06a27d9 + 077d560 commit 9fd9880
Show file tree
Hide file tree
Showing 12 changed files with 771 additions and 760 deletions.
12 changes: 6 additions & 6 deletions docs/Sources/goa.md → docs/Sources/go.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Gene Ontology Annotation (GOA) Database
# Gene Ontology (GO) Annotation Database

The Gene Ontology Annotation Database compiles high-quality [Gene Ontology (GO)](http://www.geneontology.org/) annotations to proteins in the [UniProt Knowledgebase (UniProtKB)](https://www.uniprot.org/), RNA molecules from [RNACentral](http://rnacentral.org/) and protein complexes from the [Complex Portal](https://www.ebi.ac.uk/complexportal/home).

Expand Down Expand Up @@ -44,7 +44,7 @@ __**Associations**__
* negated
* has_evidence
* aggregating_knowledge_source (["infores:monarchinitiative"])
* primary_knowledge_source (infores:goa)
* primary_knowledge_source

OR

Expand All @@ -56,7 +56,7 @@ OR
* negated
* has_evidence
* aggregating_knowledge_source (["infores:monarchinitiative"])
* primary_knowledge_source (infores:goa)
* primary_knowledge_source

* **biolink:MacromolecularMachineToBiologicalProcessAssociation**:
* id (random uuid)
Expand All @@ -66,7 +66,7 @@ OR
* negated
* has_evidence
* aggregating_knowledge_source (["infores:monarchinitiative"])
* primary_knowledge_source (infores:goa)
* primary_knowledge_source

* **biolink:MacromolecularMachineToCellularComponentAssociation**:
* id (random uuid)
Expand All @@ -76,7 +76,7 @@ OR
* negated
* has_evidence
* aggregating_knowledge_source (["infores:monarchinitiative"])
* primary_knowledge_source (infores:goa)
* primary_knowledge_source

__**Possible Additional Gene to Gene Ontology Term Association?**__

Expand All @@ -88,7 +88,7 @@ __**Possible Additional Gene to Gene Ontology Term Association?**__
* negated
* has_evidence
* aggregating_knowledge_source (["infores:monarchinitiative"])
* primary_knowledge_source (infores:goa)
* primary_knowledge_source

## Citation

Expand Down
2 changes: 1 addition & 1 deletion mkdocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ nav:
- CTD: 'Sources/ctd.md'
# - Dictybase: 'Sources/dictybase.md'
# - Flybase: 'Sources/flybase.md'
- GOA: 'Sources/goa.md'
- GO: 'Sources/go.md'
- HGNC: 'Sources/hgnc.md'
- HPOA: 'Sources/hpoa.md'
# - MGI: 'Sources/mgi.md'
Expand Down
1,368 changes: 693 additions & 675 deletions poetry.lock

Large diffs are not rendered by default.

58 changes: 29 additions & 29 deletions src/monarch_ingest/download.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -194,77 +194,77 @@
local_name: data/flybase/entity_publication_fb.tsv.gz
tag: flybase_publication_to_gene

### GOA
### GO

# Homo sapiens (human)
- url: http://current.geneontology.org/annotations/goa_human.gaf.gz
local_name: data/goa/9606.go_annotations.gaf.gz
tag: goa_go_annotation
local_name: data/go/9606.go_annotations.gaf.gz
tag: go_annotation

# Mus musculus (house mouse)
- url: http://current.geneontology.org/annotations/mgi.gaf.gz
local_name: data/goa/10090.go_annotations.gaf.gz
tag: goa_go_annotation
local_name: data/go/10090.go_annotations.gaf.gz
tag: go_annotation

# Rattus norvegicus (Norway rat)
- url: http://current.geneontology.org/annotations/rgd.gaf.gz
local_name: data/goa/10116.go_annotations.gaf.gz
tag: goa_go_annotation
local_name: data/go/10116.go_annotations.gaf.gz
tag: go_annotation

# Canis lupus familiaris (dog)
- url: http://current.geneontology.org/annotations/goa_dog.gaf.gz
local_name: data/goa/9615.go_annotations.gaf.gz
tag: goa_go_annotation
local_name: data/go/9615.go_annotations.gaf.gz
tag: go_annotation

# Bos taurus (cow)
- url: http://current.geneontology.org/annotations/goa_cow.gaf.gz
local_name: data/goa/9913.go_annotations.gaf.gz
tag: goa_go_annotation
local_name: data/go/9913.go_annotations.gaf.gz
tag: go_annotation

# Sus scrofa (pig)
- url: http://current.geneontology.org/annotations/goa_pig.gaf.gz
local_name: data/goa/9823.go_annotations.gaf.gz
tag: goa_go_annotation
local_name: data/go/9823.go_annotations.gaf.gz
tag: go_annotation

# Gallus gallus (chicken)
- url: http://current.geneontology.org/annotations/goa_chicken.gaf.gz
local_name: data/goa/9031.go_annotations.gaf.gz
tag: goa_go_annotation
local_name: data/go/9031.go_annotations.gaf.gz
tag: go_annotation

# Danio rerio (Zebrafish)
- url: http://current.geneontology.org/annotations/zfin.gaf.gz
local_name: data/goa/7955.go_annotations.gaf.gz
tag: goa_go_annotation
local_name: data/go/7955.go_annotations.gaf.gz
tag: go_annotation

# Drosophila melanogaster (fruit fly)
- url: http://current.geneontology.org/annotations/fb.gaf.gz
local_name: data/goa/7227.go_annotations.gaf.gz
tag: goa_go_annotation
local_name: data/go/7227.go_annotations.gaf.gz
tag: go_annotation

# Caenorhabditis elegans (nematodes)
- url: http://current.geneontology.org/annotations/wb.gaf.gz
local_name: data/goa/6239.go_annotations.gaf.gz
tag: goa_go_annotation
local_name: data/go/6239.go_annotations.gaf.gz
tag: go_annotation

# Dictyostelium discoideum
- url: http://current.geneontology.org/annotations/dictybase.gaf.gz
local_name: data/goa/44689.go_annotations.gaf.gz
tag: goa_go_annotation
local_name: data/go/44689.go_annotations.gaf.gz
tag: go_annotation

# Various species in the Aspergillus genus
- url: http://current.geneontology.org/annotations/aspgd.gaf.gz
local_name: data/goa/5052.go_annotations.gaf.gz
tag: goa_go_annotation
local_name: data/go/5052.go_annotations.gaf.gz
tag: go_annotation

# Saccharomyces cerevisiae (baker's yeast)
- url: http://current.geneontology.org/annotations/sgd.gaf.gz
local_name: data/goa/4932.go_annotations.gaf.gz
tag: goa_go_annotation
local_name: data/go/4932.go_annotations.gaf.gz
tag: go_annotation

# Schizosaccharomyces pombe
- url: http://current.geneontology.org/annotations/pombase.gaf.gz
local_name: data/goa/4896.go_annotations.gaf.gz
tag: goa_go_annotation
local_name: data/go/4896.go_annotations.gaf.gz
tag: go_annotation

# HGNC
-
Expand Down
4 changes: 2 additions & 2 deletions src/monarch_ingest/ingests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ dictybase_gene:
config: 'ingests/dictybase/gene.yaml'
dictybase_gene_to_phenotype:
config: 'ingests/dictybase/gene_to_phenotype.yaml'
goa_go_annotation:
config: 'ingests/goa/go_annotation.yaml'
go_annotation:
config: 'ingests/go/annotation.yaml'
hgnc_gene:
config: 'ingests/hgnc/gene.yaml'
hpoa_disease_to_phenotype:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,11 @@
Gene to GO term Associations
(to MolecularActivity, BiologicalProcess and CellularComponent)
"""
from typing import List
import uuid

from koza.cli_runner import get_koza_app

from monarch_ingest.ingests.goa.goa_utils import (
from monarch_ingest.ingests.go.annotation_utils import (
parse_identifiers,
get_biolink_classes,
lookup_predicate,
Expand All @@ -18,7 +17,7 @@
from loguru import logger


koza_app = get_koza_app("goa_go_annotation")
koza_app = get_koza_app("go_annotation")

# for row in koza_app.source: # doesn't play nice with tests
while (row := koza_app.get_row()) is not None:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: 'goa_go_annotation'
name: 'go_annotation'

format: 'csv' # is 'gaf' specifically recognized, or does it need to be specifically recognized?
delimiter: '\t'
Expand All @@ -8,21 +8,21 @@ comment_char: '!'
files:
# Need to filter out 5052 - Contains data from multiple species of the
# genus Aspergillus, taxon:5052, not just Aspergillus nidulans
- './data/goa/4932.go_annotations.gaf.gz'
- './data/goa/4896.go_annotations.gaf.gz'
- './data/goa/5052.go_annotations.gaf.gz'
- './data/goa/6239.go_annotations.gaf.gz'
- './data/goa/7227.go_annotations.gaf.gz'
- './data/goa/7955.go_annotations.gaf.gz'
- './data/goa/9031.go_annotations.gaf.gz'
- './data/goa/9606.go_annotations.gaf.gz'
- './data/goa/9615.go_annotations.gaf.gz'
- './data/goa/9823.go_annotations.gaf.gz'
- './data/goa/9913.go_annotations.gaf.gz'
- './data/goa/10090.go_annotations.gaf.gz'
- './data/goa/10116.go_annotations.gaf.gz'
- './data/goa/44689.go_annotations.gaf.gz'
# - './data/goa/162425.go_annotations.gaf.gz'
- './data/go/4932.go_annotations.gaf.gz'
- './data/go/4896.go_annotations.gaf.gz'
- './data/go/5052.go_annotations.gaf.gz'
- './data/go/6239.go_annotations.gaf.gz'
- './data/go/7227.go_annotations.gaf.gz'
- './data/go/7955.go_annotations.gaf.gz'
- './data/go/9031.go_annotations.gaf.gz'
- './data/go/9606.go_annotations.gaf.gz'
- './data/go/9615.go_annotations.gaf.gz'
- './data/go/9823.go_annotations.gaf.gz'
- './data/go/9913.go_annotations.gaf.gz'
- './data/go/10090.go_annotations.gaf.gz'
- './data/go/10116.go_annotations.gaf.gz'
- './data/go/44689.go_annotations.gaf.gz'
# - './data/go/162425.go_annotations.gaf.gz'

filters:
- inclusion: 'include'
Expand All @@ -48,16 +48,12 @@ filters:
- 'taxon:227321'
- 'taxon:559292'

metadata:
ingest_title: 'GOA DB'
ingest_url: 'http://geneontology.org/'
description: 'Gene Ontology Annotations Database'
rights: 'http://geneontology.org/docs/go-citation-policy/'
metadata: !include ./src/monarch_ingest/ingests/go/metadata.yaml

global_table: './src/monarch_ingest/translation_table.yaml'

# Evidence Code to ECO term mappings file
local_table: './src/monarch_ingest/ingests/goa/gaf-eco-mapping.yaml'
local_table: './src/monarch_ingest/ingests/go/gaf-eco-mapping.yaml'

# http://geneontology.org/docs/go-annotation-file-gaf-format-2.2/
columns:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Some Gene Ontology Annotation ingest utility functions.
"""
from re import sub, IGNORECASE, compile, Pattern
from typing import Any, Optional, Tuple, List, Dict
from typing import Optional, Tuple, List, Dict

from loguru import logger

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name: 'goa'
name: 'GO'

dataset_description:
ingest_title: 'GOA DB'
ingest_title: 'GO'
ingest_url: 'http://geneontology.org/'
description: 'Gene Ontology Annotations Database'
rights: 'http://geneontology.org/docs/go-citation-policy/'
20 changes: 9 additions & 11 deletions src/monarch_ingest/utils/ingest_utils.py
Original file line number Diff line number Diff line change
@@ -1,30 +1,29 @@
import os, pkgutil
import os
import pkgutil
from pathlib import Path
import yaml

from koza.io.yaml_loader import UniqueIncludeLoader


def get_ingests():
return yaml.safe_load(pkgutil.get_data("monarch_ingest", 'ingests.yaml'))
return yaml.safe_load(pkgutil.get_data("monarch_ingest", "ingests.yaml"))


def get_ingest(source: str):
ingests = get_ingests
return yaml.load(
pkgutil.get_data("monarch_ingest", ingests[source]['config']), yaml.FullLoader
)
return yaml.load(pkgutil.get_data("monarch_ingest", ingests[source]["config"]), UniqueIncludeLoader)


def file_exists(file):
return (Path(file).is_file() and os.stat(file).st_size > 1000)
return Path(file).is_file() and os.stat(file).st_size > 1000


def ingest_output_exists(source, output_dir):
ingests = get_ingests()

ingest_config = yaml.load(
pkgutil.get_data("monarch_ingest", ingests[source]['config']), yaml.FullLoader
)

ingest_config = yaml.load(pkgutil.get_data("monarch_ingest", ingests[source]["config"]), UniqueIncludeLoader)

has_node_properties = "node_properties" in ingest_config
has_edge_properties = "edge_properties" in ingest_config

Expand All @@ -37,4 +36,3 @@ def ingest_output_exists(source, output_dir):
return False

return True

Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from biolink.pydanticmodel_v2 import Association
from loguru import logger

from monarch_ingest.ingests.goa.goa_utils import parse_identifiers
from monarch_ingest.ingests.go.annotation_utils import parse_identifiers


@pytest.mark.parametrize(
Expand Down Expand Up @@ -50,23 +50,23 @@ def source_name():
"""
:return: string source name of GO Annotations ingest
"""
return "goa_go_annotation"
return "go_annotation"


@pytest.fixture
def script():
"""
:return: string path to GO Annotations ingest script
"""
return "./src/monarch_ingest/ingests/goa/go_annotation.py"
return "./src/monarch_ingest/ingests/go/annotation.py"


@pytest.fixture(scope="package")
def local_table():
"""
:return: string path to Evidence Code to ECO term mappings file
"""
return "src/monarch_ingest/ingests/goa/gaf-eco-mapping.yaml"
return "src/monarch_ingest/ingests/go/gaf-eco-mapping.yaml"


@pytest.fixture
Expand Down Expand Up @@ -301,7 +301,7 @@ def test_rows():


@pytest.fixture
def basic_goa(mock_koza, source_name, test_rows, script, global_table, local_table):
def basic_go(mock_koza, source_name, test_rows, script, global_table, local_table):
"""
Mock Koza run for GO annotation ingest.
Expand Down Expand Up @@ -449,12 +449,12 @@ def basic_goa(mock_koza, source_name, test_rows, script, global_table, local_tab
}


def test_association(basic_goa):
if not len(basic_goa):
def test_association(basic_go):
if not len(basic_go):
logger.warning("test_association() null test?")
return

association = basic_goa[2]
association = basic_go[2]
assert association
assert association.subject in result_expected.keys()

Expand Down

0 comments on commit 9fd9880

Please sign in to comment.