Crowdsourcing biocuration: The Community Assessment of Community Annotation with Ontologies (CACAO) Figure Generation Code
The input files and code used to generate the graphical figures in the CACAO manuscript are provided here.
requirements.txt
has all versioned python packages used to generate the figures. Conda was used as the package manager.
- The
cacao_expanded_info.dat
file is a modified gpad that is a precursor to the final quality-checked file sent to GO. Additional taxon information, as well as various CACAO-specific fields, have their own added columns. Like a GPAD, it is a tab-delimited file.- The taxon information was retrieved using ete3.
- The
cacao_dcnt-tinfo.txt
anduniprot_dcnt-tinfo.txt
files are results from the GOATOOLS analysis. The descendant count (dcnt) values for GO terms used in CACAO were calculated here. - The
goa_uniprot_all_noiea_20200101.gaf
is provided, but can also be located in the GO Data Archive.
cacao_taxon_pie.py
generates the taxonomy pie chart.cacao_go_pie.py
generates the GO aspect pie chart.
cacao_dcnt.py
generates the descendant count (dcnt) box plot comparison.
- Code was formatted using Black