A ColDP archive for the vespid wasps in Catalogue of Life, originally created by ZOBODAT.
While under testing the dataset is indexed here: https://www.dev.checklistbank.org/dataset/1037
References were normalised, taking down the numbers from 2037 to just 113 unique references.
We keep the data as Tab delimited files (TSV) that together represent a ColDP archive. Editing of TSV files can be done locally with simple text editors (e.g. Sublime with the rainbow CSV package) but also spreadsheet applications and tools like OpenRefine or online editors like Datablist
The schema definition in schema-pg.sql
allows for simple import/export of the TSV files into a local postgres database.
Larger changes to the data can then be done in a more controlled environment with relational constraints.
Loading:
\copy reference from 'Reference.tsv' NULL AS ''
\copy name_usage from 'NameUsage.tsv' NULL AS ''
Writing:
\copy (SELECT id, doi, type, author, editor, issued, title, container_author, container_title, volume, issue, edition, page, publisher, publisher_place, collection_title, collection_editor, link, accessed, version, isbn, issn, remarks, citation FROM reference) to 'Reference.tsv' NULL AS ''
\copy (SELECT id, parent_id, basionym_id, status, rank, scientific_name, authorship, name_status, name_reference_id, name_published_id_year, name_published_id_page, name_published_id_page_link, reference_id, scrutinizer, scrutinizer_date, extinct, temporal_range_start, temporal_range_end, remarks FROM name_usage) to 'NameUsage.tsv' NULL AS ''
Important: After writing, the previously existing header line needs to be added manually at the top!