-
Notifications
You must be signed in to change notification settings - Fork 36
Enhancing a CSV that describes another CSV's headers
Sometimes, a CSV is used to store metadata of another CSV's headers:
e.g., manual/enviro-reports-and-indicators.csv
has the data:
ID No.,Title,Organization,Year
16,City of Bowie State of the Environment Report,Department of Planning and Economic Development,2009
and manual/definitions-of-fields.csv
:
Column Heading,Definition
ID No.,A unique reference number to facilitate the identification of resources.
The first column of the data CSV is named
http://logd.tw.rpi.edu/source/epa-gov-mcmahon-ethan/dataset/environmental-reports/vocab/enhancement/1/id_no
while the first row of the metadata CSV is referring to the header (and resulting predicate) of the data csv.
The objective is to name the subjects of the metadata CSV rows to match the predicates created during conversion of the data CSV, which can be done with the following enhancement (see Using template variables to construct new values):
conversion:domain_template "[/sd]vocab/enhancement/[e]/[#1]";
Renames the subjects in the metadata CSV conversion to:
http://logd.tw.rpi.edu/source/epa-gov-mcmahon-ethan/dataset/environmental-reports/vocab/enhancement/1/ID_No
The only mismatch is the case of the characters, so we can use the Codebook Enhancement to make the input look different:
conversion:enhance [
ov:csvCol 1;
conversion:interpret [
conversion:symbol "ID No.";
conversion:interpretation "id_no";
];
This pattern can also be used to describe datasets, as can be seen when enhancing data.gov's Dataset 92 (their data catalog for all other datasets).