Skip to content

Codebook enhancements

Timothy Lebo edited this page Feb 14, 2012 · 17 revisions
csv2rdf4lod-automation is licensed under the [Apache License, Version 2.0](https://github.com/timrdf/csv2rdf4lod-automation/wiki/License)

Codebook enhancements are specified by conversion:interpret, see that for an introduction, discussion, and examples.

See also conversion:Enhancement and Enhancement parameters.

The rest of this page discusses details not necessary for normal use.

Querying raw values from SPARQL endpoint and creating the symbol/interpretation pairs

Script: distinct-values-2-symbol-interps.pl

Doing the above, but less elegantly

Script: symbol interpretation.awk

Java implementation

edu.rpi.tw.data.csv.querylets.column.CodebookQuerylet is used to obtain any codes that should be applied according to the input parameters (it returns a hashmap of java:String to sesame:Value). When processing bindings, the CodebookQuerylet prints something similar to stderr (using the example at Enhancing a CSV that describes another CSV's headers):

CodebookQuerylet(1) .ID No.. -> ."id_no".

edu.rpi.tw.data.csv.impl.ValueHandlerFactory uses CodebookQuerylet to obtain the codes and pass them when instantiating the ValueHandler for a column.

edu.rpi.tw.data.csv.CSVtoRDF#visit passes the ValueHandler the value of the CSV cell (after an optional conversion:delimit_object)

Clone this wiki locally