-
Notifications
You must be signed in to change notification settings - Fork 1
Data Conversion
This is the first functionality to be implemented within the service. This functionality let the user create Linked Data from his own raw data.
Unfortunately, we are not able yet to automatically generate Linked Data, therefore the service needs to be helped to convert the generic data to Linked Data. Below is an illustration of the process to convert data to linked data.
So a quick explanation of the reasoning of the above image.
Load data set - The application needs to be able to interact with the data
Classify columns - The application needs to know which columns contain URI's and which contain Literals
Link data - The application needs to know the relation between columns/classes
Download result - The user is able to download and/or publish his data set
When comparing this functionality with the original software, (Open Refine), with the Google refine extension the classification and linking are put within one interface. The choice to use divide the steps within the application is in order to be able to enforce that literals are not able to be a subject within an ontology.
Openrefine is developed for handling and transforming large amounts of data. The Google Refine extension supplies you with a way to create a RDF-skeleton. Within this view, you need to create links in a row based structure. A better way to represent the structure is to use a linked graph.
The data conversion component exists of one main component and four sub-components
- DataCreation
The DataCreation component is responsible for handling the data which needs to persist and converting the data to formats its sub-components use.
The Data Creation component exists of a tab view populated with the four steps the user takes. The user is not able to click on a tab to navigate between stages completion because if the data changes the application cannot guarantee that the next steps are valid. The tabs give the user an idea of how far along he is in the process.
The first part of the conversion step is loading the data into the web-app. The design for the UI is simple, as we expect a table structure of data uploaded we present the user with the data the app received. There were some issues with the alignment of the table and the headers when uploading a dataset with a lot of columns. This is currently resolved by setting a set width to the columns.
- The user is able to select a CSV-file from his computer in order to initiate the data-conversion
- The user is able to see a visual representation of the data he is going to convert to make sure that the data will be processed correctly
- The user is not allowed to continue when he has no or an invalid file selected |
The Data Input component has two sub-components
- Interaction bar
- Table representation
The interaction bar contains two buttons; one where the user is able to select the file to upload and one to continue the process
The Table representation shows the user the data from the file he has selected. If no file is selected, it will simply display No data loaded
As data exists in many forms the application only accepts CSV-files as the application is tailored to accompany the assignment. For parsing the CSV-files the library baby-parse is used. Below is an example how babyparse is used within the event handler when a user picks a file.
import Baby from 'babyparse';
handleFileChange(event) {
const reader = new FileReader();
reader.addEventListener('load', () => {
Baby.parse(reader.result);
this.props.setData(Baby.parse(reader.result).data, filename);
});
reader.readAsText(event.target.files[0], 'UTF-8');
}
The second part of the data conversion is to classify the data types. When the user has done this we know which columns contain URI's and which columns contain literal values.
RDF-PAQT is the result of the bachelor thesis of Gerwin Bosch commissioned by the Kadaster