Data Conversion

This is the first functionality to be implemented within the service. This functionality let the user create Linked Data from his own raw data.

Process

Unfortunately, we are not able yet to automatically generate Linked Data, therefore the service needs to be helped to convert the generic data to Linked Data. Below is an illustration of the process to convert data to linked data.

Sequence diagram Converting data

So a quick explanation of the reasoning of the above image.

Load data set - The application needs to be able to interact with the data
Classify columns - The application needs to know which columns contain URI's and which contain Literals
Link data - The application needs to know the relation between columns/classes
Download result - The user is able to download and/or publish his data set

When comparing this functionality with the original software, (Open Refine), with the Google refine extension the classification and linking are put within one interface. The choice to use divide the steps within the application is in order to be able to enforce that literals are not able to be a subject within an ontology.

OpenRefine interface

Openrefine is developed for handling and transforming large amounts of data. The Google Refine extension supplies you with a way to create a RDF-skeleton. Within this view, you need to create links in a row based structure. A better way to represent the structure is to use a linked graph.

Structure

The data conversion component exists of one main component and four sub-components

DataCreation

The DataCreation component is responsible for handling the data which needs to persist and converting the data to formats its sub-components use.

Design

The Data Creation component exists of a tab view populated with the four steps the user takes. The user is not able to click on a tab to navigate between stages completion because if the data changes the application cannot guarantee that the next steps are valid. The tabs give the user an idea of how far along he is in the process.

The viewable part of the compents

Data input

The first part of the conversion step is loading the data into the web-app. The design for the UI is simple, as we expect a table structure of data uploaded we present the user with the data the app received. There were some issues with the alignment of the table and the headers when uploading a dataset with a lot of columns. This is currently resolved by setting a set width to the columns.

Use cases

The user is able to select a CSV-file from his computer in order to initiate the data-conversion
The user is able to see a visual representation of the data he is going to convert to make sure that the data will be processed correctly

Rules

The user is not allowed to continue when he has no or an invalid file selected |

Structure

The Data Input component has two sub-components

Interaction bar
Table representation

The interaction bar contains two buttons; one where the user is able to select the file to upload and one to continue the process

The Table representation shows the user the data from the file he has selected. If no file is selected, it will simply display No data loaded

Design

Representation of the Data Import component

Format

As data exists in many forms the application only accepts CSV-files as the application is tailored to accompany the assignment. For parsing the CSV-files the library baby-parse is used. Below is an example how babyparse is used within the event handler when a user picks a file.

import Baby from 'babyparse';
handleFileChange(event) {    
  const reader = new FileReader();
  reader.addEventListener('load', () => {
    Baby.parse(reader.result);
    this.props.setData(Baby.parse(reader.result).data, filename);
    });
  reader.readAsText(event.target.files[0], 'UTF-8');
}

Data classification

The second part of the data conversion is to classify the data types. When the user has done this we know which columns contain URI's and which columns contain literal values.

Data input

RDF-PAQT is the result of the bachelor thesis of Gerwin Bosch commissioned by the Kadaster

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Conversion

Process

Structure

Design

Data input

Use cases

Rules

Structure

Design

Format

Data classification

Data input

Clone this wiki locally