The Research Object Composer is a web service that facilitates the creation of Research Objects, constrained to a pre-defined profile.
It uses JSON as an intermediary format for populating the Research Objects, and uses JSON Schemas (with some modifications) as "Profiles" to validate their correctness/completeness.
The Research Objects are serialized as zipped RO Crates/BagIt bags, and deposited in a pre-configured repository.
./mvnw spring-boot:run
To start your own instance using Docker Compose
docker-compose up -d
To run the Jupyter Notebook tutorial:
- Download this git repository or the introduction.ipynb file from GitHub (click raw to save)
- Install Anaconda (Python 3 variant)
- In the Anaconda Prompt or Terminal, create a new Conda environment for the RO Composer and start Jupyter Notebook
conda create -n rocomposer jupyter
conda activate rocomposer
jupyter notebook introduction.ipynb
To configure the Research Object Composer, please refer to the available properties, and the relevant Spring documentation for information on how to change them.
For example, the database username can be changed by setting the environment variable SPRING_DATASOURCE_USERNAME=joebloggs
Profiles are specified as JSON Schemas in public/schemas
, and are named <name>.schema.json
.
Schemas are automatically discovered and turned into Research Object Profiles when the application boots, and are available at /profiles/<name>
.
Schemas with filenames prefixed with an underscore _
are not automatically loaded as Research Object Profiles,
but can still be referenced by other schemas (i.e. using $ref
) and used in validation.
_base.schema.json
contains several definitions that are understood natively by the Research Object Composer
and should be referenced by any implementing schemas:
RemoteItem
- Metadata for a remote file, including its URL, file size, checksums etc.Metadata
- Metadata for the Research Object itself, including title, description, authors etc.
They can be referenced like so:
{
"properties" : {
"_metadata": {
"$ref": "/schemas/_base.schema.json#/definitions/Metadata"
}
}
}
$baggable
is a special schema keyword that tells the Research Object Composer which properties contain references to
remote files, and where they should be included in the bag. It is a simple map of <property name> : <relative path in the bag>
.
For example the following schema:
{
"$schema": "http://json-schema.org/draft-07/schema",
"type" : "object",
"$baggable": {
"data" : "/"
},
"properties" : {
"_metadata" : {
"$ref": "/schemas/_base.schema.json#/definitions/Metadata"
},
"data": {
"type": "array",
"items": {
"$ref": "/schemas/_base.schema.json#/definitions/RemoteItem"
}
}
},
"required": [
"data"
]
}
...tells the Research Object Composer to bag any items found under the data
property at the root of the bag (/
).
To be bagged correctly, a $baggable
property must contain an object that conforms to the RemoteItem
schema, or an array of objects that conform to that schema.
_metadata
is not a schema keyword, but a recommended property of the Research Object JSON. It contains a minimal set of
metadata about the Research Object itself, such as title, description, license, authors etc.
It should reference /schemas/_base.schema.json#/definitions/Metadata
.
The Research Object Composer will provide this metadata (and perform any necessary mapping) to the configured repository when depositing a Research Object.