Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: make ro-crate-metadata.json a File and not a CreativeWork #394

Open
multimeric opened this issue Jan 17, 2025 · 3 comments
Open

Comments

@multimeric
Copy link
Contributor

The spec says:

The _RO-Crate JSON-LD_ MUST contain a self-describing
**RO-Crate Metadata File Descriptor** with
the `@id` value `ro-crate-metadata.json` (or `ro-crate-metadata.jsonld` in legacy
crates) and `@type` [CreativeWork]. This descriptor MUST have an [about]
property referencing the _Root Data Entity_, which SHOULD have an `@id` of `./`.

I find this unusual because:

  • File aka MediaObject is already a subclass of CreativeWork in Schema.org
  • It contradicts the directions on data entities:
    _Data Entities_ representing files MUST have `"File"` as a value for `@type`. `File` is an RO-Crate alias for <http://schema.org/MediaObject>. The term _File_ here is liberal, and includes "downloadable" resources where `@id` is an absolute URI.

Also, it's unclear if the spec is saying it can only have this one type, or if it must at least have this type. For example, can I have:

        {
          "@id": "ro-crate-metadata.json",
          "@type": [
            "File",
            "CreativeWork"
          ],
          "about": {
            "@id": "."
          },
          "conformsTo": {
            "@id": "https://w3id.org/ro/crate/1.1"
          }
        },
@elichad
Copy link
Contributor

elichad commented Jan 17, 2025

The Metadata Descriptor ro-crate-metadata.json is not considered a data entity. I'm not sure if we state this explicitly anywhere in the spec, but it is implicit - e.g. it is described outside the "Data Entities" section, and it is not linked to hasPart on the Root Data Entity (as is required for data entities, see linked line below). Perhaps we should add a note to 1.2 to make this clearer as it's a common misunderstanding.

Where files and folders are represented as _Data Entities_ in the RO-Crate JSON-LD, these MUST be linked to, either directly or indirectly, from the [Root Data Entity](root-data-entity) using the [hasPart] property. Directory hierarchies MAY be represented with nested [Dataset] _Data Entities_, or the Root Dataset MAY refer to files anywhere in the hierarchy using [hasPart].

Also, it's unclear if the spec is saying it can only have this one type, or if it must at least have this type. For example, can I have:

Yes, multiple types are allowed. This is a few lines further down in the spec:

In all cases, `@type` MAY be an array in order to also specify a more specific type, e.g. `"@type": ["File", "ComputationalWorkflow"]`

@multimeric
Copy link
Contributor Author

Great, thanks. I guess I'm wondering why it's not considered a File or data entity when it physically is a file inside the crate. I could envisage a scenario where a non-RO-Crate person is working with a GUI viewer that describes all of the files in a directory but they notice that there is also a file called ro-crate-metadata.json that doesn't show up as a file despite being a file on the filesystem.

@elichad
Copy link
Contributor

elichad commented Jan 17, 2025

Yeah, it's a slightly strange one. I believe it's to do with the fact that the metadata is not the data itself, but a description of it. Theoretically this doesn't mean it can't use File, but File is closely tied to data entities/payload files in the RO-Crate. (I actually am not sure if an entity being a File always means it's a data entity - I don't think it's explicit but it's often assumed, and it's just come up in another discussion too).

We have a similar thing with the ro-crate-preview.html and related files, where they may appear in the crate folder but aren't considered part of the payload. It's spelled out a bit more in 1.2-DRAFT than in 1.1:

The _RO-Crate Website_ is not considered a part of the RO-Crate, but serves as a way to make metadata available in a user-appropriate format. The `ro-crate-preview.html` file and the `ro-crate-preview_files/` directory and any contents SHOULD NOT be included in the `hasPart` property of the _Root Data Entity_ or any other `Dataset` entity within an RO-Crate.
Metadata about parts of the _RO-Crate Website_ MAY be included in an RO-Crate as in the following example. Metadata such as an `author` property, `dateCreated` or other provenance can be included, including details about the software that created them, as described in [Software used to create files](./provenance.html#software-used-to-create-files)).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants