-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
It should be clear which version is a CSIP/SIP/AIP/DIP compatible with #703
Comments
Use cases:
|
That is usually achieved by having the profile pointer go to the correct version of the profile. So instead of pointing to the general one pointing to: https://earkcsip.dilcis.eu/profile/E-ARK-CSIP-v2-0-4.xml |
While that would be one way to mark the intended version this doesn't seem to be reliably possible at the moment as I can't find an equivalent versioned document URL for the profiles of v2.1.0 of the specifications? Only "archived" versions of the profiles of previous versions of the specifications get versions in their names but the current version would need to have that as well. It would be good to have that and a general requirement to use stable URLs as part of CSIP6. On the side of the profile itself, is it documented anywhere that the
The <URI LOCTYPE="URL" ASSIGNEDBY="local">https://earkcsip.dilcis.eu/profile/E-ARK-CSIP.xml</URI> This is equally the case for the vocabularies which I think should be thought of as bound to particular versions as well?
Some of the allowed values contain version strings, I assume the other entries at the beginning are all legacy and shouldn't be used for new packages? If additional (validation) resources are available (e.g. an XML schema for the CITS ERMS v2.1.0) the same considerations should apply I believe. |
In my opinion, pointing to patch version does not make much sense, but the specific version should be stated. https://www.loc.gov/standards/premis/v3/premis-v3-0.xsd The namespace is "http://www.loc.gov/premis/v3" which identifies the major version, and we assume schemas will maintain retrocompatibility within the major version. Also PREMIS states the version on an attribute. <xs:complexType name="premisComplexType">
<xs:sequence>
<xs:element ref="object" maxOccurs="unbounded"/>
<xs:element ref="event" minOccurs="0" maxOccurs="unbounded"/>
<xs:element ref="agent" minOccurs="0" maxOccurs="unbounded"/>
<xs:element ref="rights" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
<xs:attribute name="version" type="version3" use="required"/>
</xs:complexType>
<!--
************** version definition
-->
<xs:simpleType name="version3">
<xs:restriction base="xs:string">
<xs:enumeration value="3.0"/>
</xs:restriction>
</xs:simpleType> The same could be accomplished in METS profile, pointing to the METS profile major version and adding an attribute to state the specific version used. In the meantime @karinbredenberg solutions could be used, but as @prettybits states, we should published the latest version under its own version name and add information about the intended use in the specification. |
The specifications have a fair number of "moving parts" some of which are versioned, some of which are not. Even with the existing state of play, this causes problems as:
The official registry of METS profiles does not support profile versioning. See https://www.loc.gov/standards/mets/profile_docs/mets.profile.v2-0.html#related_profile. This means the profile has no convenient attribute to record the version. E-ARK uses the optional
Old versions of the IP specifications are archived and made available under versioned file names. Using the CSIP as an example:
This approach has been introduced as needed and is inconsistent, at least where the current version is concerned. The versioning of METS Profiles is made trickier by their lack of support for version numbering. Creating an extension to the profile schema to do so seems an extreme solution. Continuing to use the The URI/URL for the profiles should reflect the version number in the URL path, as should every other resource. This should be consistently applied across all URL forms unless there is a compelling reason not to do so. A proposed style for the URL is to version for the major and minor numbers only. PROPOSED CHANGES All profiles to use the root element ID attribute consistently:
Treatment of the leading The URI/URL for the profiles should reflect the version number in the URL path, as should every other resource. This should be consistently applied across all URL forms unless there is a compelling reason not to do so. A proposed style for the URL is to use the explicit version in the https://earkcsip.dilcis.eu/profile/v2/ will resolve to a directory listing showing all the current version 2 profiles. Each version of a profile will reside under the appropriate major version and include the full version number in the file name, e.g. below the v2 directory above:
ALL references to a profile must include a full version number to avoid ambiguous references. The supporting extension schema, vocabularies and vocabulary RNG schema will be versioned in an identical manner, e.g.
While we don't have different versions of these the changes will be made to retain consistency and open the possibility of future new versions. The extension schema namespace URI WILL NOT be versioned. The vocabularies will be versioned as follows:
and so on. For completeness, the underlying vocabulary RNG schema will be versioned as well. There are currently no plans to amend these. |
The issue is going to be discussed by the DILCIS Board |
Additional comments received from @prettybits during the Salamanca event. It should be clear exactly which version of METS Profiles was used. For this purpose, it would be good to have a standalone attribute for the METS Profile ID. The profiles should also always state the exact version number; a "latest" URL should not be used in a METS XML, as it will then be unclear - especially after a while - which version was used. |
Hi everyone, I read through all of the comments and I have to say that I'm a bit lost here. In my understading this topic has to do with choosing the correct package validation procedure. When processing an information package (IP) we need to parse the METS file placed in the root folder of the IP, and to acomplish that we need to know by which rules it should be validated. This means that the METS file should clearly identify the METS Profile it adheres to so that the right validator/parser is invoked by the validation process. The way this is done is by specifying the value of the In the following example found on the SIP GitHub repository (https://github.com/DILCISBoard/E-ARK-SIP/tree/rel/v2.1.1/examples) we can see that the profile that is identified to validate this METS file is <mets
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://www.loc.gov/METS/"
xmlns:xlink="http://www.w3.org/1999/xlink"
OBJID="a46ab3d0-c710-4d73-b58d-e93e30b53a80"
TYPE="ERMS"
xlink:CONTENTTYPESPECIFICATION="SMURFERMS"
PROFILE="http://www.dasboard.eu/specifications/sip/v03/METS.xml"
xsi:schemaLocation="http://www.loc.gov/METS/ schemas/mets.xsd"
LABEL="root level METS file for an IP">
<metsHdr CREATEDATE="2017-01-31T13:07:22.6970809+02:00" RECORDSTATUS="NEW" LASTMODDATE="2017-01-31T13:07:22.6970809+02:00">
... I would argue that this should be used as the token to identify the right METS parser. I don't think it should be necessary to get the linked METS profile and inspect its ID to get the token that we need. That being said, I believe that having an well defined @id attribute on the METS profile is a good idea and something we should work on, I just don't think it is mandatory to fulfil the original goal of this issue. The METS file generators out there should be enhanced to crearly identify the METS profile URL. The existing guides and specifications should stress this in their text. @luis100 Am I missing something here? |
I think you are missing some aspects. One is that profiles were not being published with the version explicit, they were renamed with the version only when archived (i.e. when a new version came out), so the URL for the latest version was always versionless, which creates issues. This is a the profile/spec publication part of the issue. Also, the METS Profile version and "METS parser", or generally the Specification version and Specification validator may not be one-to-one. There are changes in the specification and in the METS Profile that should be versioned but that don't necessarily create a new validator implementation. Generally, PATCH versions should not require changes in the validator implementation. The other issue is the "flavour" we are validating, are we validating CSIP, SIP, AIP, CITS-SIARD, CITS-ERMS? Which versions of which? We want to do an hierarchical validation structure, but how do we identify the versions consistently. For example, CITS-SIARD needs to use the PROFILE https://citssiard.dilcis.eu/profile/E-ARK-SIARD-ROOT.xml, which does not have a version, nor identifies if it is a SIP/AIP/DIP nor what spec version of those it is. Finally, a side note, all implementation suggestions here point to including this in the METS profile URL that is inside the METS file. Please note that knowing the profiles and versions is necessary to select the piece of code that will validate this file, so we will need to read the file twice, one for getting the package type and spec version (although this could be an empty file or malformed XML file), and another to actually do the validation. Any format detection tool (like Droid/PRONOM) will also need to use this information to identify the format and version of the file (if ZIPPED). |
@carlwilson please look at this |
@karinbredenberg In order to vote for an approach on this topic, we really need to nail down the options available for voting. We will not have time during the DILCIS Board meeting to come up with ways of addressing this issue. The options available for voting should be clear to everyone. |
The suggestion is in Carl's description: The MEtsprofiles filename are stright away named with the version number and a folder structure where the main number is the the folder for all subversions. The ID attribute in the mets element is used for all. SAme to be implemented for extension schema and vocabularies. |
The suggestion is:
Board members acknowledgment of the issue:
Voting Tick the box in front of you name to say yes to the suggestion.
|
7 DILCIS Board members have acknowledge the issue The ssuggested solution will be included in the next version. |
Related to comment keeps/commons-ip#166 (comment)
When validating the CSIP/SIP/AIP/DIP is difficult to know against which version of the specification we should validate against, or at least to know for which version of the specification a package was created for.
The text was updated successfully, but these errors were encountered: