Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing disjointedness between Information and Non-Information Resources #619

Open
2 of 15 tasks
plbt5 opened this issue Jul 26, 2024 · 3 comments · May be fixed by #622 or #623
Open
2 of 15 tasks

Introducing disjointedness between Information and Non-Information Resources #619

plbt5 opened this issue Jul 26, 2024 · 3 comments · May be fixed by #622 or #623

Comments

@plbt5
Copy link
Contributor

plbt5 commented Jul 26, 2024

Note

(Submitted by @plbt5 and @ajnelson-nist.)

This proposal is split off from Issue 606. This proposal does not address any of the UUID discussion from 606.

Background

UCO does not currently account for explicitly representing the distinction between a physical resource that extends in time and space, like a device or a person, and a digital (web) resource that only lives in the cyber-domain, e.g., <https://caseontology.org/index.html>. Since both are clearly disjoint from each other, and because there are many objects in UCO that are either one or the other, such disjointedness must be specified case by case. See for instance Issue #536 , which partially addressed a question around a graph-individual representing a downloadable file (e.g., <http://example.org/file.zip>).

We remind us of the distinction that has already been identified by RDF between information and non-information resources. This ended up in RFC9110 HTTP Semantics. We have depicted their application and distinctions in Figure 1 below:

Distinction between URIs/URN/URL and Information Resources versus Non-information Resources
Figure 1 - Information and non-information resources: their relationship and differences

(Note: For the purposes of this proposal, please consider URI and IRI as synonymous.)

The distinction between an Information Resource (IR) and Non-Information Resource (NIR) cannot be determined from the URI itself but from the response that one gets from the server. If the URI concerns a NIR the server cannot respond with data because there does not yet exists something like Elephant-Over-IP or Paul-Over-IP a.k.a. "Beam me up, Scotty" in the protocols. Instead, the server will respond with a HTTP-303 status, redirecting to a URI that is an Information Resource. Visiting the NIR thus discloses information about the NIR as opposed to the real thing itself.

This kind of behavior of the webserver leaves the determination about whether a resource is a NIR or an IR as a matter of perception by the client. For instance, some services may differentially serve a page to some users, but not others, like with an international hotel that gives its home page to in-country visitors, but a language-specific page for external-appearing visitors. This is a case where the home page is perceived as an IR to in-country visitors, and a NIR to out-of-country visitors. One graph holding perspectives from multiple geographies must be able to tolerate a resource being IR and NIR.

Meanwhile, other RDF resources encoded in a graph remain truly in a set of concepts that will never be information resources, such as people or devices. Hence, we find a need for specializing non-information resource further with a class of things that will never be information resources.

This distinction is instrumental for a lot of things that are built with RDF(S) and OWL, and it is something that UCO should at least recognize as current practice.

Requirements

Requirement 1

Allow UCO to unequivocally determine in a graph whether a resource is either never an information resource or possibly an information resource.

Requirement 2

A single web resource MUST be able to be represented as an IR and/or an NIR as appropriate at different situations, e.g., due to perception about authorization, location, specific targeting and more.

A resource can be both an IR and a NIR because it can be perceived as an IR or NIR depending on constraints or business rules as implemented by the server, e.g., serving pages in different languages when requested from different geographical locations.

(Proposal flow note: This proposal suggests a solution and competencies before providing a risk/benefit analysis.)

Solution suggestion

The implementation would first need to introduce the distinction between Non-Information and Information Resources. This would become two additional near-top-level classes, under core:UcoThing. This would be a nod to the concepts really being RDFS concepts, but not defined with RDFS IRIs. We should also avoid entailing RDFS semantics of rdfs:Resource being the top-level class, because of the tension such would create with OWL and owl:Thing being the top-level class.

Next, another distinction should be introduced to acknowledge Never-Information Resources, and these being disjoint with the IR and NIR. This allows UCO to follow the reality where an IR can change into an NIR, as explained in Competency 1.

To that end, we suggest to introduce the following concepts in UCO:

core:NonInformationResource 
	rdfs:subClassOf core:UcoThing ;
	.
core:InformationResource 
	rdfs:subClassOf core:UcoThing ;
	.
core:NeverInformationResource 
	rdfs:subClassOf core:NonInformationResource ;
	owl:disjointWith core:InformationResource ;
	.

We also introduce observable:WebResource as a parent to observable:WebPage, to acknowledge web resources that are not yet known to be an IR or NIR, and, to acknowledge Webpages that are always to be considered an IR:

observable:WebResource
	rdfs:subClassOf observable:ObservableObject ;
	.
observable:WebPage
	rdfs:subClassOf observable:WebResource ;
	rdfs:subClassOf core:InformationResource ;
	.

Visually, this renders as follows, with green nodes new classes, and the red link a new disjointedness:

flowchart BT

  core_UcoThing[core:UcoThing]
  core_InformationResource[core:InformationResource]
  core_NonInformationResource[core:NonInformationResource]
  core_NeverInformationResource[core:NeverInformationResource]
  core_UcoObject[core:UcoObject]
  core_Item[core:Item]
  observable_Observable[observable:Observable]
  observable_ObservableObject[observable:ObservableObject]
  observable_WebResource[observable:WebResource]
  observable_WebPage[observable:WebPage]

style core_InformationResource stroke:#0f0;
style core_NeverInformationResource stroke:#0f0;
style core_NonInformationResource stroke:#0f0;
style observable_WebResource stroke:#0f0;

core_InformationResource -- ⊂ --> core_UcoThing
core_NonInformationResource -- ⊂ --> core_UcoThing
core_NeverInformationResource -- ⊂ --> core_NonInformationResource
core_InformationResource x-- ⋂=∅ --x core_NeverInformationResource
linkStyle 3 color:red,stroke:red;

core_UcoObject -- ⊂ --> core_UcoThing
core_Item -- ⊂ --> core_UcoObject
observable_Observable -- ⊂ --> core_UcoObject
observable_ObservableObject -- ⊂ --> core_Item
observable_ObservableObject -- ⊂ --> observable_Observable
observable_WebResource -- ⊂ --> observable_ObservableObject
observable_WebPage -- ⊂ --> core_InformationResource
observable_WebPage -- ⊂ --> observable_WebResource
Loading

Apart from the above additions to UCO, we suggest to perform an initial alignment. The Risks section should make clear the benefit of such alignment, particularly pertaining to some existing practices (outside of UCO) on designating graph nodes with RDF types analogous to UCO's identity:Person and observable:WebPage. The rationale followed is - can this owl:Thing ever be downloaded with some browser or command-line tool?

action:Action
	rdfs:subClassOf core:NeverInformationResource ;
	.
core:Event
	rdfs:subClassOf core:NeverInformationResource ;
	.
core:UcoInherentCharacterizationThing
	rdfs:subClassOf core:NeverInformationResource ;
	.
identity:Organization
	rdfs:subClassOf core:NeverInformationResource ;
	.
identity:Person
	rdfs:subClassOf core:NeverInformationResource ;
	.
observable:Device
	rdfs:subClassOf core:NeverInformationResource ;
	.
observable:URL
	rdfs:subClassOf core:NeverInformationResource ;
	.

Competencies demonstrated

Competency 1

Say the webpage of a multilingual company (MC) is being accessed by two market analysts in a multinational organization, who routinely contribute to a shared knowledge base in the organization. Their offices are in different countries that happen to use languages MC supports, Japan and France. MC's default language is Japanese.

The Japanese analyst visits the home page, https://mc.example.co.jp/, and is served content from that URL. The French analyst visits the home page, https://mc.example.co.jp/, and is 303-redirected to https://mc.example.co.jp/lang-fr/ by server-side client-geolocation rules.

Neither analyst knows the other is trying to access https://mc.example.co.jp/.

Competency Question 1.1

What are the representations of the Japanese analyst and the French analyst, using InformationResource, NonInformationResource, NeverInformationResource, WebResource, and/or WebPage?

Result 1.1

The Japanese analyst:

<https://mc.example.co.jp/>
	a observable:WebPage ;
	.

The French analyst:

<https://mc.example.co.jp/>
	a
		core:NonInformationResource ,
		observable:WebResource
		;
	.
<https://mc.example.co.jp/lang-fr/>
	a observable:WebPage ;
	.

Even if pooled in the shared knowledge base, this total knowledge view remains consistent (i.e. does not raise SHACL validation errors).

<https://mc.example.co.jp/>
	a
		core:NonInformationResource ,
		observable:WebPage
		;
	.
<https://mc.example.co.jp/lang-fr/>
	a observable:WebPage ;
	.

This provides an example of a web resource that is, by differential service, contingently a InformationResource and/or a NonInformationResource.

Competency Question 1.2

Are the views consistent when pooled into one graph without any notes on time of observation (i.e., does not raise SHACL validation issues)?

Result 1.2

Yes. The testing in PR 610 confirms no SHACL violations are raised. The visual display of the classes and how this example doesn't hit a class-disjointedness issue is as follows (using "⊂" for subclassing (rdfs:subClassOf), "⋂=∅" for class-disjointedness (owl:disjointWith), and "∈" for instantiation (rdf:type)).

flowchart BT

subgraph TBox
  core_UcoThing[core:UcoThing]
  core_InformationResource[core:InformationResource]
  core_NonInformationResource[core:NonInformationResource]
  core_NeverInformationResource[core:NeverInformationResource]
  core_UcoObject[core:UcoObject]
  core_Item[core:Item]
  observable_Observable[observable:Observable]
  observable_ObservableObject[observable:ObservableObject]
  observable_WebResource[observable:WebResource]
  observable_WebPage[observable:WebPage]
end

subgraph ABox
  wp1[https://mc.example.co.jp/]
  wp2[https://mc.example.co.jp/lang-fr]
end

style core_InformationResource stroke:#0f0;
style core_NeverInformationResource stroke:#0f0;
style core_NonInformationResource stroke:#0f0;
style observable_WebResource stroke:#0f0;

core_InformationResource -- ⊂ --> core_UcoThing
core_NonInformationResource -- ⊂ --> core_UcoThing
core_NeverInformationResource -- ⊂ --> core_NonInformationResource
core_InformationResource x-- ⋂=∅ --x core_NeverInformationResource
linkStyle 3 color:red,stroke:red;

core_UcoObject -- ⊂ --> core_UcoThing
core_Item -- ⊂ --> core_UcoObject
observable_Observable -- ⊂ --> core_UcoObject
observable_ObservableObject -- ⊂ --> core_Item
observable_ObservableObject -- ⊂ --> observable_Observable
observable_WebResource -- ⊂ --> observable_ObservableObject
observable_WebPage -- ⊂ --> core_InformationResource
observable_WebPage -- ⊂ --> observable_WebResource

wp1 -- ∈\n(per French analyst) --> core_NonInformationResource
wp1 -- ∈\n(per French analyst) --> observable_WebResource
wp1 -- ∈\n(per Japanese analyst) --> observable_WebPage
wp2 -- ∈\n(per French analyst) --> observable_WebPage
Loading

Competency 2

This competency gives a scenario provided as a Risk in the first version of this proposal.

There is a user interface design option available for web services that choose to provide content for browser-based users and RDF-based users. They can choose to separate the RDF individuals from the web pages documenting those individuals; or, they can choose to provide the browser-friendly contents (i.e., HTML, maybe with graphics) describing an individual at that individual's IRI.

Suppose a personnel indexing service is deployed that uses home pages as person identifiers for an example organization. Their knowledge graph is available to a graph consumer who also uses UCO, and we assume the IR/NIR/Never-IR distinction of this proposal is adopted. This statement is in the graph provided by the service:

<http://example.org/~bob>
	a foaf:Person ;
	foaf:givenName "Bob" ;
	.

And, http://example.org/~bob, when visited in a browser, is served as HTML. A crawler used by the graph consumer logs this in its knowledge graph, after stumbling on Bob's home page through an intranet traversal:

<http://example.org/~bob>
	a observable:WebPage ;
	.

Competency Question 2.1

What encodings are possible to describe the graph-individual <http://example.org/~bob>?

This question stems from UCO's demonstrations to date, and is presented to motivate the need for UCO to clarify its classes URL and WebPage in particular.

Result 2.1

  1. <http://example.org/~bob> a observable:WebPage . - The graph-individual pulls down in a browser as HTML. From the crawler's perspective, this is a WebPage.
  2. <http://example.org/~bob> a identity:Person . - The graph-individual has a type of foaf:Person in the personnel service's graph, so it feels natural to translate that statement over to UCO's identity:Person.

Unfortunately, if both of those interpretations were taken, an inconsistency would be reached: identity:Person is under core:NeverInformationResource, and observable:WebPage is under core:NeverInformationResource, entailing membership in two disjoint sets.

  1. <http://example.org/~bob> a observable:URL . - The graph-individual can be seen as describing itself. However, this is another instance of the confusion discussed in Issues How does one represent a downloadable file in UCO? #534 and File and URL should be designated disjoint classes #536 , which addressed modeling a URL that yields a file-download on visit. In Issue 536, a disjointedness between URL and File was adopted, but several significant questions were left unaddressed.

This proposal takes a step towards addressing the question of what higher-level classes should be made disjoint, rather than piecemeal assignment of some ObservableObject subclasses.

Competency 2.2

How can the personnel indexing service's graph integrate into the UCO-based graph?

Result 2.2

There is some challenge in integrating the personnel indexing service's graph into an environment where information resources and non-information resources are held disjoint.

Integration of such a data source would need to split the resource http://example.org/~bob into independent entities, likely with a new identity:Person node. Other assertions on Bob from the personnel graph, such as name information, would likely need to migrate into Facets defined in the UCO identity: namespace, rather than be carried over with the FOAF vocabulary. In this case, some FOAF vocabulary can still be used to preserve links.

The below graph would be derived from the personnel graph, and added to the crawler's knowledge base. The personnel graph would not be directly added.

<http://example.org/~bob>
	a observable:WebPage ;
	.
kb:Person-a3d3af3d-ea1d-47f6-bc02-ac334ded6549
	a identity:Person ;
	core:name "Bob" ;
	core:hasFacet kb:SimpleNameFacet-5e939a71-078c-4ddd-a6fe-3635288b3f24 ;
	.
kb:SimpleNameFacet-5e939a71-078c-4ddd-a6fe-3635288b3f24
	a identity:SimpleNameFacet ;
	identity:givenName "Bob" ;
	.
kb:Relationship-6c57d1cd-8a10-4163-98bd-93d3d2e15b00
	a core:Relationship ;
	core:isDirectional true ;
	core:kindOfRelationship "Has_Company_Homepage" ;
	core:source kb:Person-a3d3af3d-ea1d-47f6-bc02-ac334ded6549 ;
	core:target <http://example.org/~bob> ;
	.

# Preserve link between new Bob node and Bob's homepage with FOAF vocabulary.  Add FOAF types entailed by linking properties.
<http://example.org/~bob>
	a foaf:Document ;
	foaf:primaryTopic kb:Person-a3d3af3d-ea1d-47f6-bc02-ac334ded6549 ;
	.
kb:Person-a3d3af3d-ea1d-47f6-bc02-ac334ded6549
	a foaf:Person ;
	foaf:homepage <http://example.org/~bob> ;
	.

# Carry some of the FOAF data to UCO Person node.
kb:Person-a3d3af3d-ea1d-47f6-bc02-ac334ded6549
	foaf:givenName "Bob" ;
	.

Risk / Benefit analysis

Benefits

Adding the specialization class NeverInformationResource moves further to realizing an assumed disjunction in RFC 9110's HTTP Semantics between "Information Resource" and "Non Information Resource". In practice, InformationResource and NonInformationResource can be conflated when graphs are built from multiple perspectives. This proposal prevents some conflations that should not be possible, especially ones where physical things could accidentally be implied to be downloadable.

Aligning WebPage with a higher-level concept should bring a better understanding to how to use it. This is needed since UCO's WebPage and URL can become mixed with other concepts due to the fundamental nature of RDF being about using IRIs and UCO describing URLs. observable:WebPage has been lacking to date in UCO demonstrations, which has raised confusion in Ontology Committee calls. Chances to clarify this class should be taken.

Understanding what WebPage is and isn't may be especially important in resolving ReactionsListFacet from #374. A social media post is often viewable as a web page, so UCO usage could easily see something like this in some adopter's graph analyzing some (example) social network:

@prefix ex: <http://example.org/ontology/> .

ex:SocialCompanyPost
	a owl:Class ;
	rdfs:comment "A social media post post on the network provided by Example Social Company, Inc."@en ;
	rdfs:subClassOf
		uco-observable:WebPage ,
		uco-observable:Post
		;
	.
ex:esciRepostOf
	a owl:ObjectProperty ;
	rdfs:domain ex:SocialCompanyPost ;
	rdfs:range ex:SocialCompanyPost ;
	.
ex:esciText
	a owl:DatatypeProperty ;
	rdfs:domain ex:SocialCompanyPost ;
	rdfs:range xsd:string ;
	.
	
<http://social.example.com/ExampleUser2/1722027818.0>
	a ex:SocialCompanyPost ;
	ex:esciRepostOf <http://social.example.com/ExampleUser1/1722027486.0> ;
	ex:esciText "lol" ;
	.
<http://social.example.com/ExampleUser1/1722027486.0>
	a ex:SocialCompanyPost ;
	ex:esciText "wow" ;
	.

Risks

Competency 2 illustrates a significant quality-control consideration for how to integrate data from non-UCO graphs. Agreement on fundamentals is one of the significant challenges of cross-graph interoperability.

The heuristic of "Can this ever be downloaded?" might, or might not, be a sufficient guideline for determining what would be NeverInformationResources. This could be challenging for some things where records and events are closely tied together. For instance, a Bitcoin transaction has tightly-intertwined elements of (UCO) Actions and EventRecords. The action is someone transferring coins, which would (by this proposal's action:Action alignment) be a NeverInformationResource; however, the action doesn't fully happen without the record being an InformationResource retrievable from the blockchain. This seems like a situation where it's tempting to say one "downloads the action," which the proposers assume is not a kind of statement UCO should wish to support. This particular "downloading the action" statement can be avoided by adding a specific disjointedness between action:Action and observable:EventRecord; but, the higher-order disjointedness in this proposal satisfies the same separation, stemming from actions being never-information resources, and leaving it open whether event records can be information resources.

If the alignment core:UcoInherentCharacterizationThing rdfs:subClassOf core:NeverInformationResource . is accepted, the current statement in the ontology core:UcoInherentCharacterizationThing rdfs:subClassOf core:UcoThing . becomes entailed, and no longer needs to be explicitly stated from some perspectives, including with respect to SHACL, and with respect to entailment schemes (whether RDFS or OWL). However, this divide is one of the foundational statements of UCO, that there are "domain objects" (UcoObject and subclasses) and "non-domain objects" (things that only inhere and characterize other things, and cannot exist without those other things). Removal of the triple core:UcoInherentCharacterizationThing rdfs:subClassOf core:UcoThing . makes this divide less apparent, because core:UcoInherentCharacterizationThing is no longer among the direct subclasses of core:UcoThing; but, the divide is still present from the axiom core:UcoInherentCharacterizationThing owl:disjointWith core:UcoObject .
This appears to the proposers to be an appropriate adjustment of UCO's foundations, because UCO's foundations include design tenets of RDF.
The alignment of core:UcoInherentCharacterizationThing assumes so far and decides furthermore that it has no subclasses that will ever be downloadable. Were they downloadable, it seems they would be domain objects (further, ObservableObjects) under UcoObject. To date, it seems the only inherent characterization thing subclass that comes close to fuzzing the downloadable-or-not divide by bundling URLs is observable:URLHistoryEntry, but that class uses observable:url to and observable:referrerURL to separate observable:URLs.

Visual summary

This figure illustrates the added classes and alignments. Current disjointedness axioms are also illustrated.

flowchart BT

subgraph AlwaysIR
  observable_WebPage[observable:WebPage]
  core_InformationResource[core:InformationResource]
end
subgraph MaybeIR
  core_IdentityAbstraction[core:IdentityAbstraction]
  core_Item[core:Item]
  core_NonInformationResource[core:NonInformationResource]
  core_UcoObject[core:UcoObject]
  core_UcoThing[core:UcoThing]
  identity_Identity[identity:Identity]
  observable_Observable[observable:Observable]
  observable_ObservableObject[observable:ObservableObject]
  observable_WebResource[observable:WebResource]
end
subgraph NeverIR
  action_Action[action:Action]
  core_Event[core:Event]
  core_NeverInformationResource[core:NeverInformationResource]
  core_UcoInherentCharacterizationThing[core:UcoInherentCharacterizationThing]
  identity_Organization[identity:Organization]
  identity_Person[identity:Person]
  observable_Device[observable:Device]
  observable_URL[observable:URL]
end

style core_InformationResource stroke:#0f0
style core_NonInformationResource stroke:#0f0
style core_NeverInformationResource stroke:#0f0
style observable_WebResource stroke:#0f0

core_InformationResource -- ⊂ --> core_UcoThing
core_NonInformationResource -- ⊂ --> core_UcoThing
core_NeverInformationResource -- ⊂ --> core_NonInformationResource
core_InformationResource x-- ⋂=∅ --x core_NeverInformationResource
linkStyle 3 color:red,stroke:red;

action_Action x-- ⋂=∅ --x core_Event
linkStyle 4 color:red,stroke:red;
action_Action -- ⊂ --> core_UcoObject
action_Action -- ⊂ --> core_NeverInformationResource
core_Event -- ⊂ --> core_NeverInformationResource
core_Event -- ⊂ --> core_UcoObject
core_IdentityAbstraction -- ⊂ --> core_UcoObject
core_Item -- ⊂ --> core_UcoObject
core_UcoInherentCharacterizationThing -- ⊂ --> core_NeverInformationResource
core_UcoInherentCharacterizationThing x-- ⋂=∅ --x core_UcoObject
linkStyle 12 color:red,stroke:red;
core_UcoObject -- ⊂ --> core_UcoThing
identity_Identity -- ⊂ --> core_IdentityAbstraction
identity_Organization -- ⊂ --> core_NeverInformationResource
identity_Organization -- ⊂ --> identity_Identity
identity_Person -- ⊂ --> core_NeverInformationResource
identity_Person -- ⊂ --> identity_Identity
observable_Device -- ⊂ --> core_NeverInformationResource
observable_Device -- ⊂ --> observable_ObservableObject
observable_Observable -- ⊂ --> core_UcoObject
observable_ObservableObject -- ⊂ --> core_Item
observable_ObservableObject -- ⊂ --> observable_Observable
observable_URL -- ⊂ --> core_NeverInformationResource
observable_URL -- ⊂ --> observable_ObservableObject
observable_WebResource -- ⊂ --> observable_ObservableObject
observable_WebPage -- ⊂ --> core_InformationResource
observable_WebPage -- ⊂ --> observable_WebResource
Loading

Coordination

  • Tracking in Jira ticket OCUCO-319
  • Administrative review completed, proposal announced to Ontology Committees (OCs) on 2024-07-30
  • Requirements to be discussed in OC meeting, 2024-08-20
  • Requirements Review vote has not occurred
  • Requirements development phase completed.
  • Solution announced to OCs on TODO-date
  • Solutions Approval to be discussed in OC meeting, date TBD
  • Solutions Approval vote has not occurred
  • Solutions development phase completed.
  • Backwards-compatible implementation merged into develop for the next release
  • develop state with backwards-compatible implementation merged into develop-2.0.0
  • Backwards-incompatible implementation merged into develop-2.0.0
  • Milestone linked
  • Documentation logged in pending release page
  • Prerelease publication: CASE develop branch updated to track UCO's updated develop branch
  • Prerelease publication: CASE develop-2.0.0 branch updated to track UCO's updated develop-2.0.0 branch
@ajnelson-nist ajnelson-nist added this to the UCO 1.4.0 milestone Jul 29, 2024
ajnelson-nist added a commit that referenced this issue Jul 30, 2024
A follow-on patch will regenerate Make-managed files.

References:
* #619

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
@ajnelson-nist ajnelson-nist linked a pull request Jul 30, 2024 that will close this issue
ajnelson-nist added a commit that referenced this issue Jul 30, 2024
References:
* #619

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
@ajnelson-nist ajnelson-nist linked a pull request Jul 30, 2024 that will close this issue
ajnelson-nist added a commit that referenced this issue Jul 30, 2024
A follow-on patch will regenerate Make-managed files.

References:
* #619

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
ajnelson-nist added a commit that referenced this issue Jul 30, 2024
References:
* #619

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
@ajnelson-nist
Copy link
Contributor

One point I found interesting in doing the alignments is that I think on a closer review, much of UCO could end up being considered as core:NeverInformationResources. The smaller set of classes is core:InformationResource, which at the moment just has observable:WebPage. Finding other subclasses of core:InformationResource seems to be an interesting challenge.

I have one potential subclass, which I haven't proposed separately because I've had trouble pinning where in UCO it would be. I'm providing it here just for discussion, not to augment the proposal.

I think an S3 Object would be another subclass of core:InformationResource and core:ObservableObject. For instance, take this S3 Object from the DFRWS 2021 challenge:

@prefix drafting: <http://example.org/ontology/drafting/> .

<s3://digitalcorpora/corpora/dfrws/challenge-2021/1_Skimmer_mSD.zip>
    a drafting:S3Object ;
    .

This can be seen in Digital Corpora's object browser here, which offers an alternative (downloadable) view of the S3 object via an HTTP portal.

As a graph-individual, how would that S3 object be classified? I think the UCO definition, including this IR/NIR proposal, would go at least this far:

drafting:S3Object
    a owl:Class , sh:NodeShape ;
    rdfs:subClassOf
        core:InformationResource ,
        observable:ObservableObject
        ;
    .

I don't think it's a subclass of File, because S3 is an object store, not a file system.

@sbarnum
Copy link
Contributor

sbarnum commented Aug 20, 2024

I still do not see the justification for this complexity here and still see a fundamental invalidity at least in the example if maybe(?) not in the underlying proposed change.

The validity issue is that throughout the examples and I believe implied in the proposed solution IRI/URI identifiers are used for UcoObjects that are not necessarily (an in at least one case apparently intentionally) globally unique which is an absolute requirement for CDO object identifiers.
This is a hard stop issue.
We have a long agreed to suggested (SHOULD) form for CDO ids ()(e.g., "https://foo.bar.co/people/4fbcece4-afd5-4cd6-b70e-aab77f930b00") that guarantees global uniqueness as long as the producer assures the provided path and UUID are unique within their controlled domain.
Even if a producer chooses not to use the suggested form they are still required to guarantee global uniqueness of ids for all of their produced content.
This is a HARD stop issue because if this requirement goes away or even softens, the entire ecosystem falls apart.

On the broader issues, I am a bit confused.
Is the core purpose for this to be able to support the HTTP scenarios outlined in Competency 1 and 2?
I don't see any justifying rationale for this very fundamental change outside of those two scenarios.
If it is for these scenarios and ones very close to them then I would assert that attempting to use object IRI ids to provide requisite support is the wrong approach.
UCO already has the building blocks (URL, WebPage, HTTPConnection, Relationship, and maybe some other of the networking ObservableObjects) to support these scenarios in a more expressive and flexible manner while avoiding the ID uniqueness issue.
I would be happy to get together on a call for a couple of hours and talk through how to model these scenarios with existing capabilities.
I have read through the long details above a few times I don't think I am missing any edge cases or anything fundamental but holding a working session would hopefully expose if I am wrong.

I would strongly object to progressing this any further until such a working session could occur.

@ajnelson-nist
Copy link
Contributor

@sbarnum I welcome a discussion to further explore motivations of this and some of CDO's fundamental objectives. In particular, we should discuss whether CDO can be used to extend existing concepts of other vocabularies and knowledge bases, which I believe (and I think some others believe) to be a key objective of ontological interoperability. I believe there is a significant risk if CDO avoids this objective - a risk of inducing an information silo.

WebPage and web resources with a download-a-file disposition have also been a source of confusion. I would very much like to work through examples of these.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants