Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCO should clarify role of informal type string fields vs. implemented classes #640

Open
11 of 16 tasks
ajnelson-nist opened this issue Oct 28, 2024 · 1 comment · Fixed by #641
Open
11 of 16 tasks

Comments

@ajnelson-nist
Copy link
Contributor

ajnelson-nist commented Oct 28, 2024

Background

UCO has several owl:DatatypePropertys that are used to house plain strings (sometimes that are part of semi-open vocabularies) which can be used to describe a type. For example, observable:accountType describes a type that is some specialization of observable:Account, and the vocabulary suggests some values (ldap, nis, openid, etc.).

These properties are provided for UCO users to strike a balance between specificity and agility. For instance, an account type might be sufficiently niche in an investigation that there would be little to no benefit to attempting to standardize it. But, less-niche options would suffer from not standardizing. For instance, the "Crossover WMD" CASE example uses the string "Phone" as a value of observable:AccountType - but, the class observable:PhoneAccount is in UCO.

This proposal introduces a property core:informalType to clarify the usage of these plain string fields that proxy formal types, by inlining documentation into one place within the ontology.

Requirements

Requirement 1

String fields that are used as a substitute of OWL-encoded or SHACL-reviewed classes must include a link to documentation on their role in data creation, management, and interchange.

Requirement 2

Any grouping of these informal-typing string fields must not impede usage of semi-open vocabularies.

Risk / Benefit analysis

Benefits

This proposal introduces structural-purpose documentation, and a structural link to that embedded documentation, to many properties throughout UCO.

Risks

Subproperties have the potential to introduce comprehension complexity from multiple rdfs:domains and rdfs:ranges going up the superproperty hierarchy. Requirement 2 is intended to avoid this situation with the new property, and UCO happens to not use rdfs:domain on any of the properties impacted by this proposal. So, no new risks are currently believed to be associated with this change.

Competencies demonstrated

Competency 1

Competency Question 1.1

What are the informal type properties currently in UCO?

Result 1.1

The solution is drawn from the implementing PR (link coming momentarily after posting), and this SPARQL query which assumes introduction of core:informalType:

PREFIX core: <https://ontology.unifiedcyberontology.org/uco/core/>
SELECT ?nProperty
WHERE {
  ?nProperty
    rdfs:subPropertyOf+ core:informalType ;
    .
}
ORDER BY ?nProperty

The result comes from running that query against test/uco_monolithic.ttl after make check was run.

?nProperty
configuration:dependencyType
configuration:itemType
core:eventType
location:addressType
marking:definitionType
observable:MSISDNType
observable:accountLogonType
observable:accountType
observable:actionType
observable:archiveType
observable:audioType
observable:blockType
observable:callType
observable:cellSiteType
observable:contentType
observable:dataType
observable:deviceType
observable:diskPartitionType
observable:diskType
observable:driveType
observable:eventType
observable:extFileType
observable:fileSystemType
observable:hiveType
observable:icmpType
observable:imageType
observable:libraryType
observable:messageType
observable:mimeType
observable:passwordType
observable:peType
observable:pictureType
observable:rangeOffsetType
observable:serviceType
observable:startType
observable:triggerSessionChangeType
observable:triggerType
observable:urlTransitionType
observable:whoisContactType
tool:toolType

Solution suggestion

core:informalType is introduced as a property. Its lengthy documentation string is given here with line-breaks:

Informal Type serves as a parent property for string-valued properties meant to describe a type without implementing a class design. This property hierarchy supports a balancing point between semantic specificity and operational agility.

The known benefits of describing types rather than implementing them include swift extensibility of some existing, or possibly non-existing, subclass hierarchy in UCO without requiring training in ontological development, taxonomic specification, or OWL, SHACL, or RDF maintenance logistics.

The known detractions of using string-literals for type descriptions include that used vocabularies may require careful maintenance among data-sharing parties;
that vocabularies require independent logistics (external to UCO) for providing definitions (i.e., dictionary-style semantics) to string-literals chosen;
and that string-literals cannot by themselves encode hierarchical structure or entailments, such as the informal device type string 'ExamplePhone 8 P4321' entailing 'ExamplePhone 8', 'ExamplePhone', or 'ExamplePhone models discontinued in 2020'.

Usage of Informal Type to house strings should be weighed against usage of classes when classes are available, and should periodically be reviewed for potential additions to UCO's class hierarchy or downstream extensions thereof.

All properties P that currently house an informal type are set as sub-properties of core:informalType: P rdfs:subPropertyOf core:informalType. Note this includes one property in CASE, investigation:investigationForm.

In satisfaction of Requirement 2, core:informalType does not have an rdfs:range. Though all of the child properties have the option of xsd:string in their range, some use a union of xsd:string with a semi-open vocabulary datatype. Even though Issue 629 will adjust typing of vocabulary members to be xsd:string, the sub-property linkage may interact poorly with owl:unionOf, since multiple rdfs:range statements are interpreted in OWL as an intersection.

Coordination

  • Tracking in Jira ticket OCUCO-149
  • Administrative review completed, proposal announced to Ontology Committees (OCs) on 2024-10-28
  • Requirements to be discussed in OC meeting, 2024-11-21
  • Requirements Review vote occurred, passing, on 2024-11-21
  • Requirements development phase completed.
  • Solution announced to OCs on 2024-11-22
  • Solutions Approval to be discussed in OC meeting, 2024-12-10
  • Solutions Approval vote occurred, passing, on 2024-12-10
  • Solutions development phase completed.
  • Backwards-compatible implementation merged into develop for the next UCO release
  • Backwards-compatible implementation merged into develop for the next CASE release
  • develop state with backwards-compatible implementation merged into UCO develop-2.0.0
  • develop state with backwards-compatible implementation merged into CASE develop-2.0.0
  • Backwards-incompatible implementation merged into develop-2.0.0 (N/A)
  • Milestone linked
  • Documentation logged in pending UCO release page
  • Documentation logged in pending CASE release page
@ajnelson-nist ajnelson-nist added this to the UCO 1.4.0 milestone Oct 28, 2024
ajnelson-nist added a commit that referenced this issue Oct 28, 2024
References:
* #640

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
ajnelson-nist added a commit to casework/CASE-Archive that referenced this issue Nov 22, 2024
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#640

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
ajnelson-nist added a commit to casework/CASE-Archive that referenced this issue Nov 22, 2024
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#640

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
ajnelson-nist added a commit to casework/CASE-Corpora that referenced this issue Nov 22, 2024
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#640

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
ajnelson-nist added a commit to casework/CASE-Examples that referenced this issue Nov 22, 2024
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#640

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue Nov 22, 2024
A follow-on patch will regenerate Make-managed files.

References:
* ucoProject/UCO#640

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue Nov 22, 2024
References:
* ucoProject/UCO#640

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
ajnelson-nist added a commit to casework/CASE-Corpora that referenced this issue Nov 22, 2024
No effects were observed on Make-managed files.

References:
* casework/CASE#162
* ucoProject/UCO#640

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
ajnelson-nist added a commit to casework/CASE-Examples that referenced this issue Nov 22, 2024
No effects were observed on Make-managed files.

References:
* casework/CASE#162
* ucoProject/UCO#640

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue Nov 22, 2024
A follow-on patch will regenerate Make-managed files.

References:
* casework/CASE#162
* ucoProject/UCO#640

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue Nov 22, 2024
References:
* casework/CASE#162
* ucoProject/UCO#640

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
@ajnelson-nist
Copy link
Contributor Author

I realized there is one point that wasn't discussed on the last OCs call, around entailment (/inferencing/knowledge expansion).

Using rdfs:subPropertyOf to link properties to core:informalType as a parent-property has an implication that is seen with either RDFS or OWL entailment. If we have this graph ...

ex:someThingType1
	a owl:DatatypeProperty ;
	rdfs:subPropertyOf core:informalType ;
	.

ex:someThingType2
	a owl:DatatypeProperty ;
	rdfs:subPropertyOf core:informalType ;
	.

kb:Thing-1
	ex:someThingType1 "Foo" ;
	ex:someThingType2 "Bar" ;
	.

... then applying either RDFS or OWL entailment would result in these extra statements being added to the knowledge base:

kb:Thing-1
	core:informalType
		"Bar" ,
		"Foo"
		;
	.

So, the end result of kb:Thing-1 would be the following after entailment:

kb:Thing-1
	ex:someThingType1 "Foo" ;
	ex:someThingType2 "Bar" ;
	core:informalType
		"Bar" ,
		"Foo"
		;
	.

I believe this is consistent with core:informalType only meaning the linked string proxies some type. Even if Foo and Bar are proxies for types from completely orthogonal type hierarchies, it seems fine to me that core:informalType by itself only notes that Foo and Bar are proxies for types in some unspecified hierarchy or hierarchies.

I also believe this entailment issue is low-risk to UCO at the moment. It would affect any class that happens to use two or more "informal type" properties simultaneously, but according to this query (run, like the query above, against the monolithic build made in the unit tests), that only happens twice:

PREFIX core: <https://ontology.unifiedcyberontology.org/uco/core/>
SELECT DISTINCT ?nClass ?nProperty1 ?nProperty2
WHERE {
  ?nClass
    rdfs:subClassOf* core:UcoThing ;
    sh:property / sh:path ?nProperty1 ;
    sh:property / sh:path ?nProperty2 ;
    .

  ?nProperty1
    rdfs:subPropertyOf+ core:informalType ;
    .
  ?nProperty2
    rdfs:subPropertyOf+ core:informalType ;
    .
  # Casting to strings and using '<' both enforces distinctness, and
  # cuts away symmetric matches.
  FILTER (STR(?nProperty1) < STR(?nProperty2))
}
ORDER BY ?nClass ?nProperty1 ?nProperty2

Results:

?nClass ?nProperty1 ?nProperty2
0 observable:TriggerType observable:triggerSessionChangeType observable:triggerType
1 observable:WindowsServiceFacet observable:serviceType observable:startType

The properties observable:serviceType and observable:startType are undocumented, have no associated vocabularies, and to date haven't been publicly demonstrated (i.e., don't appear in the list of concepts ever used among CASE examples). I'm not sure how strange a pair of informalType values could end up looking on a observable:WindowsServiceFacet instance.

observable:TriggerType---which, as an aside, seems it should have been named observable:Trigger---has a vocabulary associated with observable:triggerType, but no vocabulary associated with triggerSessionChangeType. As with serviceType and startType, none of these three concepts have been publicy demonstrated to date. So, I'm not sure how strange a pair of informalType values could look on a observable:TriggerType instance.

In summary, there seems to be low risk coming from RDFS or OWL entailment applied to this proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant