ttlser is the product of long frustration with the majority of commonly used turtle serializers due to their reordering of triples on additions or deletions which leads to spurious diffs (see this blog post for an overview of the issues). The main use case motivating ttlser is to produce human readable diffs of ontology files that display the meaningful changes and not reorderings. Specifically ttlser was developed to minimize diffs for ttl files that are stored in git. For additional information on turtle see the grammar.
- A single newline
\n
occurs after all lines. - A second newline shall occur only in the following cases.
- After the last line of the prefix section.
- After every section header.
- After the closing line (the one with a period
.
) of ardf:type
block.
- Indentation.
- There shall be no indentation for
@prefix
lines. - There shall be no indentation for lines representing top level triples (e.g.
rdf:type
lines). - There shall be no indentation for section header lines.
- Lines representing triples with lower priority predicates (e.g.
rdfs:subClassOf
) shall have one additional indentation block of 4 spaces preceeding them in addition to the number of indentation blocks preceeding the line for the highest priority triple with which they share their subject. For example ardfs:subClassOf
triple line sharing a subject with a top levelowl:Class
triple line should have exactly 1 indentation block of 4 spaces preceeding ther
inrdfs:subClassOf
. - Elements of an
rdf:List
shall all have only 1 additional indentation block beyond that of a normal object.
- There shall be no indentation for
- All opening parenthesis shall occur on the same line as the subject they represent.
- All closing parenthesis and brackets shall occur on the same line, each separated by a single space (lisp style).
- Opening parenthesis of an
rdf:List
shall be follow by a newline. - Opening brackets shall NOT be followed by a newline.
- There shall be 1 space between subject, predicate, object, parenthesis, square brackets,
;
, and.
. - There shall be NO space preceding a comma
,
separating a list of predicate-objects sharing the same subject.
Alphabetical ordering in this document means the following.
- Orderings are defined over a set of string representations of the qname forms of subjects, predicates, or objects. Anonymous BNodes should be considered to be null thus should not be considered when sorting alphabetically.
- Values that do not have a qname representation (e.g.
<http://example.org>
or"Hello world"
) and that are not BNodes shall be taken as is. - The ordering shall be a natural sort (such that
'a9'
comes before'a10'
and'a11111111'
) with an exception describe in the next point. - The ordering shall put
'a'
after'A'
but before'B'
. Essentially this can be interpreted to mean that capital vs lowercase should be ignored when ordering between different letters (e.g.A
vsB
orc
vsD
) but should be taken into account when breaking ties where there are two identical strings that differ only in their capitalization. This means that'bb'
comes before'BBb'
.
- Class orderings and predicate orderings are as listed at the start of
CustomTurtleSerializer
. In theory these orderings could be maintained in a separate file that any conforming serializer could import. - Likewise section headers are as specified in the
SECTIONS
portion ofCustomTurtleSerializer
. @prefix
lines shall be ordered alphabetically by(prefix, namespace)
pairs. For example@prefix c: <http://c.org>
will precede@prefix C: <http://cc.org>
. Another way to get the same ordering as using prefix namespace pairs is to sort the set of whole prefix lines alphabetically.- Within a section (demarcated by a header) the ordering of entries shall first be in order of their top level class and then alphabetically.
- Orderings of the contents of
rdf:List
s shall be alphabetical. - Orderings of
owl:Restriction
s shall be alphabetically first by the object of theirowl:onProperty
statement, then alphabetically byowl:allValuesFrom
vsowl:someValuesFrom
, then alphabetically by the*ValuesFrom
object. - Ordering of literals shall be by type in the following order BooleanLiteral, NumericLiteral, RDFLiteral.
- Ordering of BooleanLiterals shall be false, true.
- Ordering of NumericLiterals shall be by pairs of
(numeric value, original string representation)
. - Ordering of RDFLiterals shall be alphabetical by the triple
(value, datatype, language)
with the empty string''
being substituted for either datatype or language if either is missing. - Ordering of elements at any given nesting level (skipping invalid combinations such as a literal as a subject) shall be, literal, iri, blankNodePropertyList, collection.
For example in a collection the order would be
(true 1 1.0 1.00 1e+00 "1" A:1 <http://a.org/1> [ a owl:Thing ] (1 "1"))
. - Ordering of
owl:Axiom
s shall be by the triple of objects for(owl:annotatedSource owl:annotatedProperty owl:annotatedTarget)
.
This is currently implemented in serializers.py by finding a total ordering on all URIs and Literals, and then using the ranks on those nodes to calculate ranks for any BNode that is their parent. This is done using a fixedpoint function on the ranks of BNodes. This provides a global total ordering for all triples than can then be used to produce deterministic output recursively. Ordering rules involving predicate precidence are implemented by selecting the order in which predicates or groups of predicates appear in the list at the beginning of CustomTurtleSerializer
.