-
Notifications
You must be signed in to change notification settings - Fork 353
Structural Diffs
Most tree diff implementations focus on XML, and there's a great overview of techniques in this blog post. Daniel Ehrenberg also has an expanded 7-page PDF of his blog post, which I've mirrored here (the storage service says the PDF is under a "Attribution Non-Commercial (BY-NC)" license).
A Jane Street intern built a tree differ using an A* algorithm.
This compares two s-expressions and builds a minimal new one with the new section marked with :date-switch
.
(Jane Street also has patdiff, but that
seems to be a line-oriented diff with some whitespace/integer display polish. It doesn't understand that
e.g. whitespace in "foo "
is meaningful).
json-diff provides a proper structural diff for JSON files.
graphtage compares structured data by parsing into a generic file format, then displaying a diff. It finds the optimal edit sequence, and even allows things like diffing JSON against YAML.
sdiff and diff-sexp explore s-expression oriented diffs.
Autochrome is a structural diff for Clojure, using Dijkstra pathfinding. It does not track moves.
psydiff and ydiff apply structural diffing to Python and Lisp respectively, and output an HTML page of the result. They consider ASTs excluding comments.
diffsitter uses tree sitter for parsing, then runs longest-common-subsequence on the leaves of the tree.
https://victorcmiraldo.github.io/data/MiraldoPhD.pdf (handles merging as well as diffing). Source referenced in the paper is https://github.com/VictorCMiraldo/hdiff/tree/v0.0.5 .
Victor has also published An Efficient Algorithm for Type-Safe Structural Diffing in ICFP 2019.
PLDI 2021: https://dl.acm.org/doi/10.1145/3453483.3454052
Builds a tree structure and diffs it for several programming languages: https://github.com/GumTreeDiff/gumtree