Ensure complete coverage when cycles exist #328
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
WIP: So far this PR only contains a test triggering the problem.
Problem
The resolution behavior w.r.t. cycles is not always preserved when we go through partial paths compared to resolving in the graph directly.
Take the following example:
If we resolve
foo
in the graph directly, we get thefoo
definition as expected. Because we resolve without a precondition variable, following the cycle is safe: we can only follow it as long as there are symbols on the stack to consume.On the other hand, if we compute partial paths first (using defs, refs, root, and exported scopes as end points), things don't work out. The partial path from
foo
to root is okay. But the other path(s) would have to be from root tofoo
, including the cycle. One path excludes the cycle, but is not enough to resolve thefoo
reference later. Any path involving the cycle will be rejeected, because the partial path does have a precondition variable, so we could follow it arbitrary number of times.(One special case is supported, when the cycle coincides exactly with an accepted path. So if scope 1 was an exported scope, the cycle would be included as a partial path. But this requires a lot of care when defining the stack graph rules to ensure this is the case.)
Goal
The overall goal of partial paths is that they are purely an optimization. Precomputing partial data should not make some solutions unavailable. This gives us the following two requirements:
If a path is produced in the full graph, it should also be produced by stitching the partial paths. Caveat: Partial path sets sometimes restrict the kinds of start or end nodes of paths, in which case this holds only for paths in the full graph that also follow that restriction.
We must ensure that we do not accidentally introduce multiple ways of stitching the same path. This could cause exponential blowup during path stitching.
Solution
How can we solve this? We'll start with a naive solution that will address the first requirement, but violate the second. Then we describe a modification that fixes that.
The idea is to implicitly split a path involving a cycle in components, and keep those components separately, instead of rejecting their compositions, as we do now. So for our example, we'll encounter the following path during partial path finding:
We'll decompose this into the following paths:
<%1> ($1) | [root] -> [scope 1] | <%1> ($1)
<"bar",%1> ($1) | [scope 1] -> [pop "bar"] -> [scope 1] | <%1> ($1)
<%1> ($1) | [scope 1] | <%1> ($1)
.The prefix and the cycle itself are added to the partial path set, but not further extended. The postfix seed is not added to the partial path set itself, but extended and added to the set when it hits an accepted endpoint.
This solution fulfills requirement 1 but violates requirement 2. To see why this is the case, we have to realize that the prefix has been encountered before during path finding, and will be extended to
[def "foo"]
. Similarly, the new seed will also be extended to[def "foo"]
. So there will be two ways to construct the path[root] -> [scope 1] -> [def "foo"]
. Once directly, and once by combining the prefix and the extended seed.To solve this, we need to ensure that the prefix and seed can only be combined, if the cycle was involved in the path as well. We can achieve this by picking two new, unique symbol q. Using these we construct the paths above with modified pre- and postconditions.
<"bar",%1> ($1) | [root] -> [scope 1] -> [pop "bar"] -> [scope 1] | <%1,q> ($1)
<q,"bar",%1> ($1) | [scope 1] -> [pop "bar"] -> [scope 1] | <q,%1> ($1)
<q,%1> ($1) | [scope 1] | <%1> ($1)
.As before, the seed is the only path that is further extended. The new symbol work as a guard, ensuring these paths cannot be combined with other paths in the set, but only with each other for the specific case where the cycle is involved. When paths are stitched together, the q symbol will disappear, resulting in the same path as would be produced by resolving directly in the graph.
In this example, it might still work correctly if the q symbol is not used, because
[scope 1]
is not an endpoint for the partial path set we're computing. However, if the cycle can start and end in a node that is a possible endpoint (let's say it was marked as exported), the q precondition will prevent stitching[ref "foo"] -> [root] -> [scope 1]
with extensions of the seed.