Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix matching of literals in dynamic rules #258

Open
wants to merge 2 commits into
base: ale/3.0
Choose a base branch
from

Conversation

gkronber
Copy link
Collaborator

The literal_position returned by the ematcher may be invalidated by merging of two eclasses in another rule.
This PR changes the ematcher rules to always return the hash of the enode (even for literals).

@codecov-commenter
Copy link

codecov-commenter commented Jan 23, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.34%. Comparing base (b960db9) to head (54be8da).

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff             @@
##           ale/3.0     #258      +/-   ##
===========================================
- Coverage    81.42%   81.34%   -0.09%     
===========================================
  Files           19       19              
  Lines         1497     1501       +4     
===========================================
+ Hits          1219     1221       +2     
- Misses         278      280       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

Benchmark Results

egg-sym egg-cust MT@54be8dadf71... MT@b960db95f36... egg-sym/MT@54b... egg-cust/MT@54... MT@b960db95f36...
egraph_addexpr 1.43 ms 5.57 ms 5.22 ms 0.257 0.938
basic_maths_simpl2 13.7 ms 5.28 ms 22.2 ms 21.1 ms 0.618 0.238 0.952
prop_logic_freges_theorem 2.52 ms 1.56 ms 1.1 ms 1.06 ms 2.29 1.42 0.964
calc_logic_demorgan 60.6 μs 34.4 μs 77.9 μs 75.3 μs 0.778 0.441 0.966
calc_logic_freges_theorem 22.4 ms 11.5 ms 46.8 ms 45.5 ms 0.478 0.245 0.973
basic_maths_simpl1 6.38 ms 2.89 ms 4.78 ms 4.73 ms 1.33 0.604 0.99
egraph_constructor 0.0886 μs 0.0983 μs 0.0989 μs 0.901 1.01
prop_logic_prove1 36.5 ms 14 ms 48.7 ms 43.9 ms 0.75 0.288 0.901
prop_logic_demorgan 79.8 μs 45.1 μs 95.5 μs 93.1 μs 0.836 0.473 0.975
while_superinterpreter_while_10 18.3 ms 17.9 ms 0.979
prop_logic_rewrite 114 μs 116 μs 1.01
time_to_load 122 ms 121 ms 0.991

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

@@ -144,15 +144,11 @@ end
Instantiate argument for dynamic rule application in e-graph
"""
function instantiate_actual_param!(bindings, g::EGraph, i)
const_hash = v_pair_last(bindings[i])
const_hash == 0 || return get_constant(g, const_hash)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could introduce a hash collision if the hash of some literal value is 0

Copy link
Collaborator Author

@gkronber gkronber Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True.
Alternative: we could set v_pair_first to a negative or zero value to indicate that this is a literal match, because we do not need the eclassid for literals when we return the hashvalue in v_pair_last.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zero value would work I believe

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to change it, but appearently we require the ecid (for literal matches) for this function:

instantiate_enode!(bindings, @nospecialize(g::EGraph), p::PatVar)::Id = v_pair_first(bindings[p.idx])

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be extremly unlikely to produce a hash collision as the number of literals is expected to be small compared to the number of enodes.

Copy link
Collaborator Author

@gkronber gkronber Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought a bit more about the hash collision argument. If we are concerned about hash collisions for literals, then we would need to change the definition of the egraph type, and all handling of literals, as we always just use the hash for literals.

mutable struct EGraph{ExpressionType,Analysis}
...
  "Hashcons the constants in the e-graph"
  constants::Dict{UInt64,Any}
...
end

quote
id = $(Symbol(:σ, addr))
eclass = g[id]
node_idx = $(Symbol(:enode_idx, addr)) - 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably get rid of the idx thing and just store the hash?

Copy link
Collaborator Author

@gkronber gkronber Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean to track hashes instead of enode_idxs in the pattern matcher?
Seems not straight-forward as the ematcher seems to require the enode_idxs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking about this a bit, and we probably don't need the indexes, but just the hashes. If so, we could also avoid using UInt128 in the buffer and pairing e-class ids with indexes in the buffer. As usual, if an element of the buffer has all bits set to 1 then it's a delimiter between different matches (of many variables). If it has all bits set to 0 then it could simply be discarded and signal eqsat_apply! that the next element is a literal hash and not an e-class ID.

I can experiment and push to a branch :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants