Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sauto times out on reasonably small example #77

Open
samuelgruetter opened this issue Jul 27, 2020 · 7 comments
Open

sauto times out on reasonably small example #77

samuelgruetter opened this issue Jul 27, 2020 · 7 comments
Labels
enhancement New feature or request

Comments

@samuelgruetter
Copy link

I ran into several issues with Coq's builtin firstorder tactic, so I was very excited when I saw the documentation of coqhammer's sauto tactic: A general proof search tactic that allows me to specify which lemmas to use, which inductives to split on, which constructors to use, etc seems to be exactly the tool I was looking for.

So I tried it out on the following example (extracted to be standalone from a compiler correctness proof I'm working on):

From Hammer Require Import Hammer.
From Hammer Require Import Tactics.

Axiom K: Type.
Axiom V: Type.
Axiom map: Type.
Axiom get: map -> K -> option V.
(* in putmany m1 m2, keys of m2 override those of m1 *)
Axiom putmany: map -> map -> map.
(* `split m m1 m2` means that m can be split into two disjoint maps m1 and m2 *)
Axiom split : map -> map -> map -> Prop.

Axiom split_spec: forall (M A B: map),
    split M A B <-> forall k,
      (exists v, get M k = Some v /\
                 ((get A k = Some v /\ get B k = None) \/
                  (get B k = Some v /\ get A k = None))) \/
      (get M k = None /\ get A k = None /\ get B k = None).

Axiom putmany_spec: forall (m1 m2: map) k,
    (exists v, get (putmany m1 m2) k = Some v /\
               (get m2 k = Some v \/ (get m2 k = None /\ get m1 k = Some v))) \/
    (get (putmany m1 m2) k = None /\ get m2 k = None /\ get m1 k = None).

Goal forall m mKeep mGive frame m2 mL mStack,
    split m mKeep mGive ->
    split m2 m mL ->
    split mL mStack frame ->
    split m2 (putmany mKeep mL) mGive.
Proof.
  intros *. intros A1 A2 A3.
  pose proof (proj1 (split_spec _ _ _) A1) as B1.
  pose proof (proj1 (split_spec _ _ _) A2) as B2.
  pose proof (proj1 (split_spec _ _ _) A3) as B3.
  apply (proj2 (split_spec _ _ _)).
  clear A1 A2 A3.

  pose proof (putmany_spec mKeep mL) as B4.

  sauto inv: - ctrs: - cases: - ecases: off.

However, this sauto invocation does not return within 30s.

So I thought that maybe I'm not using the options correctly, and that the hammer plugin would tell me the right options to use, so I installed vampire and ran hammer:

Extracting features...
Running provers (8 threads)...
cvc4: No such file or directory
Error: Error running cvc4.
cvc4: No such file or directory
Error: Error running cvc4.
Error: Error running eprover.
Vampire (nbayes-64) succeeded
- dependencies: B4, B2, hammer_tests.putmany_spec, B1
Reconstructing the proof...
Trying reconstruction batch 1...
Trying reconstruction batch 2...
Trying reconstruction batch 3...
Trying reconstruction batch 4...
Trying reconstruction batch 5...
Trying reconstruction batch 6...
Trying reconstruction batch 7...

Error:
Hammer failed: proof reconstruction failed.
You may try increasing the reconstruction time limit with 'Set Hammer ReconstrLimit N' (default: 5s).
Other options are to disable the ATP which found this proof (Unset Hammer CVC4/Vampire/Eprover/Z3), or try to prove the goal manually using the displayed dependencies. Note that if the proof found by the ATP is inherently classical, it can never be reconstructed with CoqHammer's intuitionistic proof search procedure. As a last resort, you may also try enabling legacy reconstruction tactics with 'From Hammer Require Reconstr'.

... but the way it invoked proof reconstruction failed as well (after ~20s or so).

On the other hand, the following script solves the goal quite quickly, which makes me think that this goal is "not so hard":

  intro k.
  specialize (B1 k).
  specialize (B2 k).
  specialize (B3 k).
  specialize (B4 k).
  Time firstorder congruence. (* 0.61 secs *)
Qed.
  • Is there a way to make sauto succeed on this goal?
  • Do you have a recommended way of debugging proof reconstruction failures?

Coqhammer version: Latest commit (28200f8) on the coq8.11 branch
Coq version: 8.11.0

@lukaszcz
Copy link
Owner

lukaszcz commented Jul 27, 2020

Thank you for this example, but even after a brief look I see it is probably one of the examples that "sauto" will have problems with.

I would say the deeper reason is probably that this seems to be a "forward" proof, and the proof search procedure of sauto tries to construct the proof "backward" (a bit like eauto), which might be very inefficient here. Especially that there are many universally quantified existentials and disjunctions, so it will just try to eliminate them a bit "blindly" with backtracking.

But that's not how lemmas are usually formulated in Coq libraries.

Even though the "core" underlying proof search procedure of "sauto" is complete for a first-order fragment (without the many heuristics that make if effective on typical Coq problems), I don't expect "sauto" to solve really "difficult" first-order problems (or even some not-so-difficult ones). It's more like a "supercharged" version of "auto".

You may also try "l: on", "q: on", "limit:" or "depth:", "exh:". Maybe it will work then. Especially "l: on" might (or might not) help, because it disables "einv:" and "sinv:" which may be a problem with many disjunctions.

Though if the "hammer" tactic failed at proof reconstruction, you probably won't manage to find options that work.

@samuelgruetter
Copy link
Author

Thanks for the quick reply! I played a bit more with l:, q:, limit:, and exh:, none of which made it work with what I tried.
But sauto depth: 3. succeeds in 35s (while sauto depth: 2. fails).

@lukaszcz
Copy link
Owner

lukaszcz commented Jul 27, 2020

In fact, the following works, which suggests that forward reasoning is the problem:

intro k.
specialize (B1 k).
specialize (B2 k).
specialize (B3 k).
specialize (B4 k).
sauto inv: - ctrs: - cases: -.

"sauto" takes 2sec here.

The "fcrush" tactic also works, but takes a few moments (5sec). "fcrush" is essentially "sauto" with more heuristic forward reasoning.

So "hammer" reconstruction might actually succeed if you increase proof reconstruction time limit.

@samuelgruetter
Copy link
Author

Ok, so if I do Set Hammer ReconstrLimit 20., hammer takes about 1 minute, and succeeds by telling me to use hecrush use: putmany_spec., which takes 6s.

@lukaszcz lukaszcz added the enhancement New feature or request label Jul 27, 2020
@lukaszcz
Copy link
Owner

The slow runtime is indeed not acceptable in this case. I'll leave this issue open as an enhancement, because fixing this requires more fundamental work on efficient forward-reasoning proof search in sauto/hammer.

@samuelgruetter
Copy link
Author

I see, thanks 👍

I also tried to use the hecrush that worked on the isolated standalone example in the actual context, and there it didn't finish for several minutes. That's something I also observed with Coq's firstorder tactic: It's fast on an isolated example, but as soon as you try to use it in on the "same" example but in a bigger context, it becomes infeasably slow.

For now I pushed the few experiments I did to https://github.com/mit-plv/bedrock2/tree/coqhammer

@lukaszcz
Copy link
Owner

lukaszcz commented Jul 30, 2020

I also tried to use the hecrush that worked on the isolated standalone example in the actual context, and there it didn't finish for several minutes. That's something I also observed with Coq's firstorder tactic: It's fast on an isolated example, but as soon as you try to use it in on the "same" example but in a bigger context, it becomes infeasably slow.

That's unsurprising, and to some extent unavoidable. If you add hypotheses to the context you increase the search space very fast - the problem is not even decidable.

It is also possible to combine "sauto" with "firstorder" or any other tactic using the "finish:", "final:", "simp:", "solve:" options. This goal can be solved in 3sec by:

qauto depth: 1 finish: (firstorder congruence).

That's the fastest combination I've come up with, after fiddling for a few minutes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants