Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: ensure transitive property for hash value comparisons #388

Closed
wants to merge 3 commits into from

Conversation

filipesilva
Copy link
Contributor

Fix #386

@filipesilva filipesilva force-pushed the transitive-hash-comp branch from 719f122 to 7078de7 Compare March 5, 2021 21:24
@filipesilva filipesilva force-pushed the transitive-hash-comp branch from 7078de7 to 466cf0d Compare March 5, 2021 21:27
@tonsky
Copy link
Owner

tonsky commented Mar 5, 2021

Can you fix it for clj version too?

@tonsky
Copy link
Owner

tonsky commented Mar 5, 2021

Does your test fail without the fix? I was under impression that entities were missing in broken DB, but you are checking that after deleting everything in DB it remains empty.

@filipesilva
Copy link
Contributor Author

filipesilva commented Mar 5, 2021

Yes, I did verify the test fails before the fix. Verified it again just now for sanity, and that seq is ([:block/uid "wRmp6bXAx"] [:block/uid "rfL-iQOZm"]) (instead of empty) before the fix. These are the two ents that cannot be looked up due to the bug.

Regarding CLJ support, I imagine it'd look like this:

(defn value-compare [x y]
  (cond
    (= x y) 0
    (not (identical? (type x) (type y)))  (compare (type x) (type y))
    #?@(:clj  [(instance? Number x)       (clojure.lang.Numbers/compare x y)])
    #?@(:clj  [(instance? Comparable x)   (.compareTo ^Comparable x y)]
        :cljs [(satisfies? IComparable x) (-compare x y)])
    #?@(:cljs [(or (string? x) (array? x) (true? x) (false? x)) (garray/defaultCompare x y)])
    :else (- (hash x) (hash y))))

WDYT?

@tonsky
Copy link
Owner

tonsky commented Mar 5, 2021

  1. I think we should replace type with class (type returns something from metadata or class, class is always a class).
  2. Figure out how to compare classes (they are not comparable).
  3. Type comparison should go after Comparable check, otherwise we’ll fail to compare Int and Long, Vector and Seq etc
user=> (compare (type 1) (type "a"))
Execution error (ClassCastException) at user/eval15 (REPL:1).
class java.lang.Class cannot be cast to class java.lang.Comparable (java.lang.Class and java.lang.Comparable are in module java.base of loader 'bootstrap')

@filipesilva
Copy link
Contributor Author

  1. Will do class for CLJ, but CLJS doesn't have that fn.
  2. WDYT of (compare (str (class x)) (str (class y)))?
  3. There might be some extra subtlety there, because string is comparable but will throw when compared to number
(instance? Comparable "abc")
true
(.compareTo "abc" 3)
Execution error (ClassCastException) at user/eval17 (REPL:1).
class java.lang.Long cannot be cast to class java.lang.String (java.lang.Long and java.lang.String are in module java.base of loader 'bootstrap')

@filipesilva
Copy link
Contributor Author

Another problematic case, this one for CLJS

(satisfies? IComparable [1 2 3]) ;; => true
(-compare [1 2 3] 2) ;; => error, cannot compare

@filipesilva
Copy link
Contributor Author

Push a change that checks for different classes after the comparable check, but also catches errors and checks for different classes there too. LMK what you think.

@tonsky
Copy link
Owner

tonsky commented Mar 5, 2021

Judging from http://www.docjar.com/html/api/java/lang/Class.java.html, Class toString allocates new string, which is slow. Better use Class.getName, which doesn’t allocate.

As for CLJS, I wonder what type result of type is? But there’s type->str, so maybe we can just use that? https://github.com/clojure/clojurescript/blob/69b66374d857a176c5a9cd8cc0cbb7506876640d/src/main/cljs/cljs/core.cljs#L328-L331

Otherwise, looks good. Nice touch on try/catch!

@filipesilva
Copy link
Contributor Author

On CLJS type will return the constructor itself, so I'm kinda wary of using type->str because I (maybe?) makes a new string, but at any rate two fns with the same name and body will be the same string, which sounds wrong. garray.defaultCompare will just really call > and that seems to compare the fns in a consistent way anyway, so I used it. If you'd still like me to use type->str though, happy to oblige.

For CLJ it does sound better to use get name, will change that.

@tonsky
Copy link
Owner

tonsky commented Mar 5, 2021

Looks good. I’m curious about function comparsion, asked here https://twitter.com/nikitonsky/status/1367984972519313410?s=20. Probably by tomorrow we’ll have an answer, I’ll merge it then.

Thanks for the great work and contribution! I’m glad we got to the bottom of this.

@tonsky
Copy link
Owner

tonsky commented Mar 6, 2021

Ok, seems like comparing functions is going through toString()

One problem here is that == doesn’t go through toString, but < > <= >= do:

> let a = () => 1;
undefined

> let b = () => 1;
undefined

> a < b
false

> b < a
false

> a <= b
true

> b <= a
true

> a == b
false

> a === b
false

Another is that in some cases functions return [native code] as their body, makign toString comparison return false positive:

> (function f(){ return 1; }.bind(0)).toString()
"function() {
    [native code]
}"

> (function f(){ return 2; }.bind(0)).toString()
"function() {
    [native code]
}"

> (function f(){ return 1; }.bind(0)) < (function f(){ return 2; }.bind(0))
false

> (function f(){ return 1; }.bind(0)) > (function f(){ return 2; }.bind(0))
false

> (function f(){ return 1; }.bind(0)) == (function f(){ return 2; }.bind(0))
false

I think we can go through (garray/defaultCompare (type->str (type a)) (type->str (type b))). It will make functions returning [native code] being equal. That’s unfortunate, but I don’t see how we can’t work around it.

I’ll add the change and merge the PR now.

@tonsky
Copy link
Owner

tonsky commented Mar 6, 2021

Merged and released 1.0.6. Thanks again!

@tonsky tonsky closed this Mar 6, 2021
@tonsky
Copy link
Owner

tonsky commented Mar 6, 2021

Oops, I mean 1.0.7 :)

@filipesilva
Copy link
Contributor Author

I think the end result of this JS behaviour is not a problem, because we only compare types when they are not identical, and identical doesn't seem to suffer from the problem of == as you described.

Which is to say, when we compare we already know they are not identical, and JS may give false positives ("the comparator ret for these two is 0") on the >/< check, but it doesn't give false negatives ("this one is bigger than that one"). Which, as I'm writing this I realise, means it could lead to things not being inserted in the index at all.

Not sure what the best thing to do in JS would be here really. But I think these changes are a step forward regardless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

avet index out of sync with datoms
2 participants