Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error of unhashable type #259

Open
behrica opened this issue Dec 8, 2023 · 15 comments
Open

error of unhashable type #259

behrica opened this issue Dec 8, 2023 · 15 comments

Comments

@behrica
Copy link
Contributor

behrica commented Dec 8, 2023

Doing this

(pyreq/require-python 'sklearn.datasets)
(def newsgroups (sklearn.datasets/fetch_20newsgroups :subset "all" :remove (builtins/tuple [ "headers" "footers" "quotes"])))

and teh opening newsgroups i te ccider-inspector gives an error
Seems to happen only in cider-inspector ...

user> *e
;; => #error {
 :cause "TypeError: unhashable type: 'numpy.ndarray'\n"
 :via
 [{:type clojure.lang.ExceptionInfo
   :message nil
   :data #:clojure.error{:phase :print-eval-result}
   :at [clojure.main$repl$read_eval_print__9206 invoke "main.clj" 442]}
  {:type java.lang.Exception
   :message "TypeError: unhashable type: 'numpy.ndarray'\n"
   :at [libpython_clj2.python.ffi$check_error_throw invokeStatic "ffi.clj" 707]}]
 :trace
 [[libpython_clj2.python.ffi$check_error_throw invokeStatic "ffi.clj" 707]
  [libpython_clj2.python.ffi$check_error_throw invoke "ffi.clj" 705]
  [libpython_clj2.python.base$hash_code invokeStatic "base.clj" 180]
  [libpython_clj2.python.base$hash_code invokePrim "base.clj" -1]
  [libpython_clj2.python.bridge_as_jvm$generic_pyobject$reify__23789 hashCode "bridge_as_jvm.clj" 231]
  [clojure.lang.Util hasheq "Util.java" 173]
  [clojure.lang.Murmur3 hashOrdered "Murmur3.java" 107]
  [clojure.lang.ASeq hasheq "ASeq.java" 91]
  [clojure.lang.Util dohasheq "Util.java" 177]
  [clojure.lang.Util hasheq "Util.java" 168]
  [clojure.lang.PersistentHashMap hash "PersistentHashMap.java" 120]
  [clojure.lang.PersistentHashMap$TransientHashMap doAssoc "PersistentHashMap.java" 327]
  [clojure.lang.ATransientMap assoc "ATransientMap.java" 64]
  [clojure.lang.PersistentHashMap create "PersistentHashMap.java" 56]
  [clojure.lang.PersistentHashMap create "PersistentHashMap.java" 100]
  [clojure.lang.PersistentArrayMap createHT "PersistentArrayMap.java" 64]
  [clojure.lang.PersistentArrayMap assoc "PersistentArrayMap.java" 258]
  [clojure.lang.PersistentArrayMap assoc "PersistentArrayMap.java" 30]
  [clojure.lang.RT assoc "RT.java" 827]
  [clojure.core$assoc__5481 invokeStatic "core.clj" 193]
  [clojure.core$assoc__5481 invoke "core.clj" 192]
  [clojure.lang.Atom swap "Atom.java" 65]
  [clojure.core$swap_BANG_ invokeStatic "core.clj" 2371]
  [clojure.core$memoize$fn__6946 doInvoke "core.clj" 6388]
  [clojure.lang.RestFn invoke "RestFn.java" 421]
  [orchard.inspect$eval7170$fn__7175$fn__7188 invoke "inspect.clj" 660]
  [clojure.core$group_by$fn__8597 invoke "core.clj" 7224]
  [clojure.core.protocols$fn__8249 invokeStatic "protocols.clj" 168]
  [clojure.core.protocols$fn__8249 invoke "protocols.clj" 124]
  [clojure.core.protocols$fn__8204$G__8199__8213 invoke "protocols.clj" 19]
  [clojure.core.protocols$seq_reduce invokeStatic "protocols.clj" 31]
  [clojure.core.protocols$fn__8236 invokeStatic "protocols.clj" 75]
  [clojure.core.protocols$fn__8236 invoke "protocols.clj" 75]
  [clojure.core.protocols$fn__8178$G__8173__8191 invoke "protocols.clj" 13]
  [clojure.core$reduce invokeStatic "core.clj" 6886]
  [clojure.core$group_by invokeStatic "core.clj" 7214]
  [clojure.core$group_by invoke "core.clj" 7214]
  [orchard.inspect$eval7170$fn__7175 invoke "inspect.clj" 658]
  [clojure.lang.MultiFn invoke "MultiFn.java" 234]
  [orchard.inspect$inspect_render invokeStatic "inspect.clj" 792]
@behrica behrica changed the title error of unhahable type error of unhashable type Dec 8, 2023
@behrica
Copy link
Contributor Author

behrica commented Dec 8, 2023

Root cause is this:

(def newsgroups (sklearn.datasets/fetch_20newsgroups :subset "all" :remove (builtins/tuple [ "headers" "footers" "quotes"])))
(.hashCode newsgroups)

failing with:

 Unhandled java.lang.Exception
   TypeError: unhashable type: 'Bunch'

@behrica
Copy link
Contributor Author

behrica commented Dec 8, 2023

There seems to be the opinion in the java community, that .hashCode implementations should never throw exceptions.

@behrica
Copy link
Contributor Author

behrica commented Dec 8, 2023

But this seem to be a very special case in Python, where a python type is not hashable.
Bunch extends dict, and dict is not hashable in Python.

@jjtolton
Copy link
Contributor

jjtolton commented Dec 9, 2023

Interesting. I can tell you that the user interface philosophy so far as been:

  1. Default to Clojure idioms, unless adopting Clojure idioms would prevent certain Python behavior -- i.e., automatically casting a Python list to a vector would not allow using .append() style methods to the Python list.
  2. Allow opt-in Python idioms where appropriate, i.e., :bind-ns allows a user to have Python module be bound the a Clojure namespace symbol.

This is the first time I'm aware of that there has been a conflict with a Java idiom. For instance, hash([]) throwing an error is expected Python behavior. I suppose the acceptable solution would be, "allow tools that are expecting Java objects to behave like Java objects have objects that behave like Java objects, but Python code expecting Python objects should have Python objects that behave like Python objects." I'm sure there's a more elegant way to phrase that, and I can already see the conceptual difficulty with figuring out how to approach the problem.

The simple approach would be to patch the hashing behavior for Java, so maybe that's best. I don't think many libpython-clj users would be overly upset that calling hash on an unhashable object would return nil rather than throw an error.

@cnuernber
Copy link
Collaborator

Hmm. Or embrace and extend the python dict type somehow to support clojure's algorithm for hashing.

@behrica
Copy link
Contributor Author

behrica commented Dec 9, 2023

I think return 0 is as well acceptable for a java hashcode impl (better the nil)

@behrica
Copy link
Contributor Author

behrica commented Dec 9, 2023

There are some reports that Clerk cannot render the "newsgroup" objects, no sure if same reason.
https://clojurians.slack.com/archives/CLR5FD4ET/p1701986985003569

@behrica
Copy link
Contributor Author

behrica commented Dec 9, 2023

Maybe the code here:

(throw (Exception. ^String error-str))))

should catch TypeError: unhashable type:
and not throw, but return 0 instead.

Just to address the point that "non hashable" in python is "expected", while in java its not.

@jjtolton
Copy link
Contributor

jjtolton commented Dec 9, 2023

Hmm. Or embrace and extend the python dict type somehow to support clojure's algorithm for hashing.

As a hacker, I would love this approach and I think it would fulfill the original intent of introspectable datastructures. Less hacker-inclined devs may fume a bit at the potential implications of sets of mutable dicts and dicts as keys, and other potential footguns. Not sure what the effort would be to make this behavior opt-in or toggle-able.

@behrica
Copy link
Contributor Author

behrica commented Dec 9, 2023

As a far simpler example, we can take this for discussion:

(hash
 (py/->py-dict {:a 1}))

This should NOT fail in my view, but it does:


1. Unhandled java.lang.Exception
  TypeError: unhashable type: 'dict'


                  ffi.clj:  707  libpython-clj2.python.ffi/check-error-throw
                  ffi.clj:  705  libpython-clj2.python.ffi/check-error-throw
                 base.clj:  180  libpython-clj2.python.base/hash-code
                 base.clj:   -1  libpython-clj2.python.base/hash-code
        bridge_as_jvm.clj:  231  libpython-clj2.python.bridge-as-jvm/generic-python-as-map/reify
                Util.java:  173  clojure.lang.Util/hasheq
                 core.clj: 5197  clojure.core/hash
                 core.clj: 5190  clojure.core/hash
                     REPL:   45  testLibPy.testLibPy/ev
                     

@behrica
Copy link
Contributor Author

behrica commented Dec 9, 2023

In my view, in the same way we have a default behavoiur for toString:

(str
 (py/->py-dict {:a 1}))

which should return "a string" for any libpython object, we should return "a number " for any libpython object, when .hashCode is called on it.
(similar for equals())

Which algorithm to use to calculate the hashcode is then a less important consideration.
(return 0 would be already better then exception)

@jjtolton
Copy link
Contributor

jjtolton commented Dec 9, 2023

Well this discussion has also inspired me to open a new issue, because analytically I think a very important tool that is currently missing is the equivalent of py->clj and clj->py, analogous to js->clj and clj->js. Then it would be rather straightforward to do the (.hashCode (py->clj dict)). The obvious issue of course is that there is not a 1:1 correspondence, and it may be only marginally more useful than casting to json.

the issue with str, @behrica , is that Python objects are free to implement (or not) their own str implementation, and there would be a lot of unhelpful and borderline random behavior using str as a hashing key.

@behrica
Copy link
Contributor Author

behrica commented Dec 9, 2023

I am not suggesting to use str as hashing key.
In my view, we should "catch" "TypeError: unhashable type",
here:

(throw (Exception. ^String error-str))))

and either:
return 0, as hashcode or
return id() of the object as hashcode

Both will comply with the hashcode/equals rules of Java, I believe:
https://www.baeldung.com/java-equals-hashcode-contracts

@behrica
Copy link
Contributor Author

behrica commented Dec 9, 2023

Or check here:

if the python object is hashable:
The presents of attribute "hash" can be checked for being "nil"

(->
 (py/->python "")
 (py/get-attr "__hash__"))
;; => #object[tech.v3.datatype.ffi.Pointer 0x5406e4ba "{:address 0x00007EFD4368E8B0 }"]
;;
(->
 (py/->py-dict {:a 1})
 (py/get-attr "__hash__"))
;; => nil

@behrica
Copy link
Contributor Author

behrica commented Jan 16, 2024

This issue makes Clerk and libpython-clj not work well together.
Clerk is based on "value hashing" for caching, and crashes when hash-calculation of a value throws exception.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants