Description
clara.rules.platform/group-by-seq
can cause a subtle defect when Clojure record are involved. More generally, the problem is that c.r.p/group-by-seq
uses a java.util.LinkedHashMap
to perform the grouping operation. This is a Java-based collection type and is only aware of Java based equality semantics, ie Object.equals()
and Object.hashCode()
.
Clojure has it's own equality semantics taken into account in =
and hash
. There are some occasions where the difference between the Clojure vs Java equality semantics can cause functionally different behavior.
c.r.p/group-by-seq
is used internally within the Clara engine for various performance-sensitive operations. The j.u.LinkedHashMap
was chosen to be used within c.r.p/group-by-seq
to streamline this performance-sensitive operation. a j.u.LinkedHashMap
is convenient since it is mutable, for fast building of the ephemeral data structure needed. It also provides insert-order ordering determinism. This is useful to guarantee consistent execution times through the Clara engine from one JVM instance to the next, given the same input.
A concrete example of a problem with j.u.LinkedHashMap
's use of Java-based equality semantics has came up with regard to Clojure record types.
Clojure record types use their type information in clojure.core/=
comparisons and clojure.core/hash
. However, they also are meant to interop with Java, that is unaware of their record types, and instead they pose as plain java.util.Map
impl's from the Java interop perspective. This means that Clojure records Object.equals()
and Object.hashCode()
implementations do not take the type of records into account.
e.g.
(defrecord Foo [])
(defrecord Bar[])
(= (->Foo) (->Bar)) ;;= false
(.equals (->Foo) (->Bar)) ;;= true
This is a problem in the Clara engine. One example is that the engine it uses c.r.p/group-by-seq
on :bindings
of a rule. The :bindings
may contained Clojure records. The Clojure records should be compared with Clojure equality semantics, but instead they are using Java semantics.
@thegeez in the #clara Slack channel gave a reproducing case that can show this defect manifest itself. I'll copy it here:
(defrecord ClauseOne [b])
(defrecord ClauseTwo [b])
(defrule rule-one
[ClauseOne (= ?b b)]
[ClauseTwo (= ?b b)]
=>
(println "Match found:" ?b )
)
(defrecord WrapAlice [num])
(defrecord WrapBob [num])
(comment
(clear-ns-productions!)
(let [alice (->WrapAlice 1)
bob (->WrapBob 1)]
(println "(= alice bob)" (= alice bob))
(println "(.equals alice bob)" (.equals alice bob))
(-> (mk-session)
(insert
(->ClauseOne alice)
;; Line below is important, without it the wrong behavior does not surface
(->ClauseTwo alice)
(->ClauseTwo bob)
)
(fire-rules)
))
;;(= alice bob) false
;;(.equals alice bob) true
;; Match found: #clara_bug.core.WrapAlice{:num 1}
;; Match found: #clara_bug.core.WrapBob{:num 1} ;; <- should not happen
)
To fix this, it seems likely that the j.u.LinkedHashMap
should not be used. There aren't necessarily any Clojure equivalents that can capture the performance characteristics, and the order determinism that is desired here. Also, if there are, Clara will (and should) be reluctant to add more dependencies, due to the burden that places on consumers of the library who may use conflicting versions of those same dependencies (aka "jar hell").
Edit: The comments below mention a wrapper on the items in the j.u.LinkedHashMap
and I think that likely is the better approach than what I originally said below.
My first thought would be to perhaps just consider a custom implementation of this function that had the correct behavior with respect to Clojure equality semantics and our ordering guarantees, yet retained comparable performance characteristics. This probably could be done with a combination of a Clojure transient collection and perhaps a separate structure for tracking the ordering of the incoming input. More thought would be needed.
Related issues to the background of this: