Skip to content

Commit

Permalink
Fix a bug in group-by's combiner, and also a bug in into
Browse files Browse the repository at this point in the history
group-by's use of `merge-with` caused some interesting special cases;
notably, it would fail to use the combiner identity correctly, causing
emitted results to have the wrong type. This patch replaces merge-with
with a custom implementation that starts every group's combiner phase
with a combiner identity. We also use transients for a performance
boost when the number of chunks is large.

Into would crash for large numbers of chunks, because concat causes
stack overflows. We now conj directly into the given collection.
  • Loading branch information
aphyr committed Sep 8, 2016
1 parent caf71b5 commit de2986f
Showing 1 changed file with 18 additions and 9 deletions.
27 changes: 18 additions & 9 deletions core/src/tesser/core.clj
Original file line number Diff line number Diff line change
Expand Up @@ -742,9 +742,9 @@
{:reducer-identity vector
:reducer conj
:post-reducer identity
:combiner-identity vector
:combiner core/concat
:post-combiner (partial core/into coll)})
:combiner-identity (constantly coll)
:combiner core/into
:post-combiner identity})

(defwraptransform post-combine
"Transforms the output of a fold by applying a function to it.
Expand Down Expand Up @@ -798,12 +798,21 @@
; necessary.
(get acc category (reducer-identity-))
input))))
:post-reducer identity
:combiner-identity hash-map
:combiner (fn combiner [m1 m2]
(merge-with combiner- m1 m2))
:post-combiner (fn post-combiner [m]
(map-vals post-combiner- m))})
:post-reducer identity
:combiner-identity (comp transient hash-map)
:combiner (fn combiner [m1 m2]
(core/reduce (fn [m pair]
(let [k (key pair)
v2 (val pair)
v1 (get m k ::not-found)
v1 (if (= v1 ::not-found)
(combiner-identity-)
v1)]
(assoc! m k (combiner- v1 v2))))
m1
m2))
:post-combiner (fn post-combiner [m]
(map-vals post-combiner- (persistent! m)))})

(deftransform facet
"Your inputs are maps, and you want to apply a fold to each value
Expand Down

0 comments on commit de2986f

Please sign in to comment.