-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue #381 Reduce the number of Eval calls #396
Issue #381 Reduce the number of Eval calls #396
Conversation
[type (with-meta | ||
(eng/->AlphaNode env beta-children alpha-fn) | ||
{:alpha-expr (:alpha-expr (meta alpha-map))})]) | ||
[type (eng/->AlphaNode id env beta-children alpha-fn)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was torn here, i was temped to add the AlphaNodes to the id-to-node
since alpha nodes have ids now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe. I'm not sure what utility there would be in it. It may be useful at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like it could be useful for debugging and REPL exploration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there seems to be value, i will add them
{:std (double std) | ||
:mean (double mean)})) | ||
|
||
(defn run-performance-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rolled my own here, rather than using a library like criterium
because criterium makes assumptions based on the classes loaded and compiled. This lead to infinite looping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"because criterium makes assumptions based on the classes loaded and compiled" => Could you elaborate? What sorts of assumptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WilliamParker I agree the comment should explain it so we don't forget why.
I am fairly sure the rationale behind this though is that criterium
tries to warm up the JVM by stabilizing the JIT and GC metrics first prior to timing. I believe that class loading is one of the metrics that criterium
waits to see become more stable.
This performance test intentionally is creating classes on the fly via eval
, so that conflicts with what criterium
is trying to ignore.
Ethan may have a clearer description of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WilliamParker as @mrrodriguez mentioned criterium
does try to warm up for JIT, here. Again as Mike says, it seems that criterium
is looking for a stable state before running benchmarks. While ideal for most situations, compilation in this case will never reach a steady state. Effectively making the test a really expensive while (true)
block.
I will probably put a comment on this function mentioning the reasoning.
Issue #381 |
|
||
record-holder (ArrayList.) | ||
;; The rulebase should either be given from the base-session or found in | ||
;; the restored session-state. | ||
maybe-base-rulebase (when (and (not rulebase-only?) base-rulebase) | ||
base-rulebase) | ||
|
||
compilation-partition-size (or compilation-partition-size 1250) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
significance of 1250 here? what is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its pretty arbitrary, originally i had 1k but on testing i was able to load a larger rulebase with 1.25k.
Honestly, it could probably be a bit higher, it would just come down to the size of the functions being compiled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be good to say what 1250 is; bytes? A comment as to its arbitrary nature would be good as well so the next one through feels better if it needs to be altered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer accurate, this has changed
"Takes a map in form produced by extract-exprs and evaluates the values(expressions) of the map in a batched manner. | ||
This allows the eval calls to be more effecient, rather than evaluating each expression on its own." | ||
[key->expr :- {(schema/tuple sc/Int sc/Keyword) schema/SExpr} | ||
partiton-size :- sc/Int] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor spelling mistake, i'll clean that up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just went through the testing for the most part. Working on the other parts.
(println description) | ||
(println "==========================================") | ||
(println (str "Mean: " mean "ms")) | ||
(println (str "Standard Deviation : " std "ms" \newline)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consistency, do "Deviation:" instead of "Deviation :"
:let [next-fact (symbol (str seed-sym "prime")) | ||
production (assoc base-production | ||
:lhs [{:type (keyword seed-sym) | ||
:constraints ['(= this ?binding) `(filter-fn ~'this)]}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use syntax quote so that =
is expanded. Even though Clara will accept the bare =
symbol in many places, I think it is a bad practice due to it being able to be shadowed. Clara should be using clojure.core/=
, not a bare =
.
`(= ~'this ~'?binding)
This applies elsewhere in this ns too.
[clara.rules.durability.fressian :as dura]) | ||
(:import [java.io ByteArrayOutputStream ByteArrayInputStream])) | ||
|
||
(defn filter-fn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how this would work. gensym
should return a new and unique symbols every time it is called. So this filter-fn
will always be false.
If that is intentional, some sort of explanation doc string would be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As these rules never fire, this is a placeholder.
(for [seed-sym seed-syms | ||
:let [next-fact (symbol (str seed-sym "prime")) | ||
production (assoc base-production | ||
:lhs [{:accumulator '(clara.rules.accumulators/all) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clara.rules.accumulators
needs to be required explicitly by the ns running these rules, which I believe is this one. There shouldn't be assumptions about other namespaces already requiring your direct dependencies. That can lead to compilation-order dependent runs that fail occasionally.
(into {} | ||
(for [seed-sym seed-syms | ||
:let [production (assoc base-production | ||
:lhs [{:type [seed-sym] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the :type
wrapped in a vector here? I think it should be :type seed-sym
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, (keyword seed-sym)
, but you are right
(testing "Session serialization performance" | ||
(run-performance-test | ||
{:description "Generated Session serialization" | ||
:func #(.serialize (dura/create-session-serializer (ByteArrayOutputStream.)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't interact with the API via the impl detail that protocols may expose interfaces.
This should be called via the protocol fn var.
[clara.rules.durability :as dur]
...
(dur/serialize ...)
Also, I'd recommend not calling [clara.rules.durability.fressian :as dura]
since that is confusing.
Perhaps, [clara.rules.durability.fressian :as fres]
or something
(run-performance-test | ||
{:description "Generated Session deserialization" | ||
:func #(-> (dura/create-session-serializer (ByteArrayInputStream. session-bytes)) | ||
(.deserialize session-bytes {:rulebase-only? true})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, call the protocol fn via it's var, not impl detail Java interop.
(run-performance-test | ||
{:description "Generated Session deserialization" | ||
:func #(-> (dura/create-session-serializer (ByteArrayInputStream. session-bytes)) | ||
(.deserialize session-bytes {:rulebase-only? true})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is :rulebase-only?
only used in the last deserialize case? Shouldn't it be set for all serialize and deserialize calls?
|
||
(let [session (com/mk-session rules) | ||
os (ByteArrayOutputStream.)] | ||
(testing "Session serialization performance" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be more appropriate: "Session rulebase durability performance". This isn't testing the memory serialization aspects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've went through all but
extract-exprs
and compile-exprs
(big ones).
However, without those, I can see the picture of how this is working out. I will soon go through those. I am sure I'll have questions/comments there as well.
@@ -563,6 +550,11 @@ | |||
the session was serialized without the rulebase, i.e. :with-rulebase? = false, so it needs a rulebase | |||
to be 'attached' back onto it to be usable. | |||
|
|||
* :compilation-partition-size - The maximum number of expressions that will be evaluated per call to eval. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if :forms-per-eval
would be more revealing of a name. I'm not positive that I love this name.
It is also just an unfortunate configuration option. It's fairly hard to have any good idea what this number should be. I guess if it needs to be tweaked, it'd be via trial/error.
get-alphas-fn]) | ||
get-alphas-fn | ||
;; A map of [node-id field-name] to function. | ||
node-identifier-to-fn]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more consistent name would be node-id-to-fn
. This is a weird clash with id-to-node
above.
I guess you are trying to be more explicit that the id is represented as a [node-id field-name]
, but I don't think that "identifier" captures that since that's what "id" is an abbreviation for already.
Perhaps abandon the naming schema for this and go with something like node-expr-fn-lookup
. I don't think that all associated structures have to be named in a way that explicitly describes the structure of their key-val pairs.
condition)] | ||
condition) | ||
|
||
retrieve-fn (fn [id field] (get id-key->expr [id field]))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommended rename: compiled-expr-fn
. This is too generic and confusing to read later on.
@@ -1263,105 +1439,70 @@ | |||
id :- sc/Int | |||
is-root :- sc/Bool | |||
children :- [sc/Any] | |||
compile-expr ; Function to do the evaluation. | |||
id-key->expr :- {(schema/tuple sc/Int sc/Keyword) IFn} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar as I mentioned above in this ns, perhaps calling it expr-fn-lookup
is more meaningful. In this local scope, it is less important that where I mentioned above though.
|
||
;; Sort the ids to compile based on dependencies. | ||
[{:keys [id-to-production-node id-to-condition-node id-to-new-bindings forward-edges backward-edges]} :- schema/BetaGraph | ||
id-key->expr :- {(schema/tuple sc/Int sc/Keyword) IFn}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same renaming suggestions as above.
[type (with-meta | ||
(eng/->AlphaNode env beta-children alpha-fn) | ||
{:alpha-expr (:alpha-expr (meta alpha-map))})]) | ||
[type (eng/->AlphaNode id env beta-children alpha-fn)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe. I'm not sure what utility there would be in it. It may be useful at some point.
@@ -1677,7 +1775,8 @@ | |||
fact-type-fn | |||
ancestors-fn | |||
activation-group-sort-fn | |||
activation-group-fn] | |||
activation-group-fn | |||
expressions] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another place I'd consider renaming as mentioned above. expressions
is probably the most vague.
env (assoc :env env)) | ||
{:alpha-expr alpha-expr}))) | ||
[alpha-nodes :- [schema/AlphaNode] | ||
id-key->expr :- {(schema/tuple sc/Int sc/Keyword) IFn}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same renaming suggestions as above.
productions))) | ||
productions)) | ||
|
||
(sc/defn extract-exprs :- {(schema/tuple sc/Int sc/Keyword) schema/SExpr} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This schema is referred to a lot. It'd be helpful to extract it with a s/defschema
so it is easier to read through the many places it is used.
Suggested NodeExprFnLookup
I forgot to update the changelog, so i will be adding that shortly as well |
condition)] | ||
(case (or (:node-type beta-node) | ||
;; If there is no :node-type then the node is the ::root-condition | ||
;; however, creating a case for nil could potentially casue weird effects if something about the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "casue" -> "cause"
(:id-to-condition-node beta-graph)))) | ||
|
||
(sc/defn compile-exprs :- schema/NodeFnLookup | ||
"Takes a map in form produced by extract-exprs and evaluates the values(expressions) of the map in a batched manner. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grammar: "in form produced" -> "in the form produced"
|
||
(sc/defn compile-exprs :- schema/NodeFnLookup | ||
"Takes a map in form produced by extract-exprs and evaluates the values(expressions) of the map in a batched manner. | ||
This allows the eval calls to be more effecient, rather than evaluating each expression on its own." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a line after this should say
"See #381 for more details."
;; Grouping by ns, most expressions will not have a defined ns, only expressions from production nodes. | ||
;; These expressions must be evaluated in the ns where they were defined, as they will likely contain code | ||
;; that is namespace dependent. | ||
(for [[ns pairs] (group-by (comp :ns meta key) key->expr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Maybe avoid naming confusion here, call this [ns group]
and then below bind the partition-all
result to expr-batch
.
It was a bit confusing to read pairs
so much and also not that informative of what it is.
;; Partitioning the number of forms to be evaluated, Java has a limit to the size of methods if we were | ||
;; evaluate all expressions at once it would likely exceed this limit and throw an exception. | ||
pairs (partition-all partition-size pairs) | ||
:let [node-keys (map #(nth % 0) pairs)]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you avoiding using first
here for some reason? If it is perf, I believe that pairs
in this line is a seq of map entries. If that is the case, key
is the fastest.
;; evaluate all expressions at once it would likely exceed this limit and throw an exception. | ||
pairs (partition-all partition-size pairs) | ||
:let [node-keys (map #(nth % 0) pairs)]] | ||
(mapv (fn [node-key expr] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: this is mapv vector
, the anonymous fn just reimplements it's 2-arity case.
(with-bindings (if ns | ||
{#'*ns* (-> ns symbol the-ns)} | ||
{}) | ||
(batching-try-eval node-keys (mapv #(nth % 1) pairs)))))))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above, second
, or better, val
if it is always a map entry.
;; If none of the rules are the issue, it is likely that the | ||
;; size of the code trying to be evaluated has exceeded the limit | ||
;; set by java. | ||
(throw (ex-info "The batching size of expressions has exceeded the limit set by java." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ex-info
has an 3-arity version designed to take the cause
, ie ex-info [msg map cause]
So
(ex-info "The batching size of expressions has exceeded the limit set by java." {} e)
I'm not sure the :original-exception
will add value. Perhaps if that exception has ex-data
to include. Also, it may be useful to put the node-keys
in this map.
;; Grouping by ns, most expressions will not have a defined ns, only expressions from production nodes. | ||
;; These expressions must be evaluated in the ns where they were defined, as they will likely contain code | ||
;; that is namespace dependent. | ||
(for [[ns pairs] (group-by (comp :ns meta key) key->expr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The group-by
will result in a non-deterministic order here. It may be best (for debugging perhaps), to be deterministic.
Perhaps a sort-by key
would be reasonable to add?
;; 1250 is an arbitrary number, this could be lower or higher depending on the | ||
;; rulebase that is being compiled, with regards to the average size of the | ||
;; forms being evaluated. | ||
forms-per-eval (:forms-per-eval options 1250) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to have the 1250 in a var
forms-per-eval-default
somewhere and refer to that in all the places that you currently refer to 1250. That way the comments and the compiler vs deserialization impl doesn't get out of sync.
Also, how did you come up with 1250? Is this a threshold you found to work well in practice? It does depend on size of expressions of course.
As an aside to this discussion:
I think ideally, rules would be written in a way where there were not large inline code blocks in the LHS conditions, accumulators, and/or the RHS action forms.
I know from experience that people sometimes tend to make really large RHS forms though (I don't recommend it).
e.g.
Bad
(defrule my-rule
=>
(let [<lots of bindings and logic>]
(insert! some-result))
Good
Good
(defrule my-rule
=>
;; `do-bigger-logic-fn` is evaluated under normal compilation conditions (better).
;; Also, things can be broken up and made more readable by naming functions around them.
;; Also, these pure fn's can be independently verified/tested.
(let [some-result (do-bigger-logic-fn <bindings needed from LHS>)]
(insert! some-result))
If rules were written this way, I'd think the :forms-per-eval
could be quite a bit higher. I realize that'd be a breaking change for impl's that have abused the RHS by making huge action expressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EthanEChristian and I talked a lot about this offline. It actually may be the case that inline forms don't have an effect on the :forms-per-eval
. This is nice if it proves to be the case.
Basically, the reason is that batching-try-eval
eval's a vector of expressions of a fn
form.
e.g.
[(fn [x] ...) (fn [y] ...) ...]
These each get their own classfile, so the big method size ends up just being in the method that creates this vector of fn
references. There is constant overhead in terms of bytecodes for this method then. That constant overhead can be multiplied by teh :forms-per-eval
to get a close approx of the correct value to default to. Keeping it configurable is likely still a good idea, due these sizes being possibly variable across Clj compiler versions, etc.
We can make sure to add those details here shortly once it is completely validated.
@EthanEChristian @mrrodriguez I'll try to take an initial look at this over the weekend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took an initial look over this, I'll probably have more to say later.
project.clj
Outdated
@@ -68,8 +68,15 @@ | |||
;; perhaps because Leiningen is using this as uneval'ed code. | |||
;; For now just duplicate the line. | |||
:test-selectors {:default (complement (fn [x] | |||
(some->> x :ns ns-name str (re-matches #"^clara\.generative.*")))) | |||
:generative (fn [x] (some->> x :ns ns-name str (re-matches #"^clara\.generative.*")))} | |||
(let [backlisted-packages #{"generative" "performance"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spelling: "backlisted" => blacklisted
:let [{:keys [type constraints fact-binding args]} condition]] | ||
|
||
{:type (com/effective-type type) | ||
{:id id | ||
:type (com/effective-type type) | ||
:alpha-fn (com/compile-condition type (first args) constraints fact-binding env) | ||
:children (vec beta-children)}))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems odd to me that we build a map here and then use it to construct alpha nodes later in assemble-session while building the beta nodes directly in this namespace, but changing that is probably out of scope here. Maybe there's a reason for this that I'm not remembering offhand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that is a little weird the only thing i think that is forcing this is, the access to the compiled beta nodes. At this point the beta-children are just ids of the nodes that will be constructed later.
(:require [schema.core :as s]) | ||
#?(:clj | ||
(:import [clojure.lang | ||
IFn]))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: You could use the ifn? predicate instead if you want to avoid the need for conditionals. Looks like that has been in ClojureScript since 2012: commit
type (first args) constraints | ||
fact-binding env) | ||
;; Remove all metadata but file and line number | ||
;; to protect from evaluating usafe metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spelling: "usafe" => "unsafe"
:msg "compiling alpha node"}}))) | ||
{} | ||
alpha-graph) | ||
exprs (reduce-kv (fn [prev id production-node] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO a more specific name here than "exprs" would be helpful. This will be a map containing expressions if I'm understanding correctly?
(fn [prev id beta-node] | ||
(let [condition (:condition beta-node) | ||
condition (if (symbol? condition) | ||
(.loadClass (clojure.lang.RT/makeClassLoader) (name condition)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming that anything that is a symbol is a class seems likely to be a bug - we'll need to test it though. Looks like there is some existing logic along these lines. Also will we enter this path from ClojureScript at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is previous functionality, just shuffled around to work in my new approach. It does seem a little questionable, but i assumed that maintaining the old way would be better than changing functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it should be fine to leave as is with regards to this issue. I have a suspicion there is a way to break this (as in make it throw ClassNotFoundException) but I don't think this change would make it any more broken since the assumption is the same before and after. We can look at this separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think resolve
would be more appropriate, however, resolve
has a defect that can throw CNFE as well.
https://dev.clojure.org/jira/browse/CLJ-1403
this may be fixed in 1.10, but that is future thinking.
Even if resolve
returns nil
as expected here, handling the nil/exception case would be nicer than letting it bubble up without much context.
I agree though that this is not related to the work of this PR.
;; If none of the rules are the issue, it is likely that the | ||
;; size of the code trying to be evaluated has exceeded the limit | ||
;; set by java. | ||
(throw (ex-info "The batching size of expressions has exceeded the limit set by java." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is OK to say the likely cause in an exception message, but IMO we shouldn't state it with quite so much certainty. For example, we could have a bug in our implementation that causes such an exception.
(try | ||
(eval exprs) | ||
(catch Exception e | ||
(mapv (fn [expr node-keys] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"doseq" seems more in line with your intent here since this return value isn't used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used mapv
to keep from having to zip the 2 lists back together, if you feel strongly i can use doseq
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, the anonymous function has two args. Never mind, this is OK and it is a minor deal anyway. It might be worth adding a quick comment that we want mapv rather than map to avoid laziness.
Do we have an existing test that validates this "rethrow if an individual expression actually is not compilable" workflow? If not we should probably add one.
;; anyway. | ||
(try | ||
(eval exprs) | ||
(catch Exception e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there are scenarios where the compilation could throw an Error other than a VirtualMachineError (which it wouldn't make sense to try to recover from, see the docs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
catch Exception
shouldn't catch these sorts of errors anyway right?
But you do make a good point, InterruptedException would be a good candidate to not catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: determine what exceptions to not catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a reason why you shouldn't catch all exceptions and rethrow here. I think that the message needs to be clear that it could only "possibly" be a problem with compilation of the expression. However, I don't think it is beneficial to rethrow other exceptions in their "raw" form . It is more useful to wrap the exception and provide it with what context you can. Attach the root cause exception though so it is bubbles up.
More context should always be a better thing in this sort of case.
I think leaving these more critical JVM errors alone is a good idea though. They are a lower-level concern and are in a separate hierarchy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actually musing on whether there are any Error types that we should catch, with the idea being to rethrow them with the specific expression that caused them rather than a gigantic collections that probably isn't too useful to the user. I don't know that the traditional assumptions about not recovering from Error necessarily apply to a dynamic code generation and compilation framework - I particularly had LinkageError in mind. We definitely wouldn't want to catch any VirtualMachineError instances though since the JVM's behavior when you do is specifically called out as undefined by the spec.
In practice this probably isn't a huge deal and I don't want to block this on investigating it - I'd expect most bad cases to throw Exception. Maybe just a followup issue to think about later.
[type (with-meta | ||
(eng/->AlphaNode env beta-children alpha-fn) | ||
{:alpha-expr (:alpha-expr (meta alpha-map))})]) | ||
[type (eng/->AlphaNode id env beta-children alpha-fn)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like it could be useful for debugging and REPL exploration.
(-> rdr | ||
(read-record d/add-alpha-fn) | ||
d/clj-record-holder-add-fact!)))}} | ||
(create-cached-node-handler AlphaNode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that AlphaNodes have ids we should be able to leverage the cached-node-handlers
@@ -2748,3 +2752,26 @@ | |||
(fire-rules))] | |||
|
|||
(is (= [{:?x inner-one}] (query session test-query))))) | |||
|
|||
(deftest test-maximum-forms-per-eval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WilliamParker, A test demonstrating the batch eval failure with a method size too large.
This test is a little slow, i was considering putting it in the performance tests, but i would be worried that those test would not be run and we might miss changes to compilation that might break the limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long does this take to run? Perhaps it should be in another test file - it would still run during Leiningen builds but not as part of the test-rules namespace. I've been trying to move away from having everything in test-rules from a code organization POV anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test takes ~41 secs while running in a REPL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes, that is quite a long-running test. Perhaps we could put it in a "long_running_tests" file that wouldn't be part of a normal REPL workflow; if there are other instances of things like this it might be worthwhile to create a test profile that is run in Travis but exclude the file from "lein test" runs.
It also seems like this could assert on the exception message added about message size too large? Or by the exception info, you could add a key if needed. It would be pretty simple to just recursively get all the messages and make sure one of them matches a regex.
(for [{:keys [id condition beta-children env] :as node} alpha-nodes] | ||
(cond-> {:id id | ||
:type (effective-type (:type condition)) | ||
:alpha-fn (get expr-fn-lookup [id :alpha-expr]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest failing here if the map doesn't contain the value. Perhaps just a function like:
(defn safe-get
[m k]
(let [v (get m k ::not-found)]
(if (= v ::not-found)
(throw (ex-info "Key not found with safe-get") {:map map :key k})
v))))
Since this is at compile time the cost should be quite minimal and it would be useful to fail quickly if there was a problem with the expression lookup.
Applies to compile-node as well.
@@ -563,6 +550,11 @@ | |||
the session was serialized without the rulebase, i.e. :with-rulebase? = false, so it needs a rulebase | |||
to be 'attached' back onto it to be usable. | |||
|
|||
* :forms-per-eval - The maximum number of expressions that will be evaluated per call to eval. | |||
This will be defaulted to 1250 if not provided. Larger batch sizes should see better performance compared |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be defaulted to 1250 if not provided.
Out of date I think - didn't you change this to 5000?
;; Grouping by ns, most expressions will not have a defined ns, only expressions from production nodes. | ||
;; These expressions must be evaluated in the ns where they were defined, as they will likely contain code | ||
;; that is namespace dependent. | ||
(for [[ns group] (sort-by key (group-by (comp :ns meta key) key->expr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not shadow the clojure.core/ns macro here.
;; easier because the metadata contains unevalable forms. | ||
(with-meta [id expr-key] (assoc metadata expr-key s-expr)) | ||
s-expr)) | ||
is-root? #(= #{0} (get backward-edges %)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this a defined function and use everywhere we have this check?
;; set by java. | ||
(throw (ex-info (str "There was a failure while batch evaling the node expressions, " \newline | ||
"but wasn't present when evaling them individually. This likely indicates " \newline | ||
"that the method size exceeded the maximum set by java, see the cause for the actual error.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: This is a limitation set by the JVM, not by Java, which aren't quite the same thing; Clojure compiles to bytecode, not an intermediate representation in Java.
get-alphas-fn]) | ||
get-alphas-fn | ||
;; A map of [node-id field-name] to function. | ||
node-expr-fn-lookup]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to not have a schema on this or was it just an oversight?
;; however, creating a case for nil could potentially cause weird effects if something about the | ||
;; compilation of the beta graph changes. Therefore making an explicit case for ::root-condition | ||
;; and if there was anything that wasn't ::root-condition this case statement will fail rather than | ||
;; failing somewhere else. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to error checking like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like failing instead of waiting to later and being broken. However, I think a it'd be nicer to have that default case
at the end that threw a more meaningful exception than "no branch of case
matched".
eg
(case x
y :foo
z :bar
(throw (ex-info "explanation"))
However, this isn't a critical change for the sake of this review. It's more of a thought about trying to give more meaningful compilation failure errors in the future.
Looked over the compiler changes and left some comments, but overall looks good @EthanEChristian I haven't been through the durability changes yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks solid functionally - I've looked over the whole PR now - but with some maintenance concerns I'd like to address. My primary concerns are:
- The reliance of durability on metadata that isn't part of the schemas.
- The potentially unreliable use of identical? in the safe-get function.
(ThreadLocal.)) | ||
|
||
(defn- add-node-fn [node fn-key meta-key] | ||
(assoc node | ||
fn-key | ||
((.get compile-expr-fn) (:id node) (meta-key (meta node))))) | ||
(get (.get node-fn-cache) [(:id node) meta-key]))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "meta-key" seems like more of an "expr-key" now. Previously the argument name some sense as it was looked up on the metadata, but that is no longer the case correct?
(assoc id->expr | ||
;; Letting the key carry all of the metadata, this makes the batch evaluation of the expressions | ||
;; easier because the metadata contains unevalable forms. | ||
(with-meta [id expr-key] (assoc metadata expr-key s-expr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern of having the metadata carry fundamental semantic meaning of the data (without which durability would be broken) that isn't reflected in the NodeExprLookup schema seems like a suboptimal practice. I'd like to at least have the metadata expectations reflected in the schema. Alternatively you could have the key map to a tuple, map, etc. containing both the values, but my primary concern here is that as someone reading the code, I'd expect the schema to be a complete description of the data, or at least the mandatory parts of the data. This seems to break that expectation.
because the metadata contains unevalable forms.
Are you referring to the issues discussed in #243 ? If so could you reference it in the comment?
@@ -2748,3 +2752,26 @@ | |||
(fire-rules))] | |||
|
|||
(is (= [{:?x inner-one}] (query session test-query))))) | |||
|
|||
(deftest test-maximum-forms-per-eval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes, that is quite a long-running test. Perhaps we could put it in a "long_running_tests" file that wouldn't be part of a normal REPL workflow; if there are other instances of things like this it might be worthwhile to create a test profile that is run in Travis but exclude the file from "lein test" runs.
It also seems like this could assert on the exception message added about message size too large? Or by the exception info, you could add a key if needed. It would be pretty simple to just recursively get all the messages and make sure one of them matches a regex.
blacklisted-packages)] | ||
(some->> x :ns ns-name str (re-matches (re-pattern (apply str patterns))))))) | ||
:generative (fn [x] (some->> x :ns ns-name str (re-matches #"^clara\.generative.*"))) | ||
:performance (fn [x] (some->> x :ns ns-name str (re-matches #"^clara\.performance.*")))} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having this live in the main Clara project seems reasonable to me, but there is an existing (old) clara-benchmark project. FYI @rbrush in case you have suggestions, objections, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in favor of including this in the main Clara project. The test selectors are a nice solution to performance concerns here.
@@ -31,7 +31,9 @@ | |||
Accumulator | |||
ISystemFact] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any problem here on review.
function will throw an exception." | ||
[m k] | ||
(let [v (get m k ::not-found)] | ||
(if (identical? v ::not-found) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presume the reason you used identical? instead of = here was for performance? @EthanEChristian : I realize that as we've discussed previously in another (non-Clara) context it is currently part of Clojure's implementation that equal keywords are identical, and so when comparing two things that are always keywords using identity checks is a way to gain marginally better performance. However, the number of operations here is just the number of nodes in the rules session compiled, and then only at compile-time, correct? I doubt that that will be any kind of serious problem for a number of checks (at compile time) on the order of thousands or tens of thousands, and this pattern isn't a part of Clara's codebase yet. As is, this introduces a maintenance burden of keeping track of whether Clojure has changed this; I don't believe that property is part of any API guarantees.
In this particular case, I think we can resolve this one of two ways
- Using clojure.core/=. I don't think this is likely to have any meaningful performance cost, but if you have data to the contrary please post.
- Creating a let-binding for the keyword so that it actually is guaranteed to be identical by Clojure's public contracts and commenting with the reasoning.
i.e.
(defn safe-get
"A helper function for retrieving a given key from the provided map. If the key doesn't exist within the map this
function will throw an exception."
[m k]
(let [not-found-keyword ::not-found
v (get m k not-found-keyword)]
(if (identical? v not-found-keyword)
(throw (ex-info "Key not found with safe-get" {:map map :key k}))
v)))
@WilliamParker, i can see what you mean with the metadata.
where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noted one code comment that I think is out of date with the latest changes but otherwise this looks good. +1 to merge once that is addressed.
@mrrodriguez you reviewed this a while ago, is there anything left you wanted to address?
@@ -551,12 +554,12 @@ | |||
;; In this case there is nothing to do with memory, so just serialize immediately. | |||
(if (:rulebase-only? opts) | |||
;; The keys of the node-fns should contain enough data to reconstruct the entire map. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
;; The keys of the node-fns should contain enough data to reconstruct the entire map.
I think this comment is outdated with the latest changes.
[k (-> k meta (get (nth k 1)))]))) | ||
(for [[node-key compilation-ctx] expr-lookup] | ||
[node-key [(-> compilation-ctx (get (nth node-key 1))) | ||
compilation-ctx]]))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the line and file info will be included with the serialized rulebase correct? That will marginally increase the space consumed by the rulebase, but the size of sessions won't be impacted so I think this is OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rest of the :compile-ctx
will be there too, such as :condition
and :production
. Previously this information was discarded after compiling the rulebase the first time. We then were making the assumption that it would recompile after deserialization. This way if for some reason a fn
cannot be recompiled, it should give better context to what condition/production the function was being compiled for.
[m k] | ||
(let [v (get m k ::not-found)] | ||
(if (= v ::not-found) | ||
(throw (ex-info "Key not found with safe-get" {:map map :key k})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be :map m
, not :map map
. You are going to return the clojure.core/map
fn as it is written.
function will throw an exception." | ||
[m k] | ||
(let [v (get m k ::not-found)] | ||
(if (= v ::not-found) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To @WilliamParker 's point:
The potentially unreliable use of identical? in the safe-get function.
The likelihood of this changing in clj is very slim. It is even slimmer that it'd change for keyword literals. However, if choosing to remain paranoid for a change like that to happen, I still think an identical?
-based sentinel test is what would typically be done in this sort of case
(let [not-found ::not-found
v (get m k not-found)]
(if (identical? v not-found) <etc>))
Alternatively, I've even seen the more aggressive, guaranteed-unique sentinel used:
(let [not-found (Object.) ;; <---- unique object identity
v (get m k not-found)]
(if (identical? v not-found) <etc>))
However, the keyword one has some potential to be "faster" since the keyword may be constructed as a static constant of the surround fn class, while the Object constructor may be called repeatedly.
You could work around this by getting even more intense and doing:
(def ^:private sentinel-value (Object.))
.... later ....
(let [
v (get m k sentinel-value)
(if (identical? v sentinel-value) <etc>))
However, there is more fun here that you now have a Var.getRawRoot
in your runtime path; could be worked around with direct-linking enabled etc, but at that point, the compile-time consant keyword compilation is looking more appealing. :P
The namespaced keyword is a generally accepted safe clj idiom though.
Either way, in a fn like safe-get
, I'd lean towards using an approach where identical?
is used instead of =
, which can be done by just using the same references in both places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree that the likelihood of it changing is low, but not entirely trivial. I'd rather not take the risk when there is an easy way not to though. I'm OK with using the same reference to enforce identity as you describe.
@@ -1751,6 +1874,37 @@ | |||
productions | |||
(throw (ex-info (str "Non-unique production names: " non-unique) {:names non-unique}))))) | |||
|
|||
(def forms-per-eval-default | |||
;; The default max number of forms that will be evaluated together as a single batch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the analysis. The only thing that is odd to me is you used inline comments instead of making this a doc string of the var. I don't suppose it matters much though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me still.
@WilliamParker @mrrodriguez , If you guys agree, i will probably merge this in sometime over the weekend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to merge
Creating and populating the next 5343 elements of the array: 64,116B | ||
Creating the vector and the return statement: 4B | ||
|
||
This sums to 65,527B just shy of the 65,536B method size limit." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker, but this increases the importance of setting up our tests against all relevant Clojure versions in Travis since potentially different/future Clojure versions could have differences here. That is really something we should be doing anyway though.
(let [not-found ::not-found | ||
v (get m k not-found)] | ||
(if (identical? v not-found) | ||
(throw (ex-info "Key not found with safe-get" {:map m :key k})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to this approach to enforcing identity equality
I have changed both the Durability and Compiler evaluation of functions that are embedded in the rulebase, as part of this I also gave AlphaNodes an Id.
I have also added some performance tests that generate a simple rulebase and compile, serialize and deserialize the rulebase. Running performance locally (
lein test :performance
)gave me a before of:
and an after of:
I chose to persist the functions on the rulebase, this allows for the functions to be easily pulled off for serialization and deserialization. Allowing for minimal traversal of the already constructed rulebase, memory usage seems comparable to what it was before and the performance numbers speak for themselves.
@WilliamParker @mrrodriguez @rbrush
When you get some time, could you look over this?