Performance of Schema and Worker Creation #513

ikitommi · 2021-08-17T06:30:21Z

Problem Statement

Malli is optimized selectively for runtime performance. The performance critical paths include: validation, explanation, transformation and parsing. Good runtime performance has been achieved by pushing all possible work into worker initialization, away from the runtime.

Creating a schema and a validator and applying it, all at once:

;; 5.7µs
(m/validate [:or :int :string] 42)
; => true

Building the schema and schema validator separately, ahead of time:

;; 3.0µs
(def schema (m/schema [:or :int :string]))

;; 1.7µs
(def validate (m/validator schema))

;; 4ns
(validate 42)
; => true

The validation is over 1000x faster! For production apps with low latency requirements, one should always precompute the workers if possible. Tools like reitit coercion do this already on behalf of the user.

As it stands out now, the Schema & worker creation performance is only optimised for lines-of-implementation-code, which is not the best metric here. The perf is bad, REALLY bad.

Having a 500ms schema worker compilation phase to emit 1000x faster runtime is totally ok on a hi-performance cloud server but can be a show-stopper on targets like on browsers, especially if running on slow mobile phones.

Flamegraphs (less is better) with m/validate and m/validator:

using m/validator:

using m/validate:

Can we do something to make it better?

Definetely.

Schema and worker creation code should be optimised to a reasonable level for performance (both CPU & Memory) and Malli should support lazy initialisation to avoid stop-the-world pre-calculation of all in single-threaded environments

What can be done

Creating Schema instances
Transforming Schemas
Guidelines and Tools

1. Creating Schema instances

m/schema is just not optimised. We merge registries, eagerly create forms, parse entries etc. Quick wins available.

;; 400ns
(m/schema :int)

all schema instances are created separately - any larger domain model can have hundreds or thousands (or hundreds of thousands) of identically behaving leaf schemas for no good reason. With a smart cache, we could re-use once created instances.

(= (m/schema :int)
   (m/schema :int))
; => false

in the flamegraph below:

just bad code
schema format checking should be optional
can be removed with caching, all :ints are the same

entry parsing is slow. should be easy to make it 10x faster.

;; 44µs
(m/schema
 [:map
  [:x boolean?]
  [:y {:optional true} int?]
  [:z [:map
       [:x boolean?]
       [:y {:optional true} int?]]]])

Oh, the mountains!

2. Transforming Schemas

m/into-schema, m/walk and LensSchema protocol are the secret sauce of enabling generic transformations for schemas. But, elegant functional immutability is not performant by default. Running Schema transformations also currently forces re-creation of schemas.

Identity walker, e.g. "no-op":

;; 26µs
(m/walk schema (m/schema-walker identity))

As the tools are generic, it might be easy to get order(s) of magnitude improvements to this using smart diffing & caching and adding optional new -copy constructors.

3. Guidelines and Tools

Malli allows schemas to persisted (and loaded back), one could run the heavy transformations at development-time, emit just the final results as EDN, to be read back as ready schemas.

;; transform & persist at development/compile
(->> [:map
      [:x boolean?]
      [:y {:optional true} int?]
      [:z [:map
           [:x boolean?]
           [:y {:optional true} int?]]]]
     (mu/closed-schema)
     ;; ... some heavy transformations
     (malli.edn/write-string)
     (spit "malli.edn"))

;; reading it back at runtime
(-> "malli.edn" (slurp) (malli.edn/read-string))

Programming againt the map-syntax can be used like transient & persistent! in Clojure: instead of going through the Schema instance trees, one could just accumulate the map-syntax and screate a schema out of that it the end. See upcoming changes to map syntax.

(-> [:map
     [:x boolean?]
     [:y {:optional true} int?]
     [:z [:map
          [:x boolean?]
          [:y {:optional true} int?]]]]
    (mu/to-map-syntax)
    (assoc-in [:properties :closed] true)
    (assoc-in [:children 2] [:z nil :int])
    (mu/from-map-syntax))
;[:map {:closed true} 
; [:x boolean?] 
; [:y {:optional true} int?] 
; [:z :int]]

besides a new dsl to create schemas, there could be identical dsl to work with map-syntax?

TODO

define performance targets, e.g.
- 2-4x faster schema creation
- 10-20x faster schema transformations
- 75% reduced memory foonprint
work as subtasks into this issue
time / sponsor(s) for making this

The text was updated successfully, but these errors were encountered:

bsless · 2021-08-17T10:48:37Z

Initial observations, regarding simple schema only (referring to this image)

Got no explanation to this gap on the left, however, registy has room to improve:
1.1. Currently uses satisfies?, bad for runtime
1.2. Add IntoRegistry protocol
1.3. Extend to map with simple-registry constructor
1.4. simple-registry can cheat and assume it can access the map via .valAt
separate min and max from opts, bind children count
The entire tower you see is due to the destructuring of ?props
3.1. I can speed it up significantly but it's ugly, unless you want to pull in https://github.com/joinr/structural
Should implement Schemas protocol for all schemas

These get ~2x speedup for (m/schema :int)

ikitommi · 2021-08-17T18:05:12Z

Thanks for observing!

I think 1 & 2 would be a great start for this. With Schema instance cache, the price for creating :int is paid just once, so I suggest we don't add new libs yet.

ikitommi · 2021-08-22T14:00:11Z

You have been busy @bsless! I started to look the schema transformation perf, first 2 loc change made the identity walking of example drop from 26µs to 1.3µs :) Will check the -copy constructor next & re-visit entry parsing after that. I'll check all PRs after that and see how they stack up.

bsless · 2021-08-22T14:03:19Z

Two observations regarding -form:

do we need to preserve meta? It adds some overhead
forms can be wrapped in delay so they are not realized until the schema is printed.

bsless · 2021-08-23T05:15:31Z

@ikitommi could you share the work you've done until now and results / findings? for science if nothing else :)

bsless · 2021-08-23T09:13:12Z

Should we look towards adding some performance regression tests on both platforms?
What kind of things would we like to test? How would results be presented and compared?
Will Github hunt us down and kill us if we start burning CPU in github actions for performance regression tests?
Capabilities:

cold start and steady-state tests
scan Java, GC and JIT settings
for each test case report with confidence level
associate results with git sha & date

ikitommi · 2021-08-23T11:24:56Z

more complete perf suite would be great++

ikitommi · 2021-08-23T11:33:43Z

related to the copy-constructor. A cheap trick to implement it:

reify also IntoSchema alongside Schema
modify -parent to return this

=> all calls to -into-schema are handled by the same instance, making it see all closed variables, one can pass in extra data (e.g. parsed results) etc.

^{:type ::schema}
(reify
  IntoSchema
  (-into-schema [_ properties children options]
    (prn "gonna do a cheap copy for :map")
    (-into-schema parent properties children (assoc options ::parsed parsed)))
  (-type [_] (-type parent))
  (-type-properties [_] (-type-properties parent))
  (-properties-schema [_ options] (-properties-schema parent options))
  (-children-schema [_ options] (-children-schema parent options))

  Schema
  ...
  (-parent [this] this)

the good

each schema can decide wether to implement IntoSchema

the bad

few lines of boilerplate.

bsless · 2021-08-23T11:35:59Z

Something to be wary of if you implement IntoSchema on Schema is that all checks of either Schema or IntoSchema check for Schema first or you'll do double work, no?

ikitommi · 2021-08-23T11:58:48Z

It would foremost still be a Schema, as it is checked first on m/schema. It could be used with m/into-schema, which currently doesn't allow Schema instances. Not 100% sure would there be an issue, please tell if you know one!

But, only use case for this would be to reuse some heavy calculation like parsing. But, already implemented the child equality check, so this is not needed in the the case "nothing changed":

(def schema
  (m/schema
    [:map
     [:x boolean?]
     [:y {:optional true} int?]
     [:z [:map
          [:x boolean?]
          [:y {:optional true} int?]]]]))

;; 26µs => 1.3µs
(m/walk schema (m/schema-walker identity))

... for the case "something changed", this would make sense, the copy-constructor could pass in partial parse results. Not sure if that's worth all the extra code. It could be generic code, but still.

new flamegraph of the identity walker:

bsless · 2021-08-23T12:03:45Z

I meant specifically fns like:

(defn- -schema [?schema options]
  (or (and (or (schema? ?schema) (into-schema? ?schema)) ?schema)
      (-lookup ?schema options)
      (-fail! ::invalid-schema {:schema ?schema})))

It should be okay, though.

But, already implemented the child equality check, so this is not needed in the the case "nothing changed":

Someone will invent something that will surprise us.

ikitommi · 2021-08-23T12:04:44Z

Here's a btw a comparison of a clojure.core identity walker:

;; 10µs
(clojure.walk/postwalk
  identity
  [:map
   [:x boolean?]
   [:y {:optional true} int?]
   [:z [:map
        [:x boolean?]
        [:y {:optional true} int?]]]])

ikitommi · 2021-08-23T12:08:49Z

urgh, the m/-schema is broken - it's used to look both Schema instances and IntoSchema instances. Depending on where it's called it either works or not. Here it fails:

(m/into-schema (m/schema :tuple) {} [:int])
; =throws=> No implementation of method: :-into-schema of protocol: #'malli.core/IntoSchema found for class: malli.core$_tuple_schema$reify$reify__24082

bsless · 2021-08-23T12:27:39Z

more complete perf suite would be great++

Here you go #525 :)

ikitommi · 2021-08-24T13:38:00Z

The current improvements from master in numbers:

(def ?schema
  [:map
   [:x boolean?]
   [:y {:optional true} int?]
   [:z [:map
        [:x boolean?]
        [:y {:optional true} int?]]]])
        
;; 44µs => 9.4µs (4x)
(def schema (m/schema ?schema))

;; 26µs => 1.3µs (20x)
(m/walk schema (m/schema-walker identity))

;; 51µs => 7.2µs (7x)
(mu/closed-schema schema)

Creation

Closed Map

ikitommi · 2021-08-24T13:42:29Z

TODO, maybe:

I have a spike on the instance caching, the perf gains were not good enough with old/slow parsing, now it could show up
the copy-constructor, for entry-schemas, might be good, or not
delayed forms / entries / heavy computations
check that children are not moved over a slow abstraction, e.g. first on vector, turning sequences to vectors (without a good reason) etc.

ikitommi · 2021-08-25T07:44:10Z

some more improvements from #531 (faster parsing & utilities)

;; 1.7µs => 1.1µs
(bench (m/schema [:or :int :string]))

;; 400ns -> 280ns
(bench (m/schema :int))

(def ?schema
  [:map
   [:x boolean?]
   [:y {:optional true} int?]
   [:z [:map
        [:x boolean?]
        [:y {:optional true} int?]]]])

;; 9.4µs -> 8.5µs
(bench (m/schema ?schema))

bsless · 2021-08-25T07:45:51Z

some more improvements from #531 (faster parsing & utilities)

How does front-loading the destructuring work with taking functions as args in simple schema?

ikitommi · 2021-08-25T07:53:20Z

I think it's bit slower with destructuring, but works (destructuring function returns nil) and it's not a very common case. Relevant tests: https://github.com/metosin/malli/blob/master/test/malli/core_test.cljc#L2045-L2089

ikitommi · 2021-08-26T08:46:11Z

[metosin/malli "0.7.0-20210826.064206-1"] has all the currently cumulated improvements in.

ikitommi · 2021-09-12T07:46:27Z

One more small improvement (#539) coming, will check open PRs, but then closing this as the low-hanging fruits are collected, I think performance is more of a journey than a goal. That PR removes the last calls to satisfies?, making things like m/entries, m/deref and m/deref-all order of magnitude faster on JVM, less on the JS runtimes.

Also, entry-schemas can cache their parse-results and m/-set-entries and m/-set-properties can use that. As mostly everything is built on top of those, the perf gains span over most utilities.

Cumulative gains so far, on the JVM:

(def schema
  [:map
   [:x boolean?]
   [:y {:optional true} int?]
   [:z [:map
        [:x boolean?]
        [:y {:optional true} int?]]]])

(def schema (m/schema ?schema))

;; 44µs -> 3.4µs (13x)
(bench (m/schema ?schema))

;; 4.2µs -> 830ns (4.5x)
(bench (mu/assoc schema :w :string))

;; 134µs -> 15µs (9x)
(bench (mu/merge schema schema))

;; 51µs -> 3.9µs (13x)
(bench (mu/closed-schema schema))

There is clearly a need to be a way to create large amount of "raw" schemas programmatically without checking all the intermediate steps for correctness, pre-create validators etc. I think the way to go is to add the first class support, described in #406. As it's all data, there is a clear separation of "constructing things" and "instantiating schemas. Also, I really like the clj-fx syntax, so malli should do that.

Will work on that with the new Clojurists Together funding.

bsless · 2021-09-12T08:03:15Z

It's only tangentially related but I imagine #498 could have benefits in the context of a long running application

ikitommi · 2021-09-13T17:35:24Z

The entry-parsing was doing some things twice, and did a small tweak on merge. cumulative:

;; 134µs
;;  9µs (15x)
(bench (mu/merge schema schema))

;; 44µs
;; 3.4µs (13x)
(m/schema
 [:map
  [:x boolean?]
  [:y {:optional true} int?]
  [:z [:map
       [:x boolean?]
       [:y {:optional true} int?]]]])

ikitommi · 2021-09-14T18:27:25Z

all the stuff in:

➜  ~ clj -Sforce -Sdeps '{:deps {metosin/malli {:mvn/version "0.7.0-SNAPSHOT"}}}'
Downloading: metosin/malli/0.7.0-SNAPSHOT/malli-0.7.0-20210914.182554-2.pom from clojars

ikitommi · 2021-09-19T16:22:06Z

Forked #543 out of this with, first try already gave 15x better perf on schema creation (3.7µs -> 250ns).

ikitommi · 2021-09-19T16:23:13Z

There are some open PRs, but they can be merged later, closing this one.

ikitommi · 2021-10-24T18:13:52Z

Master has now support for both memoized workers and lazy parsing (#550):

(def ?schema
  [:map
   [:x boolean?]
   [:y {:optional true} int?]
   [:z [:map
        [:x boolean?]
        [:y {:optional true} int?]]]])

(def schema (m/schema ?schema))

memoized workers:

;; 1.6µs -> 64ns (25x)
(p/bench (m/validate schema {:x true, :z {:x true}}))

lazy parsing:

;; 44µs -> 2.5µs (18x)
(bench (m/schema ?schema))

;; 44µs -> 240ns (180x, lazy)
(p/bench (m/schema ?schema {::m/lazy-entries true}))

Upcoming Schme AST (#544) will remove most of the parsing overhead, instead of just postponing it.

ikitommi · 2021-10-31T21:12:23Z

Master also has now the Schema AST merged, yielding (hiccup) parse-free schema creation:

(def ast (m/ast ?schema))
;{:type :map,
; :keys {:x {:order 0, :value {:type boolean?}},
;        :y {:order 1, :value {:type int?}
;            :properties {:optional true}},
;        :z {:order 2,
;            :value {:type :map,
;                    :keys {:x {:order 0
;                               :value {:type boolean?}},
;                           :y {:order 1 
;                               :value {:type int?}
;                               :properties {:optional true}}}}}}}

;; 150ns (16x)
(p/bench (m/from-ast ast))

(-> ?schema
    (m/schema)
    (m/ast)
    (m/from-ast)
    (m/form)
    (= ?schema))
; => true

ikitommi added the enhancement New feature or request label Aug 17, 2021

ikitommi mentioned this issue Aug 17, 2021

Should schemas memoize their derived objects? #498

Open

ikitommi added a commit that referenced this issue Aug 18, 2021

initial perf suite for #513

272299a

This was referenced Aug 21, 2021

Faster registry #517

Merged

Add profiling utilities #518

Merged

Implement schemas #519

Closed

Manually destructure simple-schema props #520

Merged

Faster schema constructor vector and into-schema #521

Closed

bsless mentioned this issue Aug 23, 2021

Reimplement -parse-entries #522

Merged

bsless mentioned this issue Aug 24, 2021

Add lightweight keyset wrapper #527

Open

ikitommi mentioned this issue Sep 17, 2021

Schema AST - Fast Schema Creation and Transformations #543

Closed

ikitommi closed this as completed Sep 19, 2021

ikitommi mentioned this issue Mar 20, 2022

Make schema?, into-schema? open to extensions #664

Open

ikitommi mentioned this issue Mar 9, 2023

Fail when map schema is missing a type for a key #864

Merged

ikitommi mentioned this issue Sep 24, 2023

malli.core/into-schema? returns true i.f.f. implements protocol interface (not protocol) #960

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance of Schema and Worker Creation #513

Performance of Schema and Worker Creation #513

ikitommi commented Aug 17, 2021 •

edited

Loading

bsless commented Aug 17, 2021 •

edited

Loading

ikitommi commented Aug 17, 2021

ikitommi commented Aug 22, 2021

bsless commented Aug 22, 2021

bsless commented Aug 23, 2021

bsless commented Aug 23, 2021

ikitommi commented Aug 23, 2021

ikitommi commented Aug 23, 2021 •

edited

Loading

bsless commented Aug 23, 2021

ikitommi commented Aug 23, 2021

bsless commented Aug 23, 2021

ikitommi commented Aug 23, 2021

ikitommi commented Aug 23, 2021

bsless commented Aug 23, 2021

ikitommi commented Aug 24, 2021 •

edited

Loading

ikitommi commented Aug 24, 2021 •

edited

Loading

ikitommi commented Aug 25, 2021

bsless commented Aug 25, 2021

ikitommi commented Aug 25, 2021

ikitommi commented Aug 26, 2021

ikitommi commented Sep 12, 2021 •

edited

Loading

bsless commented Sep 12, 2021

ikitommi commented Sep 13, 2021

ikitommi commented Sep 14, 2021

ikitommi commented Sep 19, 2021

ikitommi commented Sep 19, 2021

ikitommi commented Oct 24, 2021 •

edited

Loading

ikitommi commented Oct 31, 2021

Performance of Schema and Worker Creation #513

Performance of Schema and Worker Creation #513

Comments

ikitommi commented Aug 17, 2021 • edited Loading

Problem Statement

What can be done

1. Creating Schema instances

2. Transforming Schemas

3. Guidelines and Tools

TODO

bsless commented Aug 17, 2021 • edited Loading

ikitommi commented Aug 17, 2021

ikitommi commented Aug 22, 2021

bsless commented Aug 22, 2021

bsless commented Aug 23, 2021

bsless commented Aug 23, 2021

ikitommi commented Aug 23, 2021

ikitommi commented Aug 23, 2021 • edited Loading

the good

the bad

bsless commented Aug 23, 2021

ikitommi commented Aug 23, 2021

bsless commented Aug 23, 2021

ikitommi commented Aug 23, 2021

ikitommi commented Aug 23, 2021

bsless commented Aug 23, 2021

ikitommi commented Aug 24, 2021 • edited Loading

Creation

Closed Map

ikitommi commented Aug 24, 2021 • edited Loading

ikitommi commented Aug 25, 2021

bsless commented Aug 25, 2021

ikitommi commented Aug 25, 2021

ikitommi commented Aug 26, 2021

ikitommi commented Sep 12, 2021 • edited Loading

bsless commented Sep 12, 2021

ikitommi commented Sep 13, 2021

ikitommi commented Sep 14, 2021

ikitommi commented Sep 19, 2021

ikitommi commented Sep 19, 2021

ikitommi commented Oct 24, 2021 • edited Loading

ikitommi commented Oct 31, 2021

ikitommi commented Aug 17, 2021 •

edited

Loading

bsless commented Aug 17, 2021 •

edited

Loading

ikitommi commented Aug 23, 2021 •

edited

Loading

ikitommi commented Aug 24, 2021 •

edited

Loading

ikitommi commented Aug 24, 2021 •

edited

Loading

ikitommi commented Sep 12, 2021 •

edited

Loading

ikitommi commented Oct 24, 2021 •

edited

Loading