-
Notifications
You must be signed in to change notification settings - Fork 7
Do something about JSON between repc and repliache-sdk-js #167
Comments
I have no idea how close we can get to this, but it would be really neat if we could:
From looking at the profile we should expect this to be insanely fast, as basically the entire profile is doing things other than reading the thing out of IDB and FlatBufferizing it. |
Update: I do not think it is realistically possible to avoid some of these copies for a few reasons:
However: the big thing in the profile appears to be JSON encoding, and we can get rid of all of that. We don't actually need to send JSON to JS. We're in process! wasm_bindgen supports relatively rich interop, we can send all kinds of things. We can create and integrate with all manner of First idea:First of all, we don't need to build up a list of things to return to JS. wasm_bindgen supports reflecting stateful structs with methods. We could have an Iterator in Rust that we return from scan() to JS. Once we have that, we can just return However, this will get rid of all the JSON serialization, which will probably help a ton. So after this it would look like:
Idea the secondWe can go farther. If JS is going to immediately parse the data anyway (into JSON, PB, whatever), it is possible to (unsafely) return a view directly into WASM memory to JS (https://rustwasm.github.io/wasm-bindgen/api/js_sys/struct.Uint8Array.html#method.view). JS must immediately use this data because any Rust allocations can invalidate it. 1-3. same as above. For each loop:
We save one copy but because of the unsafe pointer this only works if JS SDK is in charge of decoding values, which conflicts with #162 . Idea the third:We can go even farther. Recall that this data ultimately comes from a JS ArrayBuffer that we got out of IndexedDB. We had to copy it into a Rust Vector so that we could use it with FlatBuffers (boo), but in JS land, ArrayBuffer supports copyless views: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Uint8Array/Uint8Array. So.. instead of handing JS copies (idea #1) or unsafe rust pointers (idea #2), we can hand JS ArrayBuffers which are views onto the original ArrayBuffer we read. This is pretty invasive to our design because it would require for every key/value pair we want to return being able to (a) get back to the original ArrayBuffer we read out of IDB, and (b) being able to find the byte offset for the beginning of that key/value pair in the original ArrayBuffer. But, if we did this, then it would look like: 1-3. same as above. For each loop:
... and if we implement #162 then the last two steps move outside the JS SDK and are not our problem anymore. Users who don't use JSON can avoid the overhead. |
For rn just fix scan and if it turns out to be more general yay |
This is basically done. |
Where by basically I mean:
This part is done. |
Looks like right now the majority of the perf profile for scan is JSON encoding:
I'm not sure if we can do better by using a better JSON library. But stepping back and thinking about what's happening here:
embed::ScanItem
and allocate its key/value string (boo)Vector<embed::ScanItem>
(boo)===
The text was updated successfully, but these errors were encountered: