-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API for structured serialized data #3517
Comments
|
Would be nice to have circular references for something like this as well. |
One possible option would be creating dynamic modules which could be exported and imported w3c/FileAPI#97 (comment). |
@guest271314 this format shouldn't perform any evaluation (all stored data should be completely static) |
The data can be stored as text and only evaluated when necessary. |
this format cannot rely on any form of evaluation; it presents inherent security and performance concerns. |
Is a goal to encrypt the data? Plain text could be converted to and from an |
the goal here is a static format for storing serialized javascript values analogous to JSON but with support for many more types. this api would most likely be similar to https://nodejs.org/api/v8.html#v8_serialization_api but without the class or transferables support. |
|
It feels like the intent here is to expose https://html.spec.whatwg.org/multipage/structured-data.html#serializable-objects as some kind of format. |
@jakearchibald indeed that is what inspired me to open this |
There might be security considerations for platform objects (step 18) that are |
I think this is worth another look. As I just mentioned in the fetch repo, CBOR with tags is well suited to this problem. Its also part of WebAuthn so there must be some CBOR logic already happening in all the major browsers. |
it's not, I had to implement my own orchestration to make that useful cross realm. It works, but it's not ideal.
I work with this algorithm daily and I am super sad to see my "yet another beg for it" with reasons and use cases behind being just shut down due this long-standing, stale, issue that is going nowhere 😥 |
If anyone is still followng this ... what is blocking Chromium based browsers to offer what NodeJS has been offering for a long time? This API is allowed where security usually matters the most, the back-end, and it's not flagged as deprecated or something + security concerns are not even mentioned: it would just allow users on the Web too to store JS values as buffers and get these back without needing user-land projects to do that (I maintain structured-clone polyfill which offers that toJSON and fromJSON convention but it's weird I need to offer that because the platform cannot). It would be lovely from this group to explain what are the caveats, blockers, security concerns, or issues, for something that internally seems to be already implemented and used outside the Web ... I love providing polyfills and yet I always can't wait for these to be redundant, unnecessary, just overhead with modern features offered by the Web. Thank You! |
I don't want to be the sloppy one here so I've read this thread and I'd like to summarize my thoughts as a user.
agreed, but also all known symbols, as I do it already in my code. Any
My polyfill handles circular references already and I think (???)
My use case, which is in production already, is the following:
Most WASM targeting PLs are better off the main thread because they block on bootstrap, WASM blocks on bootstrap, all the things block on bootstrap when these are not "that tiny" so that many WASM targeting PLs are chosing the Worker full async way to provide their PL but that easily fails on anything REPL like related. You can see a fully working MicroPython REPL here, make it a Pyodide if you like, and see the main thread is never blocked. I hope one of the use cases is clear here:
I believe an awesome achievement would be to provide any valid type supported by the algorithm and nothing else ... there Map would be safe.
Yes and no. Operations here are per-browser and don't need to be universally the same ... meaning there's no need to agree on a standard buffer result to me, any vendor is free to use the convention they like or use already internally. This could be an enabler for the feature to land sooner than later, as no bike-shedding is needed for the intermediate buffer:
I hope these thoughts make sense.
This one concerns me but I wonder why that's not an issue when using Hopefully I've grabbed all relevant topics and maybe helped this forward ... it's just a hope, I take no as an answer, I just would like to understand the why as nothing, behind the scene, seems to be missing. edit
|
FWIW, that's a non-starter. People will write code that depends on some serialization format sooner rather than later. That's also what makes this hard, it needs to be a universal format that has buy-in from all parties. What makes platform objects hard: e.g., a All of this might be solvable, but it's a fairly large undertaking that compared to other issues hasn't gotten an awful lot of traction. |
@annevk I hear you, but that's why I think that should rather be "the starter", or this won't ever happen (or it'll take forever). On the other hand, we have already tons of unpredictable API results on the Web:
All I am saying is that this requirement would benefit a lot of projects that understands caveats around, it's like asking JSON.parse to understand php.serialize(value) (metaphorically speaking), but if that's the no-starter for everyone instaed, how can we start a conversation about a reasonable format able to represent and satisfy cross engines requirements? After 1+ year working with WASM I've learned everyone is using their own convention around FinalizationRegistry and whatnot to make it happen, and that actually worked to date ... so here I am asking: what is the use case to make it cross-browser when presented use-cases don't need that and at the documentation level we can all say "you can't do this or that" like it's already the case for many other APIs? Thanks. |
@annevk last from me .... could FlatBuffers be a starting discussion point to provide such API? It's already x-platform/browser and IIRC implemented in most vendors for a reason or another ... I just think that if the "agreement on the format" is what's blocking this, we have previous work around similar topics and FlatBuffers seemed to address most issues (personal experience with a company that implemented those). |
@annevk is there a world in which the desired serialization format can be specified as an argument? That would potentially allow the delivery of immediate value with a vendor-specific format without compromising the longer-term goal of shipping a vendor-neutral format. |
No, part of what makes standardization hard is that you have to think through and solve for the edge cases as you will be stuck with it essentially forever. |
@annevk if that argument though is the reason this issue has been stuck for 6 years, is it necessary and productive to block intermediate pragmatic approaches? ‘cause the result is otherwise no progress, and the issue being stuck “forever” due arguments about not wanting such issue to be stuck forever … I’m seeing a catch-22 / dead loop here and I’m trying to propose “no need to standardize the format, keep it opaque and move forward” but also “how about FlatBuffers to start moving forward standardizing it?” |
I think the reason it's been stuck is because in part there's not enough web developer demand for this functionality and mostly because nobody has taken it upon themselves to try to solve it. Having a serializer and deserializer though where the intermediate format is exposed but implementation-defined is just not something that I see succeeding. Implementation-defined behavior needs to be extremely well motivated and this does not meet that bar at all. |
so you are saying the thing is, Atomics and SharedArrayBuffer after meltdown and spectre got low adoption due tons of friction around these primitives to start with ... I've found a way to circumvent those issues without even needing special headers around so it's time to make these primitive shine again, but of course until perf are subpar, nobody would use these primitives ... for those who do anyway, having these "niece" API working well together is crucial, so another catch-22 to me ... nobody wants to use features nobody needs because they don't know they might need such features. After all, before Atomics or SharedArrayBuffer existed, who was proposing these APIs? I hope the answer is not "some internal" or "some member of the group" because there's no way through that from users' perspective. Again, I am not trying to be hostile or anything, but not wanting APIs because non existent so that not even people using all the bricks around can say "but there are use cases!" feels off from Web standards users' perspective. I am trying to propose valid use cases that already exist out there (we collaborate with Universities too and we use all these primitives behind the scene) and trying to unlock by proposing APIs already known, such as FlatBuffers ... what else can a user do, as you mentioned it's my fault my interactions here are not productive? I don't see way arounds or forward and it sadden me. This is open ... since 2018 ... use cases only increased from then, not decreased, we woudln't be here discussing this otherwise. |
I think if you want this api to exist you will need to convince individual whatwg members that it would be a good idea and get them to implement it or implement it for them (note that this can be difficult, they might each say "we'll do it if another browser does it first" for example) and from that effort you can put together a spec and a test suite. |
@devsnek fair enough ... but again, |
This conversation is going in parallel at TC39 too and that summarizes my latest thoughts around this matter ... here again for the WHATWG audience: We have already Compression and Decompression Streams where the user is in charge of picking deflate over gzip (too bad brotli is not an option) so, if there is previous work around this topic, we can let the user decide which "transformer" is desired as long as all of them are compatible with structuredClone types? Internally, all browsers already have a preferred (ad-hoc) choice for that, so that the API I can see is something like: const serializer = new Serializer('CBOR' || 'syrup' || 'default');
// default menas ... whatever the current browser/engine can provide itself
const buffer = serializer.serialize(anyStructuredCloneFriendlyData);
const clone = serializer.unserialize(buffer); It doesn't even need to be synchroonus for Atomics.wait use cases, as it can land async and then be resolved into the SharedArray buffer so anything similar would work to me plus it does answer a few points:
I hope this opens a chance to at least think about a similar API that can be incrementally landed so that users of the first kind, the |
I feel like comments in this issue thread are coming from at least two subtly different expectations, and disambiguating them might help. Here's a question that might reveal some implicit assumptions: what does it mean to do this to a Blob or File? If the idea is cold storage or cross-network communication, does that imply it wants to serialize the contents of the Blob or File (which is no longer really doing the same thing structuredClone does)? If the idea is to mimic exactly what structuredClone does but into a plain bucket of bytes, how do we know whether a Blob or File reference found in a particular bucket of bytes is referring to something that can be meaningfully reinflated by the current process? |
My assumption is that:
it’s true though that in this issue implications for Blob or File are nowhere mentioned or explained, for what I could read, but hopefully it’s clear now where I come from, what I’m interested in (expose somehow the in/out process of that algorithm) so maybe I’ve answered part of your question? |
I believe this is where the misunderstanding of the issue comes from. In case of a Blob, no data is copied, the Blob object itself is just a pointer to another location where the data is supposed to be accessible. For instance it can be a pointer to an actual file on the user's drive. When postMessaging to another context a new pointer to the same location is added "magically" on the new Blob instance, but that pointer wasn't serialized, it's all part of the "opaque" implementation. Maybe the case of an So for your case |
if I read the MDN correctly IndexedDB should be able to do that ... how can it restore an opaque entity if that's gone and the pointer wouldn't have the original reference?
the polyfill I am using (and maintaining) can deal with all structuredClone capable data ... except:
What I would need, in an ideal world, is everything but Blob which is the only case I understand problematic. This "can't Blob" could be a limitation of the |
Problem
JSON is being left in the dust as we get more and more stuff for JS, and it probably won't be getting any updates.
Goals
Intuitions
StructuredSerializeForStorage
Prior art
Other considerations
The text was updated successfully, but these errors were encountered: