Skip to content

Conversation

hudlow
Copy link
Contributor

@hudlow hudlow commented May 17, 2025

No description provided.

…ld errors when performing JSON deserialization
Comment on lines +876 to +878
describe("duplicate fields", () => {
// This depends on the ECMA-262-defined JSON.parse() behavior, which is itself
// specified to choose the last field value for a duplicate field.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior has been discussed in #1038. I'm not sure these tests are particularly useful, because they basically test JSON.parse 🤔

I just noticed that the ProtoJSON documentation was recently updated with the following:

Multiple values for
singular fields (using duplicate or equivalent JSON keys) are accepted and the
last value is retained, as with binary format parsing. Note that not all
protobuf JSON parser implementations are conformant, and some nonconformant
implementations may reject duplicate keys instead.

With JSON.parse, the last property also wins for non-singular fields:

    const a = fromJsonString(
      proto3_ts.Proto3MessageSchema,
      '{ "repeatedStringField": ["b"], "repeatedStringField": ["a"] }',
    );
    const b = fromJsonString(
      proto3_ts.Proto3MessageSchema,
      '{ "repeatedStringField": ["a"], "repeatedStringField": ["b"] }',
    );

    expect(a.repeatedStringField).toStrictEqual(["a"]);
    expect(b.repeatedStringField).toStrictEqual(["b"]);

Quite different to the binary format, where repeated values accumulate...

Copy link
Contributor Author

@hudlow hudlow May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior has been discussed in #1038.

I commented there.

I'm not sure these tests are particularly useful, because they basically test JSON.parse 🤔

That's right, and in that sense they are certainly not unit tests. I think they are worthwhile for completeness, but I'm certainly open to being convinced they're not necessary in this testing context.

I just noticed that the ProtoJSON documentation was recently updated

Yeah, I read that and it's part of what prompted me to write tests for the as-is behavior as the "correct" baseline. I was actually doing that in preparation for filing what I now know would have been a dupe of #1038. 🤪

Given this update to the spec, are tests being added to [disappears for an hour] uhm, where was I? Oh, I was going to ask if tests would be added to the conformance suite, but there are already tests verifying the opposite behavior, but the tests are broken. This was quite the bunny trail: protocolbuffers/protobuf#21873

Quite different to the binary format, where repeated values accumulate...

I think you mean to say that ProtoJSON arrays are different from the binary format where array elements can be serialized non-contiguously. 😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I noticed the test cases too... Thanks for filing an issue - depending on the outcome, we can decide whether the tests here are worthwhile or not.

FWIW, I didn't see a conformance test confirming that casing other than the original field name or json name must be rejected 😬

Arrays are non-contiguous in the binary format - well said. It just occurred to me that behavior is also different with singular message fields. They can occur multiple times in the binary format, which will merge them, unlike ProtoJSON.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arrays are non-contiguous in the binary format - well said. It just occurred to me that behavior is also different with singular message fields. They can occur multiple times in the binary format, which will merge them, unlike ProtoJSON.

Interesting; I don't think I comprehend the benefit of this for a wire format. (And, as with ignoring duplicated JSON fields, I do see the drawbacks.)

There are also some interesting things around the edges of the merge algorithm, if I'm reading it right:

  • Map values for the same key don't get recursively merged even when they are a mergeable type (a repeated field, a message, or another map). This means (I think) that you couldn't replicate the merge logic in two ProtoJSON representations without considering the underlying types. (Probably not a huge deal, but a small blow to the self-describing benefits of JSON serialization.)
  • Like JSON, Protobuf is missing a set type, so there's no way to signal that a merged repeated field should be deduplicated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I comprehend the benefit of this for a wire format.

Is the benefit that you can stream a Protobuf message and update a field along the way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unaware of documentation explaining the design choices.

It could be useful for some (constrained) applications of the event sourcing pattern.

@hudlow hudlow changed the title Add tests for field casing, duplicate field ordering, and unknown field errors when performing JSON deserialization Add tests for duplicate field ordering and unknown field errors when performing JSON deserialization May 22, 2025
@hudlow
Copy link
Contributor Author

hudlow commented Jun 4, 2025

Currently blocked by protocolbuffers/protobuf#21873 which is ongoing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants