Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

protojson output is non deterministic #1373

Closed
howardjohn opened this issue Oct 18, 2021 · 5 comments
Closed

protojson output is non deterministic #1373

howardjohn opened this issue Oct 18, 2021 · 5 comments

Comments

@howardjohn
Copy link

What version of protobuf and what language are you using?
v1.27.1, golang (tried 1.15, 1.16, and 1.17)

What did you do?
Wrote a small code snippet using protojson to marshal some protobuf to json.

What did you expect to see?
Output is consistent.

What did you see instead?
Output is not consistent based on compilation. For a given set of code + compiler settings, the output seems deterministic. However, making random changes (literally like _ = 1) will, seemingly randomly, result in a different marshalling output.

I am testing like:

for flags in '' '-gcflags=' '-gcflags=all=' '-gcflags=all=-l' '-gcflags=all=-N'  '-gcflags=all=-l -N'; do
    echo go test -c "$flags" ./pkg/bootstrap/test
    go test -c "$flags" ./pkg/bootstrap/test
    ./test.test -test.v
done

Test code:

package test

import (
	"fmt"
	"testing"

	core "github.com/envoyproxy/go-control-plane/envoy/config/core/v3"
	"github.com/golang/protobuf/jsonpb"
	"google.golang.org/protobuf/encoding/protojson"
)

func TestFlake(t *testing.T) {
	out := &core.Node{
		Id:      "foo",
		Cluster: "bar",
	}
	b, err := protojson.MarshalOptions{}.Marshal(out)
	if err != nil {
		t.Fatalf("failed to marshal: %v", err)
	}
	fmt.Println(string(b))
	s, err := (&jsonpb.Marshaler{}).MarshalToString(out)
	fmt.Println(s)
}

Example outcome:

go test -c ./pkg/bootstrap/test
=== RUN   TestFlake
{"id":"foo","cluster":"bar"}
{"id":"foo","cluster":"bar"}
--- PASS: TestFlake (0.00s)
PASS
go test -c -gcflags= ./pkg/bootstrap/test
=== RUN   TestFlake
{"id":"foo","cluster":"bar"}
{"id":"foo","cluster":"bar"}
--- PASS: TestFlake (0.00s)
PASS
go test -c -gcflags=all= ./pkg/bootstrap/test
=== RUN   TestFlake
{"id":"foo","cluster":"bar"}
{"id":"foo","cluster":"bar"}
--- PASS: TestFlake (0.00s)
PASS
go test -c -gcflags=all=-l ./pkg/bootstrap/test
=== RUN   TestFlake
{"id":"foo", "cluster":"bar"}
{"id":"foo","cluster":"bar"}
--- PASS: TestFlake (0.00s)
PASS
go test -c -gcflags=all=-N ./pkg/bootstrap/test
=== RUN   TestFlake
{"id":"foo","cluster":"bar"}
{"id":"foo","cluster":"bar"}
--- PASS: TestFlake (0.00s)
PASS
go test -c -gcflags=all=-l -N ./pkg/bootstrap/test
=== RUN   TestFlake
{"id":"foo","cluster":"bar"}
{"id":"foo","cluster":"bar"}
--- PASS: TestFlake (0.00s)
PASS

Note the 4th result has an extra space. Which of the tests adds space is not deterministic either and changes as random changes to the code are made.

jsonpb results are always constant.

This can be reproduced at howardjohn/istio@5d29413. I am able to reproduce it on two different linux amd64 machines, as well as in docker.

To me, this suggests a compiler bug but opening it here first since it seems to only impact protojson.

@neild
Copy link
Contributor

neild commented Oct 18, 2021

This is operating as intended:
https://developers.google.com/protocol-buffers/docs/reference/go/faq#unstable-json

We make no promises about the long-term stability of Go's implementation of the JSON format for protocol buffers. The specification only specifies what is valid JSON, but provides no specification for a canonical format for how a marshaler ought to exactly format a given message. To avoid giving the illusion that the output is stable, we deliberately introduce minor differences so that byte-for-byte comparisons are likely to fail.

To gain some degree of output stability, we recommend running the output through a JSON formatter.

@howardjohn
Copy link
Author

howardjohn commented Oct 18, 2021

I don't really understand how this library is intended to be used in the real world then. Is it just for debugging, and not production usage? Storing protos as json is pretty useful for a wide variety of use cases. Inserting non determinism makes this extremely hard to do, even returning random spaces in a JSON response blob on a website is a bit sloppy. It is not something that is present in the go ecosystem in the legacy protobuf library, gogo protobuf, or encoding/json; at least things like golang's randomized map order you can reasonably assume its an intended thing. Randomly inserted spaces do not look like anything but a race condition.

Running output through a json formatter is not viable as we experience an incredibly high runtime cost.

In the current state this will block our adoption of the library. We rely on outputting protos to json and triggering events when that response changes. If each build may have completely different results, then that becomes very tricky. Its one thing if its a per-protobuf-version thing - then it will just switch on upgrade which is NBD. But randomly switching for every different binary is much harder to handle.

Its also impossible to unit test. This is clear from protobuf's own tests putting a magic Disable() function that is hidden under internal.

Is it not reasonable to add an option to be stable? Just because the spec doesn't declare that the format MUST be exactly formatted one way, that doesn't mean the library couldn't pick one way and stick to it. Or at the very least making the change boundary a protobuf version.

@dsnet
Copy link
Member

dsnet commented Oct 18, 2021

I don't really understand how this library is intended to be used in the real world then.

Most of the "real world" just marshals and unmarshals data through gRPC and don't care about the exact serialized form. Non-determinism doesn't interfere with storing protos as JSON since you can still unmarshal the data when reading it out of a database.

This is clear from protobuf's own tests putting a magic Disable() function that is hidden under internal.

It's appropriate for a module to disable it's own non-determinism since it's in full control of the implementation. It's similar to how the tests for the Go runtime itself may disable non-determinism of maps and other runtime artifacts for the sake of testing the Go runtime. The Go runtime tests are allowed to do that since it fully controls the Go runtime implementation.

Is it not reasonable to add an option to be stable?

Not unless there is a well-specified approach for how each language should serialize a stable version of a protobuf message.

I recommend asking the protobuf team to prioritize having that behavior be specified. When I was a Google, I wrote a 30 page specification (with examples) and a reference implementation for canonically serializing a message. However, that effort didn't really go anywhere as it wasn't a priority for the protobuf team. The best way to get this to be available is to go to the protobuf team and help them see this a priority.

Just because the spec doesn't declare that the format MUST be exactly formatted one way, that doesn't mean the library couldn't pick one way and stick to it.

If every language did that, then we would just end up with:

  • the Go-flavored version of canonical output,
  • the Java-flavored version of canonical output,
  • the C++-flavored version of canonical output,
  • etc.

And none of them would be compatible with each other. If you depended on stability and then migrated your application from C++ to Go or something, then you're stuck since you depending on an artifact of a specific implementation. Also, you should be able to derive a stable database key from the canonical output and use it to index into a database regardless of which programming language was used to serialize the key. The canonical specifications that I wrote are trying to solve this problem in a way that all the language implementations can have the exact same semantics.

@dsnet
Copy link
Member

dsnet commented Oct 18, 2021

Also, this is probably a duplicate of #1121

@howardjohn
Copy link
Author

Thanks for the info and in #1121. Will close in favor of #1121

jchadwick-buf added a commit to connectrpc/connect-go that referenced this issue Feb 21, 2023
Unfortunately, protobuf offers no story whatsoever for canonicalization
of protobuf or protojson output. It seems the best we can get is to just
make it deterministic within each implementation.

Related issues:
* golang/protobuf#1121
* golang/protobuf#1373
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants