-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
encoding/json: fix and optimize marshal for quoted string #34127
Conversation
This PR (HEAD: d4bf400) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/go/+/193604 to see it. Tip: You can toggle comments from me using the |
Message from Daniel Martí: Patch Set 1: Any optimization must include benchmark numbers via benchstat. For example, look at recent speed-ups in the json package. BenchmarkCodeEncoder might be a good start. Please don’t reply on this GitHub thread. Visit golang.org/cl/193604. |
@mvdan Thanks for your hint about the benchmarks. From my point of view, the change was not primary driven by performance optimization in mind, this came more as a welcome side effect. My motivation for the proposed change is:
Nevertheless I will try to have a look at |
d4bf400
to
d417e71
Compare
This PR (HEAD: d417e71) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/go/+/193604 to see it. Tip: You can toggle comments from me using the |
@mvdan I had a look at the benchmarks in The change affects the encoding of struct like: type Foo struct {
Bar string `json:"bar,string"`
} While looking through the commit messages for the recent changes on
I would argue, that this change also slightly improves encoding performance and that it is not very common nor part of any existing benchmarks. Would this be fine as well or do you insist on benchmarks for this particular case? |
Please reply on Gerrit. I'd rather not keep the discussion in two places. |
Message from Daniel Martí: Patch Set 2: Code-Review-1 There were some replies on GitHub - hopefully the author can post them here. Until then, -1 as per my comment above. Please don’t reply on this GitHub thread. Visit golang.org/cl/193604. |
Message from Lucas Bremgartner: Patch Set 2:
Copy'n'paste from #34127 (comment) @mvdan Thanks for your hint about the benchmarks. From my point of view, the change was not primary driven by performance optimization in mind, this came more as a welcome side effect. My motivation for the proposed change is: remove the unnecessary error case Copy'n'paste from #34127 (comment) @mvdan I had a look at the benchmarks in encoding/json and non of them contains struct fields with the string option set in the struct tags. The change affects the encoding of struct like: type Foo struct { ... I would argue, that this change also slightly improves encoding performance and that it is not very common nor part of any existing benchmarks. Would this be fine as well or do you insist on benchmarks for this particular case? Please don’t reply on this GitHub thread. Visit golang.org/cl/193604. |
Message from Lucas Bremgartner: Patch Set 2:
Please also consider the issue #34154, because it is related to this PR (in fact the issue is solved by this PR). Please don’t reply on this GitHub thread. Visit golang.org/cl/193604. |
Message from Lucas Bremgartner: Patch Set 3: Commit message was updated. Please don’t reply on this GitHub thread. Visit golang.org/cl/193604. |
d417e71
to
b4da9fd
Compare
This PR (HEAD: b4da9fd) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/go/+/193604 to see it. Tip: You can toggle comments from me using the |
b4da9fd
to
96aea0f
Compare
This PR (HEAD: 96aea0f) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/go/+/193604 to see it. Tip: You can toggle comments from me using the |
Message from Daniel Martí: Patch Set 8: (5 comments) Thanks, I didn't know there was an issue to fix here. My previous CL didn't include benchmark numbers because it wasn't a performance optimization. I was simply deleting dead code, so I imagined that the edge case would be a bit faster. If this CL is about making JSON encoding follow an option correctly, and that's the bug you filed, then no need to add the benchmark. I only insisted on that because the original CL was written in a way that seemed like you were optimizing the code :) Please don’t reply on this GitHub thread. Visit golang.org/cl/193604. |
Since Go 1.2 every string can be marshaled to JSON without error even if it contains invalid UTF-8 byte sequences. Therefore there is no need to use Marshal again for the only reason of enclosing the string in double quotes. Not using Marshal here also removes the error check as there has not been a way for Marshal to fail anyway. name old time/op new time/op delta Issue34127-4 360ns ± 3% 200ns ± 3% -44.56% (p=0.008 n=5+5) name old alloc/op new alloc/op delta Issue34127-4 56.0B ± 0% 40.0B ± 0% -28.57% (p=0.008 n=5+5) name old allocs/op new allocs/op delta Issue34127-4 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.008 n=5+5) Fixes golang#34154
96aea0f
to
9b0ac1d
Compare
This PR (HEAD: 9b0ac1d) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/go/+/193604 to see it. Tip: You can toggle comments from me using the |
Message from Lucas Bremgartner: Patch Set 10: (5 comments) I have updated the commit message and integrated the test into an existing table driven test. I have also added the ideas, I considered regarding the implementation without make+append. Please don’t reply on this GitHub thread. Visit golang.org/cl/193604. |
For some reason I cannot comment on the CL. I have a stupid suggestion that might be worth trying. In https://go-review.googlesource.com/c/go/+/194338 an encodeBuffer from Pool is used to optimize allocations. Being in the same package it might also lend itself here instead of the bytes array. However, it contains overhead and cannot be allocated to the desired size. May not be worth it in the end or not even the cleanest thing to do. |
Message from Andreas Goetz: Patch Set 10:
Couldn't 607 just be like this:
without the additional []byte allocation? Please don’t reply on this GitHub thread. Visit golang.org/cl/193604. |
Message from Lucas Bremgartner: Patch Set 10:
This was my very first try as well, when I approached the problem. Unfortunately, this does not work, because e.string adds the double quotes as well (see: go/src/encoding/json/encode.go Line 954 in cf63058
Please don’t reply on this GitHub thread. Visit golang.org/cl/193604. |
Message from Andreas Goetz: Patch Set 10:
Ok, please tell me off if I'm confusing the discussion. Isn't then your patch really equal to:
and basically means that the entire optQuoted block could really be removed? Please don’t reply on this GitHub thread. Visit golang.org/cl/193604. |
Message from Lucas Bremgartner: Patch Set 10:
No, the optQuoted flag is controlled by the "string" option in the json struct tags. The docs (https://golang.org/pkg/encoding/json/#Marshal) says:
This means, that any string, that is encoded, is put into additional (escaped) quotes. E.g.
I suggest, you consult the added test case (https://go-review.googlesource.com/c/go/+/193604/10/src/encoding/json/stream_test.go) to see the expected output from a given input. Please don’t reply on this GitHub thread. Visit golang.org/cl/193604. |
Message from Daniel Martí: Patch Set 10: Run-TryBot+1 Code-Review+2 Thanks for the throrough explanation, Lucas! LGTM. Please don’t reply on this GitHub thread. Visit golang.org/cl/193604. |
Message from Gobot Gobot: Patch Set 10: TryBots beginning. Status page: https://farmer.golang.org/try?commit=ed63614a Please don’t reply on this GitHub thread. Visit golang.org/cl/193604. |
Message from Gobot Gobot: Patch Set 10: TryBot-Result+1 TryBots are happy. Please don’t reply on this GitHub thread. Visit golang.org/cl/193604. |
Since Go 1.2 every string can be marshaled to JSON without error even if it contains invalid UTF-8 byte sequences. Therefore there is no need to use Marshal again for the only reason of enclosing the string in double quotes. Not using Marshal here also removes the error check as there has not been a way for Marshal to fail anyway. name old time/op new time/op delta Issue34127-4 360ns ± 3% 200ns ± 3% -44.56% (p=0.008 n=5+5) name old alloc/op new alloc/op delta Issue34127-4 56.0B ± 0% 40.0B ± 0% -28.57% (p=0.008 n=5+5) name old allocs/op new allocs/op delta Issue34127-4 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.008 n=5+5) Fixes #34154 Change-Id: Ib60dc11980f9b20d8bef2982de7168943d632263 GitHub-Last-Rev: 9b0ac1d GitHub-Pull-Request: #34127 Reviewed-on: https://go-review.googlesource.com/c/go/+/193604 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>
This PR is being closed because golang.org/cl/193604 has been merged. |
Since Go 1.2 every string can be marshaled to JSON without error even if it contains invalid UTF-8 byte sequences. Therefore there is no need to use Marshal again for the only reason of enclosing the string in double quotes. Not using Marshal here also removes the error check as there has not been a way for Marshal to fail anyway. name old time/op new time/op delta Issue34127-4 360ns ± 3% 200ns ± 3% -44.56% (p=0.008 n=5+5) name old alloc/op new alloc/op delta Issue34127-4 56.0B ± 0% 40.0B ± 0% -28.57% (p=0.008 n=5+5) name old allocs/op new allocs/op delta Issue34127-4 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.008 n=5+5) Fixes #34154 Change-Id: Ib60dc11980f9b20d8bef2982de7168943d632263 GitHub-Last-Rev: 9b0ac1d4c5318b6bf9ed7930320f2bd755f9939c GitHub-Pull-Request: golang/go#34127 Reviewed-on: https://go-review.googlesource.com/c/go/+/193604 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>
Since Go 1.2 every string can be marshaled to JSON without error even if it contains invalid UTF-8 byte sequences. Therefore there is no need to use Marshal again for the only reason of enclosing the string in double quotes. Not using Marshal here also removes the error check as there has not been a way for Marshal to fail anyway. name old time/op new time/op delta Issue34127-4 360ns ± 3% 200ns ± 3% -44.56% (p=0.008 n=5+5) name old alloc/op new alloc/op delta Issue34127-4 56.0B ± 0% 40.0B ± 0% -28.57% (p=0.008 n=5+5) name old allocs/op new allocs/op delta Issue34127-4 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.008 n=5+5) Fixes #34154 Change-Id: Ib60dc11980f9b20d8bef2982de7168943d632263 GitHub-Last-Rev: 9b0ac1d4c5318b6bf9ed7930320f2bd755f9939c GitHub-Pull-Request: golang/go#34127 Reviewed-on: https://go-review.googlesource.com/c/go/+/193604 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>
Since Go 1.2 every string can be marshaled to JSON without error even if it contains invalid UTF-8 byte sequences. Therefore there is no need to use Marshal again for the only reason of enclosing the string in double quotes. Not using Marshal here also removes the error check as there has not been a way for Marshal to fail anyway. name old time/op new time/op delta Issue34127-4 360ns ± 3% 200ns ± 3% -44.56% (p=0.008 n=5+5) name old alloc/op new alloc/op delta Issue34127-4 56.0B ± 0% 40.0B ± 0% -28.57% (p=0.008 n=5+5) name old allocs/op new allocs/op delta Issue34127-4 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.008 n=5+5) Fixes #34154 Change-Id: Ib60dc11980f9b20d8bef2982de7168943d632263 GitHub-Last-Rev: 9b0ac1d4c5318b6bf9ed7930320f2bd755f9939c GitHub-Pull-Request: golang/go#34127 Reviewed-on: https://go-review.googlesource.com/c/go/+/193604 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>
Since Go 1.2 every string can be marshaled to JSON without error even if it contains invalid UTF-8 byte sequences. Therefore there is no need to use Marshal again for the only reason of enclosing the string in double quotes. Not using Marshal here also removes the error check as there has not been a way for Marshal to fail anyway. name old time/op new time/op delta Issue34127-4 360ns ± 3% 200ns ± 3% -44.56% (p=0.008 n=5+5) name old alloc/op new alloc/op delta Issue34127-4 56.0B ± 0% 40.0B ± 0% -28.57% (p=0.008 n=5+5) name old allocs/op new allocs/op delta Issue34127-4 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.008 n=5+5) Fixes #34154 Change-Id: Ib60dc11980f9b20d8bef2982de7168943d632263 GitHub-Last-Rev: 9b0ac1d4c5318b6bf9ed7930320f2bd755f9939c GitHub-Pull-Request: golang/go#34127 Reviewed-on: https://go-review.googlesource.com/c/go/+/193604 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org>
Since Go 1.2 every string can be marshaled to JSON without error even if it
contains invalid UTF-8 byte sequences. Therefore there is no need to use
Marshal again for the only reason of enclosing the string in double quotes.
Not using Marshal here also removes the error check as there has not been a
way for Marshal to fail anyway.
name old time/op new time/op delta
Issue34127-4 360ns ± 3% 200ns ± 3% -44.56% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
Issue34127-4 56.0B ± 0% 40.0B ± 0% -28.57% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
Issue34127-4 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.008 n=5+5)
Fixes #34154