Description
[Posting for feedback; I'm interested in what you think of my idea]
What version of gRPC are you using?
v1.16.0, but the code looks the same on master
What version of Go are you using (go version
)?
go version go1.11.5 linux/amd64
What operating system (Linux, Windows, …) and version?
Linux Ubuntu 16.04
What did you do?
Call a lot of gRPCs with compressed payload in tens of kilobytes range
What did you expect to see?
A fair amount of garbage
What did you see instead?
Tons of garbage
Excerpt from pprof annotated source in alloc_space
mode:
github.com/cortexproject/cortex/vendor/google.golang.org/grpc.recvAndDecompress
/go/src/github.com/cortexproject/cortex/vendor/google.golang.org/grpc/rpc_util.go
Total: 0 4.95TB (flat, cum) 43.87%
600 . .
601 . . func recvAndDecompress(p *parser, s *transport.Stream, dc Decompressor, maxReceiveMessageSize int, inPayload *stats.InPayload, compressor encoding.Compressor) ([]byte, error) {
602 . 104.95GB pf, d, err := p.recvMsg(maxReceiveMessageSize)
...
626 . . }
627 . 4.84TB d, err = ioutil.ReadAll(dcReader)
628 . . if err != nil {
629 . . return nil, status.Errorf(codes.Internal, "grpc: failed to decompress the received message %v", err)
630 . . }
631 . . }
632 . . }
In other words, one line is generating ~40% of the garbage in my program. That line is here
I believe the underlying cause is that ioutil.ReadAll
starts with a buffer of 512 bytes then (roughly) doubles till it is big enough. Since we already have the full compressed data we have a lower bound on the size of buffer required, and we could guess that a buffer some configurable multiple of that size would work well.
Suppose we have compressed data which will uncompress to 10K. By my math ioutil.ReadAll
will allocate slices of 512, 1536, 3584, 7680 and 15872 bytes, total 29184 bytes. If we guess somewhere between 10K and 20K we allocate less memory and do far less copying.
Unfortunately ioutil.ReadAll
does not expose any knobs to change behaviour, so it would need re-doing. It's not that big.