Skip to content

Use of ioutil.ReadAll in decompression generates lots of garbage #2635

Closed
@bboreham

Description

@bboreham

[Posting for feedback; I'm interested in what you think of my idea]

What version of gRPC are you using?

v1.16.0, but the code looks the same on master

What version of Go are you using (go version)?

go version go1.11.5 linux/amd64

What operating system (Linux, Windows, …) and version?

Linux Ubuntu 16.04

What did you do?

Call a lot of gRPCs with compressed payload in tens of kilobytes range

What did you expect to see?

A fair amount of garbage

What did you see instead?

Tons of garbage

Excerpt from pprof annotated source in alloc_space mode:

github.com/cortexproject/cortex/vendor/google.golang.org/grpc.recvAndDecompress
/go/src/github.com/cortexproject/cortex/vendor/google.golang.org/grpc/rpc_util.go

  Total:           0     4.95TB (flat, cum) 43.87%
    600            .          .            
    601            .          .           func recvAndDecompress(p *parser, s *transport.Stream, dc Decompressor, maxReceiveMessageSize int, inPayload *stats.InPayload, compressor encoding.Compressor) ([]byte, error) { 
    602            .   104.95GB           	pf, d, err := p.recvMsg(maxReceiveMessageSize) 
...
    626            .          .           			} 
    627            .     4.84TB           			d, err = ioutil.ReadAll(dcReader) 
    628            .          .           			if err != nil { 
    629            .          .           				return nil, status.Errorf(codes.Internal, "grpc: failed to decompress the received message %v", err) 
    630            .          .           			} 
    631            .          .           		} 
    632            .          .           	} 

In other words, one line is generating ~40% of the garbage in my program. That line is here

I believe the underlying cause is that ioutil.ReadAll starts with a buffer of 512 bytes then (roughly) doubles till it is big enough. Since we already have the full compressed data we have a lower bound on the size of buffer required, and we could guess that a buffer some configurable multiple of that size would work well.

Suppose we have compressed data which will uncompress to 10K. By my math ioutil.ReadAll will allocate slices of 512, 1536, 3584, 7680 and 15872 bytes, total 29184 bytes. If we guess somewhere between 10K and 20K we allocate less memory and do far less copying.

Unfortunately ioutil.ReadAll does not expose any knobs to change behaviour, so it would need re-doing. It's not that big.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Type: PerformancePerformance improvements (CPU, network, memory, etc)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions