Do not generate XXX fields in protobufs #1725

ppanyukov · 2019-11-06T12:19:27Z

Changes

Use options in .proto files to not generate XXX fields in Go structs.

We don't need XXX fields - the interface is so stable it is VERY unlikely we'd ever need these at all, ever. By getting rid of them we achieve:

smaller memory footprint
~~smaller payloads on the wire~~ (edit: XXX are not put on the wire it seems)
and most importantly same in-memory layout with core Prometheus structs opening path for all sorts of optimisations such as unsafe casts, leading to better memory use and faster :)
see related issue: Unify labels packages prometheus/prometheus#6029

Why this matters from memory perspective:

size of original storepb.Label type (with XXX fields): 64 bytes
size of new storepb.Label type (without XXX): 32 bytes
that's 2x reduction
considering we allocate massive slices of storepb.Label this matters

Why this matters to have same in-memory layout with core Prometheus structs:

we do tons of conversions between the prom structs and our own storepb types because of these XXX fields
this includes arrays of the same
this is not only slow, it allocates tons of memory, unnecessarily so
since now we don't have XXX fields, we can cast from type to type without allocating anything new
this is especially true for slices

The tests show Store Gateway memory allocations dropped 5% for query execution:

Showing nodes accounting for -359.20MB, 6.19% of 5798.93MB total

From code perspective, this is example of how much less memory is used for one query in blockSeries func:

    683            .          .           		s := seriesEntry{ 
    684    -122.54MB  -122.54MB           			lset: make([]storepb.Label, 0, len(lset)+len(extLset)), 
    685       2.50MB     2.50MB           			refs: make([]uint64, 0, len(chks)), 
    686    -102.04MB  -102.04MB           			chks: make([]storepb.AggrChunk, 0, len(chks)), 
    687            .          .           		}

Verification

Sample data generated with thanosbench using realistic-k8s-1w-small profile
Query executed: count({__name__=~".+"}) by (__name__)
Instrumented BucketStore.Series to dump heap profile at the start and end of the function call.
Compared the profiles using pprof:

go tool pprof -sample_index=alloc_space -edgefraction=0 -lines -base heap-series-1-after-base.pb.gz  -http=:8080 heap-series-1-after-change.pb.gz

PNG attached
Heap profiles attached

heap-series-1-after-base.pb.gz
heap-series-1-after-change.pb.gz

EDIT 1: Notes and test on compatibility

I've done some tests on the compatibility to make sure this is not a breaking change for anything:

Store Gateway: using XXX fields; Querier: not using XXX fields
Store Gateway: not using XXX fields; Querier: using XXX fields

Everything seems to work just fine between the two.

Which of course makes sense since I also discovered that XXX fields are not put on the wire :)

~~smaller payloads on the wire~~ (edit: XXX are not put on the wire it seems)

Basically this means that individual components can use or not use XXX fields and the comms between them are still OK.

Hopefully this should give people more confidence that this PR can be merged without ill effects.

We don't need them. By gettig rid of them we achieve: - smaller memory footprint - smaller payloads on the wire - same in-memory layout with core Prometheus structs opening path for all sorts of optimisations such as unsafe casts, leading to better memory use and faster :) - see related issue: prometheus/prometheus#6029 The tests show SG memory allocations dropped 5% for query execution: - Showing nodes accounting for -359.20MB, 6.19% of 5798.93MB total Signed-off-by: Philip Panyukov <ppanyukov@googlemail.com>

bwplotka

That is amazing! I think that opens many doors for improvements!

Will review later as we prepare for PromCon.

GiedriusS

Could this possibly hinder the upgrade paths in the future for our users if these types will ever be extended? I guess @bwplotka could chime in here about the actual stability of these things (:

We already extended this not such a long time ago:
#1284
#750

But I guess this trade-off could be acceptable given that we reduce the memory usage by ~5%.

ppanyukov · 2019-11-07T15:10:19Z

So, after a bit of digging on these XXX fields, it's a touchy subject apparently. They came and gone and came back again. Apparently there was a lot of upset when these fields disappeared on one release of proto3 after Google decided they don't need them internally and therefore let's ditch them for everyone. So they made it back.

To give some idea what they are and what they are used for:

https://github.com/golang/protobuf/blob/master/README.md

we reserve the right to ... adding, removing, or changing ... fields that start with XXX. These ... should not be considered part of the public API.

https://groups.google.com/forum/#!msg/golang-nuts/F5xFHTfwRnY/sPv5nTVXBQAJ

When tested inside Google, we discovered that these changes were more disruptive than expected. The cause of the issues are mostly due to additional fields added to generated messages by protoc-gen-go:

An XXX_NoUnkeyedLiteral field that the generator now creates to force users to use keyed literals (e.g., foopb.Message{Name: "Golang", Age: 8} as opposed to foopb.Message{"Golang", 8}) to ensure forward compatible usage of messages.

An XXX_unrecognized field that is necessary for proto3 to always preserve unknown fields. This breaks users that assume comparability of proto3 messages. Since XXX_unrecognized is of type []byte, this means that proto3 messages cannot be used as map keys nor directly compared using the == operator.

An XXX_sizecache field that is part of the internal implementation of the table-driven serializer. The presence of this field can break tests that use reflect.DeepEqual to compare some received messages with some golden message. It is recommended that proto.Equal be used instead for this situation.

I think we can certainly ditch XXX_NoUnkeyedLiteral.
Also XXX_unrecognized unless someone thinks we do want to preserve unrecognised fields in messages? Do we ever want this?

The XXX_sizecache is use for their table-driven serialiser which they claim made things "1.3x to 2.1x faster when tested inside Google". So hmm. I don't even know if that is a thing anymore these days or whether it's of any significance for us.

Any thoughts from anyone?

ppanyukov · 2019-11-12T11:40:15Z

Also, this produces same result as using protoc-gen-gogofaster plugin instead of protoc-gen-gogofast we are using now.

bwplotka · 2019-11-13T11:33:16Z

The XXX_sizecache is use for their table-driven serialiser which they claim made things "1.3x to 2.1x faster when tested inside Google". So hmm. I don't even know if that is a thing anymore these days or whether it's of any significance for us.

Yes. It seems to help in parsing latency. I think we can use tradeoff of parsing latency vs the memory used

bwplotka

LGTM! Just small comment to fill.

bwplotka · 2019-11-13T11:34:19Z

pkg/store/storepb/rpc.proto

@@ -11,6 +11,11 @@ option (gogoproto.marshaler_all) = true;
 option (gogoproto.unmarshaler_all) = true;
 option (gogoproto.goproto_getters_all) = false;

+// Do not generate XXX fields. As per https://stackoverflow.com/a/58088012


Let's comment why (:

Done?

// Do not generate XXX fields to reduce memory footprint and opening a door // for zero-copy casts to/from prometheus data types.

Signed-off-by: Philip Panyukov <ppanyukov@googlemail.com>

ppanyukov · 2019-11-13T17:23:11Z

More detailed and precise measurements of the effect of this change.

Memory profile deltas for start/end of each of these functions:

Querier, func (s *ProxyStore) Series(), in pkg/store/proxy.go

BEFORE:

MEM STATS DIFF:   	q-Series 	q-Series - AFTER 	-> Delta
    HeapAlloc  : 	5.25M 		1.50G 			-> 1.49G
    HeapObjects: 	16.95K 		15.76M 			-> 15.75M

MEM PROF DIFF:    	q-Series 	q-Series - AFTER 	-> Delta
    InUseBytes  : 	3.76M 		993.00M 		-> 989.24M      <===!
    InUseObjects: 	141 		26.54K 			-> 26.40K       
    AllocBytes  : 	6.95M 		2.25G 			-> 2.25G
    AllocObjects: 	344 		26.36K 			-> 26.01K

AFTER:

MEM STATS DIFF:   	q-Series 	q-Series - AFTER 	-> Delta
    HeapAlloc  : 	5.28M 		853.92M 		-> 848.64M
    HeapObjects: 	17.01K 		13.27M 			-> 13.25M

MEM PROF DIFF:    	q-Series 	q-Series - AFTER 	-> Delta
    InUseBytes  : 	3.07M 		635.79M 		-> 632.72M      <===!
    InUseObjects: 	18 		14.10K 			-> 14.09K
    AllocBytes  : 	7.20M 		1.61G 			-> 1.60G
    AllocObjects: 	29 		17.93K 			-> 17.90K

RESULT: 36% reduction for InUseBytes.

StoreGateway, func (s *BucketStore) Series(), in pkg/store/bucket.go

BEFORE:

MEM STATS DIFF:   	sg-Series 	sg-Series - AFTER 	-> Delta
    HeapAlloc  : 	5.28M 		1.47G 			-> 1.47G
    HeapObjects: 	18.67K 		12.70M 			-> 12.69M

MEM PROF DIFF:    	sg-Series 	sg-Series - AFTER 	-> Delta
    InUseBytes  : 	3.14M 		1.27G 			-> 1.27G      <===!
    InUseObjects: 	6 		4.42K 			-> 4.41K
    AllocBytes  : 	7.80M 		2.05G 			-> 2.05G
    AllocObjects: 	33 		4.69K 			-> 4.66K

AFTER:

MEM STATS DIFF:   	sg-Series 	sg-Series - AFTER 	-> Delta
    HeapAlloc  : 	5.26M 		1.20G 			-> 1.19G
    HeapObjects: 	18.66K 		12.70M 			-> 12.69M

MEM PROF DIFF:    	sg-Series 	sg-Series - AFTER 	-> Delta
    InUseBytes  : 	1.99M 		985.38M 		-> 983.39M      <===!
    InUseObjects: 	7 		2.41K 			-> 2.40K
    AllocBytes  : 	6.74M 		1.76G 			-> 1.75G
    AllocObjects: 	40 		3.11K 			-> 3.07K

RESULT: 22% reduction for InUseBytes.

* Do not generate XXX fields in protobufs We don't need them. By gettig rid of them we achieve: - smaller memory footprint - smaller payloads on the wire - same in-memory layout with core Prometheus structs opening path for all sorts of optimisations such as unsafe casts, leading to better memory use and faster :) - see related issue: prometheus/prometheus#6029 The tests show SG memory allocations dropped 5% for query execution: - Showing nodes accounting for -359.20MB, 6.19% of 5798.93MB total Signed-off-by: Philip Panyukov <ppanyukov@googlemail.com> * comment to explain why do don't generate XXX fields in protobufs Signed-off-by: Philip Panyukov <ppanyukov@googlemail.com> Signed-off-by: suntianyuan <suntianyuan@baidu.com>

* Do not generate XXX fields in protobufs We don't need them. By gettig rid of them we achieve: - smaller memory footprint - smaller payloads on the wire - same in-memory layout with core Prometheus structs opening path for all sorts of optimisations such as unsafe casts, leading to better memory use and faster :) - see related issue: prometheus/prometheus#6029 The tests show SG memory allocations dropped 5% for query execution: - Showing nodes accounting for -359.20MB, 6.19% of 5798.93MB total Signed-off-by: Philip Panyukov <ppanyukov@googlemail.com> * comment to explain why do don't generate XXX fields in protobufs Signed-off-by: Philip Panyukov <ppanyukov@googlemail.com> Signed-off-by: Aleksey Sin <asin@ozon.ru>

ppanyukov mentioned this pull request Nov 6, 2019

Unify labels packages prometheus/prometheus#6029

Closed

bwplotka reviewed Nov 6, 2019

View reviewed changes

GiedriusS reviewed Nov 7, 2019

View reviewed changes

bwplotka approved these changes Nov 13, 2019

View reviewed changes

comment to explain why do don't generate XXX fields in protobufs

557576d

Signed-off-by: Philip Panyukov <ppanyukov@googlemail.com>

ppanyukov requested a review from bwplotka November 13, 2019 11:46

squat approved these changes Nov 13, 2019

View reviewed changes

bwplotka merged commit 5eb43c1 into thanos-io:master Nov 13, 2019

ppanyukov deleted the feature/remove-pb-overhead-pr branch November 13, 2019 18:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not generate XXX fields in protobufs #1725

Do not generate XXX fields in protobufs #1725

ppanyukov commented Nov 6, 2019 •

edited

Loading

bwplotka left a comment

GiedriusS left a comment

ppanyukov commented Nov 7, 2019 •

edited

Loading

ppanyukov commented Nov 12, 2019

bwplotka commented Nov 13, 2019

bwplotka left a comment

bwplotka Nov 13, 2019

ppanyukov Nov 13, 2019

ppanyukov commented Nov 13, 2019

Do not generate XXX fields in protobufs #1725

Do not generate XXX fields in protobufs #1725

Conversation

ppanyukov commented Nov 6, 2019 • edited Loading

Changes

Verification

Verification

EDIT 1: Notes and test on compatibility

bwplotka left a comment

Choose a reason for hiding this comment

GiedriusS left a comment

Choose a reason for hiding this comment

ppanyukov commented Nov 7, 2019 • edited Loading

ppanyukov commented Nov 12, 2019

bwplotka commented Nov 13, 2019

bwplotka left a comment

Choose a reason for hiding this comment

bwplotka Nov 13, 2019

Choose a reason for hiding this comment

ppanyukov Nov 13, 2019

Choose a reason for hiding this comment

ppanyukov commented Nov 13, 2019

ppanyukov commented Nov 6, 2019 •

edited

Loading

ppanyukov commented Nov 7, 2019 •

edited

Loading