-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lnwire: fix heap escapes to reduce gc pressure #4884
lnwire: fix heap escapes to reduce gc pressure #4884
Conversation
Nice work! Drive-thru comment: commits should compile, and perhaps the diff could be broken up into smaller PRs to make review easier. |
Will do! Just want to make sure I'm heading to the right direction before more changes are made. |
I think the |
cc @cfromknecht |
ff153de
to
3ef52ef
Compare
I've changed the overall optimization into three steps and broke the commits. Let me know what you think. |
lnwire/writer.go:50:17: `buf` can be `github.com/miekg/dns.Writer` (interfacer)
func WriteBytes(buf *bytes.Buffer, b []byte) error {
^
lnwire/writer.go:56:17: `buf` can be `github.com/miekg/dns.Writer` (interfacer)
func WriteUint8(buf *bytes.Buffer, n uint8) error {
^
lnwire/writer.go:63:18: `buf` can be `github.com/miekg/dns.Writer` (interfacer)
func WriteUint16(buf *bytes.Buffer, n uint16) error { |
No idea, that is very weird. |
If you only use the methods of another example, this would fail for the linter because you could just use the
Since we intentionally want to replace the interface here, you can just add |
Cool, good to know, thanks! To me it's weird that the linter is asking me to use the package |
Ah yeah, probably just happens to match the set of methods we're using here. Pretty random. |
3ef52ef
to
7c26d1b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed up to 78afd37165d6abdf4ee723635355d4082e794143:
I think so far the PR goes in a good direction. I think the TODOs could be removed. Left some comments.
// First, we'll encode all the addresses into an intermediate | ||
// buffer. We need to do this in order to compute the total | ||
// length of the addresses. | ||
buffer := make([]byte, 0, MaxMsgBody) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the length not be calculated for []net.Addr
so a slice doesn't have to be made?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to have the length for TLV records?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand his suggestion, we'd essentially need to run through the set of addresses twice: once to tally up how much data is needed, and a second time to write the bytes in-line.
Doesn't this double buffering defeat the purpose of using bytes.Buffer
here (passed in) vs making a brand new one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The short answer is because it won't escape to heap, running go build -gcflags "-m -m"
under lnd/lnwire
,
./writer.go:393:16: make([]byte, 0, MaxMsgBody) does not escape
./writer.go:394:28: &bytes.Buffer{...} does not escape
So there're no heap allocations.
If I understand his suggestion, we'd essentially need to run through the set of addresses twice: once to tally up how much data is needed, and a second time to write the bytes in-line.
Yeah I think we have to write the address data somewhere before we could know the length of the data. And because we need to encode length before the actual data, [bytes for length][bytes for data]
, we need to use a temp buffer to hold it before we write it to our passed in buffer. Otherwise we could just write to the passed in buffer then calculate how much new data was written to retrieve the length info, ie, [bytes for data][bytes for length]
.
Only if we could first write two empty bytes, then write the data, and "insert" back the length. But I don't think that's feasible with bytes.Buffer
, as suggested here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd just have to iterate over the addresses and tally up the lengths based on the type encountered -- so I don't think it'd require a temporary buffer?
Another Q: If make([]byte, 0, MaxMsgBody)
doesn't escape to the heap, then it's on the stack. Is it a problem if multiple goroutines are calling this function then given that the stack has limited space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd just have to iterate over the addresses and tally up the lengths based on the type encountered -- so I don't think it'd require a temporary buffer?
Yeah we could iterate twice. The first time we calculate the length and the second time we put the length and actual data into the buffer. So it depends on whether you want to trade space with time. Since it lives on the stack, I don't think we need to.
Another Q: If make([]byte, 0, MaxMsgBody) doesn't escape to the heap, then it's on the stack. Is it a problem if multiple goroutines are calling this function then given that the stack has limited space?
Yeah either it's heap or stack I think we both have this oom problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Impressive results! I also like how the large change of updating all the seriliaztion in the codebase was broken up into a few smaller commits to make the setr of changes slightly more digestible.
The diff reads well to me, and I've started to run this on one of my larger testnet nodes that has lndmon
enabled so I can track metrics such as the heap size, total allocated/freed, etc. Will report back with my findings.
// generates a test message for each of the lnwire.Message, calls the | ||
// WriteMessage method and benchmark it. | ||
func BenchmarkWriteMessage(b *testing.B) { | ||
// Create testing messages. We will use a constant seed to make sure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you add b.ReportAllocs
here, then it'll also show information related to the number of allocations/bytes per op.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I used -benchmem
instead, so go test -run=^$ -bench=Write -benchmem
will give,
BenchmarkWriteMessage/Init-4 545257 2330 ns/op 176 B/op 9 allocs/op
|
||
bufPool = sync.Pool{ | ||
New: func() interface{} { | ||
return bytes.NewBuffer(buffer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, does this end copying the buffer each time New
is called, or create a new one in place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's arbitrary. From the godoc,
Get selects an arbitrary item from the Pool, removes it from the Pool, and returns it to the caller. Get may choose to ignore the pool and treat it as empty. Callers should not assume any relation between values passed to Put and the values returned by Get.
I tested locally. If I run the test exactly once (b.N
= 1), it will create two buffers and keeps iterating between the two.
// makeAllMessages is used to create testing messages for each lnwire message | ||
// type. | ||
// | ||
// TODO(yy): the following testing messages are created somewhat arbitrary. We |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by standardize? Like set the set of fields and TLV values that we typically see in practice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah exactly. Atm I'm just filling random bytes into each field, while it's a good start, we may someday get to the point where we want to optimize a single byte (hopefully!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could remove TODO comment
return msg | ||
} | ||
|
||
func newMsgAcceptChannel(t testing.TB, r *rand.Rand) *lnwire.AcceptChannel { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker, but we could possibly re-use the code we have to generate sample messages of each types for the testing/quick
tests we use for serialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I think I did copy that code and broke them into smaller functions. This could be a TODO in the future to let them share the same code.
@@ -115,7 +116,7 @@ var _ Message = (*AcceptChannel)(nil) | |||
// protocol version. | |||
// | |||
// This is part of the lnwire.Message interface. | |||
func (a *AcceptChannel) Encode(w io.Writer, pver uint32) error { | |||
func (a *AcceptChannel) Encode(w *bytes.Buffer, pver uint32) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the rationale here is that we end up saving a heap escape? I guess in the end, we don't ever really write directly to the wire (where the io.Writer
could be useful), since we need to encrypt the message payload first using our noise construction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Encode
calls WriteElements
, which saves several heap escapes when using a concrete type. I think this is where the gain comes from, and the incremental gain from this commit compared to the previous one,
name old allocs/op new allocs/op delta
WriteMessage/Init-4 6.00 ± 0% 6.00 ± 0% ~ (all equal)
WriteMessage/Error-4 4.00 ± 0% 2.00 ± 0% -50.00% (p=0.000 n=10+10)
WriteMessage/Ping-4 4.00 ± 0% 2.00 ± 0% -50.00% (p=0.000 n=10+10)
WriteMessage/Pong-4 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
WriteMessage/MsgOpenChannel-4 48.0 ± 0% 32.0 ± 0% -33.33% (p=0.000 n=10+10)
WriteMessage/MsgAcceptChannel-4 42.0 ± 0% 29.0 ± 0% -30.95% (p=0.000 n=10+10)
WriteMessage/MsgFundingCreated-4 6.00 ± 0% 3.00 ± 0% -50.00% (p=0.000 n=10+10)
WriteMessage/MsgFundingSigned-4 4.00 ± 0% 2.00 ± 0% -50.00% (p=0.000 n=10+10)
WriteMessage/FundingLocked-4 7.00 ± 0% 5.00 ± 0% -28.57% (p=0.000 n=10+10)
WriteMessage/Shutdown-4 4.00 ± 0% 2.00 ± 0% -50.00% (p=0.000 n=10+10)
WriteMessage/ClosingSigned-4 6.00 ± 0% 3.00 ± 0% -50.00% (p=0.000 n=10+10)
WriteMessage/UpdateAddHTLC-4 10.0 ± 0% 6.0 ± 0% -40.00% (p=0.000 n=10+10)
WriteMessage/UpdateFulfillHTLC-4 5.00 ± 0% 3.00 ± 0% -40.00% (p=0.000 n=10+10)
WriteMessage/UpdateFailHTLC-4 5.00 ± 0% 2.00 ± 0% -60.00% (p=0.000 n=10+10)
WriteMessage/CommitSig-4 208 ± 0% 105 ± 0% -49.52% (p=0.000 n=10+10)
WriteMessage/RevokeAndAck-4 8.00 ± 0% 6.00 ± 0% -25.00% (p=0.000 n=10+10)
WriteMessage/UpdateFee-4 4.00 ± 0% 2.00 ± 0% -50.00% (p=0.000 n=10+10)
WriteMessage/UpdateFailMalformedHTLC-4 8.00 ± 0% 5.00 ± 0% -37.50% (p=0.000 n=10+10)
WriteMessage/ChannelReestablish-4 10.0 ± 0% 6.0 ± 0% -40.00% (p=0.000 n=10+10)
WriteMessage/ChannelAnnouncement-4 25.0 ± 0% 14.0 ± 0% -44.00% (p=0.000 n=10+10)
WriteMessage/NodeAnnouncement-4 33.0 ± 0% 15.0 ± 0% -54.55% (p=0.000 n=10+10)
WriteMessage/ChannelUpdate-4 22.0 ± 0% 10.0 ± 0% -54.55% (p=0.000 n=10+10)
WriteMessage/AnnounceSignatures-4 12.0 ± 0% 6.0 ± 0% -50.00% (p=0.000 n=10+10)
WriteMessage/QueryShortChanIDs-4 4.01k ± 0% 1.00k ± 0% -74.92% (p=0.000 n=10+10)
WriteMessage/ReplyShortChanIDsEnd-4 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
WriteMessage/QueryChannelRange-4 5.00 ± 0% 3.00 ± 0% -40.00% (p=0.000 n=10+10)
WriteMessage/ReplyChannelRange-4 4.01k ± 0% 1.00k ± 0% -74.94% (p=0.000 n=10+10)
WriteMessage/GossipTimestampRange-4 5.00 ± 0% 3.00 ± 0% -40.00% (p=0.000 n=10+10)
WriteMessage/QueryShortChanIDs#01-4 4.03k ± 0% 1.04k ± 0% -74.29% (p=0.000 n=10+10)
WriteMessage/ReplyChannelRange#01-4 4.03k ± 0% 1.04k ± 0% -74.31% (p=0.000 n=10+10)
I guess in the end, we don't ever really write directly to the wire (where the
io.Writer
could be useful)
A bit confused here🧐 If the io.Writer
works, so does *bytes.Buffer
right?
@@ -76,6 +76,9 @@ func (a addressType) AddrLen() uint16 { | |||
|
|||
// WriteElement is a one-stop shop to write the big endian representation of | |||
// any element which is to be serialized for the wire protocol. | |||
// | |||
// TODO(yy): rm this method once we finish dereferencing it from other | |||
// packages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that would be useful here (still getting through the commits) to make sure we're not breaking anything in the process here would be to retain this method (rename it likely), then use quick.CheckEqual
to ensure that the encoding produced here doesn't deviate with any input. Eventually (post Go 1.7) we'll be able to also use the fuzzing in the standard library to implement differential fuzzing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I think atm only package chanbackup
is using it, will check!
// First, we'll encode all the addresses into an intermediate | ||
// buffer. We need to do this in order to compute the total | ||
// length of the addresses. | ||
buffer := make([]byte, 0, MaxMsgBody) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand his suggestion, we'd essentially need to run through the set of addresses twice: once to tally up how much data is needed, and a second time to write the bytes in-line.
Doesn't this double buffering defeat the purpose of using bytes.Buffer
here (passed in) vs making a brand new one?
a.FirstCommitmentPoint, | ||
tlvRecords, | ||
) | ||
if err := WriteBytes(w, a.PendingChannelID[:]); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra I
in the commit message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh it's meant to be like "part 1" gotcha.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it's missing a dash...will put it back
870895b
to
c4ad0e3
Compare
Just curious about how the heap allocs prior to this PR look like? Do we happen to have that kind of data? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed up to 0274194f961ffb0d5c4a68300da5ddebfd723685, some nits, comments and questions
lnwire/message_test.go
Outdated
r := bytes.NewBuffer(buf.Bytes()) | ||
|
||
// Read the message from the buffer. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove newline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
// makeAllMessages is used to create testing messages for each lnwire message | ||
// type. | ||
// | ||
// TODO(yy): the following testing messages are created somewhat arbitrary. We |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could remove TODO comment
lnwire/message_test.go
Outdated
|
||
return lnwire.NewInitMessage( | ||
rawFeatureVector(), | ||
rawFeatureVector(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May take ExtraData
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
lnwire/message_test.go
Outdated
require.NoError(t, err) | ||
|
||
_, err = r.Read(msg.PendingChannelID[:]) | ||
require.NoError(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err message here and above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
lnwire/message_test.go
Outdated
t.Helper() | ||
|
||
return &lnwire.Ping{ | ||
NumPongBytes: uint16(r.Int31()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to 16-bits wide?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
I don't have raw data, but I remember it not being so great when using |
c4ad0e3
to
7915dc9
Compare
I mainly have some historical data from my own person |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👒
Just needs an entry in the set of release notes!
MaxMessagePayload and MaxSliceLength are duplicate variables. This commit deletes MaxMessagePayload and keeps MaxSliceLength.
This commit changes the method WriteMessage to use bytes.Buffer to save heap allocations. A unit test is added to check the method is implemented as expected.
This commit changes the WriteElement and WriteElements methods to take a write buffer instead of io.Writer. The corresponding Encode methods are changed to use the write buffer.
This commit breaks the method WriteElement and adds specific writers for each of the data types.
This commit takes 10 types of messages and refactors their Encode method to use specific writers. The following commits will refactor the rest.
This commit takes another 10 message types and refactors their Encode method to use specific writers. The following commit will refactor the rest.
This commit refactors the remaining usage of WriteElements. By replacing the interface types with concrete types for the params used in the methods, most of the encoding of the messages now takes zero heap allocations.
7915dc9
to
a0e958f
Compare
post-merge update: I ran the fuzz tests for lnwire and this pull does not break anything |
After lightningnetwork#4884, many of the cases in WriteElement are now dead.
This PR is meant to fix #3004.
The overall optimization is done in three steps,
WriteMessage
to use write buffer.Encode
andWriteElement
to use write buffer.Encode
.The result,
Top heap allocs,
Build the benchmark
This first step is to build the benchmark for our optimization. Run the following commands to build the profiles. Notice that we need to run the bench test multiple times (I've chosen 10) to gather meaningful statistics for
benchstat
to analyze.Now we can check the heap allocs using,
And checking the heap escapes using
go build -gcflags "-m -m"
.First optimiztion - Using write buffer in
WriteMessage
Run the following commands to gather stats.
See the improvement,
Second optimiztion - Using write buffer in
Encode
Run the following commands to gather stats.
See incremental gain,
You can also view overall gain using,
Third optimiztion - Using concrete types in
Encode
Run the following commands to gather stats.
Check the incremental gain using,
Check the overall gain using
benchstat write-bench-base.prof write-bench-final.prof
.Most of the messages now use zero heap allocations, except when,
serializedPubkey := e.SerializeCompressed()
, as inMsgOpenChannel
,MsgAcceptChannel
,FundingLocked
,RevokeAndAck
andChannelReestablish
.zlib
, found inQueryShortChanIDs
andReplyChannelRange
.Next Step
Optimization can be an endless task. This PR focuses on optimizing the write methods used in
lnwire
. The following steps would be,WriteElement
method and its related usage in other packages.channeldb
to make use of the buffer.lnwire
.channeldb
.