discovery+lnmock: add gossip state machine fuzz tests#10605
discovery+lnmock: add gossip state machine fuzz tests#10605NishantBansal2003 wants to merge 1 commit intolightningnetwork:masterfrom
Conversation
Signed-off-by: Nishant Bansal <nishant.bansal.282003@gmail.com>
Summary of ChangesHello @NishantBansal2003, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the testing suite by adding fuzz tests for the gossip state machine. The new tests aim to improve the robustness and reliability of the gossiper by simulating a wide array of network interactions, including both valid and malformed messages. This proactive approach helps identify edge cases and potential vulnerabilities that might not be caught by traditional unit tests, ultimately contributing to a more stable and secure Lightning Network implementation. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This PR introduces a comprehensive fuzz test for the gossip state machine, which is a great addition for improving the robustness of the discovery package. The test covers a wide range of states and messages, including malformed ones, which is crucial for uncovering edge cases and potential vulnerabilities. The changes to the mock infrastructure to support dynamic behavior are also a good improvement for testability. I've found a few minor issues related to typos in function names and comments that should be addressed to improve code clarity and maintainability.
| return binary.BigEndian.Uint32(data) | ||
| } | ||
|
|
||
| // getUint16 extracts a uint32 from a byte slice. |
There was a problem hiding this comment.
The comment for getUint16 incorrectly states that it extracts a uint32. According to the repository style guide (line 17), every function must be commented with its purpose. An incorrect comment can be misleading.
| // getUint16 extracts a uint32 from a byte slice. | |
| // getUint16 extracts a uint16 from a byte slice. |
References
- Every function must be commented with its purpose and assumptions. (link)
| return binary.BigEndian.Uint16(data) | ||
| } | ||
|
|
||
| // getInt64 extracts a non-negative int32 from a byte slice. |
There was a problem hiding this comment.
The comment for getInt32 incorrectly refers to the function as getInt64. The repository style guide (line 18) states that function comments must begin with the function name.
| // getInt64 extracts a non-negative int32 from a byte slice. | |
| // getInt32 extracts a non-negative int32 from a byte slice. |
References
- Function comments must begin with the function name. (link)
| // udpateBlockHeight updates the best known block height in the fuzz network. | ||
| // The new height is selected from the fuzz data and is guaranteed to be | ||
| // monotonically increasing. | ||
| func (fn *fuzzNetwork) udpateBlockHeight(offset int) int { |
There was a problem hiding this comment.
There's a typo in the function name udpateBlockHeight; it should be updateBlockHeight. This typo is also present in the fuzzState enum (line 106) and in the runGossipStateMachine function (line 1394). For consistency and readability, please correct the function name here and in all its usages. This aligns with the style guide's emphasis on readable code (line 6).
| // udpateBlockHeight updates the best known block height in the fuzz network. | |
| // The new height is selected from the fuzz data and is guaranteed to be | |
| // monotonically increasing. | |
| func (fn *fuzzNetwork) udpateBlockHeight(offset int) int { | |
| // updateBlockHeight updates the best known block height in the fuzz network. | |
| // The new height is selected from the fuzz data and is guaranteed to be | |
| // monotonically increasing. | |
| func (fn *fuzzNetwork) updateBlockHeight(offset int) int { |
References
- Readable code is the most important requirement for any commit created. Typographical errors in function names hinder readability. (link)
morehouse
left a comment
There was a problem hiding this comment.
Approach ACK, nice fuzz target!
| gossiper *AuthenticatedGossiper | ||
| chain *lnmock.MockChain | ||
| notifier *mockNotifier | ||
| blockHeight *int32 |
There was a problem hiding this comment.
Why is blockHeight a pointer?
| } | ||
|
|
||
| // createGossiper creates and starts a gossiper for fuzz testing. | ||
| func createGossiper(t *testing.T, blockHeight *int32, |
There was a problem hiding this comment.
Why is blockHeight a pointer?
| // total number of fuzz state actions. | ||
| numFuzzStates = 13 |
There was a problem hiding this comment.
If we make this the last element to the enum below, it will automatically count the number of actions.
| var ( | ||
| testRBytes, _ = hex.DecodeString("8ce2bc69281ce27da07e6683571319d18e" + | ||
| "949ddfa2965fb6caa1bf0314f882d7") | ||
| testSBytes, _ = hex.DecodeString("299105481d63e0f4bc2a88121167221b67" + | ||
| "00d72a0ead154c03be696a292d24ae") | ||
| testRScalar = new(btcec.ModNScalar) | ||
| testSScalar = new(btcec.ModNScalar) | ||
| _ = testRScalar.SetByteSlice(testRBytes) | ||
| _ = testSScalar.SetByteSlice(testSBytes) | ||
| testSig = ecdsa.NewSignature(testRScalar, testSScalar) | ||
| ) |
There was a problem hiding this comment.
The only global value we actually use from here is testSig, correct?
We should move this entire testSig calculation into createGossiper where it is used and add a comment explaining where it was copied from and why.
|
|
||
| // runGossipStateMachine executes the gossip state machine with fuzz input data. | ||
| func (fn *fuzzNetwork) runGossipStateMachine() { | ||
| fn.t.Helper() |
There was a problem hiding this comment.
Nit: I think we should remove all of the t.Helper() calls in this file.
Getting the full stack trace is very helpful for debugging failures, and t.Helper() hides stack frames.
|
|
||
| filter, ok := malformedMsg.(*lnwire.GossipTimestampRange) | ||
| require.True(fn.t, ok) | ||
| _ = syncer.ApplyGossipFilter(fn.t.Context(), filter) |
There was a problem hiding this comment.
In theory, could we hang here if syncerSema is saturated from previous GossipTimestampRange messages?
| // NodeID | ||
| if canMutate(33) { | ||
| var nodeID [33]byte | ||
| copy(nodeID[:], fn.data[cursor:cursor+33]) | ||
| out.NodeID = nodeID | ||
| cursor += 33 | ||
| } |
There was a problem hiding this comment.
Nit: a lot of the if canMutate(X) { ... } blocks in maybeMalformMessage are identical across message types. Helper functions could be more readable and less code.
| return | ||
| default: | ||
| } | ||
| } |
There was a problem hiding this comment.
This looks like it spin-waits, which likely wastes CPU. Perhaps we should sleep for 1-10ms each loop iteration.
| scid := lnwire.NewShortChanIDFromInt(getUint64( | ||
| fn.data[offset+2 : offset+10], | ||
| )) |
There was a problem hiding this comment.
A random scid will ~never match an actual channel ID the gossiper knows about.
Perhaps we should allow reusing a previously announced scid most of the time, or at least create a TODO for it.
| fn.chain.On("GetBlockHash", int64(scid.BlockHeight)). | ||
| Return(nil, fmt.Errorf("block not found")). | ||
| Once() | ||
| fn.chain.On("GetBlock", tmock.Anything). | ||
| Return(nil, fmt.Errorf("block not found")). | ||
| Once() | ||
| fn.chain.On( | ||
| "GetUtxo", tmock.Anything, tmock.Anything, | ||
| scid.BlockHeight, tmock.Anything, | ||
| ).Return(nil, fmt.Errorf("utxo not found")).Once() |
There was a problem hiding this comment.
What if the scid somehow matches an existing valid one? Will returning errors here cause problems?
Also I worry about registering all 3 of these calls -- are we sure they all eventually get consumed? I could imagine a scenario where GetBlockHash returns an error, so GetBlock and GetUtxo are never called. If that happens, then we'll have unconsumed mock calls that could incorrectly poison future messages.
This commit adds fuzz tests for the gossip state machine. The fuzzer provides inputs, and based on those inputs, I simulate various random gossiper states. These states include connecting and disconnecting peers, sending node announcements, channel announcements, channel updates, announcement signatures, query messages, reply messages, gossip timestamp range messages, updating the block height, and triggering historical sync. Both valid and malformed messages are passed to help uncover any unexpected behavior caused by malicious nodes.
Currently, the execution rate is around ~20 execs/sec, which is not ideal. However, considering the large number of possible inputs and state transitions being exercised, this is expected. To discover more bugs, we likely need to run heavy, continuous fuzzing over an extended period.
I would also appreciate reviewers dedicating some CPU cycles to this to help ensure that the fuzz tests themselves are stable. After running the fuzzer for several days, I have accumulated more than 3,500 corpus entries. With this corpus, we are achieving over 60% coverage (in set mode) of the discovery package, which I believe can be improved further with continued fuzzing.
Also, this PR depends on #10589. So, some code semantics may need to change depending on how that PR evolves, but I believe this is ready for review.
corpus:
corpus.tar.gz
coverage:
coverage.html
cc: @morehouse