Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Networking] GossipSub iWant Flooding Mitigation #4574

Merged
merged 49 commits into from
Aug 29, 2023

Conversation

kc1116
Copy link
Contributor

@kc1116 kc1116 commented Jul 19, 2023

This PR adds iWant flooding detection to the control message RPC validation inspector. iWant validation is performed by taking a sample of all message ID's included in all the iWant control messages of an RPC and performing validation on the sample. An iWant control message ID must not be duplicated and must have been sent in a previous iHave control message by the receiver of the iWant message. This PR simplifies control message validation by replacing Hard and Safety thresholds with a single configuration value MaxSampleSize. Validation will be done on samples up to MaxSampleSize for all control message types, and any control messages not included in that sample will be dropped and resent with a subsequent RPC. Validation for the other control message types will be updated with this new strategy.

Copy link
Contributor

@yhassanzadeh13 yhassanzadeh13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, the trajectory of the development looks good, I made a few comments for improvement.

Comment on lines 220 to 223
err := c.sampleCtrlMessages(uint(len(iWants)), sampleSize, swap)
if err != nil {
return fmt.Errorf("failed to sample iwant messages: %w", err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine and is a fail-safe approach, i.e., we don't want to continue the execution if we cannot sample a control message properly. But please note that whatever error a worker processFunc returns will result in the node to crash, hence we may need to be more cautious on what to return as error and what to log.

@kc1116 kc1116 marked this pull request as ready for review August 16, 2023 21:36
@codecov-commenter
Copy link

codecov-commenter commented Aug 16, 2023

Codecov Report

Merging #4574 (37cfa4d) into master (a8f28b9) will decrease coverage by 2.85%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #4574      +/-   ##
==========================================
- Coverage   54.71%   51.87%   -2.85%     
==========================================
  Files         917      635     -282     
  Lines       85633    59471   -26162     
==========================================
- Hits        46858    30848   -16010     
+ Misses      35183    26163    -9020     
+ Partials     3592     2460    -1132     
Flag Coverage Δ
unittests 51.87% <ø> (-2.85%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
module/component/component.go 89.92% <ø> (ø)

... and 287 files with indirect coverage changes

@yhassanzadeh13 yhassanzadeh13 changed the title Khalil/6472 iwant flooding detection [Networking] iWant Flooding Mitigation Aug 18, 2023
@yhassanzadeh13 yhassanzadeh13 changed the title [Networking] iWant Flooding Mitigation [Networking] GossipSub iWant Flooding Mitigation Aug 18, 2023
insecure/corruptlibp2p/fixtures.go Outdated Show resolved Hide resolved
swap := func(i, j uint) {
iHaves[i], iHaves[j] = iHaves[j], iHaves[i]
}

activeClusterIDS := c.tracker.GetActiveClusterIds()
count := c.getCtrlMsgCount(validationConfig.ControlMsg, controlMessage)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have observed that the current implementation of getCtrlMsgCount does not meet our specific requirements. Instead of returning the total count of control message types within the RPC, it should provide the total number of message-ids for a specific message type. Take the iHave message as an example; an RPC may contain several iHave messages, each with a distinct set of message-ids.

This situation creates a problem, particularly in the evaluation of count > validationConfig.HardThreshold. In this context, count is representative of the total number of iHave messages, each containing a distinct message-id count, whereas validationConfig.HardThreshold denotes the total number of iHave message-ids. Consequently, the condition within the if statement is often incorrectly assessed as false in many cases. This inconsistency must be addressed to align the logic with our intended functionality.

I kindly request that you refactor this function to accurately reflect the message-id count for the given message type, aligning with the discussion found at this GitHub comment. Thank you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is covered in the follow up PR that adapts this strategy to the other control messages #4642

network/p2p/inspector/validation/errors.go Outdated Show resolved Hide resolved
network/p2p/inspector/validation/errors.go Outdated Show resolved Hide resolved
network/p2p/inspector/validation/utils.go Show resolved Hide resolved
network/p2p/tracer/internal/rpc_sent_tracker.go Outdated Show resolved Hide resolved
Comment on lines 1532 to 1544
func meshTracerFixture(flowConfig *config.FlowConfig, idProvider module.IdentityProvider) *tracer.GossipSubMeshTracer {
meshTracerCfg := &tracer.GossipSubMeshTracerConfig{
Logger: unittest.Logger(),
Metrics: metrics.NewNoopCollector(),
IDProvider: idProvider,
LoggerInterval: time.Second,
HeroCacheMetricsFactory: metrics.NewNoopHeroCacheMetricsFactory(),
RpcSentTrackerCacheSize: flowConfig.NetworkConfig.GossipSubConfig.RPCSentTrackerCacheSize,
RpcSentTrackerWorkerQueueCacheSize: flowConfig.NetworkConfig.GossipSubConfig.RPCSentTrackerQueueCacheSize,
RpcSentTrackerNumOfWorkers: flowConfig.NetworkConfig.GossipSubConfig.RpcSentTrackerNumOfWorkers,
}
return tracer.NewGossipSubMeshTracer(meshTracerCfg)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this test helper be moved to insecure/integration/functional/test/gossipsub/rpc_inspector/utils.go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@gomisha gomisha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Please address comments and the failing unit test before merging.

@gomisha gomisha merged commit 5f9c8a0 into master Aug 29, 2023
35 of 36 checks passed
@gomisha gomisha deleted the khalil/6472-iwant-flooding-detection branch August 29, 2023 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants