Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Peer-to-Peer Connection Tests #2997

Closed
6 tasks
boulder225 opened this issue Mar 27, 2024 · 0 comments
Closed
6 tasks

Implement Peer-to-Peer Connection Tests #2997

boulder225 opened this issue Mar 27, 2024 · 0 comments
Assignees
Labels
alpha network test protocol Protocol Team tickets

Comments

@boulder225
Copy link
Collaborator

boulder225 commented Mar 27, 2024

🎯 Problem to be solved

Verify the connectivity and health of peer-to-peer connections within the Charon cluster.

🛠️ Proposed solution

  • Start libp2p node, connect to relay and subsequently to other libp2p nodes
  • Implement a test to check if our node can ping all other nodes
  • Implement a test to measure packet loss between peers
  • Implement a test to measure round-trip time (RTT) between peers
  • Accumulate and evaluate the peer-to-peer connection test results, contributing to the overall cluster health assessment
  • Include the peer-to-peer connection test results in the report
@boulder225 boulder225 added alpha network test protocol Protocol Team tickets labels Mar 27, 2024
@boulder225 boulder225 changed the title Copy of Implement Network Health Check Skeleton for Charon Cluster Implement Peer-to-Peer Connection Tests Mar 27, 2024
obol-bulldozer bot pushed a commit that referenced this issue Apr 19, 2024
Continuation of the epic for `alpha test` command.

In this PR are included:
- Start of TCP libp2p node.
- Connection to other peer nodes (using hash of the ENRs as an ID for the relay).
- Launching 3 ping tests towards peers:
  - Ping - regular ping done only once. It blocks the rest of the peer tests. We assume that if we cannot successfully ping a peer, there is no reason to execute any other tests against that peer.
  - PingMeasure - regular ping done only once, measuring the RTT. RTT is considered either good, average or bad.
  - PingLoad - continuous pings, spawning each second a new go routine doing pings, measuring the RTT of each one of them. The RTT is considered either good, average or bad. Worst observed RTT is picked as a resulting value.
- Keeping the TCP node alive (configurable by a flag), in order other peers to be able to conduct tests on their end.

N.B.: There are some hardcoded values that are subject to change:
- Threshold for good/average/bad for PingMeasure test (currently set to 200/500ms);
- Threshold for good/average/bad for PingLoad test (currently set to 200/500ms);
- Await time between ping attempts in Ping test, when peer is not reachable (currently set to 3s);
- Await time between pings in PingLoad test per go routine (currently set to random value between 0 and 100ms);

Not in scope of this PR:
- Tests against self
- More elaborate tests towards peers - direct/indirect connections, packet loss, etc.

category: feature
ticket: #2997
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alpha network test protocol Protocol Team tickets
Projects
None yet
Development

No branches or pull requests

2 participants