Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AppRpcProvider calculation is misleading at best in the naive case #7371

Open
sambacha opened this issue Sep 24, 2023 · 0 comments
Open

AppRpcProvider calculation is misleading at best in the naive case #7371

sambacha opened this issue Sep 24, 2023 · 0 comments

Comments

@sambacha
Copy link

sambacha commented Sep 24, 2023

RPC Performance Methodology is naive at best

This appears to provide a way to gauge RPC provider performance. However, there are issues with the methodology that do not translate into providing useful information regarding RPC provider performance.

It is not recommended for trying to approximate a Ethereum RPC provider's latency using ping time or any other RPC service as a measure of the actual performance of RPC services for the following reasons:

  • RPC services, are not designed as ICMP network testing services.
  • Many networks rate limit ICMP requests.
  • ICMP ping or traceroute traffic can be discarded or delayed en-route to RPC provider.
  • The termination point of the TCP/UDP session may not represent the full network path between a user and the service.
  • User requests may be served from locations other than the destination of the initial IP termination point.
  • Even a complete lack of response to ICMP traffic may not reflect any sort of issue with RPC service providers' performance.

    <------------->   Wider confidence interval
                      High variance and/or low sample size

         <--->   Narrower confidence interval
                 Low variance and/or high sample size

 |---------|---------|---------|---------|
-1%      -0.5%       0%      +0.5%      +1%

The way to shrink confidence intervals is by increasing the sample size.
The central limit theorem means that, even when we have high variance data, and even when that data is not normally distributed, as we take more and more samples, we'll be able to calculate a more and more precise estimate of the true mean of the data.


      <------------------------------->     n=50  X -10% X +10%
                <------------------>        n=100 ✔️ -10% X +10%
                    <----->                 n=200 ✔️ -10% ✔️ +10%

  |---------|---------|---------|---------| difference in runtime
-20%      -10%        0       +10%      +20%

n    = sample size
<--> = confidence interval for percent difference of mean runtimes
✔️    = resolved condition
X    = unresolved condition

In this example, by n=50 we are uncertain whether A is faster or slower than B by more than 10%.
By n=100 we have ruled out that B is faster than A by more than 10%, but we're still not sure if it's slower by more than 10%.
By n=200 we have also ruled out that B is slower than A by more than 10%, so we stop sampling.

Note that we still don't know which is absolutely faster, we just know that whatever the difference is, it is neither faster nor slower than 10% (and if we did want to know, we could add 0 to our conditions).

Fallback Logic

Consider adding a parameter like stallTimeout which provides a timeout cutoff (in ms). This is a parameter I use in web3-rpc-failover in addition to providing priority and a weight.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant