Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Estuary Reputation System #15

Open
jcace opened this issue Dec 19, 2022 · 7 comments
Open

Estuary Reputation System #15

jcace opened this issue Dec 19, 2022 · 7 comments
Assignees
Labels
question Further information is requested

Comments

@jcace
Copy link

jcace commented Dec 19, 2022

Proposal: Estuary Reputation System

Author
Status Draft
Revision

This is a WIP

Problem Statement

Estuary currently selects SPs at random when making deals. We should build a reputation system that ranks/directs deals towards SPs that perform in a way that is advantageous for our network. We will use this issue to discuss the inputs/calculations for such a reputation system.

Currently, the most important metric we should be concerned with is **retrieval performance **

Estuary currently does not provide any incentives for Storage Providers to serve up CIDs that we deal to them. This is problematic, as autoretrieve relies on SP's serving up content to work properly. Without retrievals working, it is risky to offload content from our shuttles as it may result in unretrievable files.

Proposed Solution

  • Autoretrieve knows the count of successful/failed retrievals per SP, and we can track this data
  • Using these stats, we can come up with a retrieval-based reputation score and use it to influence how we make deals (@gmelodie has kicked us off below)
@jcace jcace added the question Further information is requested label Dec 19, 2022
@jcace jcace assigned jcace and unassigned jcace Dec 19, 2022
@gmelodie
Copy link

gmelodie commented Dec 19, 2022

Here's my idea for a retrieval metric:
For every SP

  1. get all the retrievals from the past month for it
  2. calculate reputation score: reputation = (total_retrievals + successful_retrievals) * successful_retrievals/total_retrievals (which is the same as reputation = (total_retrievals + successful_retrievals) * retrieval_success_rate
  3. calculate final score: final_score = reputation_score*luck (luck is a random number between 0.001 and 1)

Reasoning behind each step above:

  1. If an SP goes offline it'll be ranked lower. If it comes back online, it can quickly recover its position in the ranking.
  2. It allows for good margin between SPs with regards to both total number of retrievals and success rate (see graph below from https://gist.github.com/gmelodie/28bc1e1e2ef7cd700600db675b238b69).
  3. Luck makes sure smaller and newer SPs have a change against the big ones but also ensures that the big ones get advantage on choosing.

Variables we could tinker with:

  • How much time before consider for historic data (one month in the above example)
  • Luck range (0.001 to 1 in the above example)

edit: normalizing we get reputation = ((total_retrievals + successful_retrievals) * retrieval_success_rate) / 2*t. Then we can multiply that by 100 to get a reputation between 1 and 100

image

@jimmylee
Copy link

30 days to start seems good for historic data, this is awesome gabe

@jcace
Copy link
Author

jcace commented Dec 19, 2022

calculate reputation score: reputation = (total_retrievals + successful_retrievals) * successful_retrievals/total_retrievals (which is the same as reputation = (total_retrievals + successful_retrievals) * retrieval_success_rate

Would it be possible to normalize the reputation to be a number between 0 and 100 ? That way, an SP with a perfect retrieval record and 30+ days of uptime (for example) would have a 100. Easy to reason through at a glance vs. unbounded numbers that might get very big

I'm thinking along the lines of what Saturn does with their weight/bias score (see: https://orchestrator.strn.pl/stats - far right column). It's a number between 1-100 and if you have a 100, you know your node is operating perfectly and will get chosen for the Saturn retrievals.

Luck range (0.001 to 1 in the above example)

I like this luck concept since allow for some churn. If we normalize the rep. score, then it could be a random multiplier from 0.0 - 1.0, ex - 0.5 - 1.0 - this would mean an SP with a score of 51 has a chance of competing with a perfect score.

@gmelodie
Copy link

@jcace I don't think it'll get too big since we're getting 30days only buuuut why not let's do it.

@jcace
Copy link
Author

jcace commented Dec 22, 2022

I think we should roll this out in two phase to provide a "soft entry", and allow retrieval issues to get ironed out before they start causing SP's to miss out on deals:

First phase -> record the stats, make them publicly visible and accessible to SP's, but don't act on them yet. Run it like that for 2-4 weeks or so to build up some statistics and get a feel for the network. Also help with troubleshooting any retrieval issues that exist.

Then, second phase -> turn on reputation-based dealmaking

@jcace jcace self-assigned this Dec 22, 2022
@jcace
Copy link
Author

jcace commented Jan 6, 2023

@jcace jcace changed the title Estuary - Incentivizing Retrievals Estuary Reputation System Jan 13, 2023
@jcace
Copy link
Author

jcace commented Jan 30, 2023

Initially, our reputation score should strictly be based on a "yes/no" retrievability score, that's based on whether SPs are serving retrievals or not.

Longer term, it will be important for us to define what a "good Estuary retrieval" looks like. What behaviours are we interested in? ex)

  • Time to first byte (TTFB)
  • Bandwidth / transfer speeds, speed variability
  • Size of retrievals (how many bits transferred) vs. # of retrievals

Optimizing each of these parameters changes the type of retrieval we're incentivizing. For instance, Xfer speed > TTFB if we're serving large files (ex, video), whereas TTFB is more important if we're serving many small files (ex, static websites)

Saturn's incentive approach prioritized TTFB - read more about it here https://hackmd.io/@cryptoecon/saturn-aliens/%2FYOuJDLUUQieYfpEcAYSCfQ

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants