-
Notifications
You must be signed in to change notification settings - Fork 17
Add HTTP Retrieval check #87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Do Boxo/Rainbow/Kubo also try a HEAD request first before trying to download? Or just optimistically try a GET? |
They do HEAD if the endpoint supports it. |
How do they know if the endpoint supports without trying? Will they try to download if the endpoint doesn't support HEAD? |
a746211
to
7db6ecf
Compare
There is a first "Connect" step where we probe the peers/http addresses. We probe HEAD and if that doesn't work, GET. We request the identity CID. Whether the peer supports HEAD is stored in the peerstore. With that information, subsequent requests for WANT_HAVE are performed via HEAD if supported, if not, a GET requests is triggered. In our case here, we don't trigger GET requests if HEAD is not supported (which we know after probing). |
(can be reviewed while I make tests) |
From the large providers, do you know which already support HEAD? |
Pinata and Storacha do for sure. |
web/index.html
Outdated
|
||
if (respObj.DataAvailableOverHTTP.Enabled === true) { | ||
if (respObj.DataAvailableOverHTTP.Error !== "") { | ||
outText += "❌ There was an error downloading the CID via HTTP: " + respObj.DataAvailableOverHTTP.Error + "\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hsanjuan Will this be how we show that the endpoint is missing HTTPS or that TLS certificate validation failed?
We need to surface that in a clear way, to help storage providers fix their setups without bothering us for analysis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't care about certificate validity, but we do care about http2 (which doesn't get setup unless certificate is valid I think).
In that case, when no http/2 present, we would return an error which would be displayed here.
However we return a standard "no valid addresses" error. We would need to be more specific regarding the error, but that needs to be fixed in boxo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't care about certificate validity
I might be missing something but why not? Without valid certs you won't be accessible from a browser or by default from boxo, kubo, etc. so why wouldn't we care?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm ok the browser, that thing... so we do care, sorry.
When running a check with an HTTPS maddr I consistently get a 500 with context deadline exceeded.
|
Adds a mock trustless gateway server and a mock routing v1 server and checks that finding a CID works via both.
This is an issue with the existing code: it will timeout on DHT lookups and fail everything that comes after, including testing bitswap providers found via IPNI. In production the issue is not apparent because AcceleratedDHT is true, and the DHT lookup finishes quickly within the timeout. Fixing that warrants a bit of refactoring, merging of the two check paths (with and without one given multiaddr, it run the same checks) etc. |
It's strange that the with one maddr + cid check would fail because of a DHT lookup, causing it to not achieve it's main goal of checking retrievability from the peer. I just pushed 8745dff (feel free to revert) thinking we can reduce the chances of a timeout by doing more work concurrently (though it didn't help with this). What do you think about only performing the DHT checks after or while we check HTTP retrievability (the primary purpose of this check), since it's not dependent on the results from the DHT/IPNI lookups, and is likely a cheaper check to run anyways. This would result in a useful result, even if we don't have the results for |
probably 8745dff doesn't fix it. As the idea is also that DHT addresses are used to test retrieability later iirc. And I think this will still block. My opinion is that:
|
Otherwise:
|
I think that this working in production or with the accelerated DHT client ins't good reason to not fix this as part of this PR. I don't have a strong opinion on how, but adding a separate timeout for the DHT lookups seems reasonable, given that for routable peers/cids, we should be able to get a response under 5 seconds. If we don't find it in that time window, the chances are pretty slim that waiting out the default 60 second timeout is going to yield any results. Also, we don't even support HTTP providers yet in the DHT, so we're doing all of this work right now, blocking on the main purpose of this check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I pushed two fixes:
- Add separate timeout for dht peer lookup in peer-specific check
- make sure that when we don't filter out http providers from IPNI, when doing a peer-specific check (so that we can correctly render whether an http provider was found in the IPNI)
This PR adds an HTTP retrieval check (close #73)
*/http
or cannot connect, the relevant errors are returned.HEAD
, this is indicatedHEAD
request and inform if the destination has the CID (200). Downloading the content opens us to a waste of bandwidth for a request that the user can perform themselves.Related ipshipyard/roadmaps#12