Add HTTP Retrieval check #87

hsanjuan · 2025-05-22T16:35:48Z

This PR adds an HTTP retrieval check (close #73)

The multiaddresses of the peer are passed to a function that performs HTTP retrieval check for the CID
If no multiaddresses are */http or cannot connect, the relevant errors are returned.
If there is no support for HEAD, this is indicated
We do not download the CID, we just perform a HEAD request and inform if the destination has the CID (200). Downloading the content opens us to a waste of bandwidth for a request that the user can perform themselves.

2color · 2025-05-23T11:55:12Z

We do not download the CID, we just perform a HEAD request and inform if the destination has the CID (200). Downloading the content opens us to a waste of bandwidth for a request that the user can perform themselves.

Do Boxo/Rainbow/Kubo also try a HEAD request first before trying to download? Or just optimistically try a GET?

hsanjuan · 2025-05-23T11:57:14Z

Do Boxo/Rainbow/Kubo also try a HEAD request first before trying to download? Or just optimistically try a GET?

They do HEAD if the endpoint supports it.

2color · 2025-05-23T12:05:03Z

They do HEAD if the endpoint supports it.

How do they know if the endpoint supports without trying?

Will they try to download if the endpoint doesn't support HEAD?

hsanjuan · 2025-05-26T08:14:48Z

Some screenshots:

hsanjuan · 2025-05-26T08:30:56Z

They do HEAD if the endpoint supports it.

How do they know if the endpoint supports without trying?

Will they try to download if the endpoint doesn't support HEAD?

There is a first "Connect" step where we probe the peers/http addresses. We probe HEAD and if that doesn't work, GET. We request the identity CID. Whether the peer supports HEAD is stored in the peerstore. With that information, subsequent requests for WANT_HAVE are performed via HEAD if supported, if not, a GET requests is triggered.

In our case here, we don't trigger GET requests if HEAD is not supported (which we know after probing).

hsanjuan · 2025-05-26T09:37:19Z

(can be reviewed while I make tests)

daemon.go

web/index.html

2color · 2025-05-26T11:22:03Z

They do HEAD if the endpoint supports it.

How do they know if the endpoint supports without trying?
Will they try to download if the endpoint doesn't support HEAD?

There is a first "Connect" step where we probe the peers/http addresses. We probe HEAD and if that doesn't work, GET. We request the identity CID. Whether the peer supports HEAD is stored in the peerstore. With that information, subsequent requests for WANT_HAVE are performed via HEAD if supported, if not, a GET requests is triggered.

In our case here, we don't trigger GET requests if HEAD is not supported (which we know after probing).

From the large providers, do you know which already support HEAD?

hsanjuan · 2025-05-26T12:36:49Z

From the large providers, do you know which already support HEAD?

Pinata and Storacha do for sure.

…x.html

lidel · 2025-05-28T14:25:10Z

web/index.html

+
+        if (respObj.DataAvailableOverHTTP.Enabled === true) {
+          if (respObj.DataAvailableOverHTTP.Error !== "") {
+              outText += "❌ There was an error downloading the CID via HTTP: " + respObj.DataAvailableOverHTTP.Error + "\n"


@hsanjuan Will this be how we show that the endpoint is missing HTTPS or that TLS certificate validation failed?

We need to surface that in a clear way, to help storage providers fix their setups without bothering us for analysis.

We don't care about certificate validity, but we do care about http2 (which doesn't get setup unless certificate is valid I think).

In that case, when no http/2 present, we would return an error which would be displayed here.

However we return a standard "no valid addresses" error. We would need to be more specific regarding the error, but that needs to be fixed in boxo.

We don't care about certificate validity

I might be missing something but why not? Without valid certs you won't be accessible from a browser or by default from boxo, kubo, etc. so why wouldn't we care?

hmm ok the browser, that thing... so we do care, sorry.

daemon.go

2color · 2025-05-29T10:56:09Z

When running a check with an HTTPS maddr I consistently get a 500 with context deadline exceeded.

cid: bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi
multiaddr: /dns4/dag.w3s.link/tcp/443/https/p2p/QmUA9D3H7HeCYsirB3KmPSvZh3dNXMZas6Lwgr4fv1HTTp
ipniIndexer: https://cid.contact
timeoutSeconds: 60
httpRetrieval: on

Adds a mock trustless gateway server and a mock routing v1 server and checks that finding a CID works via both.

hsanjuan · 2025-05-30T14:55:23Z

When running a check with an HTTPS maddr I consistently get a 500 with context deadline exceeded.

cid: bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi
multiaddr: /dns4/dag.w3s.link/tcp/443/https/p2p/QmUA9D3H7HeCYsirB3KmPSvZh3dNXMZas6Lwgr4fv1HTTp
ipniIndexer: https://cid.contact
timeoutSeconds: 60
httpRetrieval: on

This is an issue with the existing code: it will timeout on DHT lookups and fail everything that comes after, including testing bitswap providers found via IPNI. In production the issue is not apparent because AcceleratedDHT is true, and the DHT lookup finishes quickly within the timeout. Fixing that warrants a bit of refactoring, merging of the two check paths (with and without one given multiaddr, it run the same checks) etc.

…ound messages.

2color · 2025-06-02T09:09:49Z

This is an issue with the existing code: it will timeout on DHT lookups and fail everything that comes after, including testing bitswap providers found via IPNI. In production the issue is not apparent because AcceleratedDHT is true, and the DHT lookup finishes quickly within the timeout. Fixing that warrants a bit of refactoring, merging of the two check paths (with and without one given multiaddr, it run the same checks) etc.

It's strange that the with one maddr + cid check would fail because of a DHT lookup, causing it to not achieve it's main goal of checking retrievability from the peer. I just pushed 8745dff (feel free to revert) thinking we can reduce the chances of a timeout by doing more work concurrently (though it didn't help with this).

What do you think about only performing the DHT checks after or while we check HTTP retrievability (the primary purpose of this check), since it's not dependent on the results from the DHT/IPNI lookups, and is likely a cheaper check to run anyways. This would result in a useful result, even if we don't have the results for ProviderRecordFromPeerInDHT, ProviderRecordFromPeerInIPNI, PeerFoundInDHT (we'd need to adjust for a more graceful failure).

hsanjuan · 2025-06-02T14:21:36Z

probably 8745dff doesn't fix it. As the idea is also that DHT addresses are used to test retrieability later iirc. And I think this will still block.

My opinion is that:

things work in production due to accelerated dht
no need to make the code flow more complicated than it is and we can leave things as they are.
these issues warrant a refactor, and if that happens, it is worth doing a big refactor so that the same flow is followed in all checks, and the given peer address is just a hint that gets added to everything else discovered.

hsanjuan · 2025-06-02T14:22:26Z

Otherwise:

HTTP check can be done in parallel to the rest.
DHT lookup should have its own timeout, so that other checks have time to proceed.

2color · 2025-06-02T14:46:59Z

I think that this working in production or with the accelerated DHT client ins't good reason to not fix this as part of this PR.

I don't have a strong opinion on how, but adding a separate timeout for the DHT lookups seems reasonable, given that for routable peers/cids, we should be able to get a response under 5 seconds. If we don't find it in that time window, the chances are pretty slim that waiting out the default 60 second timeout is going to yield any results.

Also, we don't even support HTTP providers yet in the DHT, so we're doing all of this work right now, blocking on the main purpose of this check.

2color

LGTM.

I pushed two fixes:

Add separate timeout for dht peer lookup in peer-specific check
make sure that when we don't filter out http providers from IPNI, when doing a peer-specific check (so that we can correctly render whether an http provider was found in the IPNI)

hsanjuan force-pushed the http-retr branch from 5589a62 to fbc75b0 Compare May 23, 2025 11:57

hsanjuan force-pushed the http-retr branch 2 times, most recently from a746211 to 7db6ecf Compare May 26, 2025 08:13

Add HTTP-retrieval check to ipfs-check

05d1154

hsanjuan force-pushed the http-retr branch from 7db6ecf to 05d1154 Compare May 26, 2025 08:27

hsanjuan marked this pull request as ready for review May 26, 2025 08:31

hsanjuan changed the title ~~Add HTTP Retrieval check (WIP)~~ Add HTTP Retrieval check May 26, 2025

2color reviewed May 26, 2025

View reviewed changes

daemon.go Outdated Show resolved Hide resolved

2color reviewed May 26, 2025

View reviewed changes

daemon.go Outdated Show resolved Hide resolved

2color reviewed May 26, 2025

View reviewed changes

web/index.html Outdated Show resolved Hide resolved

2color added 2 commits May 26, 2025 12:28

fix: indentation

6dd169d

fix: make sure providers with data are shown first

a3ac11f

hsanjuan added 3 commits May 26, 2025 20:14

Enable HTTP check by default

6fdf259

Rename otherInfo to libp2pInfo

89365e6

Add "Enabled" key to check outputs, improve response handling in inde…

c2067a5

…x.html

hsanjuan requested a review from 2color May 26, 2025 19:00

lidel reviewed May 28, 2025

View reviewed changes

2color reviewed May 29, 2025

View reviewed changes

daemon.go Outdated Show resolved Hide resolved

2color added 2 commits May 29, 2025 10:49

fix: rendering of peer id for http peers

ae8d9c8

fix: render http error once

659cd69

Add tests for HTTP check

d56eb26

Adds a mock trustless gateway server and a mock routing v1 server and checks that finding a CID works via both.

Same provOutput for a provider with both http and bitswap

295cf59

hsanjuan requested review from 2color and lidel May 30, 2025 15:02

hsanjuan and others added 4 commits May 30, 2025 17:15

http: skip ssl verification only in tests

fbbb380

index: always show http error when set, then show not connected/not f…

b37fa57

…ound messages.

fix: check peer addrs in dht concurrently

8745dff

fix: bug sorting when addrs is null

ec7f6bb

2color added 2 commits June 3, 2025 09:48

fix: add 15 second timeout to peer lookup in dht

82a3d27

fix: bug where http peers weren't found in ipni

18955f2

2color approved these changes Jun 3, 2025

View reviewed changes

hsanjuan merged commit 9d34f34 into main Jun 3, 2025
8 checks passed

hsanjuan deleted the http-retr branch June 3, 2025 08:36

lidel mentioned this pull request Jun 12, 2025

Overhaul the IPFS docs section on Troubleshooting — IPFS/2025 ipshipyard/roadmaps#13

Closed

Add HTTP Retrieval check #87

Add HTTP Retrieval check #87

Uh oh!

Conversation

hsanjuan commented May 22, 2025 • edited by 2color Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

2color commented May 23, 2025

Uh oh!

hsanjuan commented May 23, 2025

Uh oh!

2color commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hsanjuan commented May 26, 2025

Uh oh!

hsanjuan commented May 26, 2025

Uh oh!

hsanjuan commented May 26, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

2color commented May 26, 2025

Uh oh!

hsanjuan commented May 26, 2025

Uh oh!

lidel May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hsanjuan May 30, 2025

Choose a reason for hiding this comment

Uh oh!

aschmahmann May 30, 2025

Choose a reason for hiding this comment

Uh oh!

hsanjuan May 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

2color commented May 29, 2025

Uh oh!

hsanjuan commented May 30, 2025

Uh oh!

2color commented Jun 2, 2025

Uh oh!

hsanjuan commented Jun 2, 2025

Uh oh!

hsanjuan commented Jun 2, 2025

Uh oh!

2color commented Jun 2, 2025

Uh oh!

2color left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hsanjuan commented May 22, 2025 •

edited by 2color

Loading

2color commented May 23, 2025 •

edited

Loading

lidel May 28, 2025 •

edited

Loading