Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPIP-322: Content Routing Hints #322

Closed
wants to merge 1 commit into from
Closed

Conversation

guseggert
Copy link
Contributor

@guseggert guseggert commented Sep 19, 2022

(co-authored by @guseggert and @lidel )

### URI

- **Reframe**
- HTTPS URL that ends with `/reframe` MUST be interpreted as a Reframe hint, for example:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we specify different codecs like dag-cbor? Do we need to add this to the Reframe spec for HTTP URLs? And what about non-HTTP transports? Would that need to be something we add to multiaddr?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we specify different codecs like dag-cbor?

That is up to HTTP client (who controls Accept header sent with the request).

I think details of /reframe over HTTP are specified in reframe/REFRAME_HTTP_TRANSPORT.md.
It already requires endpoint to be named /reframe, but we could add a paragraph to that spec which states is more explicitly.

what about non-HTTP transports? Would that need to be something we add to multiaddr?

We probably want to register /reframe protocol at https://github.com/multiformats/multiaddr/blob/master/protocols.csv, allowing for multiaddrs like:

/dns4/example.net/tcp/443/tls/http/reframe
/dns4/example.net/tcp/443/wss/reframe

(and looping in libp2p team for sanity check, if this is correct way of representing this)

@guseggert guseggert self-assigned this Sep 20, 2022
Copy link
Contributor

@aschmahmann aschmahmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see this being worked on and some proposed syntax and formats. As linked in the document it's been a fairly long time request (I think perhaps even longer than those linked issues).

Comment on lines +74 to +76
## API Requests

We would add optional `--providers` parameter, that allows for passing as-hoc hints that are scoped to specific command.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is specific to the kubo HTTP API, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we will move this to "Notes for implementers" section as an example of CLI API.


# Specification

The default implicit content router for IPFS nodes is the IPFS public DHT and LAN DHT. Any additional content routers must be opted-in by users when making API requests.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section seems independent of the rest of the specification around specifying routing hints. I suspect it's also the most controversial given prior resistance to even defining a single or collection of routing systems as the "default/standard IPFS content routing systems".

This happens to be how the most popular and oldest IPFS implementations (e.g. kubo) have been operating over the last several years, but it could reasonably change over time. Overall this seems to be related to an independent discussion on what should be "required" for an IPFS implementation and/or if/how we should label a collection of protocols and properties that some IPFS implementations have that will make systems easier to reason about (e.g. Bitswap 1.2.0, IPFS Public DHT, libp2p with some set of transports and upgraders, etc.).

I'd try and separate this long requested and likely quite useful issue from the more nebulous "what is IPFS" kinds of conversations. Although that discussion seems like a separate important one to have and document the outcome of so it can be referenced and/or modified in the future.


The default implicit content router for IPFS nodes is the IPFS public DHT and LAN DHT. Any additional content routers must be opted-in by users when making API requests.

Users may opt-in to additional content routers using “content routing hints”, which give *suggestions* to the IPFS node about where provider records for the given CID may be found. This can include, but is not limited to, Reframe URLs, pubsub topics, multiaddrs, etc. As hints, the IPFS node is free to decide the order and strategy for using hints. If an IPFS node implements support for a hint that is specified below, it must follow the specification for that hint type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allowing for content routing hints seems fine, but IMO doing this comes with some items that may need addressing:

  • Systems that try and resolve IPFS content will likely need to be able to override user inputs to add/subtract content routers.
    • For example, I could see IPFS companion wanting to have a configuration for auto-appending content routing systems, so that a favorite system is always checked, or for limiting them to control which types of routing systems are permissible
  • Gateways might need to be able to return errors or status indicating what types of content routing systems can be used. e.g. if the gateway has blocked use of multiaddr hints but the user passed them and got some timeout error it should probably know that the multiaddr hint wasn't allowed to be used.
  • More documentation and modification to tools like https://check.ipfs.network/ to help people understand the implications of using new routing systems and how different systems interoperate.

@ajnavarro
Copy link
Member

Nice ideas.

About hints using Reframe, I cannot see any limitations about what requests can be made to a specific Reframe endpoint if it came with a particular CID.

Maybe we should specify that we only can make requests to that reframe endpoint if they are related to the "root" CID, to avoid flooding small indexers with unnecessary requests.

Comment on lines +67 to +68
- `/ipfs/{cid}?providers=url,multiaddr,somethingelse?`
- Example: `https://dweb.link/ipfs/bafy..acbd?providers=/dnsaddr/storage-provider1.com`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In designing both delegated routing and the content routing setup, I thought the goal was to try to maintain the property we have today of content addressed data.

if we have to specify the origin for the data, we've lost some of this property.

I guess there's a benefit of discovery of content routers through this mechanism, but I would hope that's not the only way we learn about content routers in kubo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we have to specify the origin for the data, we've lost some of this property.

We share the concern, and that is why this proposal.

We will lose it for sure if a public gateway decides to use a specific set of indexers, and to not use others.
A user may know that their data is in place X, but will be forced to use a specific indexer Y because that is the only thing the gateway speaks.

would hope that's not the only way we learn about content routers in kubo.

This proposal is not replacing Routing.Routers, but compliments it. Gateways should not be gatekeepers. We want to see multiple indexers, and each indexer should not have to lobby for being included on some default lists to provide utility to users.

By giving users the ability to pass additional routing hints, we remove the surface for undesired storage and content routing lock-in caused by the tyranny of the default.

Gateway operators will be in control to set any custom peering and routing they wish, but the users should still be able to improve routing even further by providing an optional hint that the gateway/node can leverage for finding providers when the CID can't be found using conventional methods. This will be course-correcting any routing gaps that may occur.

(Note to self to incorporate this into the spec)

@BigLep
Copy link
Contributor

BigLep commented Oct 4, 2022

2022-10-04 conversation: turn this spec into a IPIP for gateway spec as a quality of life item for gateways.

@lidel lidel changed the title feat: Content Routing Hints feat-322: Content Routing Hints Oct 26, 2022
@lidel lidel changed the title feat-322: Content Routing Hints IPIP-322: Content Routing Hints Oct 26, 2022
@BigLep BigLep marked this pull request as draft November 17, 2022 00:58
@BigLep
Copy link
Contributor

BigLep commented May 11, 2023

Closing this given it has been open for 6 months but there hasn't been traction on it and I'm not seeing anything in short term that will be pushing on it. It can be reopened if/when there is a push for it.

@BigLep BigLep closed this May 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Deferred
Archived in project
Development

Successfully merging this pull request may close these issues.

6 participants