Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a section about network indexer to IPFS or Filecoin #428

Open
walkerlj0 opened this issue Dec 8, 2022 · 1 comment
Open

Add a section about network indexer to IPFS or Filecoin #428

walkerlj0 opened this issue Dec 8, 2022 · 1 comment
Labels
content-request A request for additional content to be created for curriculum IPFS belongs in the IPFS section of the curriculum L Work size Large (3 days-week)

Comments

@walkerlj0
Copy link
Contributor

walkerlj0 commented Dec 8, 2022

Problem Description

Purpose

Describe what need this projectr is filling, or what problem is sovling, and who the intended audience or end user is

Proposed Solution

What features, user stories, or product/ content you would like as an output

Where Can we find out more about this topic?
Torfinn Olson & Ivan Schansy
https://filecoin.io/blog/posts/introducing-the-network-indexer/
https://github.com/ipni/specs/blob/main/IPNI.md
https://www.youtube.com/watch?v=sunA7JO4rHQ&list=PLuhRWgmPaHtSF3oIY3TzrM-Nq5IU_RTXb&index=11
https://www.youtube.com/watch?v=g7iwPIpeSIo&list=PLuhRWgmPaHtSF3oIY3TzrM-Nq5IU_RTXb&index=8
https://docs.cid.contact/filecoin-network-indexer/overview
https://github.com/ipni/storetheindex

Additional context
Network indexer uses both Graphsync & bitswap, and make the interoperability of IPFS and Filecoin posisble

Milestones (Optional)

1) Milstone Name Text

2) Milstone Name Text

Acceptance Criteria

Describe the out put and criteria for this output for considering this task completed

@walkerlj0 walkerlj0 added IPFS belongs in the IPFS section of the curriculum content-request A request for additional content to be created for curriculum L Work size Large (3 days-week) labels Dec 8, 2022
@walkerlj0
Copy link
Contributor Author

Info dump from launchpad-coloweek-v7

@walker Ford
had a lot of REALLY good questions about some of the more functional aspects of the :network-indexer: network indexer during my presentation that took me a little time to put together responses for. I figured everyone should benefit from the answers so I've decided just to share them here instead of leaving them lodged within the presentation.

Q: Why does the Indexer watch the SP announcement chain, and not just watch the blockchain directly. Confused about the benefits of the announcement/advertisement chain.
A: Capturing advertisements in situ as data is being added to the IPFS blockchain as a result of a Filecoin deal is simply the least overhead(most efficient approach) for capturing this information as opposed to possibly attempting to Query it directly from crawling the entire chain. Advertisement chain chunks give us a very fast way to find multihashes and return them. Crawling the chain would be very slow and compute intensive in comparison. Consider the difference in number of hops.

Q: How big is the Indexer database?
A: See Disk space utilization. ~3TiB, but was actually once dramatically larger this is the result of recent improvements that have had dramatic results in optimizing data storage practices.

Q: Plans to shard the Indexer database and distribute across more nodes?
A: The plan to shard the indexer database across more nodes Indexer scaling plan.

Q: What kinds of information does the Indexer have that the DHT does not have?
A: The simple answer is; 'metadata'. Who has what, and what protocols are retrievable. The DHT stores IPNS records, which the IPNI does not, although there is a spec for making that happen sometime in the future. Naam naming system powered by IPNI.

The technical answer:
Recommend reading - Ingestion design doc
(Provider,ContextID, ProviderID,Metadata,Signature,Entries)
Provider: The peer.ID of the libp2p host providing the content.
Addresses: Multiaddrs to provide to clients in order to connect to the provider.
Entries: Link to a data structure that contains the advertised multihashes.
ContextID: Identifier used to subsequently update or delete an advertisement.
ProviderID:
Metadata: Additional data returned in client query responses for ay of the CIDs in this advertisement. Expected to start with varint indicating the remaining format of metadata. Recommended to keep it below 100 bytes. Reference provider currently supports Bitswap and Filecoin(graphsync) protocols or HTTP, defined in the library.
Signature: Signed by provider private key.
Entries: can be an interlinked chain of entrychunk nodes, or an IPLD HAMT ADL where the keys in the map represent the multihashes and the values are set to true.

Q: Where does/did the IPFS Hydras fit in with the indexer?
A: 🐍 The hydra nodes acted s a lookup for the DHT which previously acted as a sync with the Indexer as a post lookup activity. Hydras are presently performing a bridging function which is the path IPFS gateways would have to leverage in order to pull data from the network indexer. IPFS gateways now have the option of querying the indexer directly via what we're calling HTTP delegated routing. In the future we will have Ambient discovery of content routers which will greatly improve on the speed and efficiency of this process while reducing the IPFS network sole reliance on the Kademlia DHT as a source of content lookups.
Read more about ambient content routing:

Or alternatively ANY of you are welcome to join the Content Routing Workgroup I formed and participate in these discussions we currently host biweekly meetings between the IPFS stewards team, Pro-lab, and the Network Indexer team as well as interested stakeholders.
Read more on the Content routing workgroup page
@lindsay Walker
I'm still promising you some IPNI launchpad content 😉 and we can use this as a good FAQ start for that I think.

Additionally I'd recommend reading more here: https://docs.cid.contact/filecoin-network-indexer/overview

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content-request A request for additional content to be created for curriculum IPFS belongs in the IPFS section of the curriculum L Work size Large (3 days-week)
Projects
Status: No status
Development

No branches or pull requests

1 participant