feat: peer discovery and routing section (#294)

salmad3 · web-flow · commit 37935bca8e1c · 2023-03-03T10:06:16.000-08:00
diff --git a/content/concepts/discovery-routing/_index.md b/content/concepts/discovery-routing/_index.md
@@ -0,0 +1,5 @@
+---
+title : "Peer Discovery and Routing"
+description: "Peer discovery and routing protocols are used to discover and announce services to other peers."
+weight: 7
+---
diff --git a/content/concepts/discovery-routing/kaddht.md b/content/concepts/discovery-routing/kaddht.md
@@ -0,0 +1,74 @@
+---
+title: "Kademlia DHT"
+description: "The libp2p Kad-DHT subsystem is an implementation of the Kademlia
+DHT, a distributed hash table."
+weight: 226
+---
+
+## Overview
+
+The Kademlia Distributed Hash Table (DHT), or Kad-DHT, is a distributed hash table
+that is designed for P2P networks.
+
+Kad-DHT in libp2p is a subsystem based on the
+[Kademlia whitepaper](https://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf).
+
+Kad-DHT offers a way to find nodes and data on the network by using a
+[routing table](https://en.wikipedia.org/wiki/Routing_table) that organizes peers based
+on how similar their keys are.
+
+<details>
+  <summary>A deeper look</summary>
+
+  The routing table is organized based on a prefix length and a distance metric.
+  The prefix length helps to group similar keys, and the distance metric helps to
+  find the closest peers to a specific key in the routing table. The table maintains
+  a list of `k` closest peers for each possible prefix length between `0` and `L-1`,
+  where `L` is the length of the keyspace, determined by the length of the hash
+  function used. **Kad-DHT uses SHA-256**, with a keyspace of 256 bits, trying to maintain
+  `k` peers with a shared key prefix for every prefix length between `0` and `255` in
+  its routing table.
+
+  The prefix length measures the proximity of two keys in the routing table and
+  divides the keyspace into smaller subspaces, called "buckets", each containing nodes
+  that share a common prefix of bits in their SHA-256 hash. The prefix length is the
+  number of bits that are the same in the two keys' SHA-256 hash. The more leading bits
+  that are the same, the longer the prefix length and the closer the proximity of the
+  two keys are considered to be.
+
+  The distance metric is a way to calculate the distance between two keys by
+  taking the bitwise exclusive-or (XOR) of the SHA-256 hash of the two keys. The
+  result is a measure of the distance between the two keys, where a distance of
+  `0` means the keys are identical, and a distance of `1` means that only one
+  bit is different, meaning the two keys are close to each other (i.e. their
+  SHA-256 hashes are similar).
+
+  This design allows for efficient and effective lookups in the routing table when
+  trying to find nodes or data that share similar prefixes.
+
+</details>
+
+## Peer routing
+
+The Kad-DHT uses a process called "peer routing" to discover nodes in the
+network. When looking for a peer, the local node contacts the `k` closest nodes to
+the remote peer's ID asking them for closer nodes. The local node repeats the
+process until it finds the peer or determines that it is not in the network.
+
+## Content provider routing
+
+Kad-DHT also includes a feature for content provider discovery, where nodes can
+look up providers for a given key. The local node again contacts the `k` closest
+nodes to the key asking them for either providers of the key and/or closer nodes
+to the key. The local node repeats the process until it finds providers for the
+key or determines that it is not in the network.
+
+## Bootstrap process
+
+To maintain a healthy routing table and discover new nodes, the Kad-DHT includes
+a bootstrap process that runs periodically. The process starts by generating a random peer
+ID and looking it up via the peer routing process. The node then adds the closest peers it
+discovers to its routing table and repeats the process multiple times. This process also
+includes looking up its own peer ID to improve awareness of nodes close to itself.
+
+{{< alert icon="💡" context="note" text="See the Kademlia DHT <a class=\"text-muted\" href=\"https://github.com/libp2p/specs/tree/master/kad-dht\">technical specification</a> for more details." />}}
diff --git a/content/concepts/discovery-routing/mDNS.md b/content/concepts/discovery-routing/mDNS.md
@@ -0,0 +1,26 @@
+---
+title: "mDNS"
+description: "mDNS uses a multicast system of DNS records over a local network to enable peer discovery."
+weight: 224
+---
+
+## What is mDNS?
+
+mDNS, or multicast Domain Name System, is a way for nodes to use IP multicast to
+publish and receive DNS records [RFC
+6762](https://www.rfc-editor.org/rfc/rfc6762) within a local network. Nodes
+broadcast topics they're interested in. mDNS is commonly used on home networks
+to allow devices such as computers, printers, and smart TVs to discover each
+other and connect.
+
+## mDNS in libp2p
+
+In libp2p, mDNS is used for peer discovery, allowing peers to find each other on
+the same local network without any configuration. In the basic mDNS node
+discovery flow a node broadcasts a request which is consecutively replied to by
+other nodes within the network with their multiaddresses.
+
+To learn more about
+definitions, specific fields, and peer discovery, [visit the mDNS libp2p
+specification](https://github.com/libp2p/specs/blob/master/discovery/mdns.md).
+<!-- ADD DIAGRAM -->
diff --git a/content/concepts/discovery-routing/overview.md b/content/concepts/discovery-routing/overview.md
@@ -0,0 +1,55 @@
+---
+title: "What is Discovery & Routing"
+description: "Peer discovery and routing protocols are used to discover and announce services to other peers and find a peer's location, respectively."
+weight: 221
+---
+
+## Overview
+
+Peer discovery and routing are two essential aspects of P2P networking. In a P2P network,
+each node must be able to discover and communicate with other nodes without the need for
+a central server.
+
+### Peer discovery
+
+Peer discovery is the process of finding and announcing services to other available
+peers in a P2P network. Peer discovery can be done using various protocols, such
+as broadcasting a message to all peers in the network or using a bootstrap node to
+provide a list of known peers.
+
+### Peer routing
+
+Peer routing, on the other hand, refers to finding a specific peer's location
+in the network. This is typically done by maintaining a routing table or a similar
+data structure that keeps track of the network topology.
+
+Different algorithms can be used to find the "closest" neighboring peers to a given peer ID.
+A peer may use a routing algorithm to find the location of a specific peer and then
+use that information to discover new peers in the vicinity. Additionally, a peer may
+use both peer routing and peer discovery mechanisms in parallel to find new peers and
+route data to them.
+
+{{< alert icon="" context="note">}}
+In practice, the distinction between peer routing and peer
+discovery is not always clear-cut, and it's worth noting that in a real-world
+implementation, discovery and routing usually happen concurrently.
+{{< /alert >}}
+
+## Discovery and routing in libp2p
+
+libp2p provides a set of modules for different network-level functionality,
+including peer discovery and routing. Peers in libp2p can discover other
+peers using various mechanisms, such as exchanging peer
+[multiaddresses](./../../fundamentals/addressing) over the
+network, querying a directory service, or using a distributed hash table (DHT)
+to store and retrieve information about available peers.
+
+These methods include, but are not limited to:
+
+- [Rendezvous](./../rendezvous): a protocol that allows peers to exchange peer multiaddresses
+  in a secure and private manner.
+- [mDNS](./../mdns): a multicast Domain Name System (DNS) protocol that allows peers to
+  discover other peers on the local network.
+- [DHT](./../kaddht): Distributed Hash Table, libp2p uses a DHT called Kademlia, it assigns
+  each piece of content a unique identifier and stores the content on the peer whose
+  identifier is closest to the content's identifier.
diff --git a/content/concepts/discovery-routing/rendezvous.md b/content/concepts/discovery-routing/rendezvous.md
@@ -0,0 +1,86 @@
+---
+title : "Rendezvous"
+description: "The rendezvous protocol can facilitate the routing and discovery of nodes in a P2P network using a common location."
+weight: 223
+---
+
+## What is Rendezvous?
+
+A rendezvous protocol is a routing protocol that enables nodes and resources
+in a P2P network to discover each other. Rendezvous is used
+as a common location (point) to route between two routes.
+
+Rendezvous points are typically nodes that are well-connected and stable in
+a network and can handle large amounts of traffic and data. They
+serve as a hub for nodes to discover.
+
+<details>
+  <summary>Rendezvous is not decentralized</summary>
+
+  It is important to note that Rendezvous is not decentralized but rather
+  federated. While this has its use cases, it also introduces a single
+  point of failure into the network. This can be contrasted with fully decentralized
+  solutions like DHT and Gossipsub. DHT (Distributed Hash Table) and Gossipsub are
+  decentralized alternatives to Rendezvous.
+
+  [DHT](kaddht.md) is a distributed network protocol used to store and
+  retrieve data in a P2P network efficiently. It is like a hash table mapping keys
+  to values, allowing for fast lookups and efficient data distribution across the network.
+
+  [Gossipsub](pubsub.md), on the other hand, is a pub-sub (publish-subscribe) protocol
+  that is used to distribute messages and data across a network. It uses a gossip-based
+  mechanism to propagate messages throughout the network, allowing fast and efficient
+  distribution without relying on a central control point.
+
+</details>
+
+## Rendezvous in libp2p
+
+{{< alert icon="💡" context="info" text="The current rendezvous implementation replaces the initial ws-star-rendezvous implementation with rendezvous daemons and a fleet of p2p-circuit relays." />}}
+
+The libp2p rendezvous protocol can be used for different use cases. E.g. it can
+be used during bootstrap to discover circuit relays that provide connectivity
+for browser nodes. Generally, a peer can use known rendezvous points to find
+peers that provide network services. Rendezvous is also used throughout the
+lifetime of an application for peer discovery by registering and polling
+rendezvous points. In an application-specific setting, rendezvous points can be
+used to progressively discover peers that can answer specific queries or host
+shards of content.
+
+The libp2p rendezvous protocol allows peers to connect to a rendezvous point and
+register their presence by sending a `REGISTER` message in one or more
+namespaces. Any node implementing the rendezvous protocol can act as a
+rendezvous point, and any peer can connect to a rendezvous point. However, only
+peers initiating a registration can register themselves at a rendezvous point.
+
+By registering with a rendezvous point, peers allow for their discovery by other peers who
+query the rendezvous point. The query may:
+
+- provide namespace(s), such as `test-app`;
+- optionally provide a maximum number of peers to return;
+- can include a cookie that is obtained from the response to a previous query,
+  thus the current query only contain registrations that weren't part of the
+  previous response.
+  > This simplifies discovery as it reduces the overhead of queried peers and allows for
+  > the pagination of query responses.
+
+There is a default peer registration lifetime of 2 hours. Peers can optionally specify the
+lifetime using a TTL parameter in the `REGISTER` message, with an upper bound of 72 hours.
+
+The rendezvous protocol runs over libp2p streams using the protocol ID `/rendezvous/1.0.0`.
+
+<!-- TO ADD: Interaction diagrams and context -->
+
+### Rendezvous and publish-subscribe
+
+For effective discovery, rendezvous can be combined with [libp2p publish/subscribe](../messaging/pubsub/overview).
+At a basic level, rendezvous can bootstrap pubsub by discovering peers subscribed to a topic. The rendezvous would
+be responsible for publishing packets, subscribing, or unsubscribing from packet shapes.
+
+Pubsub can also be used as a mechanism for building rendezvous services, where a number
+of rendezvous points can federate using pubsub for internal distribution while still
+providing a simple interface to clients.
+
+<!-- DIAGRAMS COMING SOON -->
+
+{{< alert icon="💡" context="note" text="See the rendezvous <a class=\"text-muted\" href=\"https://github.com/libp2p/specs/blob/master/rendezvous/README.md\">technical specification</a> for more details." />}}
diff --git a/content/concepts/discovery/overview.md b/content/concepts/discovery/overview.md