Skip to content

feat: peer discovery and routing section #294

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Mar 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions content/concepts/discovery-routing/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
title : "Peer Discovery and Routing"
description: "Peer discovery and routing protocols are used to discover and announce services to other peers."
weight: 7
---
74 changes: 74 additions & 0 deletions content/concepts/discovery-routing/kaddht.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
---
title: "Kademlia DHT"
description: "The libp2p Kad-DHT subsystem is an implementation of the Kademlia
DHT, a distributed hash table."
weight: 226
---

## Overview

The Kademlia Distributed Hash Table (DHT), or Kad-DHT, is a distributed hash table
that is designed for P2P networks.

Kad-DHT in libp2p is a subsystem based on the
[Kademlia whitepaper](https://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf).

Kad-DHT offers a way to find nodes and data on the network by using a
[routing table](https://en.wikipedia.org/wiki/Routing_table) that organizes peers based
on how similar their keys are.

<details>
<summary>A deeper look</summary>

The routing table is organized based on a prefix length and a distance metric.
The prefix length helps to group similar keys, and the distance metric helps to
find the closest peers to a specific key in the routing table. The table maintains
a list of `k` closest peers for each possible prefix length between `0` and `L-1`,
where `L` is the length of the keyspace, determined by the length of the hash
function used. **Kad-DHT uses SHA-256**, with a keyspace of 256 bits, trying to maintain
`k` peers with a shared key prefix for every prefix length between `0` and `255` in
its routing table.

The prefix length measures the proximity of two keys in the routing table and
divides the keyspace into smaller subspaces, called "buckets", each containing nodes
that share a common prefix of bits in their SHA-256 hash. The prefix length is the
number of bits that are the same in the two keys' SHA-256 hash. The more leading bits
that are the same, the longer the prefix length and the closer the proximity of the
two keys are considered to be.

The distance metric is a way to calculate the distance between two keys by
taking the bitwise exclusive-or (XOR) of the SHA-256 hash of the two keys. The
result is a measure of the distance between the two keys, where a distance of
`0` means the keys are identical, and a distance of `1` means that only one
bit is different, meaning the two keys are close to each other (i.e. their
SHA-256 hashes are similar).

This design allows for efficient and effective lookups in the routing table when
trying to find nodes or data that share similar prefixes.

</details>

## Peer routing

The Kad-DHT uses a process called "peer routing" to discover nodes in the
network. When looking for a peer, the local node contacts the `k` closest nodes to
the remote peer's ID asking them for closer nodes. The local node repeats the
process until it finds the peer or determines that it is not in the network.

## Content provider routing

Kad-DHT also includes a feature for content provider discovery, where nodes can
look up providers for a given key. The local node again contacts the `k` closest
nodes to the key asking them for either providers of the key and/or closer nodes
to the key. The local node repeats the process until it finds providers for the
key or determines that it is not in the network.

## Bootstrap process

To maintain a healthy routing table and discover new nodes, the Kad-DHT includes
a bootstrap process that runs periodically. The process starts by generating a random peer
ID and looking it up via the peer routing process. The node then adds the closest peers it
discovers to its routing table and repeats the process multiple times. This process also
includes looking up its own peer ID to improve awareness of nodes close to itself.

{{< alert icon="💡" context="note" text="See the Kademlia DHT <a class=\"text-muted\" href=\"https://github.com/libp2p/specs/tree/master/kad-dht\">technical specification</a> for more details." />}}
26 changes: 26 additions & 0 deletions content/concepts/discovery-routing/mDNS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
---
title: "mDNS"
description: "mDNS uses a multicast system of DNS records over a local network to enable peer discovery."
weight: 224
---

## What is mDNS?

mDNS, or multicast Domain Name System, is a way for nodes to use IP multicast to
publish and receive DNS records [RFC
6762](https://www.rfc-editor.org/rfc/rfc6762) within a local network. Nodes
broadcast topics they're interested in. mDNS is commonly used on home networks
to allow devices such as computers, printers, and smart TVs to discover each
other and connect.

## mDNS in libp2p

In libp2p, mDNS is used for peer discovery, allowing peers to find each other on
the same local network without any configuration. In the basic mDNS node
discovery flow a node broadcasts a request which is consecutively replied to by
other nodes within the network with their multiaddresses.

To learn more about
definitions, specific fields, and peer discovery, [visit the mDNS libp2p
specification](https://github.com/libp2p/specs/blob/master/discovery/mdns.md).
<!-- ADD DIAGRAM -->
55 changes: 55 additions & 0 deletions content/concepts/discovery-routing/overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
---
title: "What is Discovery & Routing"
description: "Peer discovery and routing protocols are used to discover and announce services to other peers and find a peer's location, respectively."
weight: 221
---

## Overview

Peer discovery and routing are two essential aspects of P2P networking. In a P2P network,
each node must be able to discover and communicate with other nodes without the need for
a central server.

### Peer discovery

Peer discovery is the process of finding and announcing services to other available
peers in a P2P network. Peer discovery can be done using various protocols, such
as broadcasting a message to all peers in the network or using a bootstrap node to
provide a list of known peers.

### Peer routing

Peer routing, on the other hand, refers to finding a specific peer's location
in the network. This is typically done by maintaining a routing table or a similar
data structure that keeps track of the network topology.

Different algorithms can be used to find the "closest" neighboring peers to a given peer ID.
A peer may use a routing algorithm to find the location of a specific peer and then
use that information to discover new peers in the vicinity. Additionally, a peer may
use both peer routing and peer discovery mechanisms in parallel to find new peers and
route data to them.

{{< alert icon="" context="note">}}
In practice, the distinction between peer routing and peer
discovery is not always clear-cut, and it's worth noting that in a real-world
implementation, discovery and routing usually happen concurrently.
{{< /alert >}}

## Discovery and routing in libp2p

libp2p provides a set of modules for different network-level functionality,
including peer discovery and routing. Peers in libp2p can discover other
peers using various mechanisms, such as exchanging peer
[multiaddresses](./../../fundamentals/addressing) over the
network, querying a directory service, or using a distributed hash table (DHT)
to store and retrieve information about available peers.

These methods include, but are not limited to:

- [Rendezvous](./../rendezvous): a protocol that allows peers to exchange peer multiaddresses
in a secure and private manner.
- [mDNS](./../mdns): a multicast Domain Name System (DNS) protocol that allows peers to
discover other peers on the local network.
- [DHT](./../kaddht): Distributed Hash Table, libp2p uses a DHT called Kademlia, it assigns
each piece of content a unique identifier and stores the content on the peer whose
identifier is closest to the content's identifier.
86 changes: 86 additions & 0 deletions content/concepts/discovery-routing/rendezvous.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
---
title : "Rendezvous"
description: "The rendezvous protocol can facilitate the routing and discovery of nodes in a P2P network using a common location."
weight: 223
---

## What is Rendezvous?

A rendezvous protocol is a routing protocol that enables nodes and resources
in a P2P network to discover each other. Rendezvous is used
as a common location (point) to route between two routes.

Rendezvous points are typically nodes that are well-connected and stable in
a network and can handle large amounts of traffic and data. They
serve as a hub for nodes to discover.

<details>
<summary>Rendezvous is not decentralized</summary>

It is important to note that Rendezvous is not decentralized but rather
federated. While this has its use cases, it also introduces a single
point of failure into the network. This can be contrasted with fully decentralized
solutions like DHT and Gossipsub. DHT (Distributed Hash Table) and Gossipsub are
decentralized alternatives to Rendezvous.

[DHT](kaddht.md) is a distributed network protocol used to store and
retrieve data in a P2P network efficiently. It is like a hash table mapping keys
to values, allowing for fast lookups and efficient data distribution across the network.

[Gossipsub](pubsub.md), on the other hand, is a pub-sub (publish-subscribe) protocol
that is used to distribute messages and data across a network. It uses a gossip-based
mechanism to propagate messages throughout the network, allowing fast and efficient
distribution without relying on a central control point.

</details>

## Rendezvous in libp2p

{{< alert icon="💡" context="info" text="The current rendezvous implementation replaces the initial ws-star-rendezvous implementation with rendezvous daemons and a fleet of p2p-circuit relays." />}}

The libp2p rendezvous protocol can be used for different use cases. E.g. it can
be used during bootstrap to discover circuit relays that provide connectivity
for browser nodes. Generally, a peer can use known rendezvous points to find
peers that provide network services. Rendezvous is also used throughout the
lifetime of an application for peer discovery by registering and polling
rendezvous points. In an application-specific setting, rendezvous points can be
used to progressively discover peers that can answer specific queries or host
shards of content.

The libp2p rendezvous protocol allows peers to connect to a rendezvous point and
register their presence by sending a `REGISTER` message in one or more
namespaces. Any node implementing the rendezvous protocol can act as a
rendezvous point, and any peer can connect to a rendezvous point. However, only
peers initiating a registration can register themselves at a rendezvous point.

By registering with a rendezvous point, peers allow for their discovery by other peers who
query the rendezvous point. The query may:

- provide namespace(s), such as `test-app`;
- optionally provide a maximum number of peers to return;
- can include a cookie that is obtained from the response to a previous query,
thus the current query only contain registrations that weren't part of the
previous response.
> This simplifies discovery as it reduces the overhead of queried peers and allows for
> the pagination of query responses.

There is a default peer registration lifetime of 2 hours. Peers can optionally specify the
lifetime using a TTL parameter in the `REGISTER` message, with an upper bound of 72 hours.

The rendezvous protocol runs over libp2p streams using the protocol ID `/rendezvous/1.0.0`.

<!-- TO ADD: Interaction diagrams and context -->

### Rendezvous and publish-subscribe

For effective discovery, rendezvous can be combined with [libp2p publish/subscribe](../messaging/pubsub/overview).
At a basic level, rendezvous can bootstrap pubsub by discovering peers subscribed to a topic. The rendezvous would
be responsible for publishing packets, subscribing, or unsubscribing from packet shapes.

Pubsub can also be used as a mechanism for building rendezvous services, where a number
of rendezvous points can federate using pubsub for internal distribution while still
providing a simple interface to clients.

<!-- DIAGRAMS COMING SOON -->

{{< alert icon="💡" context="note" text="See the rendezvous <a class=\"text-muted\" href=\"https://github.com/libp2p/specs/blob/master/rendezvous/README.md\">technical specification</a> for more details." />}}
30 changes: 0 additions & 30 deletions content/concepts/discovery/overview.md

This file was deleted.