Skip to content
This repository has been archived by the owner on Apr 24, 2023. It is now read-only.

PSA: WebRTC WG at W3C looking for DHT/P2P use cases #177

Open
lidel opened this issue May 14, 2019 · 5 comments
Open

PSA: WebRTC WG at W3C looking for DHT/P2P use cases #177

lidel opened this issue May 14, 2019 · 5 comments

Comments

@lidel
Copy link
Member

lidel commented May 14, 2019

Hey stargazers,

FYSA WebRTC WG at W3C is gathering P2P/DHT use cases for the next version of WebRTC which will be discussed at the June interim:

arewedistributedyet/arewedistributedyet#22 (comment):

we need someone to present the DHT use cases (filed in w3c/webrtc-nv-use-cases#15) which potentially requires fundamental changes in WebRTC. I have already broken down the use case into requirements and I can also help with the technical details (what WebRTC can and currently cannot do) and presenting the requirements. But I'm not well-versed in DHTs, so someone else needs to join.

It is highly relevant to work done in webrtc-star/direct libp2p projects, as it could lead to removal of semi-centralized signaling stars and going beyond direct client-server.

If you have any spare bandwidth, notes, providing feedback in mentioned issues would highly valuable!

cc @raulk @autonome @backkem (go-libp2p-webrtc-direct), @vasco-santos (js-libp2p-webrtc-direct), @albrow (libp2p/specs#159)

@jhiesey
Copy link

jhiesey commented Jun 10, 2019

Currently, WebRTC works well for applications that have a preconfigured signaling server address that all involved browsers connect to via HTTPS or WebSockets. This is a good fit for supporting applications like Google Hangouts and Skype, where WebRTC is a mechanism to avoid sending large amounts of video data through a central server. However, it is a poor fit for distributed applications that want to avoid centralized signaling servers.

Use Case: IPFS with full Kademlia discovery in the browser

IPFS in the browser currently relies on the libp2p-webrtc-star module, which uses a central signaling server, like a traditional WebRTC application. However, IPFS is intended to be fully distributed, so avoiding the central server would be a significant improvement. Then, anyone could host static files for an IPFS data browser to provide a client-side "gateway" to IPFS content.

Fundamentally, IPFS relies content routing using a Kademlia-style DHT to find data with a given hash. Once a data-seeking peer can communicate with any other peer in the network, the data-seeking peer can learn about other peers that are more likely to have the desired data, eventually zeroing in on a set of peers that have the desired data. In principle, it is possible to build such a system that uses the current version of WebRTC for all networking except connecting to the initial peer, but performance with current WebRTC would make such a design impractical. At present, it is feasible for a browser to behave as a Kademlia client using WebSockets, but a system where browsers also answer Kademlia content routing queries is out of reach.

How could this work?

Such an application would contact an initial bootstrap gateway server over HTTPS or WebSockets to do an offer/answer exchange with a few arbitrary other peers of the gateway's choosing. Once the application is thus bootstrapped to some arbitrary peers, it could use these peers as signaling servers to reach other peers, and using the Kademlia search algorithm eventually connect to peers that have the desired file data.

Although this still requires bootstrap gateways, the key difference is that for two peers to communicate, they do not need to agree to connect to the same gateway. As long as the network as a whole remains connected, any gateway will eventually allow you to communicate with any other peer in the whole network.

Why doesn't this work well with current WebRTC?

In principle this design could work, but it would be very complex and slow.

The fundamental Kademlia lookup step, which must be performed repeatedly to find an individual peer or piece of data, involves peer A sending a query to peer B, looking for a given key. Peer B then (in the common case) replies with a set of peers that more closely match the query. Peer A then decides which, if any, of these peers it wishes to query next.

With traditional Kademlia, as defined in the original paper and implemented in BitTorrent, peer B replies to the lookup with the IP address and UDP port of the closer peer, C. Peer A can then immediately send the same query to peer C with a single UDP packet. With WebRTC on the other hand, peer A would need to generate an offer, send it to peer B, which would then forward it to peer C. Peer C would then need to generate an answer, which would then be forwarded back through peer B to peer A. Only then could the actual DTLS handshake begin. This entire process involves many network round trips, and is consequently quite slow. Furthermore, if peer B has an entry in its routing table for peer C, that entry is only useful if peer B continuously has a WebRTC connection open to peer C to forward offers and answers back and forth. Effectively, every DHT peer needs to keep a WebRTC connection open to every peer in its routing table to make the routing table useful.

Unfortunately, WebRTC connections are very expensive to create (in terms of number of round trips) and maintain (in terms of CPU time, memory, and not hitting browser limits that make opening more connections fail or even cause a browser crash). This makes such an application impractical currently.

What changes would fix this?

The following changes would improve this situation greatly, from most to least valuable:

  1. (if possible) Non-interactive connection establishment. If a peer P could generate a long-lasting, reusable offer O that allows any peers that receive it to open connections back to P without additional signaling over a side channel. Essentially, offer O behaves like an address. This is likely incompatible with NAT hole punching, however.
  2. Long-lasting, reusable offers. This would significantly simplify establishing connections, since one connection offer could be sent to multiple peers, allowing multiple connections to be established. The reusable part already exists in ORTC as "forking", but it doesn't exist in the mainline WebRTC API.
  3. Minimize connection resource utilization in the browser. This is partially an implementation quality issue, but optimizing the API and wire protocol for cheap connections would help substantially. If browsers could handle hundreds of idle connections reliably without crashing or excessive CPU or memory use this would make DHTs in the browser feasible.
  4. Minimize connection establishment latency. This is mostly a matter of reducing the number of round trips to establish a connection. Replacing DTLS with QUIC may help significantly with this.

Use case: WebRTC as a transport for fetching page resources

The Service Worker API allows intercepting HTTPS requests for caching and other modifications, but it is not possible to use WebRTC inside a Service Worker context. Allowing this would make it possible to transparently use WebRTC to fetch resources. This has been attempted by various startups including PeerCDN, Peer5, and several others, but requires custom javascript on the page itself to fetch resources. Supporting WebRTC within Service Workers would make this transparent to the page, and would also enable a standard web browser to use more complex protocols like IPFS over WebRTC transparently as well. This would allow the basic functionality of Beaker Browser to work within an unmodified browser.

To get all of the functionality of Beaker a bunch more interfaces would be necessary outside the realm of WebRTC, but this would be a good start.

Use case: AirDrop-like local file transfer

Currently, neither WebRTC nor any other browser APIs provide a mechanism to discover other browser peers on the local network. If it were possible for a page to advertise its presence and discover other similar pages (e.g. pages sharing a common application ID string), and subsequently open WebRTC connections to those pages, it would be straightforward to build a web application similar to Apple's AirDrop on top of IPFS or something similar.

@lgrahl
Copy link

lgrahl commented Jun 10, 2019

Your proposed changes (apart from 1. which I deem technically impossible over NAT and the last use case) are covered by the slides for our presentation tomorrow.

Can you file the (very cool) AirDrop-like local file transfer use case towards https://github.com/w3c/webrtc-nv-use-cases/issues? It will not make it into the upcoming discussion but I'm certain it will be discussed eventually.


A note I simply cannot not make: There's nothing magic in QUIC that can't be done with DTLS. No need to be radical and replace an entire network stack due to minor deficiencies that can be fixed easily. Also, QUIC is currently not ready for any kind of partial reliability without ugly hacks such as stream hopping.

@jhiesey
Copy link

jhiesey commented Jun 13, 2019

@lgrahl Yes I'll file the AirDrop use case!

@jhiesey
Copy link

jhiesey commented Jun 13, 2019

As for proposal 1, I am not familiar enough with actual NAT behavior to know if it is ever possible. If a NAT requires both the source IP and port of an inbound packet to match its table entry before forwarding a packet, then I agree it's not possible. Are there NATs that don't care about the source IP when routing inbound packets?

@lgrahl
Copy link

lgrahl commented Jun 14, 2019

@feross and I have presented proposal 1 (rephrased as a requirement) even though I don't think it's possible. One never knows what people come up with and we're currently only looking at use cases. But the most important aspect I think is the possibility to reuse the signalling data for multiple peers.

Regarding NAT: It may be possible for a fraction of users to leverage techniques such as UPnP to open specific ports or create an extension to TURN to reuse specific ports. But it would be a long shot.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants