Skip to content
This repository has been archived by the owner on Feb 1, 2023. It is now read-only.

Add separate how bitswap works doc #294

Merged
merged 2 commits into from
Apr 7, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 6 additions & 52 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,8 @@ wants those blocks.

`go-bitswap` provides an implementation of the Bitswap protocol in go.

[Learn more about how Bitswap works](./docs/how-bitswap-works.md)

## Install

`go-bitswap` requires Go >= 1.11 and can be installed using Go modules
Expand Down Expand Up @@ -75,8 +77,7 @@ exchange := bitswap.New(ctx, network, bstore)
Parameter Notes:

1. `ctx` is just the parent context for all of Bitswap
2. `network` is a network abstraction provided to Bitswap on top
of libp2p & content routing.
2. `network` is a network abstraction provided to Bitswap on top of libp2p & content routing.
3. `bstore` is an IPFS blockstore

### Get A Block Synchronously
Expand Down Expand Up @@ -107,11 +108,11 @@ blockChannel, err := exchange.GetBlocks(ctx, cids)
Parameter Notes:

1. `ctx` is the context for this request, which can be cancelled to cancel the request
2. `cids` is an slice of content IDs for the blocks you're requesting
2. `cids` is a slice of content IDs for the blocks you're requesting

### Get Related Blocks Faster With Sessions

In IPFS, content blocks are often connected to each other through a MerkleDAG. If you know ahead of time that block requests are related, Bitswap can make several optimizations internally in how it requests those blocks in order to get them faster. Bitswap provides a mechanism called a Bitswap session to manage a series of block requests as part of a single higher level operation. You should initialize a bitswap session any time you intend to make a series of block requests that are related -- and whose responses are likely to come from the same peers.
In IPFS, content blocks are often connected to each other through a MerkleDAG. If you know ahead of time that block requests are related, Bitswap can make several optimizations internally in how it requests those blocks in order to get them faster. Bitswap provides a mechanism called a Bitswap Session to manage a series of block requests as part of a single higher level operation. You should initialize a Bitswap Session any time you intend to make a series of block requests that are related -- and whose responses are likely to come from the same peers.

```golang
var ctx context.Context
Expand All @@ -125,7 +126,7 @@ var relatedCids []cids.cid
relatedBlocksChannel, err := session.GetBlocks(ctx, relatedCids)
```

Note that new session returns an interface with a GetBlock and GetBlocks method that have the same signature as the overall Bitswap exchange.
Note that `NewSession` returns an interface with `GetBlock` and `GetBlocks` methods that have the same signature as the overall Bitswap exchange.

### Tell bitswap a new block was added to the local datastore

Expand All @@ -136,53 +137,6 @@ var exchange bitswap.Bitswap
err := exchange.HasBlock(blk)
```

## Implementation

The following diagram outlines the major tasks Bitswap handles, and their consituent components:

![Bitswap Components](./docs/go-bitswap.png)

### Sending Blocks

Internally, when a message with a wantlist is received, it is sent to the
decision engine to be considered. The decision engine checks the CID for
each block in the wantlist against local storage and creates a task for
each block it finds in the peer request queue. The peer request queue is
a priority queue that sorts available tasks by some metric. Currently,
that metric is very simple and aims to fairly address the tasks of each peer.
More advanced decision logic will be implemented in the future. Task workers
pull tasks to be done off of the queue, retrieve the block to be sent, and
send it off. The number of task workers is limited by a constant factor.

### Requesting Blocks

The want manager handles client requests for new blocks. The 'WantBlocks' method
is invoked for each block (or set of blocks) requested. The want manager ensures
that connected peers are notified of the new block that we want by sending the
new entries to a message queue for each peer. The message queue will loop while
there is work available and:
1. Ensure it has a connection to its peer
2. grab the message to be sent
3. Send the message
If new messages are added while the loop is in steps 1 or 3, the messages are
combined into one to avoid having to keep an actual queue and send multiple
messages. The same process occurs when the client receives a block and sends a
cancel message for it.

### Sessions

Sessions track related requests for blocks, and attempt to optimize transfer speed and reduce the number of duplicate blocks sent across the network. The basic optimization of sessions is to limit asks for blocks to the peers most likely to have that block and most likely to respond quickly. This is accomplished by tracking who responds to each block request, and how quickly they respond, and then optimizing future requests with that information. Sessions try to distribute requests amongst peers such that there is some duplication of data in the responses from different peers, for redundancy, but not too much.

### Finding Providers

When bitswap can't find a connected peer who already has the block it wants, it falls back to querying a content routing system (a DHT in IPFS's case) to try to locate a peer with the block.

Bitswap routes these requests through the ProviderQueryManager system, which rate-limits these requests and also deduplicates in-process requests.

### Providing

As a bitswap client receives blocks, by default it announces them on the provided content routing system (again, a DHT in most cases). This behaviour can be disabled by passing `bitswap.ProvideEnabled(false)` as a parameter when initializing Bitswap. IPFS currently has its own experimental provider system ([go-ipfs-provider](https://github.com/ipfs/go-ipfs-provider)) which will eventually replace Bitswap's system entirely.

## Contribute

PRs are welcome!
Expand Down
60 changes: 60 additions & 0 deletions docs/how-bitswap-works.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
How Bitswap Works
=================

When a client requests blocks, Bitswap sends the CID of those blocks to its peers as "wants". When Bitswap receives a "want" from a peer, it responds with the corresponding block.

### Requesting Blocks

#### Sessions

Bitswap Sessions allow the client to make related requests to the same group of peers. For example typically requests to fetch all the blocks in a file would be made with a single session.

#### Discovery

To discover which peers have a block, Bitswap broadcasts a `want-have` message to all peers it is connected to asking if they have the block.

Any peers that have the block respond with a `HAVE` message. They are added to the Session.

If no connected peers have the block, Bitswap queries the DHT to find peers that have the block.

### Requesting Blocks

When the client requests a block, Bitswap sends a `want-have` message with the block CID to all peers in the Session to ask who has the block.

Bitswap simultaneously sends a `want-block` message to one of the peers in the Session to request the block. If the peer does not have the block, it responds with a `DONT_HAVE` message. In that case Bitswap selects another peer and sends the `want-block` to that peer.

If no peers have the block, Bitswap broadcasts a `want-have` to all connected peers, and queries the DHT to find peers that have the block.

#### Peer Selection

Bitswap uses a probabilistic algorithm to select which peer to send `want-block` to, favouring peers that
- sent `HAVE` for the block
- were discovered as providers of the block in the DHT
- were first to send blocks to previous session requests

The selection algorithm includes some randomness so as to allow peers that are discovered later, but are more responsive, to rise in the ranking.

#### Periodic Search Widening

Periodically the Bitswap Session selects a random CID from the list of "pending wants" (wants that have been sent but for which no block has been received). Bitswap broadcasts a `want-have` to all connected peers and queries the DHT for the CID.

### Serving Blocks

#### Processing Requests

When Bitswap receives a `want-have` it checks if the block is in the local blockstore.

If the block is in the local blockstore Bitswap responds with `HAVE`. If the block is small Bitswap sends the block itself instead of `HAVE`.

If the block is not in the local blockstore, Bitswap checks the `send-dont-have` flag on the request. If `send-dont-have` is true, Bitswap sends `DONT_HAVE`. Otherwise it does not respond.

#### Processing Incoming Blocks

When Bitswap receives a block, it checks to see if any peers sent `want-have` or `want-block` for the block. If so it sends `HAVE` or the block itself to those peers.

#### Priority

Bitswap keeps requests from each peer in separate queues, ordered by the priority specified in the request message.

To select which peer to send the next response to, Bitswap chooses the peer with the least amount of data in its send queue. That way it will tend to "keep peers busy" by always keeping some data in each peer's send queue.