Skip to content

Commit

Permalink
Document piece length selection algorithm
Browse files Browse the repository at this point in the history
Add a page to the book discussing factors in piece length selection, and
Intermodal's piece length selection algorithm.

type: documentation
pr: #392
fixes:
- #367
  • Loading branch information
casey committed Apr 19, 2020
1 parent 3ed449c commit 09b0ee3
Show file tree
Hide file tree
Showing 6 changed files with 262 additions and 21 deletions.
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@ Changelog

UNRELEASED - 2020-04-19
-----------------------
- :books: [`xxxxxxxxxxxx`](https://github.com/casey/intermodal/commits/master) Generate reference sections with `bin/gen` - _Casey Rodarmor <casey@rodarmor.com>_
- :books: [`xxxxxxxxxxxx`](https://github.com/casey/intermodal/commits/master) Document piece length selection algorithm ([#392](https://github.com/casey/intermodal/pull/392)) - Fixes [#367](https://github.com/casey/intermodal/issues/367) - _Casey Rodarmor <casey@rodarmor.com>_
- :books: [`3ed449ce9325`](https://github.com/casey/intermodal/commit/3ed449ce932509ac88bd4837d74c9cbbb0729da9) Generate reference sections with `bin/gen` - _Casey Rodarmor <casey@rodarmor.com>_
- :art: [`a6bf75279181`](https://github.com/casey/intermodal/commit/a6bf7527918178821e080db10e65b057f427200d) Use `invariant` instead of `unwrap` and `expect` - Fixes [#167](https://github.com/casey/intermodal/issues/167) - _Casey Rodarmor <casey@rodarmor.com>_
- :white_check_mark: [`faf46c0f0e6f`](https://github.com/casey/intermodal/commit/faf46c0f0e6fd4e4f8b504d414a3bf02d7d68e4a) Test that globs match torrent contents - Fixes [#377](https://github.com/casey/intermodal/issues/377) - _Casey Rodarmor <casey@rodarmor.com>_
- :books: [`0a754d0bcfcf`](https://github.com/casey/intermodal/commit/0a754d0bcfcfd65127d7b6e78d41852df78d3ea2) Add manual Arch install link - Fixes [#373](https://github.com/casey/intermodal/issues/373) - _Casey Rodarmor <casey@rodarmor.com>_
Expand Down
3 changes: 2 additions & 1 deletion bin/gen/templates/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,10 @@ Summary
{{commands}}

- [Bittorrent](./bittorrent.md)
- [Distributing Large Data Sets](./bittorrent/distributing-large-data-sets.md)
- [Piece Length Selection](./bittorrent/piece-length-selection.md)
- [BEP Support](./bittorrent/bep-support.md)
- [Metainfo Utilities](./bittorrent/metainfo-utilities.md)
- [Distributing Large Data Sets](./bittorrent/distributing-large-data-sets.md)
- [UDP Tracker Protocol](./bittorrent/udp-tracker-protocol.md)

{{references}}
3 changes: 2 additions & 1 deletion book/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,10 @@ Summary
- [`imdl torrent verify`](./commands/imdl-torrent-verify.md)

- [Bittorrent](./bittorrent.md)
- [Distributing Large Data Sets](./bittorrent/distributing-large-data-sets.md)
- [Piece Length Selection](./bittorrent/piece-length-selection.md)
- [BEP Support](./bittorrent/bep-support.md)
- [Metainfo Utilities](./bittorrent/metainfo-utilities.md)
- [Distributing Large Data Sets](./bittorrent/distributing-large-data-sets.md)
- [UDP Tracker Protocol](./bittorrent/udp-tracker-protocol.md)

- [References](./references.md)
Expand Down
127 changes: 127 additions & 0 deletions book/src/bittorrent/piece-length-selection.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
BitTorrent Piece Length Selection
=================================

BitTorrent `.torrent` files contain so-called metainfo that allows BitTorrent
peers to locate, download, and verify the contents of a torrent.

This metainfo includes the piece list, a list of SHA-1 hashes of fixed-size
pieces of the torrent data. The size of these pieces is chosen by the torrent
creator.

Intermodal has a simple algorithm that attempts to pick a reasonable piece
length for a torrent given the size of the contents.

For compatibility with the
[BitTorrent v2 specification](http://bittorrent.org/beps/bep_0052.html), the
algorithm chooses piece lengths that are powers of two, and that are at least
16KiB.

The maximum automatically chosen piece length is 16MiB, as piece lengths larger
than 16MiB have been reported to cause issues for some clients.

In addition to the above constraints, there are a number of additional factors
to consider.


Factors favoring smaller piece length
-------------------------------------

- To avoid uploading bad data, peers only upload data from full pieces, which
can be verified by hash. Decreasing the piece size allows peers to more
quickly obtain a full piece, which decreases the time before they begin
uploading, and receiving data in return.

- Decreasing the piece size decreases the amount of data that must be thrown
away in case of corruption.


Factors favoring larger piece length
------------------------------------

- Increasing the piece size decreases the protocol overhead from requesting
many pieces.

- Increasing the piece size decreases the number of pieces, decreasing the
size of the metainfo.

- Increasing piece length increases the proportion of disk seeks to disk
reads, which can be beneficial for spinning disks.


Intermodal's Algorithm
----------------------

In Python, the algorithm used by intermodal is:

```python
MIN = 16 * 1024
MAX = 16 * 1024 * 1024

def piece_length(content_length):
exponent = math.log2(content_length)
length = 1 << int((exponent / 2 + 4))
return min(max(length, MIN), MAX)
```

Which gives the following piece lengths:

```
Content -> Piece Length x Count = Piece List Size
16 KiB -> 16 KiB x 1 = 20 bytes
32 KiB -> 16 KiB x 2 = 40 bytes
64 KiB -> 16 KiB x 4 = 80 bytes
128 KiB -> 16 KiB x 8 = 160 bytes
256 KiB -> 16 KiB x 16 = 320 bytes
512 KiB -> 16 KiB x 32 = 640 bytes
1 MiB -> 16 KiB x 64 = 1.25 KiB
2 MiB -> 16 KiB x 128 = 2.5 KiB
4 MiB -> 32 KiB x 128 = 2.5 KiB
8 MiB -> 32 KiB x 256 = 5 KiB
16 MiB -> 64 KiB x 256 = 5 KiB
32 MiB -> 64 KiB x 512 = 10 KiB
64 MiB -> 128 KiB x 512 = 10 KiB
128 MiB -> 128 KiB x 1024 = 20 KiB
256 MiB -> 256 KiB x 1024 = 20 KiB
512 MiB -> 256 KiB x 2048 = 40 KiB
1 GiB -> 512 KiB x 2048 = 40 KiB
2 GiB -> 512 KiB x 4096 = 80 KiB
4 GiB -> 1 MiB x 4096 = 80 KiB
8 GiB -> 1 MiB x 8192 = 160 KiB
16 GiB -> 2 MiB x 8192 = 160 KiB
32 GiB -> 2 MiB x 16384 = 320 KiB
64 GiB -> 4 MiB x 16384 = 320 KiB
128 GiB -> 4 MiB x 32768 = 640 KiB
256 GiB -> 8 MiB x 32768 = 640 KiB
512 GiB -> 8 MiB x 65536 = 1.25 MiB
1 TiB -> 16 MiB x 65536 = 1.25 MiB
2 TiB -> 16 MiB x 131072 = 2.5 MiB
4 TiB -> 16 MiB x 262144 = 5 MiB
8 TiB -> 16 MiB x 524288 = 10 MiB
16 TiB -> 16 MiB x 1048576 = 20 MiB
32 TiB -> 16 MiB x 2097152 = 40 MiB
64 TiB -> 16 MiB x 4194304 = 80 MiB
128 TiB -> 16 MiB x 8388608 = 160 MiB
256 TiB -> 16 MiB x 16777216 = 320 MiB
512 TiB -> 16 MiB x 33554432 = 640 MiB
1 PiB -> 16 MiB x 67108864 = 1.25 GiB
```


References
----------

### Articles

- [Vuze Wiki](https://wiki.vuze.com/w/Torrent_Piece_Size)

- [TorrentFreak](https://torrentfreak.com/how-to-make-the-best-torrents-081121/)

### Implementations

- [libtorrent](https://github.com/arvidn/libtorrent/blob/a3440e54bb7f65ac6100c3d993c53f887025d660/src/create_torrent.cpp#L367)

- [libtransmission](https://github.com/transmission/transmission/blob/a482100f0cbae8050fd7e954af2cb1311205916e/libtransmission/makemeta.c#L89)

- [dottorrent](https://github.com/kz26/dottorrent/blob/fea5714efe0cde2a55eabfb387295781a78d84bb/dottorrent/__init__.py#L154)

- [Torrent File Editor](https://github.com/torrent-file-editor/torrent-file-editor/blob/811e401b38f26b6d94c4808c54ae2dcc7bbc27dd/mainwindow.cpp#L1210)
127 changes: 127 additions & 0 deletions book/src/bittorrent/piece-length.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
Piece Length Selection
======================

BitTorrent `.torrent` files contain so-called metainfo that allows BitTorrent
peers to locate, download, and verify the contents of a torrent.

This metainfo includes the piece list, a list of SHA-1 hashes of fixed-size
pieces of the torrent data. The size of these pieces is chosen by the torrent
creator.

Intermodal has a simple algorithm that attempts to pick a reasonable piece
length for a torrent given the size of the contents.

For compatibility with the
[BitTorrent v2 specification](http://bittorrent.org/beps/bep_0052.html), the
algorithm chooses piece lengths that are powers of two, and that are at least
16 KiB.

The maximum automatically chosen piece length is 16 MiB, as piece lengths larger
than 16 MiB have been reported to cause issues for some clients.

In addition to the above constraints, there are a number of additional factors
to consider.


Factors favoring smaller piece length
-------------------------------------

- To avoid uploading bad data, peers only upload data from full pieces, which
can be verified by hash. Decreasing the piece size allows peers to more
quickly obtain a full piece, which decreases the time before they begin
uploading, and receiving data in return.

- Decreasing the piece size decreases the amount of data that must be thrown
away in case of corruption.


Factors favoring larger piece length
------------------------------------

- Increasing the piece size decreases the protocol overhead from requesting
many pieces.

- Increasing the piece size decreases the number of pieces, decreasing the
size of torrent metainfo.

- Increasing piece length increases the proportion of disk seeks to disk
reads, which can be beneficial for spinning disks.


Intermodal's Algorithm
----------------------

In Python, the algorithm used by intermodal is:

```python
MIN = 16 * 1024
MAX = 16 * 1024 * 1024

def piece_length(content_length):
exponent = math.log2(content_length)
length = 1 << int((exponent / 2 + 4))
return min(max(length, MIN), MAX)
```

Which gives the following piece lengths:

```
Content -> Piece Length x Count = Piece List Size
16 KiB -> 16 KiB x 1 = 20 bytes
32 KiB -> 16 KiB x 2 = 40 bytes
64 KiB -> 16 KiB x 4 = 80 bytes
128 KiB -> 16 KiB x 8 = 160 bytes
256 KiB -> 16 KiB x 16 = 320 bytes
512 KiB -> 16 KiB x 32 = 640 bytes
1 MiB -> 16 KiB x 64 = 1.25 KiB
2 MiB -> 16 KiB x 128 = 2.5 KiB
4 MiB -> 32 KiB x 128 = 2.5 KiB
8 MiB -> 32 KiB x 256 = 5 KiB
16 MiB -> 64 KiB x 256 = 5 KiB
32 MiB -> 64 KiB x 512 = 10 KiB
64 MiB -> 128 KiB x 512 = 10 KiB
128 MiB -> 128 KiB x 1024 = 20 KiB
256 MiB -> 256 KiB x 1024 = 20 KiB
512 MiB -> 256 KiB x 2048 = 40 KiB
1 GiB -> 512 KiB x 2048 = 40 KiB
2 GiB -> 512 KiB x 4096 = 80 KiB
4 GiB -> 1 MiB x 4096 = 80 KiB
8 GiB -> 1 MiB x 8192 = 160 KiB
16 GiB -> 2 MiB x 8192 = 160 KiB
32 GiB -> 2 MiB x 16384 = 320 KiB
64 GiB -> 4 MiB x 16384 = 320 KiB
128 GiB -> 4 MiB x 32768 = 640 KiB
256 GiB -> 8 MiB x 32768 = 640 KiB
512 GiB -> 8 MiB x 65536 = 1.25 MiB
1 TiB -> 16 MiB x 65536 = 1.25 MiB
2 TiB -> 16 MiB x 131072 = 2.5 MiB
4 TiB -> 16 MiB x 262144 = 5 MiB
8 TiB -> 16 MiB x 524288 = 10 MiB
16 TiB -> 16 MiB x 1048576 = 20 MiB
32 TiB -> 16 MiB x 2097152 = 40 MiB
64 TiB -> 16 MiB x 4194304 = 80 MiB
128 TiB -> 16 MiB x 8388608 = 160 MiB
256 TiB -> 16 MiB x 16777216 = 320 MiB
512 TiB -> 16 MiB x 33554432 = 640 MiB
1 PiB -> 16 MiB x 67108864 = 1.25 GiB
```


References
----------

### Articles

- [Vuze Wiki](https://wiki.vuze.com/w/Torrent_Piece_Size)

- [TorrentFreak](https://torrentfreak.com/how-to-make-the-best-torrents-081121/)

### Implementations

- [libtorrent](https://github.com/arvidn/libtorrent/blob/a3440e54bb7f65ac6100c3d993c53f887025d660/src/create_torrent.cpp#L367)

- [libtransmission](https://github.com/transmission/transmission/blob/a482100f0cbae8050fd7e954af2cb1311205916e/libtransmission/makemeta.c#L89)

- [dottorrent](https://github.com/kz26/dottorrent/blob/fea5714efe0cde2a55eabfb387295781a78d84bb/dottorrent/__init__.py#L154)

- [Torrent File Editor](https://github.com/torrent-file-editor/torrent-file-editor/blob/811e401b38f26b6d94c4808c54ae2dcc7bbc27dd/mainwindow.cpp#L1210)
20 changes: 2 additions & 18 deletions src/piece_length_picker.rs
Original file line number Diff line number Diff line change
@@ -1,21 +1,5 @@
// The piece length picker attempts to pick a reasonable piece length
// for a torrent given the size of the torrent's contents.
//
// Constraints:
// - Decreasing piece length increases protocol overhead.
// - Decreasing piece length increases torrent metainfo size.
// - Increasing piece length increases the amount of data that must be thrown
// away in case of corruption.
// - Increasing piece length increases the amount of data that must be
// downloaded before it can be verified and uploaded to other peers.
// - Decreasing piece length increases the proportion of disk seeks to disk
// reads. This can be an issue for spinning disks.
// - The BitTorrent v2 specification requires that piece sizes be larger than 16
// KiB.
//
// These constraints could probably be exactly defined and optimized
// using an integer programming solver, but instead we just copy what
// libtorrent does.
//! See [the book](https://imdl.io/book/bittorrent/piece-length.html) for more
//! information on Intermodal's automatic piece length selection algorithm.

use crate::common::*;

Expand Down

0 comments on commit 09b0ee3

Please sign in to comment.