Skip to content

MatthewLM/python-mnemonic

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

{{bip}}

<pre>
  BIP:     BIP-0039
  Title:   Mnemonic code for generating deterministic keys
  Author:  Pavol Rusnak <stick@gk2.sk>
           Marek Palatinus <info@bitcoin.cz>
           Aaron Voisine <voisine@gmail.com>
  Status:  Draft
  Type:    Standards Track
  Created: 10-09-2013
</pre>

==Abstract==

This BIP proposes a scheme for translating binary data (usually master seeds
for deterministic keys, but it can be applied to any binary data) into a group
of easy to remember words also known as mnemonic code or mnemonic sentence.

==Motivation==

Such mnemonic code or mnemonic sentence is much easier to work with than working
with the binary data directly (or its hexadecimal interpretation). The sentence
could be writen down on paper (e.g. for storing in a secure location such as
safe), told over telephone or other voice communication method, or memorized
in ones memory (this method is called brainwallet).

==Backwards Compatibility==

As this BIP is written, only one Bitcoin client (Electrum) implements mnemonic
codes, but it uses a different wordlist than the proposed one.

For compatibility reasons we propose adding a checkbox to Electrum, which will
allow user to indicate if the legacy code is being entered during import or
it is a new one that is BIP-0039 compatible. For exporting, only the new format
will be used, so this is not an issue.

==Rationale==

Our proposal is inspired by implementation used in Electrum, but we enhanced
the wordlist and algorithm so it meets the following criteria:

a) smart selection of words
   - wordlist is created in such way that it's enough to type just first four
     letters to unambiguously identify the word

b) similar words avoided
   - words as "build" and "built", "woman" and "women" or "quick" or "quickly"
     not only make remembering the sentence difficult, but are also more error
     prone and more difficult to guess (see point below)
   - we avoid these words by carefully selecting them during addition

c) sorted wordlists
   - wordlist is sorted which allow more efficient lookup of the code words
     (i.e. implementation can use binary search instead of linear search)
   - this also allows trie (prefix tree) to be used, e.g. for better compression

d) localized wordlists
   - we would like to allow localized wordlists, so it is easier for users
     to remember the code in their native language
   - by using wordlists with no colliding words among languages, it's easy to
     determine which language was used just by checking the first word of
     the sentence

e) mnemonic checksum
   - this leads to better user experience, because user can be notified
     if the mnemonic sequence is wrong, instead of showing the confusing
     data generated from the wrong sequence. 

==Specification==

<pre>
Our proposal implements two methods - "encode" and "decode".

The first method takes a binary data which have to length (L) in bytes divisable
by four and returns a sentence that consists of (L/4*3) words from the wordlist.

The second method takes sentences generated by first method (number of words in
the sentence has to be divisable by 3) and reconstructs the original binary data.

Words can repeat in the sentence more than one time.

Wordlist contains 2048 words (instead of 1626 words in Electrum), allowing
the code to compute the checksum of the whole mnemonic sequence.
Each 32 bits of input data add 1 bit of checksum.

See the following table for relation between input lengths, output lengths and
checksum sizes for the most common usecases:

+--------+---------+---------+----------+
| input  |  input  | output  | checksum |
| (bits) | (bytes) | (words) |  (bits)  |
+--------+---------+---------+----------+
|   128  |    16   |    12   |     4    |
|   160  |    20   |    15   |     5    |
|   192  |    24   |    18   |     6    |
|   224  |    28   |    21   |     7    |
|   256  |    32   |    24   |     8    |
+--------+---------+---------+----------+
</pre>

===Algorithm:===

<pre>
Encoding:
1. Read input data (I).
2. Make sure its length (L) is divisable by 32 bits.
3. Compute the length of the checkum (LC). LC = L/32
4. Split I into chunks of LC bits (I1, I2, I3, ...).
5. XOR them altogether and produce the checksum C. C = I1 xor I2 xor I3 ... xor In.
5. Concatenate I and C into encoded data (E). Length of E is divisable by 33 bits.
6. Keep taking 11 bits from E until there are none left.
7. Treat them as integer W, add word with index W to the output.

Decoding:
1. Read input mnemonic (M).
2. Make sure its wordcount is divisable by 3.
3. Figure out word indexes in a dictionary and output them as binary stream E.
4. Length of E (L) is divisable by 33 bits.
5. Split E into two parts: B and C, where B are first L/33*32 bits, C are last L/33 bits.
6. Make sure C is the checksum of B (using the step 5 from the above paragraph).
7. If it's not we have invalid mnemonic code.
8. Treat B as binary data and return it as output.

</pre>

==Test vectors==

<pre>
input    : 00000000 (32 bits)
mnemonic : abandon abandon abandon (3 words)

input    : 7f7f7f7f (32 bits)
mnemonic : liberty wing ten (3 words)

input    : 80808080 (32 bits)
mnemonic : line advance burial (3 words)

input    : ffffffff (32 bits)
mnemonic : zoo zoo zone (3 words)

input    : 0000000000000000 (64 bits)
mnemonic : abandon abandon abandon abandon abandon abandon (6 words)

input    : 7f7f7f7f7f7f7f7f (64 bits)
mnemonic : liberty wing ten year wave saturday (6 words)

input    : 8080808080808080 (64 bits)
mnemonic : line advance burst abuse always double (6 words)

input    : ffffffffffffffff (64 bits)
mnemonic : zoo zoo zoo zoo zoo young (6 words)

input    : 000000000000000000000000 (96 bits)
mnemonic : abandon abandon abandon abandon abandon abandon abandon abandon abandon (9 words)

input    : 7f7f7f7f7f7f7f7f7f7f7f7f (96 bits)
mnemonic : liberty wing ten year wave sauce worry used letter (9 words)

input    : 808080808080808080808080 (96 bits)
mnemonic : line advance burst abuse always down acid author light (9 words)

input    : ffffffffffffffffffffffff (96 bits)
mnemonic : zoo zoo zoo zoo zoo zoo zoo zoo yell (9 words)

input    : 00000000000000000000000000000000 (128 bits)
mnemonic : abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon (12 words)

input    : 7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f (128 bits)
mnemonic : liberty wing ten year wave sauce worry used liberty wing ten worth (12 words)

input    : 80808080808080808080808080808080 (128 bits)
mnemonic : line advance burst abuse always down acid author line advance burst abandon (12 words)

input    : ffffffffffffffffffffffffffffffff (128 bits)
mnemonic : zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo worth (12 words)

input    : 0000000000000000000000000000000000000000 (160 bits)
mnemonic : abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon (15 words)

input    : 7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f (160 bits)
mnemonic : liberty wing ten year wave sauce worry used liberty wing ten year wave sauce winner (15 words)

input    : 8080808080808080808080808080808080808080 (160 bits)
mnemonic : line advance burst abuse always down acid author line advance burst abuse always down abandon (15 words)

input    : ffffffffffffffffffffffffffffffffffffffff (160 bits)
mnemonic : zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo winner (15 words)

input    : 000000000000000000000000000000000000000000000000 (192 bits)
mnemonic : abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon (18 words)

input    : 7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f (192 bits)
mnemonic : liberty wing ten year wave sauce worry used liberty wing ten year wave sauce worry used liberty way (18 words)

input    : 808080808080808080808080808080808080808080808080 (192 bits)
mnemonic : line advance burst abuse always down acid author line advance burst abuse always down acid author line abandon (18 words)

input    : ffffffffffffffffffffffffffffffffffffffffffffffff (192 bits)
mnemonic : zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo way (18 words)

input    : 00000000000000000000000000000000000000000000000000000000 (224 bits)
mnemonic : abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon (21 words)

input    : 7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f (224 bits)
mnemonic : liberty wing ten year wave sauce worry used liberty wing ten year wave sauce worry used liberty wing ten year useful (21 words)

input    : 80808080808080808080808080808080808080808080808080808080 (224 bits)
mnemonic : line advance burst abuse always down acid author line advance burst abuse always down acid author line advance burst abuse abandon (21 words)

input    : ffffffffffffffffffffffffffffffffffffffffffffffffffffffff (224 bits)
mnemonic : zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo useful (21 words)

input    : 0000000000000000000000000000000000000000000000000000000000000000 (256 bits)
mnemonic : abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon (24 words)

input    : 7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f7f (256 bits)
mnemonic : liberty wing ten year wave sauce worry used liberty wing ten year wave sauce worry used liberty wing ten year wave sauce worry tennis (24 words)

input    : 8080808080808080808080808080808080808080808080808080808080808080 (256 bits)
mnemonic : line advance burst abuse always down acid author line advance burst abuse always down acid author line advance burst abuse always down acid abandon (24 words)

input    : ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff (256 bits)
mnemonic : zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo zoo tennis (24 words)

input    : 19458fe8dace41e60617c6667874f6be (128 bits)
mnemonic : block coast when remind title version bizarre website half seminar disease learn (12 words)

input    : 2d8fc7c27a52995bc165a0dcead7c169f301652f0b5db145 (192 bits)
mnemonic : column leisure utter violin city puppy ahead region sword fiscal usual squad core rebel rough hospital rank continue (18 words)

input    : ea88c6efbc4028994b8b4ebc0345a474 (128 bits)
mnemonic : tuna elegant rose key action ever company hip rotate body hero trip (12 words)

input    : bb8e02d2c0fe4c18f60e173bc38724902395b63b1b47bf18 (192 bits)
mnemonic : rookie include relax local toilet army stuff seed diamond both silent bus deny sure unaware here same shell (18 words)

input    : 0b57161fd701c6c6548db5a71df70166965eec2f16a675acad3ed3460cb3e72f (256 bits)
mnemonic : april rhythm math purchase bounce govern fancy report pocket usage scan slow gun jazz round stage jacket guess explain pledge girl forest organ widow (24 words)

input    : 666fbb72dada90d6c8f627d77450f366 (128 bits)
mnemonic : half leg swing remove post horror carpet settle street pencil dilemma slight (12 words)

input    : 29932c2887ff2c72dd8f9cff1b7ee498119f330d1dd81659 (192 bits)
mnemonic : city october answer aunt verb depart jealous viable yell swallow ride corn board odd cup rocket bench snack (18 words)

input    : 869e2bc8286611bce2159b89c5a8a4029666dc186bbfe8756ac505a4e89f481e (256 bits)
mnemonic : march valley vast fact glad talk maximum recycle member collect chunk age half icon globe junk trial prison rank april exhaust explain bush slice (24 words)

input    : d0263588b1deee8186818319515c6691 (128 bits)
mnemonic : soon craft goal grain uphold drama boat correct block mess blouse carpet (12 words)

input    : 48754ef698334cf9cc5494ccca6a0294bf30224066576a7c (192 bits)
mnemonic : end pretty runway correct curtain lawsuit cover myth slight fault link citizen version awful accord skate holiday tiny (18 words)

input    : ee9e8ce85eadcdf0f5473d7490816d9b1335f7204e7776597ac99f9e29186364 (256 bits)
mnemonic : unveil village derive rumor switch wear stand trap iraqi lunch french data cross wind little sock jeans sky reason dollar thumb mill mistake net (24 words)
</pre>

==Reference Implementation==

Reference implementation including wordlists is available from http://github.com/trezor/python-mnemonic

About

Mnemonic code for generating deterministic keys, BIP39

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%