-
Notifications
You must be signed in to change notification settings - Fork 21
/
Self-decrypting files.page
179 lines (105 loc) · 39.3 KB
/
Self-decrypting files.page
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
---
title: Time-lock encryption
description: How do you encrypt a file such that it can be broken after a date but not before?
created: 24 May 2011
tags: computer science, cryptography, Bitcoin
status: finished
belief: highly likely
...
Julian Assange/Wikileaks made some [headlines](http://www.wired.com/threatlevel/2010/07/wikileaks-insurance-file/) in 2010 when they released an "insurance file", an 1.4GB [AES-256](!Wikipedia)-encrypted file available through BitTorrent. It's generally assumed that copies of the encryption key have been left with Wikileaks supporters who will, in the appropriate contingency like Assange being assassinated, leak the key online to the thousands of downloaders of the insurance file, who will then read and publicize whatever contents as in it (speculated to be additional US documents Manning gave Wikileaks); this way, if the worst happens and Wikileaks cannot afford to keep digesting the files to eventually release them at its leisure in the way it calculates will have the most impact, the files will still be released and someone will be very unhappy.
Of course, this is an all-or-nothing strategy. Wikileaks has no guarantees that the file will not be released prematurely, nor guarantees that it will eventually be released. Any one of those Wikileaks supporters could become disaffected and leak the key at any time - or if there's only 1 supporter, they might lose the key to a glitch or become disaffected in the opposite direction and refuse to transmit the key to anyone. (Hope Wikileaks kept backups of the key!) If one trusts the person with the key *absolutely*, that's fine. But wouldn't it be nice if one didn't have to trust another person like that? Cryptography does really well at eliminating the need to trust others, so maybe there're better schemes.
Now, it's hard to imagine how some abstract math could observe an assassination and decrypt embarrassing files. Perhaps a different question could be answered - can you design an encryption scheme which requires no trusted parties but can only be broken after a certain date?
# Uses
This sort of cryptography would be useful for many things; from ["Time-Lock Puzzles in the Random Oracle Model"](http://people.seas.harvard.edu/~salil/research/timelock.pdf) (Mahmoody et al 2011):
> In addition to the basic use of 'sending messages to the future', there are many other potential uses of timed-release crypto. Rivest, Shamir and Wagner 1996 suggest, among other uses, delayed [digital cash](!Wikipedia) payments, [sealed-bid auctions](http://crypto.stackexchange.com/questions/2507/can-i-encrypt-user-input-in-a-way-i-cant-decrypt-it-for-a-certain-period-of-tim) and [key escrow](!Wikipedia). [Boneh & Naor 2000](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.108.7127&rep=rep1&type=pdf "Timed Commitments") define timed commitments and timed signatures and show that they can be used for fair contract signing, honesty-preserving auctions and more.
[Document embargoes](http://www.cs.odu.edu/~mln/pubs/ms/haq-ms-2008.pdf "'Using timed-release cryptography to mitigate preservation risk of embargo periods', Haq 2008") (eg. legal or classified documents or confessions or [coercible diaries](http://www2.warwick.ac.uk/fac/soc/law/elj/jilt/2001_3/miller/ "'Creating A Subpoena-Proof Diary: A Technological Solution to A Legal Problem', Miller & Gao 2001")) and [Receipt-free voting](/docs/2002-magkos.pdf "'Software-based Receipt-freeness in On-line Elections', Magkos et al 2002") are other applications; a cute albeit completely useless application is ["Offline Submission with RSA Time-Lock Puzzles"](http://www.cn.uni-duesseldorf.de/publications/library/Jerschow2010a.pdf), Jerschow & Mauve 2010:
> Our main contribution is an offline submission protocol which enables an author being currently offline to commit to his document before the deadline by continuously solving an RSA puzzle based on that document. When regaining Internet connectivity, he submits his document along with the puzzle solution which is a proof for the timely completion of the document.
When Ross Ulbricht/Dread Pirate Roberts of [Silk Road]() was arrested in October 2013 and his computer with his Bitcoin hoard seized, I thought of another use: conditional transfers of wealth. Ulbricht could create a time-locked copy of his bitcoins and give it to a friend. Because bitcoins can be transferred, he can at any time use his own unlocked copy to render the copy useless and doesn't have to worry about the friend stealing all the bitcoins from him, but if he is, say, "shot resisting arrest", his friend can still - eventually - recover the coins. This specific Bitcoin scenario may be [possible with the Bitcoin protocol](https://en.bitcoin.it/wiki/Contracts), in which case one has a nice offline digital cash system, as [betterunix](https://news.ycombinator.com/item?id=6509826) suggests:
> Here is one possible use case: imagine an offline digital cash system, so i.e. the bank will not accept the same token twice. To protect against an unscrupulous seller, the buyer pays by giving the seller a time-lock puzzle with the tokens; if the goods are not delivered by some deadline, the buyer will deposit the tokens at the bank, thus preventing the seller from doing so. Otherwise the seller solves the puzzle and makes the deposit. This is basically an anonymity-preserving escrow service, though in practice there are probably simpler approaches.
One not-so-cute use is in defeating [antivirus software](!Wikipedia). ["Anti-Emulation Through Time-Lock Puzzles"](http://tuts4you.com/download.php?view.2348), Ebringer 2008 outlines it: one starts a program with a small time-lock puzzle which must be solved before the program does anything evil, in the hopes that the antivirus scanner will give up or stop watching before the puzzle has been solved and the program decrypts the evil payload; the puzzle's math backing means no antivirus software can analyze or solve the puzzle first. The basic functionality cannot be blacklisted as it is used by legitimate cryptography software such as [OpenSSL](!Wikipedia) which would be expensive collateral damage.
# No trusted third-parties
Note that this bars a lot of the [usual suggestions](http://www.halfbakery.com/idea/Do_20not_20decrypt_20until_20_2e_2e_2e) for cryptography schemes. For example the general approach of [key escrow](!Wikipedia) (eg. [Bellare & Goldwasser](http://groups.csail.mit.edu/cis/pubs/shafi/1997-ccs.pdf) 1996) if you trust some people, you can just adopt a [secret sharing](!Wikipedia) protocol where they XOR together their keys to get the master key for the publicly distributed encrypted file. Or if you only trust some of those people (but are unsure which will try to betray you and either release early or late), you can adopt where _k_ of the _n_ people suffice to reconstruct the master key like [Rabin & Thorpe](ftp://ftp.deas.harvard.edu/techreports/tr-22-06.pdf) 2006. (And you can connect multiple groups, so each decrypts some necessary keys for the next group; but this gives each group a consecutive veto on release...) Or perhaps something could be devised based on [trusted timestamping](!Wikipedia) like [Crescenzo et al](http://www.cs.ucla.edu/~rafail/PUBLIC/42.pdf) 1999 or [Blake & Chan](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.2.3030&rep=rep1&type=ps) 2004; but then don't you need the [trusted third party](!Wikipedia) to survive on the network? Or [secure multi-party computation](!Wikipedia) (but don't *you* need to be on the network, or risk all the parties saying 'screw it, we're too impatient, let's just pool our secrets and decrypt the file *now*'?) or you could exploit physics and use the speed of light to communicate with [a remote computer on a spacecraft](http://www.reddit.com/r/crypto/comments/r4q2q/reliable_timelock_crypto_step_one_obtain_a/ "Reliable time-lock crypto. Step one: obtain a deep-space probe...") (except now we're trusting the spacecraft as our third party, hoping no one stole its onboard private key & is able to decrypt our transmissions instantaneously)...
One approach is to focus on creating problems which can be solved with a large but precise amount of work, reasoning that if the problem can't be solved in less than a month, then you can use that as a way to guarantee the file can't be decrypted *within* a month's time. (This would be a [proof-of-work system](!Wikipedia).) This has its own problems[^Ebringer], but it at least delivers what it promises
[^Ebringer]: Ebringer 2008, applying time-lock puzzles to enhancing the ability of computer viruses & trojans to defeat anti-virus scanners, describes Rivest's original successive-squaring solution somewhat sarcastically:
> Even in the original paper, the authors struggled to find a plausible use for it. To actually use the construction as a "time-lock" requires predicting the speed of CPUs in the future, resulting, at best, in a fuzzy release-date. This assumes that someone cares enough to want what is allegedly wrapped up in the puzzle to bother to compute the puzzle in the first place. It is not obvious that in the majority of situations, this would have a clear advantage over, say, leaving the information with a legal firm with instructions to release it on a particular date. Although this paper proposes a practical use for time-lock puzzles, the original authors would probably be dismayed that there is still not a widespread usage that appears to be of net benefit to humanity.
On the other hand, a similar criticism could and has been made about [Bitcoin](Bitcoin is Worse is Better) (supporters/users must expend massive computing power constantly just to keep it working, with no computational advantage over attackers), and that system has worked well in practice.
# Weak keys
One could encrypt the file against information that will be known in the future, like stock prices - except wait, how can you find out what stock prices will be a year from now? You can't use anything that is public knowledge *now* because that'd let the file be decrypted immediately, and by definition you don't have access to information currently unknown but which will be known in the future, and if you generate the information yourself planning to release it, now you have problems - you can't even trust yourself (what if you are abruptly assassinated like [Gerald Bull](!Wikipedia)?) much less your confederates.
The first approach that jumps to mind is to encrypt the file, but with a relatively short or weak key, one which will take years to bruteforce. Instead of a symmetrical key at 256 bits, perhaps one of 50 bits or 70 bits?
This fails because we realize that we can guarantee on average how much work it will take to bruteforce the key, but we cannot guarantee how much time it will take. It may take a CPU years to bruteforce a chosen key-length, but take a cluster of CPUs just months. Or worse than that, without invoking clusters or supercomputers - devices can differ dramatically now even in the same computers; to take the example of [Bitcoin](!Wikipedia) mining, my laptop's 2GHz CPU can search for hashes at 4k/sec, or its single outdated GPU can search at 54*m*/second^[Actual numbers; the difference [really is](https://en.bitcoin.it/wiki/Why_a_GPU_mines_faster_than_a_CPU) that [large](https://en.bitcoin.it/wiki/Mining_hardware_comparison).]. This 1200x speedup is not because the GPU's clockspeed is 2400GHz or anything like that, it is because the GPU has hundreds of small specialized processors which are able to compute hashes, and the particular application does not require the processors to coordinate or anything like that which might slow them down. Incidentally, this imbalance between CPUs and highly-parallel specialized chips has had the negative effect of centralizing Bitcoin mining power, reducing the security of the network.^[While there are [tens or hundreds of thousands of nodes](https://en.bitcoin.it/wiki/Bitcoin_Map) in the Bitcoin P2P network, only a few of them are actual miners because CPU mining has become useless - the big miners, who have large server farms of GPUs or ASICs, collectively control much of the hash power. This has not yet been a problem, but may. Using a (partially) memory-bound hash function is one of the selling points of a competing Bitcoin currency, [Litecoin](https://github.com/litecoin-project/litecoin/wiki/Comparison-between-Bitcoin-and-Litecoin).] Many [scientific applications](!Wikipedia "GPGPU#Applications") have moved to clusters of GPUs because they offer such great speedups; as have a number of cryptographic applications such as generating[^rainbow] [rainbow tables](!Wikipedia). Someone who tries to time-lock a file using a parallelizable form of work renders themselves vulnerable to any attackers like the NSA or botnets with large numbers of computers, but may also render themselves vulnerable to an ordinary computer-gamer with 2 new GPUs: it would not be very useful to have a time-lock which guarantees the file will be locked between a year and a millennium, depending on how many & what kind of people bother to attack it and whether Moore's law continues to increase the parallel-processing power available.
So anything which is parallel, like using short keys, is probably useless. We need an approach which is inherently *serial*.
# Hashing
For example, one could take a [hash](!Wikipedia "Cryptographic hash function") like [bcrypt](!Wikipedia), give it a random input, and hash it for a month. Each hash depends on the previous hash, and there's no way to skip from the first hash to the trillionth hash. After a month, you use the final hash as the encryption key, and then release the encrypted file and the random input to all the world. The first person who wants to decrypt the file has no choice but to redo the trillion hashes in serial order to reach the same encryption key you used.
Nor can the general public (or the NSA) exploit the parallelism they have available, because each hash depends sensitively on the hash before it - the [avalanche effect](!Wikipedia) is a key property to cryptographic hashes.
On the other hand, there seems to be a way that the original person running this algorithm *can* run it in parallel: one generates _n_ random inputs (for _n_ CPUs, presumably), and sets them hashing as before for say a month. Then, one sets up a chain between the _n_ results - the final hash of seed 1 is used to encrypt seed 2, the final hash of which was the encryption for seed 3, and so on. Then one releases the encrypted file, first seed, and the $n-1$ *encrypted* seeds. Now the public has to hash the first seed for a month, and only then can it decrypt the second seed, and start hashing *that* for a month, and so on. Karl Gluck suggests that like repeated squaring, it's even possible to add error-detection to the procedure[^Gluck-chain]. (A somewhat similar scheme, ["A Guided Tour Puzzle for Denial of Service Prevention"](http://www.cs.pitt.edu/~mehmud/docs/abliz09tourpuzzle.pdf "Abliz & Znati 2009"), uses network latency rather than hash outputs as the chained data - clients bounce from resources to resource - but this obviously requires an online server and is unsuitable for our purposes, among [other problems](http://www.reddit.com/r/compsci/comments/1nx0n5/timelock_encryption/ccne7bk).)
[^Gluck-chain]: On [Hacker News](https://news.ycombinator.com/item?id=6509688)
> To add checkpoints, one could release both the original seed of the chain A, and a number of pairs of hashes (x0,y0) (x1,y1) ... Let's say you wanted to do 1-month chains. Hash the seed A for a week, then take the current value x0 such that H(B)=x0. You know the value of B, since you've been computing the chain. Pick another random value y0, and continue the chain with H(B^y0). Write (x0,y0) in the output, and hash for another week. Do the same for (x1,y1) (x2,y2) and (x3,y3). Each chain then has a seed value and 4 pairs of 'checkpoints'. When unlocking the crypto puzzle, these checkpoints can't be used to jump ahead in the computation, but they can tell you that you're on the right track. I think that you could even use a secondary hash chain for the y_n values, so y_n+1=H(y_n). If you also derived y0 from A (e.g. y0=H(A^const) ), you would just need to publish the seed value A and each checkpoint hash x_n in order to have a fully checkpointed crypto puzzle.
This is pretty clever. If one has a thousand CPUs handy, one can store up 3 years' of computation-resistance in just a day. This satisfies a number of needs. (Maybe; please do not try to use this for a real application before getting a proof from a cryptographer that chained hashing is secure.) But what about people who only have a normal computer? Fundamentally, this repeated hashing requires you to put in as much computation as you want your public to expend reproducing the computation, which is not enough. We want to force the public to expend more computation - potentially much more - than we put in. How can we do this?
It's hard to see. At least, I haven't thought of anything clever. [Homomorphic encryption](!Wikipedia) promises to let us encode arbitrary computations into an encrypted file, so one could imagine implementing the above hash chains *inside* the homomorphic computation, or perhaps just encoding a loop counting up to a large number. There are two problems with trying to apply homomorphic encryption:
1. I don't know how one would let the public decrypt the result of the homomorphic encryption without also letting them tamper with the loop.
Suppose one specifies that the final output is encrypted to a weak key, so one simply has to run the homomorphic system to completion and then put in a little effort to break the homomorphic encryption to get the key which unlocks a file; what stops someone from breaking the homomorphic encryption at the very start, manually examining the running program, and shortcutting to the end result? Of course, one could postulate that what was under the homomorphic encryption was something like a hash-chain where predicting the result is impossible - but then why bother with the homomorphic encryption? Just use that instead!
2. and in any case, homomorphic encryption as of 2013 is a net computational loss: it takes as much or more time to create such a program as it would take to run the program, and is no good.
## Vulnerability of one-way functions
As it turns out, ["Time-Lock Puzzles in the Random Oracle Model"](http://people.seas.harvard.edu/~salil/research/timelock.pdf) (Mahmoody, Moran, and Vadhan 2011; [slides](http://www.iacr.org/conferences/crypto2011/slides/01-3-Mahmoody.pdf)) directly & formally analyzes the general power of one-way functions used for time-lock puzzles assuming a [random oracle](!Wikipedia). Unfortunately, they find an opponent can exploit the oracle to gain speedups. Fortunately, the cruder scheme where one 'stores up' computation (repeatedly asking the oracle at inputs based on its previous output) still works under their assumptions:
> *A time-lock puzzle with a linear gap in parallel time.* Although our negative results rule out 'strong' time-lock puzzles, they still leave open the possibility for a weaker version: one that can be generated with _n_ parallel queries to the oracle but requires _n_ rounds of adaptive queries to solve. In a positive result, we show that such a puzzle can indeed be constructed...Although this work rules out black-box constructions (with a super-constant gap) from one-way permutations and collision-resistant hash functions, we have no reason to believe that time-lock puzzles based on other concrete problems (e.g., lattice-based problems) do not exist. Extending our approach to other general assumptions (e.g., trapdoor permutations) is also an interesting open problem.
That is, the puzzle constructor can construct the puzzle in parallel, and the solver has to solve it serially.
# Successive squaring
At this point, let's see what the crypto experts have to say. Googling to see what the existing literature was (after I'd thought of the above schemes), I found that the relevant term is "time-lock puzzles" (from analogy with the bank vault [time lock](!Wikipedia)). In particular, [Rivest](!Wikipedia "Ron Rivest")/[Shamir](!Wikipedia "Adi Shamir")/[Wagner](!Wikipedia "David A. Wagner") have published a 1996 paper on the topic, ["Time-lock puzzles and timed-release crypto"](http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.110.5709). Apparently the question was first raised by [Timothy C. May](!Wikipedia) on the [Cypherpunks mailing list](!Wikipedia) in an [email from 10 February 1993](http://cypherpunks.venona.com/date/1993/02/msg00129.html "Timed-Release Crypto"). May also discusses the topic briefly in his _[Cyphernomicon](!Wikipedia)_, [ch14.5](http://www.cypherpunks.to/faq/cyphernomicron/chapter14.html); unfortunately, May's solution (14.5.1) is essentially to punt to the legal system and rely on legal privilege and economic incentives to keep keys private. <!-- backup: http://cypherpunks.venona.com/date/1994/06/msg00481.html http://www.webcitation.org/6JhMIlgIF -->
Rivest et al agree with us that
> There are 2 natural approaches to implementing timed-release crypto:
>
> - Use 'time-lock puzzles' - computational problems that can not be solved without running a computer continuously for at least a certain amount of time.
> - Use trusted agents who promise not to reveal certain information until a specified date.
And that for time-lock puzzles:
> Our goal is thus to design time-lock puzzles that, to the great extent possible, are 'intrinsically sequential' in nature, and can not be solved substantially faster with large investments in hardware. In particular, we want our puzzles to have the property that putting computers to work together in parallel doesn't speed up finding the solution. (Solving the puzzle should be like having a baby: two women can't have a baby in 4.5 months.)
Rivest et al then points out that the most obvious approach - encrypt the file to a random short key, short enough that brute-forcing takes only a few months/years as opposed to eons - is flawed because brute-forcing a key is very [parallelizable](!Wikipedia "EFF DES cracker") and amenable to [special hardware](!Wikipedia "Custom hardware attack")[^hardware]. (And as well, the randomness of searching a key space means that the key might be found very early or very late; any estimate of how long it will take to brute force is just a guess.) One cute application of the same brute-forcing idea is [Merkle's Puzzles](!Wikipedia) where the time-lock puzzle is used to hide a key for a second-party to communicate with the first-party creator, but it has the same drawback: it has the creator make many time-lock puzzles (any of which could be used by the second-party) and raises the cost to the attacker (who might have to crack each puzzle), but can be defeated by a feasibly wealthy attacker, and offers only probabilistic guarantees (what if the attacker cracks the same puzzle the second-party happens to choose?).
[^hardware]: Colin Percival's ["Insecurity in the Jungle (disk)"](http://www.daemonology.net/blog/2011-06-03-insecurity-in-the-jungle.html) presents a table giving times for brute-forcing [MD5](!Wikipedia) hashes given various hardware; most dramatically, <$1m of custom [ASIC](!Wikipedia "Application-specific integrated circuit") hardware could bruteforce a random 10 character string in 2 hours. (Hardware reaps extreme performance gains [mostly when](http://www.yosefk.com/blog/its-done-in-hardware-so-its-cheap.html "`It's done in hardware so it's cheap`") when few memory accesses are required, and a few fast operations applied to small amounts of data; this is because flexibility imposes overhead, and when the overhead is incurred just to run fast instructions, the overhead dominates the entire operation. For example, graphics chips do just a relative handful of math to a frame, again and again, and so they gain orders of magnitude speedups by being specialized chips - as does any other program which is like that, which includes cryptographic hashes designed for speed like the ones Bitcoin uses.)
[Wikipedia](!Wikipedia "Key stretching#Strength and time") gives an older example using [FPGAs](!Wikipedia "Field-programmable gate array") (also [being](https://bitcointalk.org/index.php?topic=9047.0) [used](http://www.reddit.com/r/Bitcoin/comments/hhd4l/fpga_mining/) for Bitcoin hashing):
> An important consideration to be made is that CPU-bound hash functions are still vulnerable to hardware implementations. For example, the literature provides efficient hardware implementations of SHA-1 in as low as 5000 gates, and able to produce a result in less than 400 clock cycles^[2](http://rfidsec2013.iaik.tugraz.at/RFIDSec08/Papers/Publication/04%20-%20ONeill%20-%20Low%20Cost%20SHA-1%20-%20Paper.pdf "'Low-Cost SHA-1 Hash Function Architecture for RFID Tags', O'Neill")^. Since multi-million gate FPGAs can be purchased at less than $100 price points^[3](http://www.xilinx.com/prs_rls/silicon_spart/0333spartan3.htm)^, it follows that an attacker can build a fully unrolled hardware cracker for about $5000. Such a design, if clocked at 100MHz can try about 300,000 keys/second for the algorithm proposed above.
Rivest et al propose a scheme in which one encrypts the file with a very strong key as usual, but then one encrypts the key in such a way that one must calculate $\text{encryptedKey}^{2^t} \text{mod}(n)$ where _t_ is the adjustable difficulty factor. With the original numbers, one can easily avoid doing the [successive squarings](http://mathworld.wolfram.com/SuccessiveSquareMethod.html). This has the nice property that the puzzle constructor invests only $O(log(n))$ computing power, but the solver has to spend $O(n^2)$ computing power. (This scheme works in the random oracle model, but [Barak & Mahmoody-Ghidary](http://www.cs.princeton.edu/~boaz/Papers/merkle.pdf) 2009 proves that is the best you can do.)
Rivest [has actually used](http://people.csail.mit.edu/rivest/lcs35-puzzle-description.txt) this scheme for a time capsule commemorating the [MIT Computer Science and Artificial Intelligence Laboratory](!Wikipedia); he expects his puzzle to take ~35 years. As of 14 years later, [minimal progress](http://crypto.stackexchange.com/questions/5831/what-is-the-progress-on-the-mit-lcs35-time-capsule-crypto-puzzle "What is the progress on the MIT LCS35 Time Capsule Crypto-Puzzle?") seems to have been made; if anything, the breakdown of CPU clockspeeds seems to imply that it will take far more than 35 years[^Rivest-projection] Rivest offers some advice for anyone attempting to unlock this time-lock puzzle (may or may not be related to Mao's 2000 paper ["Time-Lock Puzzle with Examinable Evidence of Unlocking Time"](/docs/2000-mao.pdf)):
> An interesting question is how to protect such a computation from errors. If you have an error in year 3 that goes undetected, you may waste the next 32 years of computing. Adi Shamir has proposed a slick means of checking your computation as you go, as follows. Pick a small (50-bit) prime _c_, and perform the computation modulo _cn_ rather than just modulo _n_. You can check the result modulo _c_ whenever you like; this should be a extremely effective check on the computation modulo _n_ as well.
[^Rivest-projection]: As of 2013, the highest-frequency consumer CPU available, the AMD FX-9000, tops out at 5GHz with few prospects for substantial increases in frequency; Rivest projected 10GHz and increasing:
> Based on the SEMATECH National Technology Roadmap for Semiconductors (1997 edition), we can expect internal chip speeds to increase by a factor of approximately 13 overall up to 2012, when the clock rates reach about 10GHz. After that improvements seem more difficult, but we estimate that another factor of five might be achievable by 2034. Thus, the overall rate of computation should go through approximately six doublings by 2034.
Moore's law, in the original formulation of transistors per dollar, may have continued post-1997, but the gains increasingly came by parallelism, which is not useful for repeated squaring.
## Constant factors
How well does this work? The complexity seems correct, but I worry about the constant factors. Back in 1996, computers were fairly homogeneous, and Rivest et al could reasonably write
> We know of no obvious way to parallelize it to any large degree. (A small amount of parallelization may be possible *within* each squaring.) The degree of variation in how long it might take to solve the puzzle depends on the variation in the speed of single computers and not on one's total budget. Since the speed of hardware available to individual consumers is within a small constant factor of what is available to large intelligence organizations, the difference in time to solution is reasonably controllable.
But I wonder how true this is. Successive squaring does not seem to be a very complex algorithm to implement in hardware. There are more exotic technologies than GPUs we might worry about, like [field-programmable gate array](!Wikipedia)s which may be specialized for successive squaring; if problems like the [n-body problem](!Wikipedia) can be [handled with custom chips](!Wikipedia "Gravity Pipe"), why not multiplication? Or, since squaring seems simple, is it relevant to forecasts of serial speed that there are [graphene](!Wikipedia) transistor prototypes going as high as [100GHz](http://www.pnas.org/content/109/29/11588.full "'High-frequency self-aligned graphene transistors with transferred gate stacks', Cheng et al 2012")? Offhand, I don't know of any compelling argument to the effect that there are no large constant-factor speedups possible for multiplication/successive-squaring. Indeed, the general approach of exponentiation and factoring has to worry about the fact that the complexity of factoring has never been proven (and could still be very fast) and that there are speedups with quantum techniques like [Shor's algorithm](!Wikipedia).
[^rainbow]: eg. the 2008 Graves thesis, ["High performance password cracking by implementing rainbow tables on nVidia graphics cards (IseCrack)"](https://www.iac.iastate.edu/mediawiki/images/b/b3/Cryptohaze.pdf) claims a 100x speedup over CPU generation of rainbow tables, or the actively developed utility, [RainbowCrack](http://www.project-rainbowcrack.com/) (which you can even [buy](http://www.project-rainbowcrack.com/buy.php) the generated rainbow tables from).
# Memory-bound hashes
Of course, one could ask the same question of my original proposal - what makes you think that hashing can't be sped up? You already supplied an example where cryptographic hashes were sped up astonishingly by a GPU, Bitcoin mining.
The difference is that hashing can be made to stress the weakest part of any modern computer system, the [memory hierarchy](!Wikipedia)'s terrible bandwidth and latency[^latency]; the hash can blow the fast die-level caches (the [CPU](!Wikipedia "Processor register") & its [cache](!Wikipedia "CPU cache")) and force constant fetches from the main RAM. They were devised for anti-spam proof-of-work systems that wouldn't unfairly penalize cellphones & PDAs while still being costly on desktops & workstations (which rules out the usual functions like [Hashcash](!Wikipedia) that stress the CPU). For example, the 2003 ["On Memory-Bound Functions for Fighting Spam"](http://research.microsoft.com/pubs/65154/crypto03.pdf); from the abstract:
> Burrows suggested that, since memory access speeds vary across machines much less than do CPU speeds, memory-bound functions may behave more equitably than CPU-bound functions; this approach was first explored by Abadi, Burrows, Manasse, and Wobber [8]. We further investigate this intriguing proposal. Specifically, we...
>
> 2\. Provide an abstract function and prove an asymptotically tight amortized lower bound on the number of memory accesses required to compute an acceptable proof of effort; specifically, we prove that, on average, the sender of a message must perform many unrelated accesses to memory, while the receiver, in order to verify the work, has to perform significantly fewer accesses;
> 3\. Propose a concrete instantiation of our abstract function, inspired by the RC4 stream cipher;
> 4\. Describe techniques to permit the receiver to verify the computation with no memory accesses; 5. Give experimental results showing that our concrete memory-bound function is only about four times slower on a 233 MHz settop box than on a 3.06 GHz workstation, and that speedup of the function is limited even if an adversary knows the access sequence and uses optimal off-line cache replacement.
Abadi 2005, ["Moderately hard, memory-bound functions"](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.78.6879&rep=rep1&type=pdf) develop more memory-bound functions and benchmark them (partially replicated by [Das & Doshi](http://www.cs.jhu.edu/~rdas/finalreport.pdf) 2004):
> ...we give experimental results for five modern machines that were bought within a two-year period in 2000-2002, and which cover a range of performance characteristics. All of these machines are sometimes used to send e-mail-even the settop box,which is employed as a quiet machine in a home...None of the machines have huge caches-the largest was on the server machine, which has a 512KB cache. Although the clock speeds of the machines vary by a factor of 12, the memory read times vary by a factor of only 4.2. This measurement confirms our premise that memory read latencies vary much less than CPU speeds.
>
> ...At the high end, the server has lower performance than one might expect, because of a complex pipeline that penalizes branching code. In general, higher clock speeds correlate with higher performance, but the correlation is far from perfect...Second, the desktop machine is the most cost-effective one for both CPU-bound and memory-bound computations; in both cases, attackers are best served by buying the same type of machines as ordinary users. Finally, the memory-bound functions succeed in maintaining a performance ratio between the slowest and fastest machines that is not much greater than the ratio of memory read times.
Colin Percival continues the general trend in the context of finding passwords schemes which are resistant to cheap brute-forcing, inventing [`scrypt`](http://www.tarsnap.com/scrypt.html ) in the 2009 paper ["Stronger Key Derivation via Sequential Memory-Hard Functions"](http://www.tarsnap.com/scrypt/scrypt.pdf). Percival notes that designing a really good memory-bound function requires not overly relying on *latency* since his proofs do not incorporate latency, although in practice this might not be so bad:
> Existing widely used hash functions produce outputs of up to 512 bits (64 bytes), closely matching the cache line sizes of modern CPUs (typically 32-128 bytes), and the computing time required to hash even a very small amount of data (typically 200-2000 clock cycles on modern CPUs, depending on the hash used) is sufficient that the memory latency cost (typically 100-500 clock cycles) does not dominate the running time of ROMix.
>
> However, as semiconductor technology advances, it is likely that neither of these facts will remain true. Memory latencies, measured in comparison to CPU performance or memory bandwidth, have been steadily increasing for decades, and there is no reason to expect that this will cease — to the contrary, switching delays impose a lower bound of Ω(log N ) on the latency of accessing a word in an N-byte RAM, while the speed of light imposes a lower bound of Ω( √N ) for 2-dimensional circuits. Furthermore, since most applications exhibit significant locality of reference, it is reasonable to expect cache designers to continue to increase cache line sizes in an attempt to trade memory bandwidth for (avoided) memory latency.
>
> In order to avoid having ROMix become latency-limited in the future, it is necessary to apply it to larger hash functions. While we have only proved that ROMix is sequential memory-hard under the Random Oracle model, by considering the structure of the proof we note that the full strength of this model does not appear to be necessary.
Percival constructs a password algorithm on his new hash function and then calculates costs using 2002 circuit prices
> When used for interactive logins, it is 35 times more expensive than bcrypt and 260 times more expensive than PBKDF2; and when used for file encryption — where, unlike bcrypt and PBKDF2, scrypt uses not only more CPU time but also increases the die area required — scrypt increases its lead to a factor of 4000 over bcrypt and 20000 over PBKDF2.
That is quite a difference between the hashes, especially considered that bcrypt and PBKDF2 were already engineered to have adjustable difficulty for similar reasons to our time-lock crypto puzzles.
[^latency]: From Abadi 2005:
> Fast CPUs run much faster than slow CPUs - consider a 2.5GHz PC versus a 33MHz Palm PDA. Moreover, in addition to high clock rates, higher-end computer systems also have sophisticated pipelines and other advantageous features. If a computation takes a few seconds on a new PC, it may take a minute on an old PC, and several minutes on a PDA. That seems unfortunate for users of old PCs, and probably unacceptable for users of PDAs...we are concerned with finding moderately hard functions that most computer systems will evaluate at about the same speed. We envision that high-end systems might evaluate these functions somewhat faster than low-end systems, perhaps even 2-10 times faster (but not 10-100 faster, as CPU disparities might imply). Moreover, the best achievable price-performance should not be significantly better than that of a typical legitimate client...A memory-bound function is one whose computation time is dominated by the time spent accessing memory. The ratios of memory latencies of machines built in the last five years is typically no greater than two, and almost always less than four. (Memory throughput tends to be less uniform, so we focus on latency.) A memory-bound function should access locations in a large region of memory in an unpredictable way, in such a way that caches are ineffectual...Other possible applications include establishing shared secrets over insecure channels and the timed release of information, using memory-bound variants of Merkle puzzles [Merkle 1978] and of time-lock puzzles [May 1993; Rivest et al. 1996], respectively. We discuss these also in section 4.
And here I thought I was being original in suggesting memory-bound functions for time-lock puzzles! Truly, "there is nothing new under the sun".
# External links
- ["Time capsule cryptography?"](http://crypto.stackexchange.com/questions/606/time-capsule-cryptography) -(Cryptography [StackExchange](!Wikipedia))
- [Hacker News discussion](https://news.ycombinator.com/item?id=6508179)
- [Reddit discussion](http://www.reddit.com/r/compsci/comments/1nx0n5/timelock_encryption/)
<!-- TODO
https://en.bitcoin.it/wiki/User:Gmaxwell/alt_ideas
POW which turns the distributed computation into ticking for timelock encryption
An infinite sequence of nothing-up-my-sleeve numbers are taken as an infinte sequence of ECC public keys. Searching the pow involves finding distinguished points along a Pollard's rho DLP solution trying to crack the key. When the key is cracked the problem is advanced to the next key.
People can then encrypt messages to these keys and sometime in the future the network will crack them, achieving a timelock.
Probably incompatible with merged mining and other POW schemes.
Making the difficulty adaptive either makes far in the future messages impossible (because the problem size wouldn't be known long in advance), or requires increasingly big headers as the difficulty would require working on multiple problems concurrently.
-->