D J Bernstein’s draft paper Understanding Brute Force argues that the way we currently measure the cost of cryptanalytic attacks is highly misleading. The paper is a good example of Bernstein’s unconventional style, and mixes quite informal writing with very formal and precise descriptions of cryptanalytic methods and costs. Though his conclusions are correct, I think he hasn’t quite put his finger on how people have come to be misled in the past, so I shall have a go here at arguing what I think is the same point in different words.
The traditional argument goes like this: an n-processor machine can never solve a problem more than n times faster than the same machine with only one processor, and sometimes it will be much less than n times faster because some problems don’t parallelize well. So we give the attacker the best advantage by measuring the cost of their attack on a uniprocessor machine, and that’s the best way forward.
The premise is (approximately) correct, but the conclusion is false. The problem is that this reasoning ignores memory cost. Specifically, if your memory costs are high you might be better off spending less on memory and more on processors. By making this error, you can overestimate the efficacy of a cryptanalytic attack.
To highlight this, imagine an attack on a block cipher with a 128-bit key which takes 290 time and requires 290 storage. By current standards, this is a publishable attack which would be considered to “break” the block cipher. But compare this attack with a straightforward parallel brute-force attack: for far less cost, one could build a machine with 264 dedicated processors, which would brute-force through the keyspace of the cipher in 264 steps. Clearly, the cipher is not really any weaker for the existence of this attack.
Or consider the attacks on triple DES presented in my colleague Stefan Lucks’ paper, Attacking Triple Encryption. The final attack presented requires only time equivalent to 290 DES encryptions, which may seem a huge step forward over the 2168 steps required for a brute-force attack. However, it also requires 288 memory. Considered in the above light, this attack seems less impressive: if a dedicated Triple-DES key cracker can be built for the cost of 28 units of storage, then brute force will beat this attack on time and cost quite handily. As it is, the cost of such a unit is likely to be somewhat more, so this attack survives but barely.
This brings us to our first conclusion: the true measure of the cost of an attack is not time, but the product of time and hardware cost. Only by this measure are our comparisons with brute force fair.
Once we accept this measure, however, we open ourselves up to the reverse error; if we persist in considering only serial attacks, we will now underestimate the power available to attackers. Though I don’t understand it well enough to know for sure, it’s possible that the above Triple-DES attack might parallelize quite nicely. If 270 processors each with 218 memory can perform the attack with the full efficacy of parallelism, then only 220 steps are needed to perform the attack, and this is a far lower time and hardware cost than any brute force search. Suddenly this attack becomes far more tempting – but not all attacks are amenable to such parallelization, and the details have to be considered carefully.
From this we draw our second conclusion: it is insufficient to consider serial attacks, and assume that parallel attacks are implicitly covered. Parallel attacks must be considered explicitly. The existence of an attack with hardware cost h and time cost t (total cost ht) does not imply either the existence of an attack with hardware cost h/2 and time cost 2t or one with hardware cost 2h and time cost t/2.
However, one should be careful in asserting that a parallel attack can be “more powerful” than a serial attack. This assertion can be misleading, because it seems to contradict our opening premise which was correct. Rather, this assertion should be taken as the combination of our two conclusions: the correct measure of an attack is not time but the product of time and hardware cost, and it is by this measure that we might find that against some cryptosystems there exist parallel attacks that outperform any serial attack.
Bernstein’s paper also highlights a third point, which touches only tangentally on these arguments about parallelism but also adds to our understanding of brute force: if our attacker has multiple cryptanalytic targets, their life becomes proportionally easier. If our attacker need only break any one of 220 keys to defeat the system, this can make the job up to 220 times easier, because each guess at a key has a 220 times greater chance of being right. This insight was first presented by Biham in 1996 (How to Forge DES-Encrypted Messages in 228 Steps), and is a key component of the attack presented in Extracting a 3DES key from an IBM 4758: the attackers find a way to break into the system if any of 214 DES keys can be discovered, and this make the job of brute-forcing DES sufficiently easy (only 242 steps) that it can be done on an off-the-shelf FPGA evaluation board costing under $1000. Bernstein demonstrates that these advantages can be preserved in a highly parallel attack, and goes on to argue convincingly that the best countermeasure is a larger keyspace – echoing the conclusion of Biham’s original paper on the subject, which suggests that our keys should be twice as long as they need to be to exhaust the computational resources of our attacker.
This leads to an interesting open problem in cryptography. The current effective working definition of what it means for a cipher to be “unbroken” is that there are no distinguishing attacks more effective than brute force. Bernstein suggests that we should modify this definition to allow a cipher to have a key length greater than its design strength; one could declare that a cipher with a 256-bit key has a design strength of only 128 bits, so an attack that required 2160 resources would be considered irrelevant even though it greatly outperforms brute force. The purpose of this distinction would be to gain the resistance against multiple-target cryptanalysis that comes with larger keys without paying the performance cost associated with resisting the impossibly powerful attacker who can muster up to 2256 computing resources.
However, this leaves a problem of defining exactly what security up to a design strength of 128-bits means. If one defines it in the straightforward way – that the cipher is indistinguishable from random by an attacker who can expend up to 2128 resources – then one can achieve this by discarding half the key bits and using a strong 128-bit cipher, which would lend no extra resistance to multiple-target attacks and defeats the purpose of extending the key. I see no more elegant way to do it than directly considering an attacker with access to multiple targets, but I hope that further research will shed more light on this useful question.