SHA-1 broken


For those into cryptography, you are probably anxiously wondering what "broken" means in this context. For those who aren't, you're probably wondering what SHA-1 is and why you should care. If you're in the latter group, bear with me a moment, while I speak to the former. Broken means that instead of requiring a brute-force attack of 280 to create a duplicate hash, it can now be done in 269. This according to a blog entry from Bruce Schneier (author of Applied Cryptography, one of my favorite crypto books). It is now expected that a $38-40M machine could be built to break SHA-1 in 56 hours. Still a lot of money, but Moore is on our tail.

For those who are wondering what SHA-1 is, it is an algorithm that is used by computers to create a short digital summary of a longer chunk of data. In particular, it is what is called a hashing algorithm. By using some very interesting math, it creates a small (160-bit or 20-byte) digital fingerprint of a much larger chunk of data, such as a data file. The most interesting part about the math is that two pieces of data that are similar (like two versions of a book) are very likely to have radically different hash values. This makes them perfect for use in comparing files and data and also makes them well- suited for digital signatures.

For those unfamiliar with digital signatures, let's say you have a 10MB file and you want to verify that it is from the person you think it is from. One easy way to do this is to use encryption and a key that you share with the originator. In a public-private key system, the keys are different for encryption and decryption, so you can use the transmitter's public key (they used the private key for encryption) to decrypt the file and you are certain that the file is from them. Thus, the file has been digitally "signed" by using the private key over the entire file.

However, let's say you aren't really interested in encrypting the entire file... you want to put the file up on a web site and let anyone download it, but you would like to be able to verify (upon demand) that the file was created by person or organization you expect. In this case, you would rather have a digital signature. You're willing to take a small (1 in 280) chance that the signature is not correct in exchange for getting a much smaller signature (say, oh, 160 bits). In this case, you need a number that represents reasonably the data and for which any collision would be obvious. Thus, the need for a hash function. By using a hash, such as SHA-1, you can take a 160-bit fingerprint of the file, encrypt it using the private key of the author and supply an external digital signature that proves the authenticity of the data beyond a reasonable doubt.

So, with the problem now in front of us that there are ways to crack the SHA-1 algorithm, calls are going out for new hashing algorithms that will be resistant to those attacks. No immediate cause for alarm, since the cracking is expensive and time consuming, but as time goes on, it'll become more important to continue increasing the security of these functions.