CRYPTO-GRAM, March 15, 2005
Bruce Schneier,
SHA-1 Broken
SHA-1 has been broken. Not a reduced-round version. Not a simplified
version. The real thing.
The research team of Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu
(mostly from Shandong University in China) have been quietly
circulating a paper describing their results:
collisions in the full SHA-1 in 2**69 hash operations, much less than
the brute-force attack of 2**80 operations based on the hash length.
collisions in SHA-0 in 2**39 operations.
collisions in 58-round SHA-1 in 2**33 operations.
This attack builds on previous attacks on SHA-0 and SHA-1, and is a
major, major cryptanalytic result: the first attack faster than
brute-force against SHA-1.
I wrote about SHA, and the need to replace it, last September. Aside
from the details of the new attack, everything I said then still
stands. I'll quote from that article, adding new material where
appropriate.
"One-way hash functions are a cryptographic construct used in many
applications. They are used in conjunction with public-key algorithms
for both encryption and digital signatures. They are used in integrity
checking. They are used in authentication. They have all sorts of
applications in a great many different protocols. Much more than
encryption algorithms, one-way hash functions are the workhorses of
modern cryptography.
"In 1990, Ron Rivest invented the hash function MD4. In 1992, he
improved on MD4 and developed another hash function: MD5. In 1993, the
National Security Agency published a hash function very similar to MD5,
called SHA (Secure Hash Algorithm). Then, in 1995, citing a newly
discovered weakness that it refused to elaborate on, the NSA made a
change to SHA. The new algorithm was called SHA-1. Today, the most
popular hash function is SHA-1, with MD5 still being used in older
applications.
"One-way hash functions are supposed to have two properties. One,
they're one way. This means that it is easy to take a message and
compute the hash value, but it's impossible to take a hash value and
recreate the original message. (By 'impossible' I mean 'can't be done
in any reasonable amount of time.') Two, they're collision free. This
means that it is impossible to find two messages that hash to the same
hash value. The cryptographic reasoning behind these two properties is
subtle, and I invite curious readers to learn more in my book Applied
Cryptography.
"Breaking a hash function means showing that either -- or both -- of
those properties are not true."
Last month, three Chinese cryptographers showed that SHA-1 is not
collision-free. That is, they developed an algorithm for finding
collisions faster than brute force.
SHA-1 produces a 160-bit hash. That is, every message hashes down to a
160-bit number. Given that there are an infinite number of messages
that hash to each possible value, there are an infinite number of
possible collisions. But because the number of possible hashes is so
large, the odds of finding one by chance is negligibly small (one in
2^80, to be exact). If you hashed 2^80 random messages, you'd find one
pair that hashed to the same value. That's the "brute force" way of
finding collisions, and it depends solely on the length of the hash
value. "Breaking" the hash function means being able to find collisions
faster than that. And that's what the Chinese did.
They can find collisions in SHA-1 in 2^69 calculations, about 2,000
times faster than brute force. Right now, that is just on the far edge
of feasibility with current technology. Two comparable massive
computations illustrate that point.
In 1999, a group of cryptographers built a DES cracker. It was able to
perform 2^56 DES operations in 56 hours. The machine cost $250K to
build, although duplicates could be made in the $50K-$75K
range. Extrapolating that machine using Moore's Law, a similar machine
built today could perform 2^60 calculations in 56 hours, and 2^69
calculations in three and a quarter years. Or, a machine that cost
$25M-$38M could do 2^69 calculations in the same 56 hours.
On the software side, the main comparable is a 2^64 keysearch done by
distributed.net that finished in 2002. One article put it this way:
"Over the course of the competition, some 331,252 users participated by
allowing their unused processor cycles to be used for key discovery.
After 1,757 days (4.81 years), a participant in Japan discovered the
winning key." Moore's Law means that today the calculation would have
taken one quarter the time -- or have required one quarter the number
of computers -- so today a 2^69 computation would take eight times as
long, or require eight times the computers.
"The magnitude of these results depends on who you are. If you're a
cryptographer, this is a huge deal. While not revolutionary, these
results are substantial advances in the field. The techniques
described by the researchers are likely to have other applications, and
we'll be better able to design secure systems as a result. This is how
the science of cryptography advances: we learn how to design new
algorithms by breaking other algorithms. Additionally, algorithms from
the NSA are considered a sort of alien technology: they come from a
superior race with no explanations. Any successful cryptanalysis
against an NSA algorithm is an interesting data point in the eternal
question of how good they really are in there."
For the average Internet user, this news is not a cause for panic. No
one is going to be breaking digital signatures or reading encrypted
messages anytime soon. The electronic world is no less secure after
these announcements than it was before.
But there's an old saying inside the NSA: "Attacks always get better;
they never get worse." Just as this week's attack builds on other
papers describing attacks against simplified versions of SHA-1, SHA-0,
MD4, and MD5, other researchers will build on this result. The attack
against SHA-1 will continue to improve, as others read about it and
develop faster tricks, optimizations, etc. And Moore's Law will
continue to march forward, making even the existing attack faster and
more affordable.
Jon Callas, PGP's CTO, put it best: "It's time to walk, but not run, to
the fire exits. You don't see smoke, but the fire alarms have gone
off." That's basically what I said last August.
"It's time for us all to migrate away from SHA-1.
"Luckily, there are alternatives. The National Institute of Standards
and Technology already has standards for longer -- and harder to break
-- hash functions: SHA-224, SHA-256, SHA-384, and SHA-512. They're
already government standards, and can already be used. This is a good
stopgap, but I'd like to see more.
"I'd like to see NIST orchestrate a worldwide competition for a new
hash function, like they did for the new encryption algorithm, AES, to
replace DES. NIST should issue a call for algorithms, and conduct a
series of analysis rounds, where the community analyzes the various
proposals with the intent of establishing a new standard.
"Most of the hash functions we have, and all the ones in widespread
use, are based on the general principles of MD4. Clearly we've learned
a lot about hash functions in the past decade, and I think we can start
applying that knowledge to create something even more secure."
Hash functions are the least-well-understood cryptographic primitive,
and hashing techniques are much less developed than encryption
techniques. Regularly there are surprising cryptographic results in
hashing. I have a paper, written with John Kelsey, that describes an
algorithm to find second preimages with SHA-1 -- a technique that
generalizes to almost all other hash functions -- in 2^106
calculations: much less than the 2^160 calculations for brute
force. This attack is completely theoretical and not even remotely
practical, but it demonstrates that we still have a lot to learn about
hashing.
It is clear from rereading what I wrote last September that I expected
this to happen, but not nearly this quickly and not nearly this
impressively. The Chinese cryptographers deserve a lot of credit for
their work, and we need to get to work replacing SHA.
Summary of the paper (the full paper isn't generally available yet):
My original essay:
NIST standard for SHA-224, SHA-256, SHA-384, and SHA-512:
My second-preimages paper:
More hash function news:
Two X-509 certificates with identical MD5 hashes:
Faster MD5 collisions (eight hours on 1.6 GHz computer):