The death of MD5 (and some SSL certificates)
Broken Chain of Trust
Researchers set out to compromise MD5 in an effort to convince people to stop using it. We explain how the attack worked and what this means for you.
Message Digest algorithm 5 (MD5 for short) is a one-way cryptographic hashing function. Put in its simplest terms, it takes input, mangles it, and generates a 128-bit value (usually expressed as a 32-character hexadecimal number such as 76ffd163bd23504cfeb873a9c027b2ed). The same input (e.g., password) will always have the same output (for example, 5f4dcc3b5aa765d61d8327deb882cf99). So why use MD5? When cryptographically signing data (such as email or SSL certificates), it is much more efficient to sign a cryptographic signature of the data rather than the entire block of data itself (128 bits of data compared with a kilobyte or more for an SSL certificate).
MD5 is widely used. For example, many Linux distributions use it by default to hash password values in the /etc/shadow password file, numerous SSL certificate authorities support it, and many application vendors use it rather than stronger algorithms such as SHA-1 or SHA-256 (a hashing algorithm similar in functionality to MD5).
Like any security issue, a continuum of choices generally ranges from a combination of "cheap, easy, insecure, and computationally inexpensive" to "expensive, difficult, secure, and computationally expensive." In the case of MD5, it falls somewhere in the middle, not so much because of any conscious choices to cut corners, but largely because of its age (it was invented in 1991).
The largest flaw with MD5 is its limited hash size: At 128 bits, it is significantly smaller than many modern hashing algorithms such as SHA-1 (160 bits) or SHA-256 (256 bits). This limited hash size allows attackers to conduct what is known as a "birthday attack." In cryptographic terms, a birthday attack occurs when two different inputs (e.g., two different but validly formed SSL certificate requests) have the same output after being passed through a hashing function such as MD5. Because MD5 only has 2128 possible outputs, and there are obviously more than that many possible inputs (e.g., 100 standard ASCII characters represent 2800 possible inputs) . Even something as simple as a date stamp and a serial number can easily represent over 2128 potential inputs.
Realistically, the only thing preventing someone from attacking MD5 is the amount of computational power needed and the resistance of the algorithm to various types of attacks.
Unfortunately several weaknesses were found in MD5 (some as far back as 1993), and computational power got very cheap much faster than anyone expected, even taking Moore's law into account. So armed with several known weaknesses in MD5, a group of researchers set out to attack and compromise it in a way that would finally demonstrate the weaknesses in a conclusive manner (and hopefully convince people to stop using it).
One of the most public uses of MD5 is in SSL certificate signing; a small group of certificate authorities (such as Thawte, RapidSSL, RSA, and VeriSign Japan) still use MD5, making them vulnerable to this attack. Now all that was needed was for an attacker to create two certificate requests: one a standard and legitimate request for a secure website and the other a certificate with signing authority allowing one to use it to create signed certificates at will.
How the Attack Worked
In a nutshell, the researchers found a certificate authority that issued certificates in a way that allows the attacker to control the data placed in the certificate by the certificate authority. It's no good for you to create two certificates that have matching MD5 signatures if the certificate authority adds a timestamp and random serial number, thus changing the MD5 signature for the certificate. The vulnerable certificate authority used sequential serial numbers (for example, 1001, 1002, 1003) and timestamps that were exactly six seconds in the future from the time the user submitted the certificate request to their website.
Now all the researchers had to do was find sufficiently cheap computing hardware so that they could calculate a pair of certificates in a reasonable amount of time.
Fortunately, the PlayStation contains a specialized chip called the "Cell" processor that is uniquely suited to calculating a birthday attack, and with a mere 200 machines (about US$ 80,000 at retail prices) the researchers were able to calculate the initial data needed to find a matching set of certificates in 10 hours. Further computation was needed to generate the certificates, which was done on a quad core system (in other words, not a very expensive machine).
Ultimately the researchers were able to carry out a successful attack that gave them a certificate that could be used to sign other certificates. Fortunately, because they are the good guys, they had the certificate dates set to 2004 so that it was expired and raised a warning when encountered.
What This Means for You
Although this attack requires a relatively modest budget (approximately $100,000 for hardware), the technological sophistication needed is quite high. Additionally, only a handful of certificate authorities were affected by this problem because the vast majority stopped using MD5 some years ago (when someone finds a theoretical weakness in a security system, a practical exploit is often not far behind).
Although this type of attack is the holy grail of bad guys abusing the web (using it, they can pretend to be your bank or an online store), it is unlikely you will see an attacker creating and using a signing certificate to impersonate websites. The main reason is that there are much easier ways to impersonate a secure website.
Buy this article as PDF
New release marks the arrival of AMD’s unified driver strategy.
A new study by IDC charts big changes in the big hardware market.
Azure CTO says Redmond has already considered the unthinkable.
Lead developer quells rumors that the Debian version is slated for center stage.
MSBuild is now just another GitHub project as Redmond continues its path to the light.
Malware could pass data and commands between disconnected computers without leaving a trace on the network.
New rules emphasize collegiality in coding.
Upstart lands in the dust bin as a new era begins for Linux.
HP's annual Cyber Risk report offers a bleak look at the state of IT.
But what do the big numbers really mean?