As stated in the previous entry to this series, we discussed that the techniques to discover a plaintext passphrase for a wireless network is done by making a series of educated guesses. The least desirable approach towards this work is a brute force attack, namely trying to guess every single possible combination. That’s because that brute forcing, while effective, may not be computationally feasible with available resources.
So the approach that many take is to use a methodical set of guesses to the passphrase before using a brute force as a last resort. The start of the attack typically uses a dictionary to come up with a set of words that might be possibly used in the guesses. These words might be used directly out of the dictionary (say a word like pomegranate), or a set of rules may be applied to modify the words to try different combinations.
A typical set of rules would try capitalizing different parts of the word, and seeing how that worked. For the example with pomegranate, one might also try “Pomegranate”, “pOmegranate”, “poMegranate", and so on. Another set of rules might try adding various numbers to the passphrase, such as putting a single digit within the word, then a double digit, and so on.
Both the dictionaries and the rules used in the past were crude by today's standards. These lists might be ways that people used passphrases, but it still wasn't effective against the passphrases that were not in the dictionary. Without a feedback loop to link the real world with the theoretical world, it was still an imprecise measure of guessing, and while weak passwords were easily broken, the strong passwords were not.
In recent years, there has been a renaissance in password cracking. Efforts to go after databases of usernames led to a large trove of unencrypted passwords. With a growing list of real world passphrases in hand, researchers now have data on how to build more accurate word lists and rules based on research rather than theory, thus improving the accuracy of the guesses and reducing the time spent on combinations that are unlikely to yield good results.
For instance, when asked to add a set of numbers to the passphrase to add complexity, the number would often be the two digits or four digit representation of the year they were born. Thus instead of having to try adding every single number from 0000-9999 in a four digit guess, it's far more useful to focus on the hundred digits between 1900-2000. This single rule would reduce the number of guesses by a hundredfold. Given enough computing power, one could try all combinations, but better rules made better use of the available resources by performing best guesses first.
With wireless networks, it turns out there are also additional rules to make the guesses more accurate. Entering special characters is not very easy with a mobile device, and even entering capital letters can be trying. As a result, people tend to choose passphrases that stick to one keyboard mode. For instance, the wireless passphrase is likely going to be all lowercase rather than upper/lowercase. Since WPA/WPA2 passphrases must be a minimum of 8 characters, many people will use their phone number as a passphrase so that 8+ characters could be entered from a single keyboard mode.
In addition to the improvements in passphrase word lists and rules, yet another set of developments led to massive improvements in passsphrase recovery: improvements in technology. I’ll cover this topic in my next entry.
Meanwhile, if you're interested in the art of passphrase attacks, there is a lot of great coverage on Ars Technica. See the article, "Why Passwords Have Never Been Weaker, and Crackers Have Never Been Stronger" for more insights.