Passphrase FAQ: Practical questions
Table of contents
- How long should the passphrase be?
- What if I use all random letters?
- What if I use all random characters?
- What if I use another language?
- What if I use common phrases or quotes?
- What happens if I combine phrases and nonsense phrases?
- Does odd spelling, punctuation and capitalization help?
- What if I use random words?
- Can I use a small dictionary designed for passphrases?
The rule of thumb is that you use one character per bit of key needed. You really get about 1.2 bits per English text character for key usage. Modifying the key size means 128 / 1.2 = 106.667 or 107 letters of text are needed. This assumes normal English structure, only lower case letters and spaces for the passphrase and for the calculation purposes, all spaces are ignored in the passphrase. Few of us are willing to type out a line and a half of text every time we use PGP though. This is where security fails and we use weak passphrases.
The standard alphabet has 26 letters in it. Doing the math again we get log(2128) / log(26) = 27.23 random letters are needed. Rounding up will mean using 28 letters to make it harder than the IDEA key. Memorizing the 28 random letters would be tough to do, but it isn't impossible. This isn't too bad to type though.
A variation on this idea is to use existing words but to remove the vowels or to replace them with characters like '3' for 'e' or '4' for 'a'. It's unclear how much advantage this variation would give you. This is a well-known idea and likely many of these variations are already present in the attackers' dictionaries.
If we use all possible printable ASCII characters we end up with 95 possible characters to work with. Punching buttons we end up needing log(2128) / log(95) = 19.48 random characters for this method. Rounding up again, 20 random characters are needed to make this method harder than the IDEA key. Memorizing 20 random characters is still a tough job, and it is kind of hard to type.
Using your native language is probably an obvious choice. Throughout this FAQ, data and statistics apply to English text. Using another language or combining languages will change the numbers some. It will not make your passphrase harder to guess. Attacking a different language or even multiple languages is still the same. The search space is roughly the size of the language or grows by adding the size of the average size of the vocabulary of the added language. Dictionary attacks in another language would run in the same manner as a dictionary attack in English.
Using words from two or more different languages may give you a slight advantage as an attacker would have to combine all his available dictionaries which increases the time to guess the correct passphrase. Of course, if the attacker knows that you are fluent in e.g. Dutch and English, this increase in time would be negligible. So use dictionaries of languages no one would associate with you.
The short version on common phrases is don't use them ever. A book of quotes may contain 40,000 quotes. You could probably set an old PC XT in a corner and have common phrases checked in a relatively short amount of time without any special hardware. Simple phrases will be the first ones checked. If you are a Star Trek fan, "Beam me up Scottie" is a bad phrase to use. If you can find the phrase in any published work then don't use it. A simple background search will reveal what kind of music, books, TV shows, movies, games, hobbies, and everything else you might use. All the common phrases will be tried on the first pass of a key search. You can try 40,000 quotes using unmodified PGP in about 2.4 days. See
Combining phrases extends the phrase search some. Nonsense phrases will also slow down a brute force search. A smart attack would take advantage of normal phrase structure. Ordering nouns, verbs, adverbs, adjectives and all the other components of a sentence would be tried in a natural order. A good nonsense phrase begins to appear to be random as far as a brute force search goes, but it isn't really random.
A popular trick is to substitute digits for letters, or to randomly capitalize certain letters. Using this kind of "0dd sp3LLing5 and CaP!tal!ZaTiOn" will extend the search by about 1 million tries per word. Modifying the numbers for passphrases means you probably get more than 8 million (1 million per word) for a decent passphrase. Capitalization at random will cause word length dependent permutations. Adding a single digit 0-9 to a word multiplies the dictionary size by 10. This is a small gain but in some cases may be worth the trouble. Substituting 3 for E, 1 for I, 5 for S and 2 for Z adds the numbers to the possible alphabet. Adding the numbers 0-9 increases the alphabet to 36 characters. Switching letters, letter rotations, letter shifts, and other word scrambling won't help the randomness but they do slow the brute force search some. You can approach a random looking passphrase in this manner.
The Random House Dictionary (paperback, Ballantine Books, 1980) has around 74,000 words in it. Using the 128 bit key size we then need, log(2128) / log(74,000) = 7.91, random words from our dictionary. Rounding up, you will then need 8 random words to make the passphrase harder than the IDEA key. A brute force dictionary attack will then take slightly longer than a brute force attack on the IDEA key. This is a decent way to generate a passphrase except that it is kind of hard to remember sometimes. This is pretty easy to type though.
This approach has been used in the Diceware method of passphrase generation.
A smaller dictionary can be searched much faster. Just having one around is enough of a clue to start with that instead of the normal searches. So, you better be sure your key generation system is really random. Programs can be compromised, written poorly or simply monitored. Try Diceware for a good random passphrase generation system. It is irrelevant if the dictionary has any tricks that make the construction of the words more random. In the end, the search space is all that counts. The random number source may not be random and further reduce the search. For these reasons, you need to be sure your key generator is really random.
Here is what effect different size dictionaries have. Using a 10,000 word dictionary, (from section 2.7) log(3.16E13) / log(10,000) = 3.37 or about 4 words are needed to last more than the average 6 months. Using the same dictionary to create an IDEA equivalent passphrase gives us log(2128) / log(10,000) = 9.63 or 10 words are needed. Using a 25,000 word dictionary means log(2128) / log(25,000) = 8.76 or 9 words. A 50,000 word dictionary needs log(2128) / log(50,000) = 8.20 or 9 words.