Passphrase FAQ: Getting started

Table of contents

How do I get random numbers?

Random numbers are very hard to generate. Quite often non-random events affect the randomness of devices or circuits. A suggestion would be to make 1 to N markers and place them in a very good mixer. You might want to try coin flipping, but if you have a person involved, a coin flip can be biased enough to skew the results over the long term. A ball method like several lotteries use is a good random source, but don't use the numbers from the lottery. Lottery numbers are a little to obvious and it is easy to try them.

A pool/billiards game uses a set of balls in a bottle that allows only one ball to be extracted at a time. This is a cheap and inexpensive source for random numbers. I leave it to the reader to figure out how to get the 16 balls translated to something useful for any particular application.

Those who are familiar with Dungeons and Dragons and the other role playing games may already have a set of dice numbered in a variety of sizes. The one caution with dice is that adding dice (e.g. 2 six sided dice) will change the output to a median number (odds of 7 are 1 in 6) and the extremes (odds of 2 and 12 are 1 in 36) are less likely to occur, losing some randomness. Be sure that you have random dice. The quality is sometimes not very good and may cause non-random results.

You might want to look at RFC 1750 (Randomness Recommendations For Security) for more information on generating random numbers for secure purposes.

What is MD5?

MD5 is what takes your passphrase and scrambles it into an IDEA key. In theory, MD5 should generate a different output for every possible bit combination as long as your key space is equal to or larger than 2128. Proving that MD5 will generate all 2128 outputs from a given key space equal to 2128 is practically impossible. This would be about the same as a brute force search on the IDEA key. An interesting problem is that theoretically you can produce an equivalent passphrase by searching any given key space that is 2128 or larger.

In light of the attack on MD5, wait and watch. While a weakness has been found, the jury is still out on using unmodified MD5. A move to SHA or other hash function may be in the future for PGP.

Should I use a pseudo-random number generator?

Using a pseudo-random number generator (PRNG) is in most cases a bad way to generate random numbers. The problem with PRNGs is the numbers are generated by a function. This includes the BASIC RND() function, the C rand() function or any other language that has a random function.

Programmers have used this simple and relatively fast method in programs and games for years. The reason for this is because of the way PRNGs work. A simple PRNG will use code something like R = (A * R1 + B) mod(C): R1 = R: R = R / C. Primes are usually used for constants A, B, and C. Most languages have provisions for placing a seed value in R1 before calling the PRNG but it isn't needed and some PRNGs may not bother with the additive constant B.

What makes a PRNG easy to break is that many only use 16 bits to store the values. That means we can brute force a 16 bit PRNG key space in 65536 * N attempts where N is the number of pseudo-random elements used. Almost anyone can probably search a standard PRNG key space in a day. A worst case search will probably last less than a week even on the average home computer.

If you are lucky and have a good PRNG, then the search space may be 232 which isn't a whole lot better. Note that 40 bit keys can be brute forced by an individual with access to enough computing power in about a week or less and places like the NSA don't mind 40 bit keys.

What about hardware random number generators?

Hardware random number generators can provide much better quality random numbers than PRNGs. However it may be difficult to find a good piece of hardware that can serve as a source of random numbers.

Hardware generators can be made using the noise from a variety of semiconductor PN junctions. A good example of this is simply amplified noise from a zener diode. Other noise sources are high value resistors and a number of commercial chips that use a variety techniques to make noise. A caution with hardware sources of random information is that they could be influenced by noise or other signals that are not random. Most places are saturated with 50 or 60 Hz noise from power, clock signals and other digital noise from computers, television and radio, and a variety of other types of electronic equipment.

For safety, you may want to encrypt or hash the output of a hardware source. A good hash function or encryption will hide any undiscovered patterns.

All parts of this FAQ