# The MD5 cryptographic hash function

The MD5 function is a cryptographic algorithm that takes an input of arbitrary length and produces a

message digestthat is 128 bits long. The digest is sometimes also called the "hash" or "fingerprint" of the input. MD5 is used in many situations where a potentially long message needs to be processed and/or compared quickly. The most common application is the creation and verification of digital signatures.MD5 was designed by well-known cryptographer Ronald Rivest in 1991. In 2004, some serious flaws were found in MD5. The complete implications of these flaws has yet to be determined.

## How MD5 works

### Preparing the input

The MD5 algorithm first divides the input in **blocks** of 512 bits each.
64 Bits are inserted at the end of the last block. These 64
bits are used to record the length of the original input.
If the last block is less than 512 bits, some extra bits are 'padded'
to the end.

Next, each **block** is divided into
16 **words** of 32 bits each. These
are denoted as M_{0} ...
M_{15}.

### MD5 helper functions

#### The buffer

MD5 uses a buffer that is made up of four
**words** that are each
32 bits long. These words are called A, B, C and D. They are
initialized as

word A: 01 23 45 67 word B: 89 ab cd ef word C: fe dc ba 98 word D: 76 54 32 10

#### The table

MD5 further uses a table K that has 64 elements. Element number i
is indicated as K_{i}. The table is computed
beforehand to speed up the computations. The elements are computed
using the mathematical sin function:

K_{i}= abs(sin(i + 1)) * 2^{32}

#### Four auxiliary functions

In addition MD5 uses four auxiliary functions that each take as input three 32-bit words and produce as output one 32-bit word. They apply the logical operators and, or, not and xor to the input bits.

F(X,Y,Z) = (X and Y) or (not(X) and Z) G(X,Y,Z) = (X and Z) or (Y and not(Z)) H(X,Y,Z) = X xor Y xor Z I(X,Y,Z) = Y xor (X or not(Z))

### Processing the blocks

The contents of the four buffers (A, B, C and D) are now mixed
with the words of the input, using the four auxiliary functions
(F, G, H and I). There are four *rounds*, each
involves 16 basic *operations*. One operation is
illustrated in the figure below.

The figure shows how the auxiliary function F is applied to the
four buffers (A, B, C and D), using message word
M_{i} and constant K_{i}.
The item "<<<s" denotes a binary left shift by
*s* bits.

### The output

After all rounds have been performed, the buffers A, B, C and D contain the MD5 digest of the original input.

## Reference implementation

A more complete discussion of MD5, and a reference implementation, can be found in RFC 1321.

## See also

- The MD5 message digest algorithm (Wikipedia)
- Cryptography: Algorithms: Message Digests (Open Directory)