AUTHOR:		Robert Connolly <robert at linuxfromscratch dot org> (ashes)

DATE:		2007-05-16

LICENSE:	Public Domain

SYNOPSIS:	Entropy and random number generators in Linux


The word "entropy" generally means "chaos", "disorder", or "uncertainty". In
this hint "entropy" is used to describe random computer data.

Many system components depend on entropy (random numbers) for various tasks.
One of the simplest examples would be the fortune(6) program, which gives a
random quote from a list when we log in. Another simple example is a solitaire
card game, or the shuffle option in a music player. Without random numbers
these programs would generate the same results every time they run. The above
examples are low security applications. It is not critical for them to use
high quality random numbers, and in applications like these the current system
time and date is usually an adequate source of entropy.

Examples of medium security uses for entropy would be applications like
mktemp(1), password salt, or the Stack Smashing Protector (SSP) GCC feature.
These applications need unpredictable entropy to function securely, but the
life span of these applications is generally short, so they do not need to use
the highest quality entropy available. Using the system time is unsafe for
these applications because it is predictable.

Cryptographic keys tend to have a very long life, often several years. Even
after the key is eventually replaced, everything it was used to encrypt remains
only as safe as the entropy used to generate the key. For cryptography we want
to use the best entropy possible, and conserve this high quality entropy
specifically for cryptography.

Generating true entropy in a computer is fairly difficult because nothing,
outside of quantum physics, is random. The Linux kernel uses keyboard, mouse,
network, and disc activities, with a cryptographic algorithm (SHA1), to
generate data for the /dev/random device. One of the problems with this is that
the input is not constant, so the kernel entropy pool can easily become empty.
The /dev/random device is called a "blocking device". This means if the entropy
pool is empty applications trying to use /dev/random will have to wait,
indefinitely, until something refills the pool. This is both a feature and a
nuisance, and can cause a denial of service depending on the application.
Another problem with using the keyboard, mouse, network, and disc activity is
that on idle, unmanned, and disc-less systems there is very little, or no, input
of this kind. It is also theoretically possible for an observer (keyboard or
network sniffer) to predict the entropy pool without having root level access.
The only real solution to these vulnerabilities is in using a hardware-based
random number generator. These hardware devices usually use electrical static
as a source of entropy, because there is currently no technology that can
reliably predict this. The best hardware random number generators use
radioactive decay as an entropy source.

The /dev/urandom device is referred to as a pseudo-random device (like-random),
although /dev/random is also pseudo-random but to a lesser extent. /dev/urandom
uses small amounts of data from /dev/random to seed a secondary entropy pool.
This has the effect of inflating the real entropy so it can be conserved. Using
/dev/urandom can cause /dev/random's pool to become empty, but if this happens
/dev/urandom will not block, and it will continue using the last available
seed. This makes /dev/urandom theoretically vulnerable to outputting repeating
data, depending on the limitations of the algorithm used, but this is extremely
rare and to my knowledge has never actually happened. /dev/urandom is widely
considered safe for all cryptographic purposes, except by the most paranoid

This hint contains links to web sites and patches to help you get more entropy,
and use it more conservatively.

Glibc-2.5, for the arc4random patch.
The entropy daemons have no prerequisites.




Linux random number generation (RNG) is often a source of confusion to developers, but it is also a very integral part of the security of the system. It provides random data to generate cryptographic keys, TCP sequence numbers, and the like, so unpredictability as well as very strong random numbers are required.

The kernel gathers information from external sources to provide input to its entropy pool. This pool contains bits that have extremely strong random properties, so long as unpredictable events (inter-keypress timings, mouse movements, disk interrupts, etc.) are sampled. It provides direct access to this pool via the /dev/random device. Reading from that device will provide the strongest random numbers that Linux can offer – depleting the entropy pool. When the entropy pool runs low, reads to /dev/random block until there is sufficient entropy.

The alternative interface, the one that nearly all programs should use, is /dev/urandom. Reading from that device will not block. If sufficient entropy is available, it will provide random numbers just as strong as /dev/random, if not, it uses the SHA cryptographic hash algorithm to generate very strong random numbers. Developers often overestimate how strong their random numbers need to be; they also overestimate how easy “breaking” /dev/urandomwould be, which leads to programs that, unnecessarily, read /dev/random. Ted Ts’o, who wrote the kernel RNG, puts it this way:

Past a certain point /dev/urandom will start returning results which are cryptographically random. At that point, you are depending on the strength of the SHA hash algorithm, and actually being able to not just to find hash collisions, but being able to trivially find all or most possible pre-images for a particular SHA hash algorithm. If that were to happen, it’s highly likely that all digital signatures and openssh would be totally broken.

There is still a bit of hole in all of this: how does a freshly installed system, with little or no user interaction, at least yet, get its initial entropy? When Alan Cox and Mike McGrath started describing the smolt problem, the immediate reaction was to look closely at how the entropy pool was being initialized. While that turned out not to be the problem, it did lead Matt Mackall, maintainer of the kernel RNG, to start thinking about better pool initialization. Various ideas about mixing in data specific to the host, like MAC address and PCI device characteristics were discussed.

As Ts’o points out, that will help prevent things like UUID collisions, but it doesn’t solve the problem of predictability of the random numbers that will be generated by these systems.

In order to do that we really do need to improve the amount of hardware entropy we can mix into the system. This is a hard problem, but as more people are relying on these facilities, it’s something we need to think about quite a bit more!

Linux provides random numbers suitable for nearly any purpose via /dev/urandom. For the truly paranoid, there is also /dev/random, but developers would do well to forget that device exists for everything but the most critical needs. If one is generating a large key pair, to use for the next century, using some data from /dev/random is probably right. Anything with lower requirements should seriously consider /dev/urandom.