AUTHOR: Robert Connolly <robert at linuxfromscratch dot org> (ashes) DATE: 2007-05-16 LICENSE: Public Domain SYNOPSIS: Entropy and random number generators in Linux PRIMARY URL: http://www.linuxfromscratch.org/hints/ DESCRIPTION: The word "entropy" generally means "chaos", "disorder", or "uncertainty". In this hint "entropy" is used to describe random computer data. Many system components depend on entropy (random numbers) for various tasks. One of the simplest examples would be the fortune(6) program, which gives a random quote from a list when we log in. Another simple example is a solitaire card game, or the shuffle option in a music player. Without random numbers these programs would generate the same results every time they run. The above examples are low security applications. It is not critical for them to use high quality random numbers, and in applications like these the current system time and date is usually an adequate source of entropy. Examples of medium security uses for entropy would be applications like mktemp(1), password salt, or the Stack Smashing Protector (SSP) GCC feature. These applications need unpredictable entropy to function securely, but the life span of these applications is generally short, so they do not need to use the highest quality entropy available. Using the system time is unsafe for these applications because it is predictable. Cryptographic keys tend to have a very long life, often several years. Even after the key is eventually replaced, everything it was used to encrypt remains only as safe as the entropy used to generate the key. For cryptography we want to use the best entropy possible, and conserve this high quality entropy specifically for cryptography. Generating true entropy in a computer is fairly difficult because nothing, outside of quantum physics, is random. The Linux kernel uses keyboard, mouse, network, and disc activities, with a cryptographic algorithm (SHA1), to generate data for the /dev/random device. One of the problems with this is that the input is not constant, so the kernel entropy pool can easily become empty. The /dev/random device is called a "blocking device". This means if the entropy pool is empty applications trying to use /dev/random will have to wait, indefinitely, until something refills the pool. This is both a feature and a nuisance, and can cause a denial of service depending on the application. Another problem with using the keyboard, mouse, network, and disc activity is that on idle, unmanned, and disc-less systems there is very little, or no, input of this kind. It is also theoretically possible for an observer (keyboard or network sniffer) to predict the entropy pool without having root level access. The only real solution to these vulnerabilities is in using a hardware-based random number generator. These hardware devices usually use electrical static as a source of entropy, because there is currently no technology that can reliably predict this. The best hardware random number generators use radioactive decay as an entropy source. The /dev/urandom device is referred to as a pseudo-random device (like-random), although /dev/random is also pseudo-random but to a lesser extent. /dev/urandom uses small amounts of data from /dev/random to seed a secondary entropy pool. This has the effect of inflating the real entropy so it can be conserved. Using /dev/urandom can cause /dev/random's pool to become empty, but if this happens /dev/urandom will not block, and it will continue using the last available seed. This makes /dev/urandom theoretically vulnerable to outputting repeating data, depending on the limitations of the algorithm used, but this is extremely rare and to my knowledge has never actually happened. /dev/urandom is widely considered safe for all cryptographic purposes, except by the most paranoid people. This hint contains links to web sites and patches to help you get more entropy, and use it more conservatively. PREREQUISITES: Glibc-2.5, for the arc4random patch. The entropy daemons have no prerequisites. HINT:
Linux random number generation (RNG) is often a source of confusion to developers, but it is also a very integral part of the security of the system. It provides random data to generate cryptographic keys, TCP sequence numbers, and the like, so unpredictability as well as very strong random numbers are required.
The kernel gathers information from external sources to provide input to its entropy pool. This pool contains bits that have extremely strong random properties, so long as unpredictable events (inter-keypress timings, mouse movements, disk interrupts, etc.) are sampled. It provides direct access to this pool via the /dev/random device. Reading from that device will provide the strongest random numbers that Linux can offer – depleting the entropy pool. When the entropy pool runs low, reads to /dev/random block until there is sufficient entropy.
The alternative interface, the one that nearly all programs should use, is /dev/urandom. Reading from that device will not block. If sufficient entropy is available, it will provide random numbers just as strong as /dev/random, if not, it uses the SHA cryptographic hash algorithm to generate very strong random numbers. Developers often overestimate how strong their random numbers need to be; they also overestimate how easy “breaking” /dev/urandomwould be, which leads to programs that, unnecessarily, read /dev/random. Ted Ts’o, who wrote the kernel RNG, puts it this way:
There is still a bit of hole in all of this: how does a freshly installed system, with little or no user interaction, at least yet, get its initial entropy? When Alan Cox and Mike McGrath started describing the smolt problem, the immediate reaction was to look closely at how the entropy pool was being initialized. While that turned out not to be the problem, it did lead Matt Mackall, maintainer of the kernel RNG, to start thinking about better pool initialization. Various ideas about mixing in data specific to the host, like MAC address and PCI device characteristics were discussed.
As Ts’o points out, that will help prevent things like UUID collisions, but it doesn’t solve the problem of predictability of the random numbers that will be generated by these systems.
Linux provides random numbers suitable for nearly any purpose via /dev/urandom. For the truly paranoid, there is also /dev/random, but developers would do well to forget that device exists for everything but the most critical needs. If one is generating a large key pair, to use for the next century, using some data from /dev/random is probably right. Anything with lower requirements should seriously consider /dev/urandom.