Random numbers
Kent Borg
kentborg at borg.org
Wed Nov 3 15:07:02 UTC 2010
C de-Avillez wrote:
> It is common to see that on very busy HTTPS sites without a
> good PRNG.
>
Aren't the ones running Linux going to use /dev/urandom? If so--modulo
any bugs in /dev/urandom that I don't know about--it is a very good
random source.
> Verifying true randomess on PRNGs is considered a hard problem.
>
Verifying any purported randomness is difficult. Another definition of
randomness is "unpredictability". It depends on the skill and insight
of whomever is doing the predicting, proving something can't be
predicted is trying to prove a negative. Logical difficulties lie there.
20-some years ago there was a decent distinction between pseudo-random
number generators and "true" random number generators, but things have
evolved.
"True" random number generators had flaws. Consider a random number
generator that actually flipped a coin. Pretty solid, real randomness!
Except any real coin probably has a little bias, given enough output one
can start to make predictions that will be ~slightly~ better than 50-50,
its output should be cleaned up, run through an algorithm, probably
"whitened" a little.
Alternately, because PRNGs are all predicated on some starting secret
state and the longer they are run the more information about that
starting state will necessarily leak out, it was obvious that we can
make them better by mixing up that state in an ongoing-basis by looking
for some ongoing source of entropy to mix in. Even if the entropy
source is not perfectly unpredictable, any unpredictability can be made
to help.
So real random number generators get post-processing that looks like
PRNGs and PRNGs get their internal state updated with new random
data...and they start to converge.
The Linux random device is such a hybrid design. It looks for sources
of entropy to stir its entropy pool, and it uses cryptographic-quality
SHA-1 hashing as a PRNG to turn that pool state into a stream of random
bits.
A modern server that appears to not have any good entropy sources
actually does: It has a fast system clock, the clock is generated with
analog PLL circuitry from a lower frequency clock, and when interrupts
happen the exact number of system clock ticks at that moment is used as
entropy source to mix the pool. Sure, from the outside, one can measure
how long the computer has been running and one can note when the
ethernet packet comes in, but it is not easy for an outside observer to
know the least significant bit or two of that high speed counter at the
instant the CPU looks at the counter as part of servicing that
interrupt. And unless you can know every bit's value in the counter at
the moment the CPU reads it, at least one bit of entropy will slip past.
That greater-than-GHz clock, the counter that increments from that
clock, and the CPU's interrupt processing, are all inside one chip,
remote from external prying eyes. If every ethernet packet supplies at
least one bit of new entropy, and if SHA-1 is mostly good at hiding
input data and mixing up its output (and the system is generally
secure)...then we get dang good random numbers from the Linux kernel.
If https is using /dev/urandom, I don't see what is lacking there.
Note that not all uses of random data want cryptographic-quality random
data. Traditional-style PRNGs are great for Monte Carlo simulations.
Something like Mersenne Twister has an enormous number of internal
states (and so can run a very long time before repeating) yet by
resetting the internal state to a known value one can get a repeatable
sequence. (Change your algorithm and do another test run...was the
output different because of different random data or because of your
code changes? By being able to have the same random data for each run,
you can control what changed.)
Mersenne Twister does not go to great effort to hide its internal state,
and so is not suited to cryptographic uses where humans might be working
hard to deduce the internal state and win the lottery, but, say you are
doing a biological simulation, it is reasonable to assume that some
virus simulation isn't going to deduce the internal state.
Yes, proving randomness is hard, but if a random number generator is (a)
designed with thought, (b) carefully constructed, and (c) built out of
good parts (i.e., SHA-1), the result can be very good. The Linux random
number generator makes a good stab at all three of those points.
-kb
More information about the ubuntu-users
mailing list