Excel's PRNG

(Keywords: Excel, PRNG, FPRN, rand, rnd, randomize, randomise, reverse engineer, seed, rho, period)

Binary tunnel

This article concentrates on the earlier versions of Excel where the PRNG was rather problematic. See another article for more information on the generators in different versions.

"In Excel, FPRNs (Floating Point Random Numbers) are returned by the RAND function.However the algorithms are different. In Excel 2000, RAND is an FPRN. In Excel 2003,RAND is generated from three separate IRN(16)s, converted to FP values, and rescaledto the zero to one interval, giving a single FPRN output. The seed in this case is the threeseparate IRN(16)s which change on every RAND call. These three are hidden fromaccess and can’t be modified. This means that a given sequence can never be repeated."



In Excel 2003, however, Microsoft apparently tried to fix the weaknesses in the generator - but got it quite wrong. See
this article.

"If you have a spreadsheet that uses the RAND() function, and you’re using Excel 2003, you may be getting bad results – and you probably don’t even know it. Excel 2003’s RAND() has a big, big bug in it.In short, RAND() should produce a pseudo-random number between 0 and 1.  Microsoft reworked the function for Excel 2003 so that it would produce a better quality of randomness, especially when you ask for a large number of random numbers.Guess what?  If you do try to get a lot of random number, the function gets buggy, instead of numbers from 0 to 1 you start getting negative numbers.  If that weren't enough, the negative numbers aren't sufficiently distributed to be considered 'random'.And to top it all off, Microsoft has been told about the problem and has done nothing.  Like serious Excel problems in the past this new bug has been ignored."




Excel (at least pre 2003) has a rather interesting PRNG that it inherited from the earliest versions. Interesting because it is 'almost' linear - but not quite. As Excel deals with floating point numbers as a rule, it seems that the algorithm chosen tries to generate numbers using a floating point algorithm rather than an integer one. (see
Microsoft Knowledge Base article.)

First random number:
  random_number=fractional part of (9821 * r + 0.211327),    where r = .5

Successive random numbers:
   random_number=fractional part of (9821 * r + 0.211327),    where r = the previous random number

In fact, it almost appears as if the developer copied a Linear Congruential PRNG algorithm into floating point space and tried to make it fit. Even if this isn't the case, the fit isn't so special. It turns out that this quasi-linear implementation makes the algorithm much worse by introducing large numbers of loops within the stream.

These loops are called rho loops in the mathematical literature on account of the shape of the Greek letter rho. Basically, you start at a certain seeded number, iterate for some amount of time and then suddenly a number is hit that appeared before. When this happens the series enters a loop, never to exit. If the loop is small, then the numbers produced are extremely non random.

The algorithm with the default starting seed generates about a million unique numbers before repeating even though the floating point numbers have vastly more bit size. VBA's algorithm - weak by comparison to most PRNGs in use - generates about 16 million. A 32 bit generator will do about 2 billion.