VBA's Pseudo Random Number Generator

(Keywords: VB, VBA, PRNG, LCPRNG, rand, rnd, randomize, randomise, invert, modulo, reverse engineer)

There it is - hexadecimal code

[NOTE: This documentation refers to prior versions of Visual Basic. However, the concepts are useful for PRNGS in general.]

The pseudo-random number generator discussed is the one provided by Microsoft in its Visual Basic library, used by Visual Basic itself and its variant, VBA. Most generators that are supplied with 3GL and 4GL languages are very standard and are variations of the Linear Congruential Pseudo-Random Number Generators (LCPRNGs). These operate in the following way:

Each new number is the product of the previous number times a constant (a) plus another constant (c). The result is kept modulo a third number (m), which is invariably the width of its ‘long’ integer size (or fraction thereof). Standard examples are m=2^64, 2^48 or 2^32. Mathematically, this is represented as: r(i+1)=r(i)*a+c (mod m), where r(0) is the ‘seed’ value.

Although the algorithm in use can be obtained from analysing a string of outputs from the generator and deducing the values of a, c and M, Microsoft has published theirs on their web site (
Knowledge article) and it turns out that they use an LCPRNG with:

a = 16598013, c = 2820163 and m=2^24 = 16777216


These PRNGs operate on integer spaces, VB’s returning an integer number from 0 to 16777215 (M-1). VB provides one more step in this, converting the resulting number to a floating point representation. It does this by simply dividing the resulting integer by 2^24 and thus returning a floating point number in the range 0-1.

VB’s random number generator uses a standard LCPRNG model, with a,c & m as described above. This gives it a period of about 17 million before it cycles. (Most of the sensible PRNGs will use a modulus of, say, 2^64 and return the top 32 bits only, which makes it more random and harder (though still possible) to invert due to the much larger period. Certainly, more bits of output are required in order to crack these.)

One extra function that VB provides is the ‘randomize()’ function. This function takes a floating point number as input and reseeds the PRNG, so that r(0) is some function of this input. It is easy to determine that this function is just an exclusive or of the first and third bytes of the input with the second and fourth. This is a 'lossy' mapping, which should theoretically make it impractical to determine what possible values of f were used in the call from just the output alone. If a good hash function were used for randomize(), there would be around 2^42 alternative f's that would create the same R(0). As we will see, VB's randomize() function is not very one-way, using XORs on two input bytes to create two of the output bytes.

Now suppose we are given a few floating point numbers that have been produced by VB’s rnd() function. Firstly, we can multiply them by 2^24 to get the original in integer form. So we have a few integers in our hands that we know have been generated from the rnd() function. How easy is it for us to determine what the seed value was that started this sequence? Is it a difficult problem? Can we step back through the generator and determine all of the previous random numbers prior to those we had at hand? Can we determine what was used by the randomize() function to create the seed?

Reversing the Algorithm

One less well-known feature of LCPRNGS is the fact that they are invertible. That is, given a particular number r, its predecessor, r(-1) can be found and the sequence essentially run backwards. This is done quite simply. Firstly, the multiplicative inverse of a mod m is found. This is the number which, when multiplied by a modulo m equals 1.

Under modulo arithmetic, the inverse of a, ainv, acts like rational division in normal arithmetic. Thus a *ainv = a/a = 1. Given this, we can invert the LCPRNG equation, arriving at r(i)=ainv(r(i+1)-c) (mod m) , where it is now only required to determine ainv from a. Although this can be determined readily by
trial and error (because VB's rnd() space is so small), a simple algorithm for computing multiplicative inverses is also available and called the 'Extended Euclidean Algorithm'.

In our case, a=16598013 and ainv is found to be
602453. This can be checked by 16598013*602453 = 9999522725889 = 1 mod 2^24. So, given any number output from the VB random number generator, we can calculate the previous number in the sequence by applying the general rule r(i)=602453(r(i+1)-2820163) (mod 16777216)
It is quite common to want to find out what generator and seed values were used to produce a certain sequence of output characters, which are invariably number in a smaller range, e.g. 0-9 or A-Z (ascii). It is not difficult, given only a handful of these in many cases, to determine the seed.

Summary

In summary, we have shown that VB's pseudo random number generator, rnd(), is quite weak for anything but very simple needs:

  • It has too small a period, only 2^24 = 16 million values before it cycles.
  • The output stream is too predictable. For example, all the output numbers alternate between even and odd!
  • The randomize() function is not a strong hash function and enables much information to be determined about the initialiser value from the output stream.

So be warned if you were thinking of using VB's PRNG for any cryptographic purposes!

SW July, 2001