VBA's Pseudo Random Number Generator
[NOTE: This documentation refers to prior versions of
Visual Basic. However, the concepts are useful for
PRNGS in general.]
The pseudo-random number
generator discussed is the one provided by Microsoft
in its Visual Basic library, used by Visual Basic
itself and its variant, VBA. Most generators that are
supplied with 3GL and 4GL languages are very standard
and are variations of the Linear Congruential
Pseudo-Random Number Generators (LCPRNGs). These
operate in the following way:
Each new number is the product of the previous number
times a constant (a) plus another constant (c). The
result is kept modulo a third number (m), which is
invariably the width of its ‘long’ integer size (or
fraction thereof). Standard examples are m=2^64, 2^48
or 2^32. Mathematically, this is represented as:
r(i+1)=r(i)*a+c (mod m), where r(0) is the ‘seed’
value.
Although the algorithm in use can be obtained from
analysing a string of outputs from the generator and
deducing the values of a, c and M, Microsoft has
published theirs on their web site
(Knowledge
article) and it turns out that they
use an LCPRNG with:
a = 16598013, c = 2820163 and m=2^24 =
16777216
These PRNGs operate on integer spaces, VB’s returning
an integer number from 0 to 16777215 (M-1). VB
provides one more step in this, converting the
resulting number to a floating point representation.
It does this by simply dividing the resulting integer
by 2^24 and thus returning a floating point number in
the range 0-1.
VB’s random number generator uses a standard LCPRNG
model, with a,c & m as described above. This
gives it a period of about 17 million before it
cycles. (Most of the sensible PRNGs will use a
modulus of, say, 2^64 and return the top 32 bits
only, which makes it more random and harder (though
still possible) to invert due to the much larger
period. Certainly, more bits of output are required
in order to crack these.)
One extra function that VB provides is the
‘randomize()’ function. This function takes a
floating point number as input and reseeds the PRNG,
so that r(0) is some function of this input. It is
easy to determine that this function is just an
exclusive or of the first and third bytes of the
input with the second and fourth. This is a 'lossy'
mapping, which should theoretically make it
impractical to determine what possible values of f
were used in the call from just the output alone. If
a good hash function were used for randomize(), there
would be around 2^42 alternative f's that would
create the same R(0). As we will see, VB's
randomize() function is not very one-way, using XORs
on two input bytes to create two of the output bytes.
Now suppose we are given a few floating point numbers
that have been produced by VB’s rnd() function.
Firstly, we can multiply them by 2^24 to get the
original in integer form. So we have a few integers
in our hands that we know have been generated from
the rnd() function. How easy is it for us to
determine what the seed value was that started this
sequence? Is it a difficult problem? Can we step back
through the generator and determine all of the
previous random numbers prior to those we had at
hand? Can we determine what was used by the
randomize() function to create the seed?
Reversing the Algorithm
One less well-known feature of LCPRNGS is the fact
that they are invertible. That is, given a particular
number r, its predecessor, r(-1) can be found and the
sequence essentially run backwards. This is done
quite simply. Firstly, the multiplicative inverse of
a mod m is found. This is the number which, when
multiplied by a modulo m equals 1.
Under modulo arithmetic, the inverse of a, ainv, acts
like rational division in normal arithmetic. Thus a
*ainv = a/a = 1. Given this, we can invert the LCPRNG
equation, arriving at r(i)=ainv(r(i+1)-c) (mod m) ,
where it is now only required to determine ainv from
a. Although this can be determined readily by
trial and error
(because VB's
rnd() space is so small), a simple algorithm for
computing multiplicative inverses is also
available and called the 'Extended Euclidean
Algorithm'.
In our case, a=16598013 and ainv is found to
be
602453.
This can be checked by 16598013*602453 =
9999522725889 = 1 mod 2^24. So, given any number
output from the VB random number generator, we can
calculate the previous number in the sequence by
applying the general rule r(i)=602453(r(i+1)-2820163)
(mod 16777216)
It is quite common to want to find out what generator
and seed values were used to produce a certain
sequence of output characters, which are invariably
number in a smaller range, e.g. 0-9 or A-Z (ascii).
It is not difficult, given only a handful of these in
many cases, to determine the seed.
Summary
In summary, we have shown that VB's pseudo random
number generator, rnd(), is quite weak for anything
but very simple needs:
- It has too small a period, only 2^24 = 16 million values before it cycles.
- The output stream is too predictable. For example, all the output numbers alternate between even and odd!
- The randomize() function is not a strong hash function and enables much information to be determined about the initialiser value from the output stream.
So be warned if you were thinking of using VB's PRNG for any cryptographic purposes!
SW July, 2001





