Random Number Generation from Non-uniform Distributions

Most algorithms for generating pseudo-random numbers from other distributions depend on a good uniform pseudo-random number generator. However, there is a great variety in the types of algorithms which are efficient for many different distributions. We present several examples here.

The Probability Integral Transform

A fact learned in a standard first course in probability is that for any continuous random variable X with a cumulative distribution function F that has a unique inverse F^{-1}, F(X) is a uniform random variable on (0,1). From this it follows that X could be generated by F^{-1}(U). For any distribution for which it is easy to compute the inverse cdf, this provides an easy way to generate pseudo-random numbers.

The Exponential Distribution

As an example, the exponential distribution has density function
f(x) = a e^{-ax}  (x > 0)
for positive parameter a, called the rate. Standard calculus gives the cdf as
F(x) = 1 - e^{-ax}
Inverting this we find
F^{-1}(x) = - log(1 - x) / a
Because 1-U is uniform(0,1) if U is, -log(U) / a will have an exponential distribution with rate a.

Sums of Random Variables

Many standard probability distributions may be defined as the distribution resulting from summing independent and identically distributed random variables from other distributions. For example, the binomial distribution with parameters n and p is the sum of n independent Bernoulli(p) random variables, the negative binomial distribution with parameters r and p is the sum of r independent geometric(p) random variables, and the gamma distribution with parameters alpha and lambda is a sum of alpha independent exponential(lambda) random variables. As long as the number of terms in the sum is not too large, these distributions may be efficiently simulated summing several generated pseudo-random numbers with the appropriate distribution.

Transformations

Many of the standard probability distributions are related to one another through various transformations. If you can generate one random variable, the transformation gives a mechanism for generating another.

The Polar Method for Normal Random Variaibles

Consider this method for randomly picking a pair of points in the plane.

Method: Generate the polar coordinates of (X,Y) in this fashion. Let S = R^2 be an exponential(1/2) random variable and T be a uniform(0,2 Pi) random variable. Then X = R cos(T) and Y = R sin(T).

X and Y generated in this fashion will be independent standard normal random variables. (See Rice, Mathematical Statistics and Data Analysis, Second Edition, pages 96-97.) Because we can generate uniform and exponential pseudo-random variables, we may use this to generate normal pseudo-random variables. Specifically, letting R = sqrt( -2 log( U_1) ) and T = 2 Pi U_2,

X = sqrt( -2 log(U_1) ) cos( 2 Pi U_2 )
Y = sqrt( -2 log(U_1) ) sin( 2 Pi U_2 ) 
It takes two uniform pseudo-random numbers and some computation to generate two independent standard normal pseudo-random numbers by the polar method. If you only need one, you can use X and save Y for later.

There is another twist we can add which eliminates the need to use the trigonometric functions. Here is the idea. Let V_1 and V_2 be the (X,Y) coordinates of a point selected uniformly at random from the interior of the unit circle. Define S, the squared distance from the origin, and T, the angle between the line segment connecting the origin to the point (V_1,V_2) and the x-axis, by the equations

S = V_1^2 + V_2^2
T = tan^{-1} (V_2 / V_1)
Because of the circular symmetry of the range of (V_1,V_2) around the origin, S and T are independent, even though both are functions of the same two random variables V_1 and V_2. Also, S is a uniform random variable on (0,1). (Can you show this?) Furthermore, the sine and cosine of T may be computed without using trigonometric functions. Notice that
cos(T) = V_1 / sqrt(S)
sin(T) = V_2 / sqrt(S)
The above method depended on two independent variables, one for the distance from the origin and one for the angle. By using S and T in that capacity, we have this modified algorithm for the polar method using pseudo-random uniform variables U_1 and U_2.
  1. V_1 = 2 * U_1 - 1
  2. V_2 = 2 * U_2 - 1
  3. S = V_1^2 + V_2^2
  4. If S > 1, go back to step 1. (The pair is not in the unit circle.)
  5. X = V_1 sqrt( -2 log(S) / S)
  6. Y = V_2 sqrt( -2 log(S) / S)
X and Y are two independent standard normal pseudo-random variables.

This new method requires a random number of uniform pseudo-random numbers because it rejects a pair if S > 1. On average, it will take 4/Pi = 1.27 pairs to find an acceptable one. Even with this extra potential computation, avoiding calling the trig functions makes this faster to compute on average.


Last modified: April 2, 1997

Bret Larget, larget@mathcs.duq.edu