randomdist.lua

NAME

randomdist.lua - a few simple functions for generating random numbers.

SYNOPSIS

 local R = require 'randomdist'
 grand1 = R.new_grand(10,3)
 grand2 = R.new_grand(100,3)
 for i = 1,20 do print( grand1(), grand2() ) end

 gue_irand1 = R.new_gue_irand(4)
 gue_irand2 = R.new_gue_irand(20)
 for i = 1,20 do print( gue_irand1(), gue_irand2() ) end

 for i = 1,20 do print(R.rayleigh_rand(3.456)) end
 for i = 1,20 do print(R.rayleigh_irand(10)) end

 a = {'cold', 'cool', 'warm', 'hot'}
 for i = 1,20 do print( R.randomget(a) ) end
 for i = 1,20 do print(table.unpack( R.randomgetn(a, 2) )) end

 a = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', }
 zipf1 = R.new_zipf(a)
 for i = 1,20 do print(R.zipf1()) end
 zipf2 = R.new_zipf(8, 1.02)
 for i = 1,20 do print(R.zipf2()) end

 word2count = {
   the=983, ['and']=421, of=340, to=286, I=263, it=252, -- etc
 }
 s, stddev = R.wordcount2zipf(word2count)
 eo_words = {'la', 'kaj', 'de', 'al', 'mi', 'gxi'}
 random_word = R.new_zipf(eo_words, s)
 for i = 1,1000 do print(random_word()) end


DESCRIPTION

This module implements in Lua a few simple functions for generating random numbers according to various distributions.

randomdist.lua is based on the PostScript module random.ps


FUNCTIONS

new_grand(),   new_gue_irand(),   rayleigh_rand(),   rayleigh_irand(),   randomget(),   randomgetn(), new_zipf() and   wordcount2zipf()

new_grand (mean, stddev)

This function returns a closure, which is a function which you can then call to return a Gaussian (or Normal) Random distribution of numbers with the given mean and standard deviation.

It keeps some internal local state, but because it is a closure, you may run different Gaussian Random generators simultaneously, for example with different means and standard-deviations, without them interfering with each other.

It uses the algorithm given by Erik Carter which used to be at design.caltech.edu/erik/Misc/Gaussian.html

This algorithm generates results in pairs, but returns them one by one. Therefore if you are using math.randomseed to reset the random-number generator to a known state, and your code happens to make an odd number of calls to your closure, and you want your program to run consistently, then you should call your closure (eg: grand1) with the argument 'reset' each time you call math.randomseed. Eg:
  grand1 = R.new_grand(10,3)
  ... grand1() ... etc ...
  math.randomseed(244823040) ; grand1('reset')

new_gue_irand (average)

This function returns a closure, which is a function which you can then call to return a Gaussian-Random-Ensemble distribution of integers.

The Gaussian Unitary Ensemble models Hamiltonians lacking time-reversal symmetry. Considering a hermitian matrix with gaussian-random values; from the ordered sequence of eigenvalues, one defines the normalized spacings
  s = (\lambda_{n+1}-\lambda_n) / <s>
where <s> = is the mean spacing. The probability distribution of spacings is approximately given by
  p_2(s) = (32 / pi^2) * s^2 * e^((-4/pi) * s^2)
These numerical constants are such that p_2 (s) is normalized: and the mean spacing is 1.
  \int_0^\infty ds p_2(s) = 1   \int_0^\infty ds s p_2(s) = 1

Montgomery's pair correlation conjecture is a conjecture made by Hugh Montgomery (1973) that the pair correlation between pairs of zeros of the Riemann zeta function (normalized to have unit average spacing) is:
  1 - ({sin(pi u)}/{pi u}})^2 + \delta(u)
which, as Freeman Dyson pointed out to him, is the same as the pair correlation function of random Hermitian matrices.

rayleigh_rand (sigma)

This function returns a random number according to the Rayleigh Distribution, which is a continuous probability distribution for positive-valued random variables. It occurs, for example, when random complex numbers whose real and imaginary components are independent Gaussian distributions with equal variance and zero mean, in which case, the absolute value of the complex number is Rayleigh-distributed.
  f(x; sigma) = x exp(-x^2 / 2*sigma^2) / sigma^2 for x>=0
The algorithm contains no internal state, hence rayleigh_rand directly returns a number.

rayleigh_irand (sigma)

This function returns a random integer according to the Rayleigh Distribution, which is a probability distribution for positive-valued random integers. For example MIDI parameters, or a number of people, etc.
The average return-value is about 1.2533*sigma
The algorithm contains no internal state, hence rayleigh_irand directly returns an integer.

randomget (an_array)

This example gets a random element from the given array. For example, the following executes one of the four given functions at random:
  randomget( {bassclef, trebleclef, sharp, natural} ) ()

randomgetn (an_array, n)

This example returns an array containing n random elements, with distinct indices, from the given array.

new_zipf (an_array, s)
new_zipf (n, s)

This function returns a closure, which is a function which you can then call to return a Zipf-Distribution of array elements, or of integers.
The first example takes an array argument and returns a function which will return one of the items in the array, the first item being returned most frequently.
The second example takes an number argument and returns a function which will return a number from 1 to n, with 1 being the most frequent.
If s is not given it defaults to 1.0

s, stddev = wordcount2zipf (a_word_to_number_table)

This function can supply the s parameter used by new_zipf()
The argument is a table, for example:

city2population = {
   Chongqing=30165500,
   Shanghai=24183300,
   Beijing=21707000,
   Lagos=16060303,
   Istanbul=15029231,
   Karachi=14910352,    -- etc , etc ...
 }
It returns two numbers: the Zipf-parameter s which best fits the data,
and the standard deviation stddev from which you can guess how reliable your parameter s is.


DOWNLOAD

This module is available as a LuaRock in luarocks.org/modules/peterbillam so you should be able to install it with the command:

 $ su
 Password:
 # luarocks install randomdist

or:

 # luarocks install http://www.pjb.com.au/comp/lua/randomdist-1.6-0.rockspec

The test script used during development is www.pjb.com.au/comp/lua/test_randomdist.lua


CHANGES

 20200417 1.6 add rayleigh_irand()
 20180724 1.5 add wordcount2zipf
 20180711 1.4 add zipf distribution
 20171226 1.3 randomgetn allows n == array-size, meaning shuffle the array
 20170819 1.2 add randomgetn()
 20170707 1.1 grand('reset') more robust, and tested
 20170706 1.0 first released version

AUTHOR

Peter J Billam,   www.pjb.com.au/comp/contact.html


SEE ALSO

en.wikipedia.org/wiki/Normal_distribution
en.wikipedia.org/wiki/Random_matrix#Gaussian_ensembles
en.wikipedia.org/wiki/Random_matrix#Distribution_of_level_spacings
en.wikipedia.org/wiki/Montgomery%27s_pair_correlation_conjecture
en.wikipedia.org/wiki/Radial_distribution_function
en.wikipedia.org/wiki/Pair_distribution_function
en.wikipedia.org/wiki/Rayleigh_distribution
en.wikipedia.org/wiki/Zipf%27s_law
luarocks.org/modules/luarocks/lrandom
www.pjb.com.au/comp/random.html
www.pjb.com.au/comp/index.html

Montgomery, Hugh L. (1973), "The pair correlation of zeros of the zeta function", Analytic number theory, Proc. Sympos. Pure Math., XXIV, Providence, R.I.: American Mathematical Society, pp. 181-193, MR 0337821

Odlyzko, A. M. (1987), "On the distribution of spacings between zeros of the zeta function", Mathematics of Computation, American Mathematical Society, 48 (177): 273-308, ISSN 0025-5718, JSTOR 2007890, MR 866115, doi:10.2307/2007890

John Derbyshire, Prime Obsession, Joseph Henry Press, 2003, p.288