an introduction to random number generators and monte carlo methods

An Introduction to Random Number Generators and

Monte Carlo Methods

An Introduction to Random Number Generators and

Monte Carlo MethodsJosh Gilkerson

Wei Li

David Owen

Josh Gilkerson

Wei Li

David Owen

Random Number GeneratorsRandom Number Generators

Uses for Random NumbersUses for Random Numbers

Monte Carlo Simulations Generation of Cryptographic Keys Evolutionary Algorithms Many Combinatorial Optimization Algorithms

Monte Carlo Simulations Generation of Cryptographic Keys Evolutionary Algorithms Many Combinatorial Optimization Algorithms

Two Types of Random NumbersTwo Types of Random Numbers

Pseudorandom numbers are numbers that appear random, but are obtained in a deterministic, repeatable, and predictable manner.

True random numbers are generated in non-deterministic ways. They are not predictable. They are not repeatable.

Pseudorandom numbers are numbers that appear random, but are obtained in a deterministic, repeatable, and predictable manner.

True random numbers are generated in non-deterministic ways. They are not predictable. They are not repeatable.

True Random GeneratorsTrue Random Generators

Use one of several sources of randomness– decay times of radioactive material– electrical noise from a resistor or semiconductor– radio channel or audible noise– keyboard timings

some are better than others usually slower than PRNGs

Use one of several sources of randomness– decay times of radioactive material– electrical noise from a resistor or semiconductor– radio channel or audible noise– keyboard timings

some are better than others usually slower than PRNGs

RNG And Random MachinesRNG And Random Machines

It is not viable to generate a true random number using computers since they are deterministic. However, we can generate a good enough random numbers that have properties close to true random numbers.– The first machine used to produce a table of 100,000 random

digits was done by M. G. Kendall and B. Babington-Smith in 1939.

– RAND Corporation in 1955 released a table of a million random digits.

– ERNIE is a random number generator machine used to pick the winning numbers in the British Premium Bonds lottery.

It is not viable to generate a true random number using computers since they are deterministic. However, we can generate a good enough random numbers that have properties close to true random numbers.– The first machine used to produce a table of 100,000 random

digits was done by M. G. Kendall and B. Babington-Smith in 1939.

– RAND Corporation in 1955 released a table of a million random digits.

– ERNIE is a random number generator machine used to pick the winning numbers in the British Premium Bonds lottery.

Desirable Properties of PRNGsDesirable Properties of PRNGs

Uniform Lengthy period Serially uncorrelated Fast

Uniform Lengthy period Serially uncorrelated Fast

Problems With PRNGProblems With PRNG

It is very difficult to pin point the problem with random number generators when they arise. Usually, the programmers would need to replace the whole random number generator with a better ones.

With small test cases, problems rarely arises. However, when it gets to large scale random number generations (possibly in millions or even billions of numbers) the problem could be apparent. This makes debugging difficult.

In large-scale computing problems, one might need to use a parallel algorithm. The effect is that, sometimes it is not possible to duplicate the simulation exactly.

It is very difficult to pin point the problem with random number generators when they arise. Usually, the programmers would need to replace the whole random number generator with a better ones.

With small test cases, problems rarely arises. However, when it gets to large scale random number generations (possibly in millions or even billions of numbers) the problem could be apparent. This makes debugging difficult.

In large-scale computing problems, one might need to use a parallel algorithm. The effect is that, sometimes it is not possible to duplicate the simulation exactly.

Linear Congruential Generator(LCG)

Linear Congruential Generator(LCG)

Most common Maximum period of 2n for n-bit numbers Xn+1=( aXn + c ) mod m

a,c,m are constants X0 is the seed

Most common Maximum period of 2n for n-bit numbers Xn+1=( aXn + c ) mod m

a,c,m are constants X0 is the seed

Advantages of LCGAdvantages of LCG

Most common Very easily implemented Fast and small (remember only last number) Easily parallelized

– N processes 1 ... N.

– numbers for process n are Xn+iN

– no more expensive than serial version.

Most common Very easily implemented Fast and small (remember only last number) Easily parallelized

– N processes 1 ... N.

– numbers for process n are Xn+iN

– no more expensive than serial version.

Disadvantages of LCGsDisadvantages of LCGs

Other generators have longer maximum periods.

Bad choices of M result in very bad sequences (primes work best, powers of 2 are fast, but not nearly as good).

Initial seed affects period. Low order bits are not random.

Other generators have longer maximum periods.

Bad choices of M result in very bad sequences (primes work best, powers of 2 are fast, but not nearly as good).

Initial seed affects period. Low order bits are not random.

Lagged Fibonacci GeneratorsLagged Fibonacci Generators

Similar to Fibonacci Sequence Increasingly popular

Xn = (X

n-l + X

n-k) mod m (l>k>0)

l seeds are needed m usually a power of 2 Maximum period of (2l-1)x2M-1 when m=2M

Similar to Fibonacci Sequence Increasingly popular

Xn = (X

n-l + X

n-k) mod m (l>k>0)

l seeds are needed m usually a power of 2 Maximum period of (2l-1)x2M-1 when m=2M

Add-with-carry & Subtract-with-borrow

Add-with-carry & Subtract-with-borrow

Similar to LFG

AWC: Xn=(X

n-l+X

n-k+carry) mod m

SWB: Xn=(X

n-l-X

n-k-carry) mod m

Similar to LFG

AWC: Xn=(X

n-l+X

n-k+carry) mod m

SWB: Xn=(X

n-l-X

n-k-carry) mod m

Multiply-with-carry GeneratorsMultiply-with-carry Generators

Similar to LCG

Xn=(aX

n-1+carry) mod m

Similar to LCG

Xn=(aX

n-1+carry) mod m

Inverse Congruential GeneratorsInverse Congruential Generators

Xn=(a * ~X

n-1 + b) mod m

m should be prime ~y is the multiplicative inverse of y in the

field over {0,1,...,m-1}.

Xn=(a * ~X

n-1 + b) mod m

m should be prime ~y is the multiplicative inverse of y in the

field over {0,1,...,m-1}.

PRNG ReviewPRNG Review

This is just a short review. There are many other PRNGs.

Linear Congruential Generator Lagged Fibonacci Generator Add-with-carry Generator Subtract-with-carry Generator Multiply-with-carry Generator Inverse Congruential Generator

This is just a short review. There are many other PRNGs.

Linear Congruential Generator Lagged Fibonacci Generator Add-with-carry Generator Subtract-with-carry Generator Multiply-with-carry Generator Inverse Congruential Generator

Testing RandomnessTesting Randomness

Test for uniform distribution (of singletons, pairs, triples, etc) of the sequence and all subsequences.

DIEHARD - http://stat.fsu.edu/pub/diehard/ NIST - http://csrs.nist.gov/rng

Test for uniform distribution (of singletons, pairs, triples, etc) of the sequence and all subsequences.

DIEHARD - http://stat.fsu.edu/pub/diehard/ NIST - http://csrs.nist.gov/rng

http://csep1.phy.ornl.gov/mc/mc.html




http://www.chem.unl.edu/zeng/joy/mclab/mcintro.html
















Monte Carlo MethodsMonte Carlo Methods

Introduction of Monte CarloIntroduction of Monte Carlo

Monte Carlo methods have been used for centuries.

However during World War II, this method was used to simulate the probabilistic issues with neutron diffusion (first real use).

Named after the capital of Monaco (one of the world’s center for gambling), due to the similarity to games of chance.

Monte Carlo methods have been used for centuries.

However during World War II, this method was used to simulate the probabilistic issues with neutron diffusion (first real use).

Named after the capital of Monaco (one of the world’s center for gambling), due to the similarity to games of chance.

What is Monte CarloWhat is Monte Carlo

Non Monte Carlo methods typically involve ODE/PDE equations that describe the system.

Monte Carlo methods are stochastic techniques. It is based on the use of random numbers and

probability statistics to simulate problems. Something can be called a Monte Carlo method if it

uses random numbers to examine the problem it is solving.

First, we would need to determine the probability density function (PDF). Then perform random sampling from the PDF. We keep record of each simulation performed and tally them.

Non Monte Carlo methods typically involve ODE/PDE equations that describe the system.

Monte Carlo methods are stochastic techniques. It is based on the use of random numbers and

probability statistics to simulate problems. Something can be called a Monte Carlo method if it

uses random numbers to examine the problem it is solving.

First, we would need to determine the probability density function (PDF). Then perform random sampling from the PDF. We keep record of each simulation performed and tally them.

Probability Density FunctionProbability Density Function

A probability density function (or probability distribution function) is a function f defined on an interval (a, b) and having the following properties:

A probability density function (or probability distribution function) is a function f defined on an interval (a, b) and having the following properties:

Why use Monte CarloWhy use Monte Carlo

It allows us to examine complex system. And is usually easy to formulate (independent of the problem).

For example, solving equations which describe two atoms interactions. This would be doable without using Monte Carlo method. But solving the interactions for thousands of atoms using the same equations is impossible.

However, the solutions are imprecise and it can be very slow if higher precision is desired.

It allows us to examine complex system. And is usually easy to formulate (independent of the problem).

For example, solving equations which describe two atoms interactions. This would be doable without using Monte Carlo method. But solving the interactions for thousands of atoms using the same equations is impossible.

However, the solutions are imprecise and it can be very slow if higher precision is desired.

Components of Monte Carlo simulation

Components of Monte Carlo simulation

Probability distribution functions (pdf's) - the physical (or mathematical) system must be described by a set of pdf's.

Random number generator - a source of random numbers uniformly distributed on the unit interval must be available.

Sampling rule - a prescription for sampling from the specified pdf's, assuming the availability of random numbers on the unit interval, must be given.

Scoring (or tallying) - the outcomes must be accumulated into overall tallies or scores for the quantities of interest.

Probability distribution functions (pdf's) - the physical (or mathematical) system must be described by a set of pdf's.

Random number generator - a source of random numbers uniformly distributed on the unit interval must be available.

Sampling rule - a prescription for sampling from the specified pdf's, assuming the availability of random numbers on the unit interval, must be given.

Scoring (or tallying) - the outcomes must be accumulated into overall tallies or scores for the quantities of interest.

Components of Monte Carlo simulation (cont.)

Components of Monte Carlo simulation (cont.)

Error estimation - an estimate of the statistical error (variance) as a function of the number of trials and other quantities must be determined.

Variance reduction techniques - methods for reducing the variance in the estimated solution to reduce the computational time for Monte Carlo simulation

Parallelization and vectorization - algorithms to allow Monte Carlo methods to be implemented efficiently on advanced computer architectures.

Error estimation - an estimate of the statistical error (variance) as a function of the number of trials and other quantities must be determined.

Variance reduction techniques - methods for reducing the variance in the estimated solution to reduce the computational time for Monte Carlo simulation

Parallelization and vectorization - algorithms to allow Monte Carlo methods to be implemented efficiently on advanced computer architectures.

Monte Carlo Example Computing Pi

Monte Carlo Example Computing Pi

Monte Carlo Example (cont.)Monte Carlo Example (cont.)

So, we can compute PI by generating two numbers for x and y component of a simulated throw. Then we can figure out by using Pythagorean theorem if this throw is inside or outside the circle. We count this hits, and after doing this thousands of times (or more), we can get an estimate value of PI.

Accuracy of the estimate depends on the number of “throws”. An example code would be (assuming we set the radius = 1):

double x = rand(); // get random # in [0, 1] for xdouble y = rand(); // get random # in [0, 1] for ydouble dist = sqrt(x*x + y*y);if (distFromOrigin(x,y) <= 1)

hits++;

So, we can compute PI by generating two numbers for x and y component of a simulated throw. Then we can figure out by using Pythagorean theorem if this throw is inside or outside the circle. We count this hits, and after doing this thousands of times (or more), we can get an estimate value of PI.

Accuracy of the estimate depends on the number of “throws”. An example code would be (assuming we set the radius = 1):

double x = rand(); // get random # in [0, 1] for xdouble y = rand(); // get random # in [0, 1] for ydouble dist = sqrt(x*x + y*y);if (distFromOrigin(x,y) <= 1)

hits++;

What MC NeedsWhat MC Needs

MC methods might needs different RNG.– For example, when simulating outgoing direction for a launched

particle and interactions of the particle with the medium, the following would be the desirable properties:

The attribute of each particle should be independent from each other.

The attribute of all the particles should span across the entire attribute space. I.e., as we approach infinite numbers of particles, the particles launched into a space should cover the space completely.

Next slide will states the properties of the RNG needed.

MC methods might needs different RNG.– For example, when simulating outgoing direction for a launched

particle and interactions of the particle with the medium, the following would be the desirable properties:

The attribute of each particle should be independent from each other.

The attribute of all the particles should span across the entire attribute space. I.e., as we approach infinite numbers of particles, the particles launched into a space should cover the space completely.

Next slide will states the properties of the RNG needed.

What MC Needs (cont.)What MC Needs (cont.)

Any subsequence of random numbers should not be correlated with any other subsequence of random numbers. For example, when simulating the launched particles, we should not generate geometrical patterns.

Random number repetition should occur only after a very large generation of random numbers.

The random numbers generated should be uniform. This point and the first one are loosely related. To achieve more uniformity, some correlations between random numbers must be established.

The RNG should be efficient. It should be vectorizable with low overhead. The processors in parallel systems, should not be required to talk between each other.

Any subsequence of random numbers should not be correlated with any other subsequence of random numbers. For example, when simulating the launched particles, we should not generate geometrical patterns.

Random number repetition should occur only after a very large generation of random numbers.

The random numbers generated should be uniform. This point and the first one are loosely related. To achieve more uniformity, some correlations between random numbers must be established.

The RNG should be efficient. It should be vectorizable with low overhead. The processors in parallel systems, should not be required to talk between each other.

Appropriate PRNGsAppropriate PRNGs

The following are packages of available RNGs (http://www.agner.org/random/).

Uniform RNG in C++ & assembly language Mersenne twister. Mother-of-all. RANROT. In C, we can use drand48() to generate a double type

of random number which is produced using 48-bit integers.

The following are packages of available RNGs (http://www.agner.org/random/).

Uniform RNG in C++ & assembly language Mersenne twister. Mother-of-all. RANROT. In C, we can use drand48() to generate a double type

of random number which is produced using 48-bit integers.

http://www.agner.org/random/


An Application of the Monte Carlo Method

An Application of the Monte Carlo Method

The Effect of Space Discretization on the Canonical Monte Carlo

Simulation

The Effect of Space Discretization on the Canonical Monte Carlo

Simulation

AgendaAgenda

Introduction to the Monte Carlo (MC) molecular simulation Canonical Ensemble Importance Sampling Simulation Process Simulation of the equation of state of the Lennard-Jones

Fluid – Continuum Model Discretized Model Comparison of the simulation results

Introduction to the Monte Carlo (MC) molecular simulation Canonical Ensemble Importance Sampling Simulation Process Simulation of the equation of state of the Lennard-Jones

Fluid – Continuum Model Discretized Model Comparison of the simulation results

IntroductionIntroduction Why molecular simulation?

– Help explaining experimental observations

– simulate critical or extreme conditions

– Guide real experiments

Purpose of this study– Long-term goal - simulation of self-assembly of surfactant solutions – fine

lattice

– The continuum model is not viable under the current computing power

– The discretized model is at least 10 times faster compared with the continuum model

– The effect of space discretization on the simulation results

Why molecular simulation?– Help explaining experimental observations

– simulate critical or extreme conditions

– Guide real experiments

Purpose of this study– Long-term goal - simulation of self-assembly of surfactant solutions – fine

lattice

– The continuum model is not viable under the current computing power

– The discretized model is at least 10 times faster compared with the continuum model

– The effect of space discretization on the simulation results

Canonical EnsembleCanonical Ensemble fixed number of molecules N, fixed

volume V (volume), fixed temperature T The canonical ensemble partition

function from statistical mechanics

Evaluation of observable properties A

Random sampling - brute force Monte Carlo

When estimate <f(x)> , most of the computing is wasted

fixed number of molecules N, fixed volume V (volume), fixed temperature T

The canonical ensemble partition function from statistical mechanics


Random sampling - brute force Monte Carlo

When estimate <f(x)> , most of the computing is wasted

∫ −= )],(exp[ NNNN prHdrdpcQ β

∫∫

−

−>=<

)],(exp[

)],(exp[),(NNNN

NNNNNN

prHdrdp

prHrpAdrdpA

β

β

∫=b

adxxfI )(

∑=

−=

><−=L

iixfL

ab

xfabI

1

)()(

)()(

Importance SamplingImportance Sampling

∑

∫∫∫

=

=

===

N

i i

i

uxw

uxf

L

duuxw

uxfdx

xw

xfxwdxxfI

1

1

0

1

0

1

0

))((

))((1

))((

))((

)(

)()()(

]/)/([1 22 ><−><= wfwfLIδ

Change the integration variable

Standard deviation

Impossible to find the weight function w in multidimensional integrals

The Metropolis MethodThe Metropolis Method

∫∫

−

−>=<

)],(exp[

)],(exp[),(NNNN

NNNNNN

prHdrdp

prHrpAdrdpA

β

β

∫ −−

=)],(exp[

)],(exp[)(

NNNN

NNN

prHdrdp

prHrN

β

βProbability of finding the system in a configuration around r


∑=

>=<L

i

Ni

Ni rArNA

1

)()(

Randomly generate sampling points according to the probability distribution N(r)

The Detailed BalanceThe Detailed Balance

)()0()()()()( onaccnonoaccnoo →×→Ν=→×→Ν αα

)]}()([exp{)(

)(

)(

)(ounu

o

n

onacc

noacc−−=

ΝΝ

=→→ β

Generate sampling points according to the probability distribution – detailed balance

If α is a symmetric matrix

If α is a symmetric matrix

)]}()([exp{)( oununoacc −−=→ β

Simulation ProcessSimulation Process

Initialize the system– Put the system in a random state

Make a trial move– Randomly make a trial move

Calculate the energy change– Reevaluate the interactions of the moved

particles with its neighbors and calculate the energy change

Accept the trial move with the Metropolis scheme

Keep trying the moves until system approach equilibrium– Either monitor the total energy change, or monitor the structure formed

in the simulation box Sampling

– Sample a certain property over a certain number of configurations

Initialize the system– Put the system in a random state

Make a trial move– Randomly make a trial move

Calculate the energy change– Reevaluate the interactions of the moved

particles with its neighbors and calculate the energy change

Accept the trial move with the Metropolis scheme

Keep trying the moves until system approach equilibrium– Either monitor the total energy change, or monitor the structure formed

in the simulation box Sampling

– Sample a certain property over a certain number of configurations

⎪⎩

⎪⎨⎧

<Δ

>ΔΔ

−=

01

0)exp(

E

ETk

EP

b

Continuum ModelContinuum Model

Fixed N, V, T Lennard-Jones

potential Intermolecular force

Virial of the system

Pressure of the system

Fixed N, V, T Lennard-Jones

potential Intermolecular force

Virial of the system

Pressure of the system

€

u(r) = 4ε((δ

r)12 − (

δ

r)6)

))(5.0)((48)( 612

rrdr

durf

δδε −=−=

∑∑>

⋅=i ij

rrfvirrrr

)(31

V

virP +=

βρ

Simulation ProcessSimulation Process

Read simulationparameters

Start

Initialize positionsof all particles

New simulation?

Read oldconfiguration

Monte Carloloop

Stop

yes no

Monte Carlo loop SubroutineStart

Stop

Trial move

Satisfy Metropolis

rule?

Accept thetrial move

Update energyand virial

Sample thepressure

End ofsimulation?

yes

no

yes

no

Main program

Parameters ModelingParameters Modeling

8.0

0.2

100

09.0

0.1

0.1

======

ρ

δε

TNdr

Potential minimum between 2 particlesAverage distance between 2 particlesMaximum displacement of a particleNumber of particlesTemperatureDensity of the system

Simulation Results – Continuum Model

Simulation Results – Continuum Model

Equation of state of L-J fluid - Continuum Model

0

2

4

6

8

10

12

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Equation ofstate of L-Jfluid

Energy vs. Density

-500

-400

-300

-200

-100

00.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Energy vs.Density

Discretized ModelDiscretized Model

Space is discretized. Particles can only move on a 3D mesh (fine lattice).

Distance between particles is a set of fixed values.

Evaluation of complex functions against distance can be precalculated.

Depends on the form the functions, the simulation can be accelerated 10-100 times

Space is discretized. Particles can only move on a 3D mesh (fine lattice).

Distance between particles is a set of fixed values.

Evaluation of complex functions against distance can be precalculated.

Depends on the form the functions, the simulation can be accelerated 10-100 times

move

move

Simulation Results – Discretized Model

Simulation Results – Discretized Model

0

2

4

6

8

10

12

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Continuum Model

Discretization=20Discretization=10Discretization= 6

-500-450-400-350-300-250-200-150-100-50

0

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Continuum Model

Discretization=20Discretization=10Discretizaion=6

ConclusionConclusion

The equation of state of L-J fluid from the canonical MC simulation agrees with what reported on literature

The Discretized Model can produce results comparable to the Continuum Model

The Discretized Model can make simulations where the normal Continuum Model cannot access

The equation of state of L-J fluid from the canonical MC simulation agrees with what reported on literature

The Discretized Model can produce results comparable to the Continuum Model

The Discretized Model can make simulations where the normal Continuum Model cannot access

RNG ResourcesRNG Resources True Random Numbers

– http://www.random.org/

– http://www.fourmilab.ch/hotbits/

– http://www.robertnz.net/hwrng.htm

– http://world.std.com/~reinhold/truenoise.html Pseudo-random Number Generators

– http://random.mat.sbg.ac.at/

– http://www.math.utah.edu/~alfeld/Random/Random.html

– http://www.mathcom.com/corpdir/techinfo.mdir/scifaq/q210.html

– http://csep1.phy.ornl.gov/rn/rn.html Others

– ftp://ftp.isi.edu/in-notes/rfc1750.txt

True Random Numbers

– http://www.random.org/

– http://www.fourmilab.ch/hotbits/

– http://www.robertnz.net/hwrng.htm

– http://world.std.com/~reinhold/truenoise.html Pseudo-random Number Generators

– http://random.mat.sbg.ac.at/

– http://www.math.utah.edu/~alfeld/Random/Random.html

– http://www.mathcom.com/corpdir/techinfo.mdir/scifaq/q210.html

– http://csep1.phy.ornl.gov/rn/rn.html Others

– ftp://ftp.isi.edu/in-notes/rfc1750.txt

http://csep1.phy.ornl.gov/rn/rn.html

http://mathworld.wolfram.com/QuasirandomSequence.html










http://web.cz3.nus.edu.sg/~yzchen/teach/comphys/sec03.html



http://csep1.phy.ornl.gov/rn/rn.html














Monte Carlo Method Resources

Monte Carlo Method Resources

http://csep1.phy.ornl.gov/mc/mc.html http://www.chem.unl.edu/zeng/joy/mclab/mcintro.html http://csep1.phy.ornl.gov/rn/rn.html http://mathworld.wolfram.com/QuasirandomSequence.html http://www.agner.org/random/ http://web.cz3.nus.edu.sg/~yzchen/teach/comphys/sec03.html

http://csep1.phy.ornl.gov/mc/mc.html http://www.chem.unl.edu/zeng/joy/mclab/mcintro.html http://csep1.phy.ornl.gov/rn/rn.html http://mathworld.wolfram.com/QuasirandomSequence.html http://www.agner.org/random/ http://web.cz3.nus.edu.sg/~yzchen/teach/comphys/sec03.html

an introduction to random number generators and monte carlo methods

Documents