an introduction to random number generators and monte carlo methods
DESCRIPTION
An Introduction to Random Number Generators and Monte Carlo Methods. Josh Gilkerson Wei Li David Owen. Random Number Generators. Uses for Random Numbers. Monte Carlo Simulations Generation of Cryptographic Keys Evolutionary Algorithms Many Combinatorial Optimization Algorithms. - PowerPoint PPT PresentationTRANSCRIPT
An Introduction to Random Number Generators and
Monte Carlo Methods
An Introduction to Random Number Generators and
Monte Carlo MethodsJosh Gilkerson
Wei Li
David Owen
Josh Gilkerson
Wei Li
David Owen
Random Number GeneratorsRandom Number Generators
Uses for Random NumbersUses for Random Numbers
Monte Carlo Simulations Generation of Cryptographic Keys Evolutionary Algorithms Many Combinatorial Optimization Algorithms
Monte Carlo Simulations Generation of Cryptographic Keys Evolutionary Algorithms Many Combinatorial Optimization Algorithms
Two Types of Random NumbersTwo Types of Random Numbers
Pseudorandom numbers are numbers that appear random, but are obtained in a deterministic, repeatable, and predictable manner.
True random numbers are generated in non-deterministic ways. They are not predictable. They are not repeatable.
Pseudorandom numbers are numbers that appear random, but are obtained in a deterministic, repeatable, and predictable manner.
True random numbers are generated in non-deterministic ways. They are not predictable. They are not repeatable.
True Random GeneratorsTrue Random Generators
Use one of several sources of randomness– decay times of radioactive material– electrical noise from a resistor or semiconductor– radio channel or audible noise– keyboard timings
some are better than others usually slower than PRNGs
Use one of several sources of randomness– decay times of radioactive material– electrical noise from a resistor or semiconductor– radio channel or audible noise– keyboard timings
some are better than others usually slower than PRNGs
RNG And Random MachinesRNG And Random Machines
It is not viable to generate a true random number using computers since they are deterministic. However, we can generate a good enough random numbers that have properties close to true random numbers.– The first machine used to produce a table of 100,000 random
digits was done by M. G. Kendall and B. Babington-Smith in 1939.
– RAND Corporation in 1955 released a table of a million random digits.
– ERNIE is a random number generator machine used to pick the winning numbers in the British Premium Bonds lottery.
It is not viable to generate a true random number using computers since they are deterministic. However, we can generate a good enough random numbers that have properties close to true random numbers.– The first machine used to produce a table of 100,000 random
digits was done by M. G. Kendall and B. Babington-Smith in 1939.
– RAND Corporation in 1955 released a table of a million random digits.
– ERNIE is a random number generator machine used to pick the winning numbers in the British Premium Bonds lottery.
Desirable Properties of PRNGsDesirable Properties of PRNGs
Uniform Lengthy period Serially uncorrelated Fast
Uniform Lengthy period Serially uncorrelated Fast
Problems With PRNGProblems With PRNG
It is very difficult to pin point the problem with random number generators when they arise. Usually, the programmers would need to replace the whole random number generator with a better ones.
With small test cases, problems rarely arises. However, when it gets to large scale random number generations (possibly in millions or even billions of numbers) the problem could be apparent. This makes debugging difficult.
In large-scale computing problems, one might need to use a parallel algorithm. The effect is that, sometimes it is not possible to duplicate the simulation exactly.
It is very difficult to pin point the problem with random number generators when they arise. Usually, the programmers would need to replace the whole random number generator with a better ones.
With small test cases, problems rarely arises. However, when it gets to large scale random number generations (possibly in millions or even billions of numbers) the problem could be apparent. This makes debugging difficult.
In large-scale computing problems, one might need to use a parallel algorithm. The effect is that, sometimes it is not possible to duplicate the simulation exactly.
Linear Congruential Generator(LCG)
Linear Congruential Generator(LCG)
Most common Maximum period of 2n for n-bit numbers Xn+1=( aXn + c ) mod m
a,c,m are constants X0 is the seed
Most common Maximum period of 2n for n-bit numbers Xn+1=( aXn + c ) mod m
a,c,m are constants X0 is the seed
Advantages of LCGAdvantages of LCG
Most common Very easily implemented Fast and small (remember only last number) Easily parallelized
– N processes 1 ... N.
– numbers for process n are Xn+iN
– no more expensive than serial version.
Most common Very easily implemented Fast and small (remember only last number) Easily parallelized
– N processes 1 ... N.
– numbers for process n are Xn+iN
– no more expensive than serial version.
Disadvantages of LCGsDisadvantages of LCGs
Other generators have longer maximum periods.
Bad choices of M result in very bad sequences (primes work best, powers of 2 are fast, but not nearly as good).
Initial seed affects period. Low order bits are not random.
Other generators have longer maximum periods.
Bad choices of M result in very bad sequences (primes work best, powers of 2 are fast, but not nearly as good).
Initial seed affects period. Low order bits are not random.
Lagged Fibonacci GeneratorsLagged Fibonacci Generators
Similar to Fibonacci Sequence Increasingly popular
Xn = (X
n-l + X
n-k) mod m (l>k>0)
l seeds are needed m usually a power of 2 Maximum period of (2l-1)x2M-1 when m=2M
Similar to Fibonacci Sequence Increasingly popular
Xn = (X
n-l + X
n-k) mod m (l>k>0)
l seeds are needed m usually a power of 2 Maximum period of (2l-1)x2M-1 when m=2M
Add-with-carry & Subtract-with-borrow
Add-with-carry & Subtract-with-borrow
Similar to LFG
AWC: Xn=(X
n-l+X
n-k+carry) mod m
SWB: Xn=(X
n-l-X
n-k-carry) mod m
Similar to LFG
AWC: Xn=(X
n-l+X
n-k+carry) mod m
SWB: Xn=(X
n-l-X
n-k-carry) mod m
Multiply-with-carry GeneratorsMultiply-with-carry Generators
Similar to LCG
Xn=(aX
n-1+carry) mod m
Similar to LCG
Xn=(aX
n-1+carry) mod m
Inverse Congruential GeneratorsInverse Congruential Generators
Xn=(a * ~X
n-1 + b) mod m
m should be prime ~y is the multiplicative inverse of y in the
field over {0,1,...,m-1}.
Xn=(a * ~X
n-1 + b) mod m
m should be prime ~y is the multiplicative inverse of y in the
field over {0,1,...,m-1}.
PRNG ReviewPRNG Review
This is just a short review. There are many other PRNGs.
Linear Congruential Generator Lagged Fibonacci Generator Add-with-carry Generator Subtract-with-carry Generator Multiply-with-carry Generator Inverse Congruential Generator
This is just a short review. There are many other PRNGs.
Linear Congruential Generator Lagged Fibonacci Generator Add-with-carry Generator Subtract-with-carry Generator Multiply-with-carry Generator Inverse Congruential Generator
Testing RandomnessTesting Randomness
Test for uniform distribution (of singletons, pairs, triples, etc) of the sequence and all subsequences.
DIEHARD - http://stat.fsu.edu/pub/diehard/ NIST - http://csrs.nist.gov/rng
Test for uniform distribution (of singletons, pairs, triples, etc) of the sequence and all subsequences.
DIEHARD - http://stat.fsu.edu/pub/diehard/ NIST - http://csrs.nist.gov/rng
Monte Carlo MethodsMonte Carlo Methods
Introduction of Monte CarloIntroduction of Monte Carlo
Monte Carlo methods have been used for centuries.
However during World War II, this method was used to simulate the probabilistic issues with neutron diffusion (first real use).
Named after the capital of Monaco (one of the world’s center for gambling), due to the similarity to games of chance.
Monte Carlo methods have been used for centuries.
However during World War II, this method was used to simulate the probabilistic issues with neutron diffusion (first real use).
Named after the capital of Monaco (one of the world’s center for gambling), due to the similarity to games of chance.
What is Monte CarloWhat is Monte Carlo
Non Monte Carlo methods typically involve ODE/PDE equations that describe the system.
Monte Carlo methods are stochastic techniques. It is based on the use of random numbers and
probability statistics to simulate problems. Something can be called a Monte Carlo method if it
uses random numbers to examine the problem it is solving.
First, we would need to determine the probability density function (PDF). Then perform random sampling from the PDF. We keep record of each simulation performed and tally them.
Non Monte Carlo methods typically involve ODE/PDE equations that describe the system.
Monte Carlo methods are stochastic techniques. It is based on the use of random numbers and
probability statistics to simulate problems. Something can be called a Monte Carlo method if it
uses random numbers to examine the problem it is solving.
First, we would need to determine the probability density function (PDF). Then perform random sampling from the PDF. We keep record of each simulation performed and tally them.
Probability Density FunctionProbability Density Function
A probability density function (or probability distribution function) is a function f defined on an interval (a, b) and having the following properties:
A probability density function (or probability distribution function) is a function f defined on an interval (a, b) and having the following properties:
Why use Monte CarloWhy use Monte Carlo
It allows us to examine complex system. And is usually easy to formulate (independent of the problem).
For example, solving equations which describe two atoms interactions. This would be doable without using Monte Carlo method. But solving the interactions for thousands of atoms using the same equations is impossible.
However, the solutions are imprecise and it can be very slow if higher precision is desired.
It allows us to examine complex system. And is usually easy to formulate (independent of the problem).
For example, solving equations which describe two atoms interactions. This would be doable without using Monte Carlo method. But solving the interactions for thousands of atoms using the same equations is impossible.
However, the solutions are imprecise and it can be very slow if higher precision is desired.
Components of Monte Carlo simulation
Components of Monte Carlo simulation
Probability distribution functions (pdf's) - the physical (or mathematical) system must be described by a set of pdf's.
Random number generator - a source of random numbers uniformly distributed on the unit interval must be available.
Sampling rule - a prescription for sampling from the specified pdf's, assuming the availability of random numbers on the unit interval, must be given.
Scoring (or tallying) - the outcomes must be accumulated into overall tallies or scores for the quantities of interest.
Probability distribution functions (pdf's) - the physical (or mathematical) system must be described by a set of pdf's.
Random number generator - a source of random numbers uniformly distributed on the unit interval must be available.
Sampling rule - a prescription for sampling from the specified pdf's, assuming the availability of random numbers on the unit interval, must be given.
Scoring (or tallying) - the outcomes must be accumulated into overall tallies or scores for the quantities of interest.
Components of Monte Carlo simulation (cont.)
Components of Monte Carlo simulation (cont.)
Error estimation - an estimate of the statistical error (variance) as a function of the number of trials and other quantities must be determined.
Variance reduction techniques - methods for reducing the variance in the estimated solution to reduce the computational time for Monte Carlo simulation
Parallelization and vectorization - algorithms to allow Monte Carlo methods to be implemented efficiently on advanced computer architectures.
Error estimation - an estimate of the statistical error (variance) as a function of the number of trials and other quantities must be determined.
Variance reduction techniques - methods for reducing the variance in the estimated solution to reduce the computational time for Monte Carlo simulation
Parallelization and vectorization - algorithms to allow Monte Carlo methods to be implemented efficiently on advanced computer architectures.
Monte Carlo Example Computing Pi
Monte Carlo Example Computing Pi
Monte Carlo Example (cont.)Monte Carlo Example (cont.)
So, we can compute PI by generating two numbers for x and y component of a simulated throw. Then we can figure out by using Pythagorean theorem if this throw is inside or outside the circle. We count this hits, and after doing this thousands of times (or more), we can get an estimate value of PI.
Accuracy of the estimate depends on the number of “throws”. An example code would be (assuming we set the radius = 1):
double x = rand(); // get random # in [0, 1] for xdouble y = rand(); // get random # in [0, 1] for ydouble dist = sqrt(x*x + y*y);if (distFromOrigin(x,y) <= 1)
hits++;
So, we can compute PI by generating two numbers for x and y component of a simulated throw. Then we can figure out by using Pythagorean theorem if this throw is inside or outside the circle. We count this hits, and after doing this thousands of times (or more), we can get an estimate value of PI.
Accuracy of the estimate depends on the number of “throws”. An example code would be (assuming we set the radius = 1):
double x = rand(); // get random # in [0, 1] for xdouble y = rand(); // get random # in [0, 1] for ydouble dist = sqrt(x*x + y*y);if (distFromOrigin(x,y) <= 1)
hits++;
What MC NeedsWhat MC Needs
MC methods might needs different RNG.– For example, when simulating outgoing direction for a launched
particle and interactions of the particle with the medium, the following would be the desirable properties:
The attribute of each particle should be independent from each other.
The attribute of all the particles should span across the entire attribute space. I.e., as we approach infinite numbers of particles, the particles launched into a space should cover the space completely.
Next slide will states the properties of the RNG needed.
MC methods might needs different RNG.– For example, when simulating outgoing direction for a launched
particle and interactions of the particle with the medium, the following would be the desirable properties:
The attribute of each particle should be independent from each other.
The attribute of all the particles should span across the entire attribute space. I.e., as we approach infinite numbers of particles, the particles launched into a space should cover the space completely.
Next slide will states the properties of the RNG needed.
What MC Needs (cont.)What MC Needs (cont.)
Any subsequence of random numbers should not be correlated with any other subsequence of random numbers. For example, when simulating the launched particles, we should not generate geometrical patterns.
Random number repetition should occur only after a very large generation of random numbers.
The random numbers generated should be uniform. This point and the first one are loosely related. To achieve more uniformity, some correlations between random numbers must be established.
The RNG should be efficient. It should be vectorizable with low overhead. The processors in parallel systems, should not be required to talk between each other.
Any subsequence of random numbers should not be correlated with any other subsequence of random numbers. For example, when simulating the launched particles, we should not generate geometrical patterns.
Random number repetition should occur only after a very large generation of random numbers.
The random numbers generated should be uniform. This point and the first one are loosely related. To achieve more uniformity, some correlations between random numbers must be established.
The RNG should be efficient. It should be vectorizable with low overhead. The processors in parallel systems, should not be required to talk between each other.
Appropriate PRNGsAppropriate PRNGs
The following are packages of available RNGs (http://www.agner.org/random/).
Uniform RNG in C++ & assembly language Mersenne twister. Mother-of-all. RANROT. In C, we can use drand48() to generate a double type
of random number which is produced using 48-bit integers.
The following are packages of available RNGs (http://www.agner.org/random/).
Uniform RNG in C++ & assembly language Mersenne twister. Mother-of-all. RANROT. In C, we can use drand48() to generate a double type
of random number which is produced using 48-bit integers.
An Application of the Monte Carlo Method
An Application of the Monte Carlo Method
The Effect of Space Discretization on the Canonical Monte Carlo
Simulation
The Effect of Space Discretization on the Canonical Monte Carlo
Simulation
AgendaAgenda
Introduction to the Monte Carlo (MC) molecular simulation Canonical Ensemble Importance Sampling Simulation Process Simulation of the equation of state of the Lennard-Jones
Fluid – Continuum Model Discretized Model Comparison of the simulation results
Introduction to the Monte Carlo (MC) molecular simulation Canonical Ensemble Importance Sampling Simulation Process Simulation of the equation of state of the Lennard-Jones
Fluid – Continuum Model Discretized Model Comparison of the simulation results
IntroductionIntroduction Why molecular simulation?
– Help explaining experimental observations
– simulate critical or extreme conditions
– Guide real experiments
Purpose of this study– Long-term goal - simulation of self-assembly of surfactant solutions – fine
lattice
– The continuum model is not viable under the current computing power
– The discretized model is at least 10 times faster compared with the continuum model
– The effect of space discretization on the simulation results
Why molecular simulation?– Help explaining experimental observations
– simulate critical or extreme conditions
– Guide real experiments
Purpose of this study– Long-term goal - simulation of self-assembly of surfactant solutions – fine
lattice
– The continuum model is not viable under the current computing power
– The discretized model is at least 10 times faster compared with the continuum model
– The effect of space discretization on the simulation results
Canonical EnsembleCanonical Ensemble fixed number of molecules N, fixed
volume V (volume), fixed temperature T The canonical ensemble partition
function from statistical mechanics
Evaluation of observable properties A
Random sampling - brute force Monte Carlo
When estimate <f(x)> , most of the computing is wasted
fixed number of molecules N, fixed volume V (volume), fixed temperature T
The canonical ensemble partition function from statistical mechanics
Evaluation of observable properties A
Random sampling - brute force Monte Carlo
When estimate <f(x)> , most of the computing is wasted
∫ −= )],(exp[ NNNN prHdrdpcQ β
∫∫
−
−>=<
)],(exp[
)],(exp[),(NNNN
NNNNNN
prHdrdp
prHrpAdrdpA
β
β
∫=b
adxxfI )(
∑=
−=
><−=L
iixfL
ab
xfabI
1
)()(
)()(
Importance SamplingImportance Sampling
∑
∫∫∫
=
=
===
N
i i
i
uxw
uxf
L
duuxw
uxfdx
xw
xfxwdxxfI
1
1
0
1
0
1
0
))((
))((1
))((
))((
)(
)()()(
]/)/([1 22 ><−><= wfwfLIδ
Change the integration variable
Standard deviation
Impossible to find the weight function w in multidimensional integrals
The Metropolis MethodThe Metropolis Method
∫∫
−
−>=<
)],(exp[
)],(exp[),(NNNN
NNNNNN
prHdrdp
prHrpAdrdpA
β
β
∫ −−
=)],(exp[
)],(exp[)(
NNNN
NNN
prHdrdp
prHrN
β
βProbability of finding the system in a configuration around r
Evaluation of observable properties A
∑=
>=<L
i
Ni
Ni rArNA
1
)()(
Randomly generate sampling points according to the probability distribution N(r)
The Detailed BalanceThe Detailed Balance
)()0()()()()( onaccnonoaccnoo →×→Ν=→×→Ν αα
)]}()([exp{)(
)(
)(
)(ounu
o
n
onacc
noacc−−=
ΝΝ
=→→ β
Generate sampling points according to the probability distribution – detailed balance
If α is a symmetric matrix
If α is a symmetric matrix
)]}()([exp{)( oununoacc −−=→ β
Simulation ProcessSimulation Process
Initialize the system– Put the system in a random state
Make a trial move– Randomly make a trial move
Calculate the energy change– Reevaluate the interactions of the moved
particles with its neighbors and calculate the energy change
Accept the trial move with the Metropolis scheme
Keep trying the moves until system approach equilibrium– Either monitor the total energy change, or monitor the structure formed
in the simulation box Sampling
– Sample a certain property over a certain number of configurations
Initialize the system– Put the system in a random state
Make a trial move– Randomly make a trial move
Calculate the energy change– Reevaluate the interactions of the moved
particles with its neighbors and calculate the energy change
Accept the trial move with the Metropolis scheme
Keep trying the moves until system approach equilibrium– Either monitor the total energy change, or monitor the structure formed
in the simulation box Sampling
– Sample a certain property over a certain number of configurations
⎪⎩
⎪⎨⎧
<Δ
>ΔΔ
−=
01
0)exp(
E
ETk
EP
b
Continuum ModelContinuum Model
Fixed N, V, T Lennard-Jones
potential Intermolecular force
Virial of the system
Pressure of the system
Fixed N, V, T Lennard-Jones
potential Intermolecular force
Virial of the system
Pressure of the system
€
u(r) = 4ε((δ
r)12 − (
δ
r)6)
))(5.0)((48)( 612
rrdr
durf
δδε −=−=
∑∑>
⋅=i ij
rrfvirrrr
)(31
V
virP +=
βρ
Simulation ProcessSimulation Process
Read simulationparameters
Start
Initialize positionsof all particles
New simulation?
Read oldconfiguration
Monte Carloloop
Stop
yes no
Monte Carlo loop SubroutineStart
Stop
Trial move
Satisfy Metropolis
rule?
Accept thetrial move
Update energyand virial
Sample thepressure
End ofsimulation?
yes
no
yes
no
Main program
Parameters ModelingParameters Modeling
8.0
0.2
100
09.0
0.1
0.1
======
ρ
δε
TNdr
Potential minimum between 2 particlesAverage distance between 2 particlesMaximum displacement of a particleNumber of particlesTemperatureDensity of the system
Simulation Results – Continuum Model
Simulation Results – Continuum Model
Equation of state of L-J fluid - Continuum Model
0
2
4
6
8
10
12
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Equation ofstate of L-Jfluid
Energy vs. Density
-500
-400
-300
-200
-100
00.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Energy vs.Density
Discretized ModelDiscretized Model
Space is discretized. Particles can only move on a 3D mesh (fine lattice).
Distance between particles is a set of fixed values.
Evaluation of complex functions against distance can be precalculated.
Depends on the form the functions, the simulation can be accelerated 10-100 times
Space is discretized. Particles can only move on a 3D mesh (fine lattice).
Distance between particles is a set of fixed values.
Evaluation of complex functions against distance can be precalculated.
Depends on the form the functions, the simulation can be accelerated 10-100 times
move
move
Simulation Results – Discretized Model
Simulation Results – Discretized Model
0
2
4
6
8
10
12
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Continuum Model
Discretization=20Discretization=10Discretization= 6
-500-450-400-350-300-250-200-150-100-50
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Continuum Model
Discretization=20Discretization=10Discretizaion=6
ConclusionConclusion
The equation of state of L-J fluid from the canonical MC simulation agrees with what reported on literature
The Discretized Model can produce results comparable to the Continuum Model
The Discretized Model can make simulations where the normal Continuum Model cannot access
The equation of state of L-J fluid from the canonical MC simulation agrees with what reported on literature
The Discretized Model can produce results comparable to the Continuum Model
The Discretized Model can make simulations where the normal Continuum Model cannot access
RNG ResourcesRNG Resources True Random Numbers
– http://www.random.org/
– http://www.fourmilab.ch/hotbits/
– http://www.robertnz.net/hwrng.htm
– http://world.std.com/~reinhold/truenoise.html Pseudo-random Number Generators
– http://random.mat.sbg.ac.at/
– http://www.math.utah.edu/~alfeld/Random/Random.html
– http://www.mathcom.com/corpdir/techinfo.mdir/scifaq/q210.html
– http://csep1.phy.ornl.gov/rn/rn.html Others
– ftp://ftp.isi.edu/in-notes/rfc1750.txt
True Random Numbers
– http://www.random.org/
– http://www.fourmilab.ch/hotbits/
– http://www.robertnz.net/hwrng.htm
– http://world.std.com/~reinhold/truenoise.html Pseudo-random Number Generators
– http://random.mat.sbg.ac.at/
– http://www.math.utah.edu/~alfeld/Random/Random.html
– http://www.mathcom.com/corpdir/techinfo.mdir/scifaq/q210.html
– http://csep1.phy.ornl.gov/rn/rn.html Others
– ftp://ftp.isi.edu/in-notes/rfc1750.txt
Monte Carlo Method Resources
Monte Carlo Method Resources
http://csep1.phy.ornl.gov/mc/mc.html http://www.chem.unl.edu/zeng/joy/mclab/mcintro.html http://csep1.phy.ornl.gov/rn/rn.html http://mathworld.wolfram.com/QuasirandomSequence.html http://www.agner.org/random/ http://web.cz3.nus.edu.sg/~yzchen/teach/comphys/sec03.html
http://csep1.phy.ornl.gov/mc/mc.html http://www.chem.unl.edu/zeng/joy/mclab/mcintro.html http://csep1.phy.ornl.gov/rn/rn.html http://mathworld.wolfram.com/QuasirandomSequence.html http://www.agner.org/random/ http://web.cz3.nus.edu.sg/~yzchen/teach/comphys/sec03.html