why the irs cares about the riemann zeta function and number … · 2019. 9. 9. · intro general...
TRANSCRIPT
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Why the IRS cares about the RiemannZeta Function and Number Theory
(and why you should too!)
Steven J. [email protected],
http://web.williams.edu/Mathematics/sjmiller/public_html/
Stresa, Italy, July 11, 1, 20191
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Introduction
A. Berger and T. P. Hill, An Introduction to Benford’s Law,Princeton University Press, Princeton, 2015. See also http://www.benfordonline.net/.
A. E. Kossovsky, Benford’s Law: Theory, the General Law ofRelative Quantities, and Forensic Fraud Detection Applications,WSPC, 2014.
S. J. Miller (editor), Theory and Applications of Benford’s Law,Princeton University Press, 2015.
M. Nigrini, Benford’s Law: Applications for Forensic Accounting,Auditing, and Fraud Detection, 1st Edition, Wiley, 2014.
2
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Interesting Question
Motivating Question: For a nice data set, such as theFibonacci numbers, stock prices, street addresses ofcollege employees and students, ..., what percent of theleading digits are 1?
Natural guess: 10% (but immediately correct to 11%!).
3
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Interesting Question
Motivating Question: For a nice data set, such as theFibonacci numbers, stock prices, street addresses ofcollege employees and students, ..., what percent of theleading digits are 1?
Answer: Benford’s law!
4
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Examples with First Digit Bias
Fibonacci numbers
Most common iPhone passcodes
Twitter users by # followers
Distance of stars from Earth
5
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Summary
Explain Benford’s Law.
Discuss examples and applications.
Sketch proofs.
Describe open problems.
6
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Caveats!
A math test indicating fraud is not proof of fraud:unlikely events, alternate reasons.
7
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Caveats!
A math test indicating fraud is not proof of fraud:unlikely events, alternate reasons.
8
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Examples
recurrence relations
special functions (such as n!)
iterates of power, exponential, rational maps
products of random variables
L-functions, characteristic polynomials
iterates of the 3x + 1 map
differences of order statistics
hydrology and financial data
many hierarchical Bayesian models
9
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Applications
Analyzing round-off errors.
Determining the optimal way to storenumbers.
Detecting tax and image fraud, and dataintegrity.
10
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
General Theory
11
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Benford’s Law: Newcomb (1881), Benford (1938)
StatementFor many data sets, probability of observing afirst digit of d base B is logB
(d+1
d
); base 10
about 30% are 1s.
Benford’s Law (probabilities)12
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Background Material
Modulo: a = b mod c if a − b is an integer times c; thus17 = 5 mod 12, and 4.5 = .5 mod1.
13
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Background Material
Modulo: a = b mod c if a − b is an integer times c; thus17 = 5 mod 12, and 4.5 = .5 mod1.
Significand: x = S10(x) · 10k , k integer, 1 ≤ S10(x) < 10.
14
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Background Material
Modulo: a = b mod c if a − b is an integer times c; thus17 = 5 mod 12, and 4.5 = .5 mod1.
Significand: x = S10(x) · 10k , k integer, 1 ≤ S10(x) < 10.
S10(x) = S10(x) if and only if x and x have the sameleading digits. Note log10 x = log10 S10(x) + k .
15
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Background Material
Modulo: a = b mod c if a − b is an integer times c; thus17 = 5 mod 12, and 4.5 = .5 mod1.
Significand: x = S10(x) · 10k , k integer, 1 ≤ S10(x) < 10.
S10(x) = S10(x) if and only if x and x have the sameleading digits. Note log10 x = log10 S10(x) + k .
Key observation: log10(x) = log10(x) mod 1 if and only if xand x have the same leading digits.
Thus often study y = log10 x mod 1.Advanced: e2πiu = e2πi(u mod 1).
16
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Equidistribution and Benford’s Law
Equidistribution{yn}∞n=1 is equidistributed modulo 1 if probabilityyn mod 1 ∈ [a, b] tends to b − a:
#{n ≤ N : yn mod 1 ∈ [a, b]}N
→ b − a.
17
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Equidistribution and Benford’s Law
Equidistribution{yn}∞n=1 is equidistributed modulo 1 if probabilityyn mod 1 ∈ [a, b] tends to b − a:
#{n ≤ N : yn mod 1 ∈ [a, b]}N
→ b − a.
Thm: β 6∈ Q, nβ is equidistributed mod 1.
18
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Equidistribution and Benford’s Law
Equidistribution{yn}∞n=1 is equidistributed modulo 1 if probabilityyn mod 1 ∈ [a, b] tends to b − a:
#{n ≤ N : yn mod 1 ∈ [a, b]}N
→ b − a.
Thm: β 6∈ Q, nβ is equidistributed mod 1.
Examples: log10 2, log10
(1+
√5
2
)6∈ Q.
19
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Equidistribution and Benford’s Law
Equidistribution{yn}∞n=1 is equidistributed modulo 1 if probabilityyn mod 1 ∈ [a, b] tends to b − a:
#{n ≤ N : yn mod 1 ∈ [a, b]}N
→ b − a.
Thm: β 6∈ Q, nβ is equidistributed mod 1.
Examples: log10 2, log10
(1+
√5
2
)6∈ Q.
Proof: if rational: 2 = 10p/q.
20
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Equidistribution and Benford’s Law
Equidistribution{yn}∞n=1 is equidistributed modulo 1 if probabilityyn mod 1 ∈ [a, b] tends to b − a:
#{n ≤ N : yn mod 1 ∈ [a, b]}N
→ b − a.
Thm: β 6∈ Q, nβ is equidistributed mod 1.
Examples: log10 2, log10
(1+
√5
2
)6∈ Q.
Proof: if rational: 2 = 10p/q.Thus 2q = 10p or 2q−p = 5p, impossible.
21
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Example of Equidistribution: n√π mod 1
0.2 0.4 0.6 0.8 1
0.5
1.0
1.5
2.0
n√π mod 1 for n ≤ 10
22
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Example of Equidistribution: n√π mod 1
0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1.0
n√π mod 1 for n ≤ 100
23
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Example of Equidistribution: n√π mod 1
0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1.0
n√π mod 1 for n ≤ 1000
24
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Example of Equidistribution: n√π mod 1
0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1.0
n√π mod 1 for n ≤ 10, 000
25
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Logarithms and Benford’s Law
Fundamental EquivalenceData set {xi} is Benford base B if {yi} isequidistributed mod 1, where yi = logB xi .
26
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Logarithms and Benford’s Law
Fundamental EquivalenceData set {xi} is Benford base B if {yi} isequidistributed mod 1, where yi = logB xi .
x = S10(x) · 10k then
log10 x = log10 S10(x) + k = log10 S10x mod 1.
27
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Logarithms and Benford’s Law
Fundamental EquivalenceData set {xi} is Benford base B if {yi} isequidistributed mod 1, where yi = logB xi .
x = S10(x) · 10k then
log10 x = log10 S10(x) + k = log10 S10x mod 1.
28
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Logarithms and Benford’s Law
Prob(leading digit d)= log10(d +1)− log10(d)= log10
(d+1
d
)
= log10
(1 + 1
d
).
Have Benford’s law ↔mantissa of logarithmsof data are uniformlydistributed
29
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
The Power of the Right Perspective
30
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Examples
2n is Benford base 10 as log10 2 6∈ Q.
31
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Examples
Fibonacci numbers are Benford base 10.
32
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Examples
Fibonacci numbers are Benford base 10.an+1 = an + an−1.
33
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Examples
Fibonacci numbers are Benford base 10.an+1 = an + an−1.Guess an = rn: rn+1 = rn + rn−1 or r2 = r + 1.
34
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Examples
Fibonacci numbers are Benford base 10.an+1 = an + an−1.Guess an = rn: rn+1 = rn + rn−1 or r2 = r + 1.Roots r = (1 ±
√5)/2.
35
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Examples
Fibonacci numbers are Benford base 10.an+1 = an + an−1.Guess an = rn: rn+1 = rn + rn−1 or r2 = r + 1.Roots r = (1 ±
√5)/2.
General solution: an = c1rn1 + c2rn
2 .
36
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Examples
Fibonacci numbers are Benford base 10.an+1 = an + an−1.Guess an = rn: rn+1 = rn + rn−1 or r2 = r + 1.Roots r = (1 ±
√5)/2.
General solution: an = c1rn1 + c2rn
2 .
Binet: an = 1√5
(1+
√5
2
)n− 1√
5
(1−
√5
2
)n.
37
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Examples
Fibonacci numbers are Benford base 10.an+1 = an + an−1.Guess an = rn: rn+1 = rn + rn−1 or r2 = r + 1.Roots r = (1 ±
√5)/2.
General solution: an = c1rn1 + c2rn
2 .
Binet: an = 1√5
(1+
√5
2
)n− 1√
5
(1−
√5
2
)n.
Most linear recurrence relations Benford:
38
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Examples
Fibonacci numbers are Benford base 10.an+1 = an + an−1.Guess an = rn: rn+1 = rn + rn−1 or r2 = r + 1.Roots r = (1 ±
√5)/2.
General solution: an = c1rn1 + c2rn
2 .
Binet: an = 1√5
(1+
√5
2
)n− 1√
5
(1−
√5
2
)n.
Most linear recurrence relations Benford:⋄ an+1 = 2an
39
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Examples
Fibonacci numbers are Benford base 10.an+1 = an + an−1.Guess an = rn: rn+1 = rn + rn−1 or r2 = r + 1.Roots r = (1 ±
√5)/2.
General solution: an = c1rn1 + c2rn
2 .
Binet: an = 1√5
(1+
√5
2
)n− 1√
5
(1−
√5
2
)n.
Most linear recurrence relations Benford:⋄ an+1 = 2an − an−1
40
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Examples
Fibonacci numbers are Benford base 10.an+1 = an + an−1.Guess an = rn: rn+1 = rn + rn−1 or r2 = r + 1.Roots r = (1 ±
√5)/2.
General solution: an = c1rn1 + c2rn
2 .
Binet: an = 1√5
(1+
√5
2
)n− 1√
5
(1−
√5
2
)n.
Most linear recurrence relations Benford:⋄ an+1 = 2an − an−1
⋄ take a0 = a1 = 1 or a0 = 0, a1 = 1.
41
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Digits of 2n
First 60 values of 2n (only displaying 30)1 1024 1048576 digit # Obs Prob Benf Prob2 2048 2097152 1 18 .300 .3014 4096 4194304 2 12 .200 .1768 8192 8388608 3 6 .100 .125
16 16384 16777216 4 6 .100 .09732 32768 33554432 5 6 .100 .07964 65536 67108864 6 4 .067 .067
128 131072 134217728 7 2 .033 .058256 262144 268435456 8 5 .083 .051512 524288 536870912 9 1 .017 .046
42
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Digits of 2n
First 60 values of 2n (only displaying 30)1 1024 1048576 digit # Obs Prob Benf Prob2 2048 2097152 1 18 .300 .3014 4096 4194304 2 12 .200 .1768 8192 8388608 3 6 .100 .125
16 16384 16777216 4 6 .100 .09732 32768 33554432 5 6 .100 .07964 65536 67108864 6 4 .067 .067
128 131072 134217728 7 2 .033 .058256 262144 268435456 8 5 .083 .051512 524288 536870912 9 1 .017 .046
43
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Digits of 2n
First 60 values of 2n (only displaying 30): 210 = 1024 ≈ 103.1 1024 1048576 digit # Obs Prob Benf Prob2 2048 2097152 1 18 .300 .3014 4096 4194304 2 12 .200 .1768 8192 8388608 3 6 .100 .125
16 16384 16777216 4 6 .100 .09732 32768 33554432 5 6 .100 .07964 65536 67108864 6 4 .067 .067
128 131072 134217728 7 2 .033 .058256 262144 268435456 8 5 .083 .051512 524288 536870912 9 1 .017 .046
44
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Logarithms and Benford’s Law
χ2 values for αn, 1 ≤ n ≤ N (5% 15.5).N χ2(γ) χ2(e) χ2(π)
100 0.72 0.30 46.65200 0.24 0.30 8.58400 0.14 0.10 10.55500 0.08 0.07 2.69700 0.19 0.04 0.05800 0.04 0.03 6.19900 0.09 0.09 1.71
1000 0.02 0.06 2.90
45
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Logarithms and Benford’s Law: Base 10 (5%: log(χ2) ≈ 2.74)
log(χ2) vs N for πn (red) and en (blue),n ∈ {1, . . . ,N}. Note π175 ≈ 1.0028 · 1087.
200 400 600 800 1000
-1.5
-1.0
-0.5
0.5
1.0
1.5
2.0
46
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Logarithms and Benford’s Law: Base 10 (5%: log(χ2) ≈ 2.74)
log(χ2) vs N for πn (red) and en (blue),n ∈ {1, . . . ,N}. Note π175 ≈ 1.0028 · 1087.
200 400 600 800 1000
-1.5
-1.0
-0.5
0.5
1.0
1.5
2.0
47
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
New Result: Linear Recurrence Relations of Degree 2
an+1 = f (n)an + g(n)an−1 with non-constantcoefficients f (n) and g(n).
Explore conditions on f and g such that thesequence generated obeys Benford’s Lawfor all initial values.
First solve the closed form of the sequence(an), then analyze its main term.
48
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Main idea: reduce the degree of recurrence.
an+1 = (λ(n) + µ(n))an − µ(n)λ(n − 1)an−1,and compare the coefficients:
f (n) = λ(n) + µ(n)
g(n) = −λ(n − 1)µ(n).
We show that for any given pair of f and g,such λ and µ always exist.
49
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Linear Recurrence Relations of Degree 2
Recurrence relations of degree 1:
an+1 = λ(n)an + bn
bn = µ(n)bn−1.
an+1 = r(n)(
1 +n∑
k=3
n∏i=k
λ(i)µ(i) +
a2
b1
n∏i=2
λ(i)µ(i)
),
where r(n) := b1
n∏i=2
µ(i).
Find conditions on µ, λ such that main termdominates; Benford if
∏µ(i) is.
50
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Examples when f and g are functions
If µ(k) = k , then r(n) = n!.
If µ(k) = kα where α ∈ R, then r(n) = (n!)α.
If µ(k) = exp(αh(k)) where α is irrational andh(k) is a monic polynomial, then
log r(n) = αn∑
k=1h(k).
LemmaThe sequence {αp(n)} is equidistributed mod 1if α /∈ Q and p(n) a monic polynomial.
51
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Examples when f and g are random variables
Take µ(n) ∼ h(n)Un where the Un’s areindependent uniform distributions on [0, 1],and h(n) is a deterministic function in n such
thatn∏
i=1h(i) is Benford.
Then r(n) =n∏
i=1h(i)
n∏i=1
Ui is Benford.
Take µ(n) ∼ exp(Un) where the Un’s are i.i.d.random variables. Then take logarithm andsum up log(µ(n)). Apply Central LimitTheorem and get a Gaussian distributionwith increasing variance.52
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Linear Recurrences of Higher Degree
Use recurrence relation of degree 3 as anexample. Similar main idea: reduce thedegree.
Define the sequence {an}∞n=1 byan+1 = f1(n)an + f2(n)an−1 + f3(n)an−2.
Define an auxiliary sequence (bn)∞n=1 by
bn = an+1 − λ(n)an. Then (bn) is degree 2.
53
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Why Benford’s Law?
54
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Streets
Not all data sets satisfy Benford’s Law.Long street [1,L]: L = 199 versus L = 999.Oscillates b/w 1/9 and 5/9 with first digit 1.
Probability first digit 1 versus street length L.What if we have many streets of different lengths?
55
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Streets
Not all data sets satisfy Benford’s Law.Long street [1,L]: L = 199 versus L = 999.Oscillates b/w 1/9 and 5/9 with first digit 1.
Probability first digit 1 versus street length L.What if we have many streets of different lengths?
56
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Streets
Not all data sets satisfy Benford’s Law.Long street [1,L]: L = 199 versus L = 999.Oscillates b/w 1/9 and 5/9 with first digit 1.
Probability first digit 1 versus log(street length L).What if we have many streets of different lengths?
57
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Streets
Not all data sets satisfy Benford’s Law.Long street [1,L]: L = 199 versus L = 999.Oscillates b/w 1/9 and 5/9 with first digit 1.
Probability first digit 1 versus log(street length L).What if we have many streets of different lengths?
58
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Amalgamating Streets
All houses: 1000 Streets,
each from 1 to 10000.
First digit and first two digits vs Benford.Conclusion: More processes, closer to Benford.
59
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Amalgamating Streets
All houses: 1000 Streets,
each from 1 to rand(10000).
First digit and first two digits vs Benford.Conclusion: More processes, closer to Benford.
60
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Amalgamating Streets
All houses: 1000 Streets,
each 1 to rand(rand(10000)).
First digit and first two digits vs Benford.Conclusion: More processes, closer to Benford.
61
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Amalgamating Streets
All houses: 1000 Streets,
each 1 to rand(rand(rand(10000))).
First digit and first two digits vs Benford.Conclusion: More processes, closer to Benford.
62
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Probability Review
Let X be random variable with density p(x):⋄ p(x) ≥ 0;
∫∞
−∞p(x)dx = 1;
⋄ Prob (a ≤ X ≤ b) =∫ b
a p(x)dx .
Mean µ =∫∞
−∞xp(x)dx .
Variance σ2 =∫∞
−∞(x − µ)2p(x)dx .
Independence: knowledge of one random variablegives no knowledge of the other.
63
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Central Limit Theorem
Normal N(µ, σ2) : p(x) = e−(x−µ)2/2σ2/√
2πσ2.
TheoremIf X1,X2, . . . independent, identically distributed randomvariables (mean µ, variance σ2, finite moments) then
SN :=X1 + · · ·+ XN − Nµ
σ√
Nconverges to N(0, 1).
64
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Central Limit Theorem: Sums of Uniform Random VariablesXi ∼ Unif(−1/2,1/2) (adjusted to mean 0, variance 1)
Y1 = X1/σX1 vs N(0, 1).
Density of Y1 versus N(0, 1).65
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Central Limit Theorem: Sums of Uniform Random VariablesXi ∼ Unif(−1/2,1/2) (adjusted to mean 0, variance 1)
Y2 = (X1 + X2)/σX1+X2 vs N(0, 1).
Density of Y2 versus N(0, 1).66
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Central Limit Theorem: Sums of Uniform Random VariablesXi ∼ Unif(−1/2,1/2) (adjusted to mean 0, variance 1)
Y4 = (X1 +X2 +X3 +X4)/σX1+X2+X3+X4 vs N(0, 1).
Density of Y4 versus N(0, 1).67
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Central Limit Theorem: Sums of Uniform Random VariablesXi ∼ Unif(−1/2,1/2) (adjusted to mean 0, variance 1)
Y8 = (X1 + · · ·+ X8)/σX1+···+X8 vs N(0, 1).
Density of Y4 versus N(0, 1).68
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Central Limit Theorem: Sums of Uniform Random VariablesXi ∼ Unif(−1/2,1/2) (adjusted to mean 0, variance 1)
Density of Y4 = (X1 + · · ·+ X4)/σX1+···+X4.
(Don’t even think of asking to see Y8’s!)69
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Normal Distributions Mod 1
As σ → ∞, N(0, σ2) mod 1 → Unif(0, 1).
Variance is .01.70
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Normal Distributions Mod 1
As σ → ∞, N(0, σ2) mod 1 → Unif(0, 1).
Variance is .1.71
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Normal Distributions Mod 1
As σ → ∞, N(0, σ2) mod 1 → Unif(0, 1).
Variance is .5.72
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Products and Benford’s Law
Pavlovian Response: See a product, take a logarithm.
73
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Products and Benford’s Law
Pavlovian Response: See a product, take a logarithm.
X1,X2, . . . nice, WN = X1 · X2 · · ·XN .
74
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Products and Benford’s Law
Pavlovian Response: See a product, take a logarithm.
X1,X2, . . . nice, WN = X1 · X2 · · ·XN .
Yi = log10 Xi , VN := log10 WN .
75
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Products and Benford’s Law
Pavlovian Response: See a product, take a logarithm.
X1,X2, . . . nice, WN = X1 · X2 · · ·XN .
Yi = log10 Xi , VN := log10 WN .
VN = log10(X1 · X2 · · ·XN)
76
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Products and Benford’s Law
Pavlovian Response: See a product, take a logarithm.
X1,X2, . . . nice, WN = X1 · X2 · · ·XN .
Yi = log10 Xi , VN := log10 WN .
VN = log10(X1 · X2 · · ·XN)
= log10 X1 + log10 X2 + · · ·+ log10 XN
77
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Products and Benford’s Law
Pavlovian Response: See a product, take a logarithm.
X1,X2, . . . nice, WN = X1 · X2 · · ·XN .
Yi = log10 Xi , VN := log10 WN .
VN = log10(X1 · X2 · · ·XN)
= log10 X1 + log10 X2 + · · ·+ log10 XN
= Y1 + Y2 + · · ·+ YN .
Need distribution of VN mod 1, which by CLT becomes uniform,
implying Benfordness!
78
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Applications
79
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Applications for the IRS: Detecting Fraud
A Tale of Two Steve Millers....80
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Detecting Fraud
Bank FraudAudit of a bank revealed huge spike ofnumbers starting with 48 and 49, most dueto one person.
Write-off limit of $5,000. Officer had friendsapplying for credit cards, ran up balancesjust under $5,000 then he would write thedebts off.
81
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Can you see the cat in the tree?
82
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Transmitting Images
How to transmit an image?
Have an L × W grid with LW pixels.
Each pixel a triple: (Red, Green, Blue).
Often each value in {0, 1, 2, 3, . . . , 2n − 1}.
n = 8 gives 256 choices for each, or16,777,216 possibilities.
83
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Steganography
Steganography: Concealing a message inanother message: https://en.wikipedia.org/wiki/Steganography.
84
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Steganography
Steganography: Concealing a message inanother message: https://en.wikipedia.org/wiki/Steganography.
Take one of the colors, say red, a number from 0to 255.
Write in binary: r727 + r626 + · · ·+ r12 + r0.
If change just the last or last two digits, veryminor change to image.
Can hide an image in another.85
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Can you see the cat in the tree?
86
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Can you see the cat in the tree?
87
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Benford Good Processes
A. Kontorovich and S. J. Miller, Benford’sLaw, values of L-functions and the 3x + 1problem, Acta Arithmetica 120 (2005), no. 3,269–297.
88
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Poisson Summation and Benford’s Law: Definitions
Feller, Pinkham (often exact processes)
89
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Poisson Summation and Benford’s Law: Definitions
Feller, Pinkham (often exact processes)data YT ,B = logB
−→X T (discrete/continuous):
P(A) = limT→∞
#{n ∈ A : n ≤ T}T
90
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Poisson Summation and Benford’s Law: Definitions
Feller, Pinkham (often exact processes)data YT ,B = logB
−→X T (discrete/continuous):
P(A) = limT→∞
#{n ∈ A : n ≤ T}T
Poisson Summation Formula: f nice:∞∑
ℓ=−∞f (ℓ) =
∞∑
ℓ=−∞f (ℓ),
Fourier transform f (ξ) =
∫ ∞
−∞f (x)e−2πixξdx .
91
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Benford Good Process
XT is Benford Good if there is a nice f st
CDF−→Y T ,B
(y) =
∫ y
−∞
1T
f(
tT
)dt+ET (y) := GT (y)
and monotonically increasing h (h(|T |) → ∞):
92
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Benford Good Process
XT is Benford Good if there is a nice f st
CDF−→Y T ,B
(y) =
∫ y
−∞
1T
f(
tT
)dt+ET (y) := GT (y)
and monotonically increasing h (h(|T |) → ∞):Small tails: GT (∞)− GT (Th(T )) = o(1),GT (−Th(T ))− GT (−∞) = o(1).
93
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Benford Good Process
XT is Benford Good if there is a nice f st
CDF−→Y T ,B
(y) =
∫ y
−∞
1T
f(
tT
)dt+ET (y) := GT (y)
and monotonically increasing h (h(|T |) → ∞):Small tails: GT (∞)− GT (Th(T )) = o(1),GT (−Th(T ))− GT (−∞) = o(1).Decay of the Fourier Transform:∑
ℓ 6=0
∣∣∣ f (T ℓ)ℓ
∣∣∣ = o(1).
94
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Benford Good Process
XT is Benford Good if there is a nice f st
CDF−→Y T ,B
(y) =
∫ y
−∞
1T
f(
tT
)dt+ET (y) := GT (y)
and monotonically increasing h (h(|T |) → ∞):Small tails: GT (∞)− GT (Th(T )) = o(1),GT (−Th(T ))− GT (−∞) = o(1).Decay of the Fourier Transform:∑
ℓ 6=0
∣∣∣ f (T ℓ)ℓ
∣∣∣ = o(1).
Small translated error: E(a, b,T )) =∑|ℓ|≤Th(T ) [ET (b + ℓ)− ET (a + ℓ)] = o(1).
95
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Main Theorem
Theorem (Kontorovich and M–, 2005)XT converging to X as T → ∞ (think spreadingGaussian). If XT is Benford good, then X isBenford.
96
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Main Theorem
Theorem (Kontorovich and M–, 2005)XT converging to X as T → ∞ (think spreadingGaussian). If XT is Benford good, then X isBenford.
Examples⋄ L-functions⋄ characteristic polynomials (RMT)⋄ 3x + 1 problem⋄ geometric Brownian motion.
97
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Sketch of the proof
Structure Theorem:⋄ main term is something nice spreading out⋄ apply Poisson summation
98
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Sketch of the proof
Structure Theorem:⋄ main term is something nice spreading out⋄ apply Poisson summation
Control translated errors:⋄ hardest step⋄ techniques problem specific
99
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Sketch of the proof (continued)
∞∑
ℓ=−∞P
(a + ℓ ≤ −→
Y T ,B ≤ b + ℓ)
100
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Sketch of the proof (continued)
∞∑
ℓ=−∞P
(a + ℓ ≤ −→
Y T ,B ≤ b + ℓ)
=∑
|ℓ|≤Th(T )
[GT (b + ℓ)− GT (a + ℓ)] + o(1)
101
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Sketch of the proof (continued)
∞∑
ℓ=−∞P
(a + ℓ ≤ −→
Y T ,B ≤ b + ℓ)
=∑
|ℓ|≤Th(T )
[GT (b + ℓ)− GT (a + ℓ)] + o(1)
=
∫ b
a
∑
|ℓ|≤Th(T )
1T
f(
t + ℓ
T
)dt + E(a, b,T ) + o(1)
102
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Sketch of the proof (continued)
∞∑
ℓ=−∞P
(a + ℓ ≤ −→
Y T ,B ≤ b + ℓ)
=∑
|ℓ|≤Th(T )
[GT (b + ℓ)− GT (a + ℓ)] + o(1)
=
∫ b
a
∑
|ℓ|≤Th(T )
1T
f(
t + ℓ
T
)dt + E(a, b,T ) + o(1)
= f (0) · (b − a) +∑
ℓ 6=0
f (T ℓ)e2πibℓ − e2πiaℓ
2πiℓ+ o(1).
103
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Riemann Zeta Function (for real part of s greater than 1)
ζ(s) =∞∑
n=1
1ns
=∏
p prime
(1 − 1
ps
)−1
, Re(s) > 1.
Geometric Series Formula: (1 − x)−1 = 1 + x + x2 + · · · .Unique Factorization: n = pr1
1 · · · prmm .
∏
p
(1 − 1
ps
)−1
=
[1 +
12s +
(12s
)2
+ · · ·][
1 +13s +
(13s
)2
+ · · ·]· · ·
=∑
n
1ns .
104
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Riemann Zeta Function
∣∣ζ(
12 + ik
4
)∣∣, k ∈ {0, 1, . . . , 65535}.
2 4 6 8
0.05
0.1
0.15
0.2
0.25
0.3
105
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
3x + 1 Problem
Kakutani (conspiracy), Erdös (not ready).
x odd, T (x) = 3x+12k , 2k ||3x + 1.
Conjecture: for some n = n(x), T n(x) = 1.
106
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
3x + 1 Problem
Kakutani (conspiracy), Erdös (not ready).
x odd, T (x) = 3x+12k , 2k ||3x + 1.
Conjecture: for some n = n(x), T n(x) = 1.
7
107
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
3x + 1 Problem
Kakutani (conspiracy), Erdös (not ready).
x odd, T (x) = 3x+12k , 2k ||3x + 1.
Conjecture: for some n = n(x), T n(x) = 1.
7 →1 11
108
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
3x + 1 Problem
Kakutani (conspiracy), Erdös (not ready).
x odd, T (x) = 3x+12k , 2k ||3x + 1.
Conjecture: for some n = n(x), T n(x) = 1.
7 →1 11 →1 17
109
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
3x + 1 Problem
Kakutani (conspiracy), Erdös (not ready).
x odd, T (x) = 3x+12k , 2k ||3x + 1.
Conjecture: for some n = n(x), T n(x) = 1.
7 →1 11 →1 17 →2 13
110
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
3x + 1 Problem
Kakutani (conspiracy), Erdös (not ready).
x odd, T (x) = 3x+12k , 2k ||3x + 1.
Conjecture: for some n = n(x), T n(x) = 1.
7 →1 11 →1 17 →2 13 →3 5
111
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
3x + 1 Problem
Kakutani (conspiracy), Erdös (not ready).
x odd, T (x) = 3x+12k , 2k ||3x + 1.
Conjecture: for some n = n(x), T n(x) = 1.
7 →1 11 →1 17 →2 13 →3 5 →4 1
112
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
3x + 1 Problem
Kakutani (conspiracy), Erdös (not ready).
x odd, T (x) = 3x+12k , 2k ||3x + 1.
Conjecture: for some n = n(x), T n(x) = 1.
7 →1 11 →1 17 →2 13 →3 5 →4 1 →2 1,
113
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
3x + 1 Problem
Kakutani (conspiracy), Erdös (not ready).
x odd, T (x) = 3x+12k , 2k ||3x + 1.
Conjecture: for some n = n(x), T n(x) = 1.
7 →1 11 →1 17 →2 13 →3 5 →4 1 →2 1,2-path (1, 1), 5-path (1, 1, 2, 3, 4).m-path: (k1, . . . , km).
114
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
3x + 1 and Benford
Theorem (Kontorovich and M–, 2005)As m → ∞, xm/(3/4)mx0 is Benford.
Theorem (Lagarias-Soundararajan, 2006)
X ≥ 2N , for all but at most c(B)N−1/36X initialseeds the distribution of the first N iterates ofthe 3x + 1 map are within 2N−1/36 of theBenford probabilities.
115
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
3x + 1 Data: random 10,000 digit number, 2k ||3x + 1
80,514 iterations ((4/3)n = a0 predicts 80,319);χ2 = 13.5 (5% 15.5).
Digit Number Observed Benford1 24251 0.301 0.3012 14156 0.176 0.1763 10227 0.127 0.1254 7931 0.099 0.0975 6359 0.079 0.0796 5372 0.067 0.0677 4476 0.056 0.0588 4092 0.051 0.0519 3650 0.045 0.046
116
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
3x + 1 Data: random 10,000 digit number, 2|3x + 1
241,344 iterations, χ2 = 11.4 (5% 15.5).
Digit Number Observed Benford1 72924 0.302 0.3012 42357 0.176 0.1763 30201 0.125 0.1254 23507 0.097 0.0975 18928 0.078 0.0796 16296 0.068 0.0677 13702 0.057 0.0588 12356 0.051 0.0519 11073 0.046 0.046
117
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Stick Decomposition
T. Becker, D. Burt, T. C. Corcoran, A. Greaves-Tunnell, J. R.Iafrate, J. Jing, S. J. Miller, J. D. Porfilio, R. Ronan, J.Samranvedhya, F. W. Strauch and B. Talbut, Benford’s Law andContinuous Dependent Random Variables, Annals of Physics388 (2018), 350–381.
J. Iafrate, S. J. Miller and F. W. Strauch, Equipartitions and adistribution for numbers: A statistical model for Benford’s law,Physical Review E 91 (2015), no. 6, 062138 (6 pages).
118
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Fixed Proportion Decomposition Process
Decomposition Process
1 Consider a stick of length L.
119
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Fixed Proportion Decomposition Process
Decomposition Process
1 Consider a stick of length L.
2 Uniformly choose a proportion p ∈ (0, 1).
120
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Fixed Proportion Decomposition Process
Decomposition Process
1 Consider a stick of length L.
2 Uniformly choose a proportion p ∈ (0, 1).
3 Break the stick into two pieces—lengths pLand (1 − p)L.
121
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Fixed Proportion Decomposition Process
Decomposition Process
1 Consider a stick of length L.
2 Uniformly choose a proportion p ∈ (0, 1).
3 Break the stick into two pieces—lengths pLand (1 − p)L.
4 Repeat N times (using the same proportion).
122
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Fixed Proportion Decomposition Process
123
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Fixed Proportion Conjecture (Joy Jing ’13)
Conjecture: The above decomposition processis Benford as N → ∞ for any p ∈ (0, 1), p 6= 1
2 .
124
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Fixed Proportion Conjecture (Joy Jing ’13)
Conjecture: The above decomposition processis Benford as N → ∞ for any p ∈ (0, 1), p 6= 1
2 .
Counterexample (SMALL REU ’13): p = 111 , 1 − p = 10
11 .
125
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Benford Analysis
At N th level,
2N sticks
N + 1 distinct lengths: write pN−j(1 − p)j as
pN
(1 − p
p
)j
, j ∈ {0, . . . ,N}, have
(Nj
)times.
126
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Benford Analysis
At N th level,
2N sticks
N + 1 distinct lengths: write pN−j(1 − p)j as
pN
(1 − p
p
)j
, j ∈ {0, . . . ,N}, have
(Nj
)times.
(Weighted) Geometric with ratio 1−pp = 10y ;
behavior depends on irrationality of y !
127
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Benford Analysis
At N th level,
2N sticks
N + 1 distinct lengths: write pN−j(1 − p)j as
pN
(1 − p
p
)j
, j ∈ {0, . . . ,N}, have
(Nj
)times.
(Weighted) Geometric with ratio 1−pp = 10y ;
behavior depends on irrationality of y !Theorem: Benford if and only if y irrational.
128
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Benford Analysis (cont)
Say 1−pp = 10r/q for r , q integers.
All terms with index j mod q have same leading digit; probability index j mod q is
1
2N
[(
N
j
)
+
(
N
j + q
)
+
(
N
j + 2q
)
+ · · ·
]
=1
q
q−1∑
s=0
(
cosπs
q
)Ncos
π(N − 2j)s
q
=1
q
1 +
q−1∑
s=1
(
cosπs
q
)Ncos
π(N − 2j)s
q
=1
q
(
1 + Err
[
(q − 1)(
cosπ
q
)N])
,
where Err[X ] indicates an absolute error of size at most X
129
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Examples
p = 3/11, 1000 levels; y = log10(8/3) 6∈ Q
(irrational)
130
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Examples
p = 1/11, 1000 levels; y = 1 ∈ Q
(rational)
131
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Examples
p = 1/(1 + 1033/10), 1000 levels; y = 33/10 ∈ Q
(rational)
132
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Random Cuts
L
L K1 LH1-K1L
L K1K2 L K1H1-K2L LH1-K1LK3 LH1-K1LH1-K3L
L K1K2K4 L K1K2H1-K4L L K1H1-K2LK5 L K1H1-K2LH1-K45L LH1-K1LK3K6 LH1-K1LK3H1-K6L LH1-K1LH1-K3LK7 LH1-K1LH1-K3LH1-K7L
Figure: Unrestricted Decomposition: Breaking L into pieces, N = 3.133
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Conclusions and References
134
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Conclusions and Future Investigations
See many different systems exhibit Benfordbehavior.
Ingredients of proofs (logarithms,equidistribution).
Applications to fraud detection / dataintegrity.
135
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
A. K. Adhikari, Some results on the distribution of the mostsignificant digit, Sankhya: The Indian Journal of Statistics, SeriesB 31 (1969), 413–420.
A. K. Adhikari and B. P. Sarkar, Distribution of most significantdigit in certain functions whose arguments are random variables,Sankhya: The Indian J. of Statistics, Series B 30 (1968), 47–58.
R. N. Bhattacharya, Speed of convergence of the n-foldconvolution of a probability measure ona compact group, Z.Wahrscheinlichkeitstheorie verw. Geb. 25 (1972), 1–10.
F. Benford, The law of anomalous numbers, Proceedings of theAmerican Philosophical Society 78 (1938), 551–572. http://www.jstor.org/stable/984802?seq=1#page_scan_tab_contents.
136
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
A. Berger, L. A. Bunimovich and T. Hill, One-dimensionaldynamical systems and Benford’s Law, Trans. AMS 357 (2005),no. 1, 197–219. http://www.ams.org/journals/tran/2005-357-01/S0002-9947-04-03455-5/.
A. Berger and T. Hill, Newton’s method obeys Benford’s law, TheAmer. Math. Monthly 114 (2007), no. 7, 588-601. http://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?article=1058&context=rgp_rsr.
A. Berger and T. Hill, Benford on-line bibliography, http://www.benfordonline.net/.
J. Boyle, An application of Fourier series to the most significantdigit problem Amer. Math. Monthly 101 (1994), 879–886.http://www.jstor.org/stable/2975136?seq=1#page_scan_tab_contents.
J. Brown and R. Duncan, Modulo one uniform distribution of thesequence of logarithms of certain recursive sequences,Fibonacci Quarterly 8 (1970) 482–486.
137
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
P. Diaconis, The distribution of leading digits and uniformdistribution mod 1, Ann. Probab. 5 (1979), 72–81. http://statweb.stanford.edu/~cgates/PERSI/papers/digits.pdf.
W. Feller, An Introduction to Probability Theory and itsApplications, Vol. II, second edition, John Wiley & Sons, Inc.,1971.
R. W. Hamming, On the distribution of numbers, Bell Syst. Tech.J. 49 (1970), 1609-1625. https://archive.org/details/bstj49-8-1609.
T. Hill, The first-digit phenomenon, American Scientist 86 (1996),358–363. http://www.americanscientist.org/issues/feature/1998/4/the-first-digit-phenomenon/99999.
T. Hill, A statistical derivation of the significant-digit law,Statistical Science 10 (1996), 354–363. https://projecteuclid.org/euclid.ss/1177009869.
138
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
P. J. Holewijn, On the uniform distribuiton of sequences ofrandom variables, Z. Wahrscheinlichkeitstheorie verw. Geb. 14(1969), 89–92.
W. Hurlimann, Benford’s Law from 1881 to 2006: a bibliography,http://arxiv.org/abs/math/0607168.
D. Jang, J. Kang, A. Kruckman, J. Kudo & S. J. Miller, Chains ofdistributions, hierarchical Bayesian models and Benford’s Law,Journal of Algebra, Number Theory: Advances and Applications,volume 1, number 1 (March 2009), 37–60. http://arxiv.org/abs/0805.4226.
E. Janvresse and T. de la Rue, From uniform distribution toBenford’s law, Journal of Applied Probability 41 (2004) no. 4,1203–1210. http://www.jstor.org/stable/4141393?seq=1#page_scan_tab_contents.
A. Kontorovich and S. J. Miller, Benford’s Law, Values ofL-functions and the 3x + 1 Problem, Acta Arith. 120 (2005),269–297. http://arxiv.org/pdf/math/0412003.pdf.
139
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
D. Knuth, The Art of Computer Programming, Volume 2:Seminumerical Algorithms, Addison-Wesley, third edition, 1997.
J. Lagarias and K. Soundararajan, Benford’s Law for the 3x + 1Function, J. London Math. Soc. (2) 74 (2006), no. 2, 289–303.http://arxiv.org/pdf/math/0509175.pdf.
S. Lang, Undergraduate Analysis, 2nd edition, Springer-Verlag,New York, 1997.
P. Levy, L’addition des variables aleatoires definies sur unecirconference, Bull. de la S. M. F. 67 (1939), 1–41.
E. Ley, On the peculiar distribution of the U.S. Stock IndicesDigits, The American Statistician 50 (1996), no. 4, 311–313.http://www.jstor.org/stable/2684926?seq=1#page_scan_tab_contents.
140
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
R. M. Loynes, Some results in the probabilistic theory ofasympototic uniform distributions modulo 1, Z.Wahrscheinlichkeitstheorie verw. Geb. 26 (1973), 33–41.
S. J. Miller, Benford’s Law: Theory and Applications, PrincetonUniversity Press, 2015. http://web.williams.edu/Mathematics/sjmiller/public_html/benford/.
S. J. Miller and M. Nigrini, The Modulo 1 Central Limit Theoremand Benford’s Law for Products, International Journal of Algebra2 (2008), no. 3, 119–130. http://arxiv.org/pdf/math/0607686v2.
S. J. Miller and M. Nigrini, Order Statistics and Benford’s law,International Journal of Mathematics and MathematicalSciences, Volume 2008 (2008), Article ID 382948, 19 pages.http://arxiv.org/pdf/math/0601344v5.pdf.
141
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
S. J. Miller and R. Takloo-Bighash, An Invitation to ModernNumber Theory, Princeton University Press, Princeton, NJ, 2006.http://web.williams.edu/Mathematics/sjmiller/public_html/book/index.html.
S. Newcomb, Note on the frequency of use of the different digitsin natural numbers, Amer. J. Math. 4 (1881), 39-40. http://www.jstor.org/stable/2369148?seq=1#page_scan_tab_contents.
M. Nigrini, Digital Analysis and the Reduction of Auditor LitigationRisk. Pages 69–81 in Proceedings of the 1996 Deloitte & Touche/ University of Kansas Symposium on Auditing Problems, ed. M.Ettredge, University of Kansas, Lawrence, KS, 1996.
M. Nigrini, The Use of Benford’s Law as an Aid in AnalyticalProcedures, Auditing: A Journal of Practice & Theory, 16 (1997),no. 2, 52–67.
142
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
M. Nigrini and S. J. Miller, Benford’s Law applied to hydrologydata – results and relevance to other geophysical data,Mathematical Geology 39 (2007), no. 5, 469–490. http://link.springer.com/article/10.1007%2Fs11004-007-9109-5?LI=true.
M. Nigrini and S. J. Miller, Data diagnostics using second ordertests of Benford’s Law, Auditing: A Journal of Practice andTheory 28 (2009), no. 2, 305–324. http://accounting.uwaterloo.ca/uwcisa/symposiums/symposium_2007/AdvancedBenfordsLaw7.pdf.
R. Pinkham, On the Distribution of First Significant Digits, TheAnnals of Mathematical Statistics 32, no. 4 (1961), 1223-1230.
R. A. Raimi, The first digit problem, Amer. Math. Monthly 83(1976), no. 7, 521–538.
143
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
H. Robbins, On the equidistribution of sums of independentrandom variables, Proc. Amer. Math. Soc. 4 (1953), 786–799.http://projecteuclid.org/download/pdf_1/euclid.aoms/1177704862.
H. Sakamoto, On the distributions of the product and the quotientof the independent and uniformly distributed random variables,Tohoku Math. J. 49 (1943), 243–260.
P. Schatte, On sums modulo 2π of independent randomvariables, Math. Nachr. 110 (1983), 243–261.
P. Schatte, On the asymptotic uniform distribution of sumsreduced mod 1, Math. Nachr. 115 (1984), 275–281.
144
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
P. Schatte, On the asymptotic logarithmic distribution of thefloating-point mantissas of sums, Math. Nachr. 127 (1986), 7–20.
E. Stein and R. Shakarchi, Fourier Analysis: An Introduction,Princeton University Press, 2003.
M. D. Springer and W. E. Thompson, The distribution of productsof independent random variables, SIAM J. Appl. Math. 14 (1966)511–526. http://www.jstor.org/stable/2946226?seq=1#page_scan_tab_contents.
K. Stromberg, Probabilities on a compact group, Trans. Amer.Math. Soc. 94 (1960), 295–309. http://www.jstor.org/stable/1993313?seq=1#page_scan_tab_contents.
P. R. Turner, The distribution of leading significant digits, IMA J.Numer. Anal. 2 (1982), no. 4, 407–412.
145
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Productsof
Random Variables
S. J. Miller and M. Nigrini, The Modulo 1Central Limit Theorem and Benford’s Law forProducts, International Journal of Algebra 2(2008), no. 3, 119–130.
146
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Preliminaries
X1 · · ·Xn ⇔ Y1 + · · ·+ Yn mod 1, Yi = logB Xi
Density Yi is gi , density Yi + Yj is
(gi ∗ gj)(y) =
∫ 1
0gi(t)gj(y − t)dt .
hn = g1 ∗ · · · ∗ gn, hn(ξ) = g1(ξ) · · · gn(ξ).
147
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Modulo 1 Central Limit Theorem
Theorem (M– and Nigrini 2007){Ym} independent continuous random variableson [0, 1) (not necc. i.i.d.), densities {gm}.Y1 + · · ·+ YM mod 1 converges to the uniformdistribution as M → ∞ in L1([0, 1]) if and only iffor all n 6= 0, limM→∞ g1(n) · · · gM(n) = 0.
⋄ Gives info on rate of convergence.
148
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Generalizations
Levy proved for i.i.d.r.v. just one year afterBenford’s paper.
Generalized to other compact groups, withestimates on the rate of convergence.⋄ Stromberg: n-fold convolution of a regularprobability measure on a compact Hausdorffgroup G converges to normalized Haarmeasure in weak-star topology iff support ofthe distribution not contained in a coset of aproper normal closed subgroup of G.
149
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Distribution of digits (base 10) of 1000 productsX1 · · ·X1000, where g10,m = φ11m.φm(x) = m if |x − 1/8| ≤ 1/2m (0 otherwise).
2 4 6 8 10
0.1
0.2
0.3
0.4
0.5
150
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Proof under stronger conditions
Use standard CLT to show Y1 + · · ·+ YM
tends to a Gaussian.
Use Poisson Summation to show theGaussian tends to the uniform modulo 1.
151
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Proof under stronger conditions
-4 -2 2 4
0.1
0.2
0.3
0.4
Figure: Plot of normal (mean 0, stdev 1).
152
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Proof under stronger conditions
0.2 0.4 0.6 0.8 1.0
1
2
3
4
Figure: Plot of normal (mean 0, stdev .1) modulo 1.
153
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Proof under stronger conditions
0.2 0.4 0.6 0.8 1.0
0.995
1.000
1.005
1.010
1.015
Figure: Plot of normal (mean 0, stdev .5) modulo 1.
154
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Inputs
Poisson Summation Formulaf nice: ∞∑
ℓ=−∞f (ℓ) =
∞∑
ℓ=−∞f (ℓ),
Fourier transform f (ξ) =
∫ ∞
−∞f (x)e−2πixξdx .
Lemma2√
2πσ2
∫∞σ1+δ e−x2/2σ2
dx ≪ e−σ2δ/2.
155
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Proof Under Weaker Conditions
Lemma
As N → ∞, pN(x) = e−πx2/N√N
becomesequidistributed modulo 1.
∫∞x=−∞
x mod 1∈[a,b]pN(x)dx =
1√N
∑n∈Z∫ b
x=a e−π(x+n)2/Ndx .
e−π(x+n)2/N = e−πn2/N + O(max(1,|n|)
N e−n2/N).
Can restrict sum to |n| ≤ N5/4.1√N
∑n∈Z e−πn2/N =
∑n∈Z e−πn2N .
156
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Proof Under Weaker Conditions
1√N
∑
|n|≤N5/4
∫ b
x=ae−π(x+n)2/Ndx
=1√N
∑
|n|≤N5/4
∫ b
x=a
[e−πn2/N + O
(max(1, |n|)
Ne−n2/N
)]dx
=b − a√
N
∑
|n|≤N5/4
e−πn2/N + O
1
N
N5/4∑
n=0
n + 1√N
e−π(n/√
N)2
=b − a√
N
∑
|n|≤N5/4
e−πn2/N + O
(1N
∫ N3/4
w=0(w + 1)e−πw2√
Ndw
)
=b − a√
N
∑
|n|≤N5/4
e−πn2/N + O(
N−1/2).
157
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Proof Under Weaker Conditions
Extend sums to n ∈ Z, apply PoissonSummation:
1√N
∑
n∈Z
∫ b
x=ae−π(x+n)2/Ndx ≈ (b−a) ·
∑
n∈Ze−πn2N .
For n = 0 the right hand side is b − a.For all other n, we trivially estimate the sum:∑
n 6=0
e−πn2N ≤ 2∑
n≥1
e−πnN ≤ 2e−πN
1 − e−πN,
which is less than 4e−πN for N sufficiently large.158
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Proof in General Case: Fourier input
Fejér kernel:
FN(x) =N∑
n=−N
(1 − |n|
N
)e2πinx .
Fejér series TN f (x) equals
(f ∗ FN)(x) =N∑
n=−N
(1 − |n|
N
)f (n)e2πinx .
Lebesgue’s Theorem: f ∈ L1([0, 1]). AsN → ∞, TN f converges to f in L1([0, 1]).TN(f ∗ g) = (TN f ) ∗ g: convolution assoc.
159
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Proof of Modulo 1 CLT
Density of sum is hℓ = g1 ∗ · · · ∗ gℓ.
Suffices show ∀ǫ: limM→∞∫ 1
0 |hM(x)− 1|dx < ǫ.
Lebesgue’s Theorem: N large,
||h1 − TNh1||1 =
∫ 1
0|h1(x)− TNh1(x)|dx <
ǫ
2.
Claim: above holds for hM for all M.
160
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Proof of Modulo 1 CLT : Proof of Claim
TNhM+1 = TN(hM ∗ gM+1) = (TNhM) ∗ gM+1
||hM+1 − TNhM+1||1 =
∫ 1
0|hM+1(x)− TNhM+1(x)|dx
=
∫ 1
0|(hM ∗ gM+1)(x)− (TNhM) ∗ gM+1(x)|dx
=
∫ 1
0
∣∣∣∣∣
∫ 1
0(hM(y)− TNhM(y))gM+1(x − y)
∣∣∣∣∣ dydx
≤∫ 1
0
∫ 1
0|hM(y)− TNhM(y)|gM+1(x − y)dxdy
=
∫ 1
0|hM(y)− TNhM(y)|dy · 1 <
ǫ
2.
161
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Proof of Modulo 1 CLT
Show limM→∞ ||hM − 1||1 = 0.Triangle inequality:
||hM − 1||1 ≤ ||hM − TNhM ||1 + ||TNhM − 1||1.
Choices of N and ǫ:
||hM − TNhM ||1 < ǫ/2.
Show ||TNhM − 1||1 < ǫ/2.
162
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Proof of Modulo 1 CLT
||TNhM − 1||1 =
∫ 1
0
∣∣∣∣∣∣∣
N∑
n=−Nn 6=0
(1 − |n|
N
)hM(n)e2πinx
∣∣∣∣∣∣∣dx
≤N∑
n=−Nn 6=0
(1 − |n|
N
)|hM(n)|
hM(n) = g1(n) · · · gM(n) −→M→∞ 0.For fixed N and ǫ, choose M large so that |hM(n)| < ǫ/4N whenevern 6= 0 and |n| ≤ N.
163
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Productsand
Chains of Random Variables
D. Jang, J. U. Kang, A. Kruckman, J. Kudoand S. J. Miller, Chains of distributions,hierarchical Bayesian models and Benford’sLaw, Journal of Algebra, Number Theory:Advances and Applications, volume 1,number 1 (March 2009), 37–60.
164
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Key Ingredients
Mellin transform and Fourier transformrelated by logarithmic change of variable.
Poisson summation from collapsing tomodulo 1 random variables.
165
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Preliminaries
Ξ1, . . . ,Ξn nice independent r.v.’s on [0,∞).Density Ξ1 · Ξ2:
∫ ∞
0f2(x
t
)f1(t)
dtt
166
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Preliminaries
Ξ1, . . . ,Ξn nice independent r.v.’s on [0,∞).Density Ξ1 · Ξ2:
∫ ∞
0f2(x
t
)f1(t)
dtt
⋄ Proof: Prob(Ξ1 · Ξ2 ∈ [0, x ]):∫ ∞
t=0Prob
(Ξ2 ∈
[0,
xt
])f1(t)dt
=
∫ ∞
t=0F2
(xt
)f1(t)dt ,
differentiate.167
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Mellin Transform
(Mf )(s) =
∫ ∞
0f (x)xs dx
x
(M−1g)(x) =1
2πi
∫ c+i∞
c−i∞g(s)x−sds
g(s) = (Mf )(s), f (x) = (M−1g)(x).
(f1 ⋆ f2)(x) =
∫ ∞
0f2(x
t
)f1(t)
dtt
(M(f1 ⋆ f2))(s) = (Mf1)(s) · (Mf2)(s).168
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Mellin Transform Formulation: Products Random Variables
TheoremXi ’s independent, densities fi . Ξn = X1 · · ·Xn,
hn(xn) = (f1 ⋆ · · · ⋆ fn)(xn)
(Mhn)(s) =n∏
m=1
(Mfm)(s).
As n → ∞, Ξn becomes Benford: Yn = logB Ξn,|Prob(Yn mod 1 ∈ [a, b])− (b − a)| ≤
(b − a) ·∞∑
ℓ 6=0,ℓ=−∞
n∏
m=1
(Mfi)(
1 − 2πiℓlogB
).
169
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Proof of Kossovsky’s Chain Conjecture for certain densitie s
Conditions{Di(θ)}i∈I: one-parameter distributions,densities fDi(θ) on [0,∞).p : N → I, X1 ∼ Dp(1)(1), Xm ∼ Dp(m)(Xm−1).m ≥ 2,
fm(xm) =
∫ ∞
0fDp(m)(1)
(xm
xm−1
)fm−1(xm−1)
dxm−1
xm−1
limn→∞
∞∑
ℓ=−∞ℓ 6=0
n∏
m=1
(MfDp(m)(1))
(1 − 2πiℓ
logB
)= 0
170
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Chains of Random Variables
Return to street problem: chain of uniforms.
Let Dunif(θ) be the density of a uniform randomvariable on [0, θ].
Let X1 ∼ Dunif(1) and Xn+1 ∼ Dunif(Xn).
171
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Proof of Kossovsky’s Chain Conjecture for certain densitie s
Theorem (JKKKM)If conditions hold, as n → ∞ the distributionof leading digits of Xn tends to Benford’s law.
The error is a nice function of the Mellintransforms: if Yn = logB Xn, then
|Prob(Yn mod 1 ∈ [a, b])− (b + a)| ≤∣∣∣∣∣∣∣(b − a) ·
∞∑
ℓ=−∞ℓ 6=0
n∏
m=1
(MfDp(m)(1))
(1 − 2πiℓ
logB
)∣∣∣∣∣∣∣
172
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Example: All Xi ∼ Exp(1)
Xi ∼ Exp(1), Yn = logB Ξn.
Needed ingredients:⋄∫∞
0 exp(−x)xs−1dx = Γ(s).⋄ |Γ(1 + ix)| =
√πx/ sinh(πx), x ∈ R.
|Pn(s)− log10(s)| ≤
logB s∞∑
ℓ=1
(2π2ℓ/ logB
sinh(2π2ℓ/ logB)
)n/2
.
173
Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains
Example: All Xi ∼ Exp(1)
Bounds on the error|Pn(s)− log10 s| ≤⋄ 3.3 · 10−3 logB s if n = 2,⋄ 1.9 · 10−4 logB s if n = 3,⋄ 1.1 · 10−5 logB s if n = 5, and⋄ 3.6 · 10−13 logB s if n = 10.
Error at most
log10 s∞∑
ℓ=1
(17.148ℓ
exp(8.5726ℓ)
)n/2
≤ .057n log10 s
174