18.600: lecture 30 .1in central limit theoremmath.mit.edu/~sheffield/2019600/lecture30.pdf ·...
TRANSCRIPT
![Page 1: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/1.jpg)
18.600: Lecture 30
Central limit theorem
Scott Sheffield
MIT
![Page 2: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/2.jpg)
Outline
Central limit theorem
Proving the central limit theorem
![Page 3: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/3.jpg)
Outline
Central limit theorem
Proving the central limit theorem
![Page 4: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/4.jpg)
Recall: DeMoivre-Laplace limit theorem
I Let Xi be an i.i.d. sequence of random variables. WriteSn =
∑ni=1 Xn.
I Suppose each Xi is 1 with probability p and 0 with probabilityq = 1− p.
I DeMoivre-Laplace limit theorem:
limn→∞
P{a ≤ Sn − np√npq
≤ b} → Φ(b)− Φ(a).
I Here Φ(b)− Φ(a) = P{a ≤ Z ≤ b} when Z is a standardnormal random variable.
I Sn−np√npq describes “number of standard deviations that Sn is
above or below its mean”.
I Question: Does a similar statement hold if the Xi are i.i.d. buthave some other probability distribution?
I Central limit theorem: Yes, if they have finite variance.
![Page 5: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/5.jpg)
Recall: DeMoivre-Laplace limit theorem
I Let Xi be an i.i.d. sequence of random variables. WriteSn =
∑ni=1 Xn.
I Suppose each Xi is 1 with probability p and 0 with probabilityq = 1− p.
I DeMoivre-Laplace limit theorem:
limn→∞
P{a ≤ Sn − np√npq
≤ b} → Φ(b)− Φ(a).
I Here Φ(b)− Φ(a) = P{a ≤ Z ≤ b} when Z is a standardnormal random variable.
I Sn−np√npq describes “number of standard deviations that Sn is
above or below its mean”.
I Question: Does a similar statement hold if the Xi are i.i.d. buthave some other probability distribution?
I Central limit theorem: Yes, if they have finite variance.
![Page 6: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/6.jpg)
Recall: DeMoivre-Laplace limit theorem
I Let Xi be an i.i.d. sequence of random variables. WriteSn =
∑ni=1 Xn.
I Suppose each Xi is 1 with probability p and 0 with probabilityq = 1− p.
I DeMoivre-Laplace limit theorem:
limn→∞
P{a ≤ Sn − np√npq
≤ b} → Φ(b)− Φ(a).
I Here Φ(b)− Φ(a) = P{a ≤ Z ≤ b} when Z is a standardnormal random variable.
I Sn−np√npq describes “number of standard deviations that Sn is
above or below its mean”.
I Question: Does a similar statement hold if the Xi are i.i.d. buthave some other probability distribution?
I Central limit theorem: Yes, if they have finite variance.
![Page 7: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/7.jpg)
Recall: DeMoivre-Laplace limit theorem
I Let Xi be an i.i.d. sequence of random variables. WriteSn =
∑ni=1 Xn.
I Suppose each Xi is 1 with probability p and 0 with probabilityq = 1− p.
I DeMoivre-Laplace limit theorem:
limn→∞
P{a ≤ Sn − np√npq
≤ b} → Φ(b)− Φ(a).
I Here Φ(b)− Φ(a) = P{a ≤ Z ≤ b} when Z is a standardnormal random variable.
I Sn−np√npq describes “number of standard deviations that Sn is
above or below its mean”.
I Question: Does a similar statement hold if the Xi are i.i.d. buthave some other probability distribution?
I Central limit theorem: Yes, if they have finite variance.
![Page 8: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/8.jpg)
Recall: DeMoivre-Laplace limit theorem
I Let Xi be an i.i.d. sequence of random variables. WriteSn =
∑ni=1 Xn.
I Suppose each Xi is 1 with probability p and 0 with probabilityq = 1− p.
I DeMoivre-Laplace limit theorem:
limn→∞
P{a ≤ Sn − np√npq
≤ b} → Φ(b)− Φ(a).
I Here Φ(b)− Φ(a) = P{a ≤ Z ≤ b} when Z is a standardnormal random variable.
I Sn−np√npq describes “number of standard deviations that Sn is
above or below its mean”.
I Question: Does a similar statement hold if the Xi are i.i.d. buthave some other probability distribution?
I Central limit theorem: Yes, if they have finite variance.
![Page 9: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/9.jpg)
Recall: DeMoivre-Laplace limit theorem
I Let Xi be an i.i.d. sequence of random variables. WriteSn =
∑ni=1 Xn.
I Suppose each Xi is 1 with probability p and 0 with probabilityq = 1− p.
I DeMoivre-Laplace limit theorem:
limn→∞
P{a ≤ Sn − np√npq
≤ b} → Φ(b)− Φ(a).
I Here Φ(b)− Φ(a) = P{a ≤ Z ≤ b} when Z is a standardnormal random variable.
I Sn−np√npq describes “number of standard deviations that Sn is
above or below its mean”.
I Question: Does a similar statement hold if the Xi are i.i.d. buthave some other probability distribution?
I Central limit theorem: Yes, if they have finite variance.
![Page 10: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/10.jpg)
Recall: DeMoivre-Laplace limit theorem
I Let Xi be an i.i.d. sequence of random variables. WriteSn =
∑ni=1 Xn.
I Suppose each Xi is 1 with probability p and 0 with probabilityq = 1− p.
I DeMoivre-Laplace limit theorem:
limn→∞
P{a ≤ Sn − np√npq
≤ b} → Φ(b)− Φ(a).
I Here Φ(b)− Φ(a) = P{a ≤ Z ≤ b} when Z is a standardnormal random variable.
I Sn−np√npq describes “number of standard deviations that Sn is
above or below its mean”.
I Question: Does a similar statement hold if the Xi are i.i.d. buthave some other probability distribution?
I Central limit theorem: Yes, if they have finite variance.
![Page 11: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/11.jpg)
Example
I Say we roll 106 ordinary dice independently of each other.
I Let Xi be the number on the ith die. Let X =∑106
i=1 Xi be thetotal of the numbers rolled.
I What is E [X ]?
I 106 · (7/2)
I What is Var[X ]?
I 106 · (35/12)
I How about SD[X ] =√
Var[X ]?
I 1000√
35/12
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 12: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/12.jpg)
Example
I Say we roll 106 ordinary dice independently of each other.
I Let Xi be the number on the ith die. Let X =∑106
i=1 Xi be thetotal of the numbers rolled.
I What is E [X ]?
I 106 · (7/2)
I What is Var[X ]?
I 106 · (35/12)
I How about SD[X ] =√
Var[X ]?
I 1000√
35/12
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 13: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/13.jpg)
Example
I Say we roll 106 ordinary dice independently of each other.
I Let Xi be the number on the ith die. Let X =∑106
i=1 Xi be thetotal of the numbers rolled.
I What is E [X ]?
I 106 · (7/2)
I What is Var[X ]?
I 106 · (35/12)
I How about SD[X ] =√
Var[X ]?
I 1000√
35/12
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 14: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/14.jpg)
Example
I Say we roll 106 ordinary dice independently of each other.
I Let Xi be the number on the ith die. Let X =∑106
i=1 Xi be thetotal of the numbers rolled.
I What is E [X ]?
I 106 · (7/2)
I What is Var[X ]?
I 106 · (35/12)
I How about SD[X ] =√
Var[X ]?
I 1000√
35/12
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 15: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/15.jpg)
Example
I Say we roll 106 ordinary dice independently of each other.
I Let Xi be the number on the ith die. Let X =∑106
i=1 Xi be thetotal of the numbers rolled.
I What is E [X ]?
I 106 · (7/2)
I What is Var[X ]?
I 106 · (35/12)
I How about SD[X ] =√
Var[X ]?
I 1000√
35/12
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 16: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/16.jpg)
Example
I Say we roll 106 ordinary dice independently of each other.
I Let Xi be the number on the ith die. Let X =∑106
i=1 Xi be thetotal of the numbers rolled.
I What is E [X ]?
I 106 · (7/2)
I What is Var[X ]?
I 106 · (35/12)
I How about SD[X ] =√
Var[X ]?
I 1000√
35/12
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 17: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/17.jpg)
Example
I Say we roll 106 ordinary dice independently of each other.
I Let Xi be the number on the ith die. Let X =∑106
i=1 Xi be thetotal of the numbers rolled.
I What is E [X ]?
I 106 · (7/2)
I What is Var[X ]?
I 106 · (35/12)
I How about SD[X ] =√Var[X ]?
I 1000√
35/12
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 18: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/18.jpg)
Example
I Say we roll 106 ordinary dice independently of each other.
I Let Xi be the number on the ith die. Let X =∑106
i=1 Xi be thetotal of the numbers rolled.
I What is E [X ]?
I 106 · (7/2)
I What is Var[X ]?
I 106 · (35/12)
I How about SD[X ] =√Var[X ]?
I 1000√
35/12
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 19: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/19.jpg)
Example
I Say we roll 106 ordinary dice independently of each other.
I Let Xi be the number on the ith die. Let X =∑106
i=1 Xi be thetotal of the numbers rolled.
I What is E [X ]?
I 106 · (7/2)
I What is Var[X ]?
I 106 · (35/12)
I How about SD[X ] =√Var[X ]?
I 1000√
35/12
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 20: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/20.jpg)
Example
I Say we roll 106 ordinary dice independently of each other.
I Let Xi be the number on the ith die. Let X =∑106
i=1 Xi be thetotal of the numbers rolled.
I What is E [X ]?
I 106 · (7/2)
I What is Var[X ]?
I 106 · (35/12)
I How about SD[X ] =√Var[X ]?
I 1000√
35/12
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 21: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/21.jpg)
Example
I Suppose earthquakes in some region are a Poisson pointprocess with rate λ equal to 1 per year.
I Let X be the number of earthquakes that occur over aten-thousand year period. Should be a Poisson randomvariable with rate 10000.
I What is E [X ]?
I 10000
I What is Var[X ]?
I 10000
I How about SD[X ]?
I 100
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 22: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/22.jpg)
Example
I Suppose earthquakes in some region are a Poisson pointprocess with rate λ equal to 1 per year.
I Let X be the number of earthquakes that occur over aten-thousand year period. Should be a Poisson randomvariable with rate 10000.
I What is E [X ]?
I 10000
I What is Var[X ]?
I 10000
I How about SD[X ]?
I 100
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 23: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/23.jpg)
Example
I Suppose earthquakes in some region are a Poisson pointprocess with rate λ equal to 1 per year.
I Let X be the number of earthquakes that occur over aten-thousand year period. Should be a Poisson randomvariable with rate 10000.
I What is E [X ]?
I 10000
I What is Var[X ]?
I 10000
I How about SD[X ]?
I 100
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 24: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/24.jpg)
Example
I Suppose earthquakes in some region are a Poisson pointprocess with rate λ equal to 1 per year.
I Let X be the number of earthquakes that occur over aten-thousand year period. Should be a Poisson randomvariable with rate 10000.
I What is E [X ]?
I 10000
I What is Var[X ]?
I 10000
I How about SD[X ]?
I 100
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 25: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/25.jpg)
Example
I Suppose earthquakes in some region are a Poisson pointprocess with rate λ equal to 1 per year.
I Let X be the number of earthquakes that occur over aten-thousand year period. Should be a Poisson randomvariable with rate 10000.
I What is E [X ]?
I 10000
I What is Var[X ]?
I 10000
I How about SD[X ]?
I 100
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 26: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/26.jpg)
Example
I Suppose earthquakes in some region are a Poisson pointprocess with rate λ equal to 1 per year.
I Let X be the number of earthquakes that occur over aten-thousand year period. Should be a Poisson randomvariable with rate 10000.
I What is E [X ]?
I 10000
I What is Var[X ]?
I 10000
I How about SD[X ]?
I 100
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 27: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/27.jpg)
Example
I Suppose earthquakes in some region are a Poisson pointprocess with rate λ equal to 1 per year.
I Let X be the number of earthquakes that occur over aten-thousand year period. Should be a Poisson randomvariable with rate 10000.
I What is E [X ]?
I 10000
I What is Var[X ]?
I 10000
I How about SD[X ]?
I 100
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 28: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/28.jpg)
Example
I Suppose earthquakes in some region are a Poisson pointprocess with rate λ equal to 1 per year.
I Let X be the number of earthquakes that occur over aten-thousand year period. Should be a Poisson randomvariable with rate 10000.
I What is E [X ]?
I 10000
I What is Var[X ]?
I 10000
I How about SD[X ]?
I 100
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 29: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/29.jpg)
Example
I Suppose earthquakes in some region are a Poisson pointprocess with rate λ equal to 1 per year.
I Let X be the number of earthquakes that occur over aten-thousand year period. Should be a Poisson randomvariable with rate 10000.
I What is E [X ]?
I 10000
I What is Var[X ]?
I 10000
I How about SD[X ]?
I 100
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 30: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/30.jpg)
Example
I Suppose earthquakes in some region are a Poisson pointprocess with rate λ equal to 1 per year.
I Let X be the number of earthquakes that occur over aten-thousand year period. Should be a Poisson randomvariable with rate 10000.
I What is E [X ]?
I 10000
I What is Var[X ]?
I 10000
I How about SD[X ]?
I 100
I What is the probability that X is less than a standarddeviations above its mean?
I Central limit theorem: should be about 1√2π
∫ a−∞ e−x
2/2dx .
![Page 31: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/31.jpg)
General statement
I Let Xi be an i.i.d. sequence of random variables with finitemean µ and variance σ2.
I Write Sn =∑n
i=1 Xi . So E [Sn] = nµ and Var[Sn] = nσ2 andSD[Sn] = σ
√n.
I Write Bn = X1+X2+...+Xn−nµσ√n
. Then Bn is the difference
between Sn and its expectation, measured in standarddeviation units.
I Central limit theorem:
limn→∞
P{a ≤ Bn ≤ b} → Φ(b)− Φ(a).
![Page 32: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/32.jpg)
General statement
I Let Xi be an i.i.d. sequence of random variables with finitemean µ and variance σ2.
I Write Sn =∑n
i=1 Xi . So E [Sn] = nµ and Var[Sn] = nσ2 andSD[Sn] = σ
√n.
I Write Bn = X1+X2+...+Xn−nµσ√n
. Then Bn is the difference
between Sn and its expectation, measured in standarddeviation units.
I Central limit theorem:
limn→∞
P{a ≤ Bn ≤ b} → Φ(b)− Φ(a).
![Page 33: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/33.jpg)
General statement
I Let Xi be an i.i.d. sequence of random variables with finitemean µ and variance σ2.
I Write Sn =∑n
i=1 Xi . So E [Sn] = nµ and Var[Sn] = nσ2 andSD[Sn] = σ
√n.
I Write Bn = X1+X2+...+Xn−nµσ√n
. Then Bn is the difference
between Sn and its expectation, measured in standarddeviation units.
I Central limit theorem:
limn→∞
P{a ≤ Bn ≤ b} → Φ(b)− Φ(a).
![Page 34: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/34.jpg)
General statement
I Let Xi be an i.i.d. sequence of random variables with finitemean µ and variance σ2.
I Write Sn =∑n
i=1 Xi . So E [Sn] = nµ and Var[Sn] = nσ2 andSD[Sn] = σ
√n.
I Write Bn = X1+X2+...+Xn−nµσ√n
. Then Bn is the difference
between Sn and its expectation, measured in standarddeviation units.
I Central limit theorem:
limn→∞
P{a ≤ Bn ≤ b} → Φ(b)− Φ(a).
![Page 35: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/35.jpg)
Outline
Central limit theorem
Proving the central limit theorem
![Page 36: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/36.jpg)
Outline
Central limit theorem
Proving the central limit theorem
![Page 37: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/37.jpg)
Recall: characteristic functions
I Let X be a random variable.
I The characteristic function of X is defined byφ(t) = φX (t) := E [e itX ]. Like M(t) except with i thrown in.
I Recall that by definition e it = cos(t) + i sin(t).
I Characteristic functions are similar to moment generatingfunctions in some ways.
I For example, φX+Y = φXφY , just as MX+Y = MXMY , if Xand Y are independent.
I And φaX (t) = φX (at) just as MaX (t) = MX (at).
I And if X has an mth moment then E [Xm] = imφ(m)X (0).
I Characteristic functions are well defined at all t for all randomvariables X .
![Page 38: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/38.jpg)
Recall: characteristic functions
I Let X be a random variable.
I The characteristic function of X is defined byφ(t) = φX (t) := E [e itX ]. Like M(t) except with i thrown in.
I Recall that by definition e it = cos(t) + i sin(t).
I Characteristic functions are similar to moment generatingfunctions in some ways.
I For example, φX+Y = φXφY , just as MX+Y = MXMY , if Xand Y are independent.
I And φaX (t) = φX (at) just as MaX (t) = MX (at).
I And if X has an mth moment then E [Xm] = imφ(m)X (0).
I Characteristic functions are well defined at all t for all randomvariables X .
![Page 39: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/39.jpg)
Recall: characteristic functions
I Let X be a random variable.
I The characteristic function of X is defined byφ(t) = φX (t) := E [e itX ]. Like M(t) except with i thrown in.
I Recall that by definition e it = cos(t) + i sin(t).
I Characteristic functions are similar to moment generatingfunctions in some ways.
I For example, φX+Y = φXφY , just as MX+Y = MXMY , if Xand Y are independent.
I And φaX (t) = φX (at) just as MaX (t) = MX (at).
I And if X has an mth moment then E [Xm] = imφ(m)X (0).
I Characteristic functions are well defined at all t for all randomvariables X .
![Page 40: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/40.jpg)
Recall: characteristic functions
I Let X be a random variable.
I The characteristic function of X is defined byφ(t) = φX (t) := E [e itX ]. Like M(t) except with i thrown in.
I Recall that by definition e it = cos(t) + i sin(t).
I Characteristic functions are similar to moment generatingfunctions in some ways.
I For example, φX+Y = φXφY , just as MX+Y = MXMY , if Xand Y are independent.
I And φaX (t) = φX (at) just as MaX (t) = MX (at).
I And if X has an mth moment then E [Xm] = imφ(m)X (0).
I Characteristic functions are well defined at all t for all randomvariables X .
![Page 41: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/41.jpg)
Recall: characteristic functions
I Let X be a random variable.
I The characteristic function of X is defined byφ(t) = φX (t) := E [e itX ]. Like M(t) except with i thrown in.
I Recall that by definition e it = cos(t) + i sin(t).
I Characteristic functions are similar to moment generatingfunctions in some ways.
I For example, φX+Y = φXφY , just as MX+Y = MXMY , if Xand Y are independent.
I And φaX (t) = φX (at) just as MaX (t) = MX (at).
I And if X has an mth moment then E [Xm] = imφ(m)X (0).
I Characteristic functions are well defined at all t for all randomvariables X .
![Page 42: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/42.jpg)
Recall: characteristic functions
I Let X be a random variable.
I The characteristic function of X is defined byφ(t) = φX (t) := E [e itX ]. Like M(t) except with i thrown in.
I Recall that by definition e it = cos(t) + i sin(t).
I Characteristic functions are similar to moment generatingfunctions in some ways.
I For example, φX+Y = φXφY , just as MX+Y = MXMY , if Xand Y are independent.
I And φaX (t) = φX (at) just as MaX (t) = MX (at).
I And if X has an mth moment then E [Xm] = imφ(m)X (0).
I Characteristic functions are well defined at all t for all randomvariables X .
![Page 43: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/43.jpg)
Recall: characteristic functions
I Let X be a random variable.
I The characteristic function of X is defined byφ(t) = φX (t) := E [e itX ]. Like M(t) except with i thrown in.
I Recall that by definition e it = cos(t) + i sin(t).
I Characteristic functions are similar to moment generatingfunctions in some ways.
I For example, φX+Y = φXφY , just as MX+Y = MXMY , if Xand Y are independent.
I And φaX (t) = φX (at) just as MaX (t) = MX (at).
I And if X has an mth moment then E [Xm] = imφ(m)X (0).
I Characteristic functions are well defined at all t for all randomvariables X .
![Page 44: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/44.jpg)
Recall: characteristic functions
I Let X be a random variable.
I The characteristic function of X is defined byφ(t) = φX (t) := E [e itX ]. Like M(t) except with i thrown in.
I Recall that by definition e it = cos(t) + i sin(t).
I Characteristic functions are similar to moment generatingfunctions in some ways.
I For example, φX+Y = φXφY , just as MX+Y = MXMY , if Xand Y are independent.
I And φaX (t) = φX (at) just as MaX (t) = MX (at).
I And if X has an mth moment then E [Xm] = imφ(m)X (0).
I Characteristic functions are well defined at all t for all randomvariables X .
![Page 45: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/45.jpg)
Rephrasing the theorem
I Let X be a random variable and Xn a sequence of randomvariables.
I Say Xn converge in distribution or converge in law to X iflimn→∞ FXn(x) = FX (x) at all x ∈ R at which FX iscontinuous.
I Recall: the weak law of large numbers can be rephrased as thestatement that An = X1+X2+...+Xn
n converges in law to µ (i.e.,to the random variable that is equal to µ with probability one)as n→∞.
I The central limit theorem can be rephrased as the statementthat Bn = X1+X2+...+Xn−nµ
σ√n
converges in law to a standard
normal random variable as n→∞.
![Page 46: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/46.jpg)
Rephrasing the theorem
I Let X be a random variable and Xn a sequence of randomvariables.
I Say Xn converge in distribution or converge in law to X iflimn→∞ FXn(x) = FX (x) at all x ∈ R at which FX iscontinuous.
I Recall: the weak law of large numbers can be rephrased as thestatement that An = X1+X2+...+Xn
n converges in law to µ (i.e.,to the random variable that is equal to µ with probability one)as n→∞.
I The central limit theorem can be rephrased as the statementthat Bn = X1+X2+...+Xn−nµ
σ√n
converges in law to a standard
normal random variable as n→∞.
![Page 47: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/47.jpg)
Rephrasing the theorem
I Let X be a random variable and Xn a sequence of randomvariables.
I Say Xn converge in distribution or converge in law to X iflimn→∞ FXn(x) = FX (x) at all x ∈ R at which FX iscontinuous.
I Recall: the weak law of large numbers can be rephrased as thestatement that An = X1+X2+...+Xn
n converges in law to µ (i.e.,to the random variable that is equal to µ with probability one)as n→∞.
I The central limit theorem can be rephrased as the statementthat Bn = X1+X2+...+Xn−nµ
σ√n
converges in law to a standard
normal random variable as n→∞.
![Page 48: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/48.jpg)
Rephrasing the theorem
I Let X be a random variable and Xn a sequence of randomvariables.
I Say Xn converge in distribution or converge in law to X iflimn→∞ FXn(x) = FX (x) at all x ∈ R at which FX iscontinuous.
I Recall: the weak law of large numbers can be rephrased as thestatement that An = X1+X2+...+Xn
n converges in law to µ (i.e.,to the random variable that is equal to µ with probability one)as n→∞.
I The central limit theorem can be rephrased as the statementthat Bn = X1+X2+...+Xn−nµ
σ√n
converges in law to a standard
normal random variable as n→∞.
![Page 49: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/49.jpg)
Continuity theorems
I Levy’s continuity theorem (see Wikipedia): if
limn→∞
φXn(t) = φX (t)
for all t, then Xn converge in law to X .
I By this theorem, we can prove the central limit theorem byshowing limn→∞ φBn(t) = e−t
2/2 for all t.
I Moment generating function continuity theorem: ifmoment generating functions MXn(t) are defined for all t andn and limn→∞MXn(t) = MX (t) for all t, then Xn converge inlaw to X .
I By this theorem, we can prove the central limit theorem byshowing limn→∞MBn(t) = et
2/2 for all t.
![Page 50: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/50.jpg)
Continuity theorems
I Levy’s continuity theorem (see Wikipedia): if
limn→∞
φXn(t) = φX (t)
for all t, then Xn converge in law to X .
I By this theorem, we can prove the central limit theorem byshowing limn→∞ φBn(t) = e−t
2/2 for all t.
I Moment generating function continuity theorem: ifmoment generating functions MXn(t) are defined for all t andn and limn→∞MXn(t) = MX (t) for all t, then Xn converge inlaw to X .
I By this theorem, we can prove the central limit theorem byshowing limn→∞MBn(t) = et
2/2 for all t.
![Page 51: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/51.jpg)
Continuity theorems
I Levy’s continuity theorem (see Wikipedia): if
limn→∞
φXn(t) = φX (t)
for all t, then Xn converge in law to X .
I By this theorem, we can prove the central limit theorem byshowing limn→∞ φBn(t) = e−t
2/2 for all t.
I Moment generating function continuity theorem: ifmoment generating functions MXn(t) are defined for all t andn and limn→∞MXn(t) = MX (t) for all t, then Xn converge inlaw to X .
I By this theorem, we can prove the central limit theorem byshowing limn→∞MBn(t) = et
2/2 for all t.
![Page 52: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/52.jpg)
Continuity theorems
I Levy’s continuity theorem (see Wikipedia): if
limn→∞
φXn(t) = φX (t)
for all t, then Xn converge in law to X .
I By this theorem, we can prove the central limit theorem byshowing limn→∞ φBn(t) = e−t
2/2 for all t.
I Moment generating function continuity theorem: ifmoment generating functions MXn(t) are defined for all t andn and limn→∞MXn(t) = MX (t) for all t, then Xn converge inlaw to X .
I By this theorem, we can prove the central limit theorem byshowing limn→∞MBn(t) = et
2/2 for all t.
![Page 53: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/53.jpg)
Proof of central limit theorem with moment generatingfunctions
I Write Y = X−µσ . Then Y has mean zero and variance 1.
I Write MY (t) = E [etY ] and g(t) = logMY (t). SoMY (t) = eg(t).
I We know g(0) = 0. Also M ′Y (0) = E [Y ] = 0 andM ′′Y (0) = E [Y 2] = Var[Y ] = 1.
I Chain rule: M ′Y (0) = g ′(0)eg(0) = g ′(0) = 0 andM ′′Y (0) = g ′′(0)eg(0) + g ′(0)2eg(0) = g ′′(0) = 1.
I So g is a nice function with g(0) = g ′(0) = 0 and g ′′(0) = 1.Taylor expansion: g(t) = t2/2 + o(t2) for t near zero.
I Now Bn is 1√n
times the sum of n independent copies of Y .
I So MBn(t) =(MY (t/
√n))n
= eng( t√
n).
I But eng( t√
n) ≈ e
n( t√n)2/2
= et2/2, in sense that LHS tends to
et2/2 as n tends to infinity.
![Page 54: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/54.jpg)
Proof of central limit theorem with moment generatingfunctions
I Write Y = X−µσ . Then Y has mean zero and variance 1.
I Write MY (t) = E [etY ] and g(t) = logMY (t). SoMY (t) = eg(t).
I We know g(0) = 0. Also M ′Y (0) = E [Y ] = 0 andM ′′Y (0) = E [Y 2] = Var[Y ] = 1.
I Chain rule: M ′Y (0) = g ′(0)eg(0) = g ′(0) = 0 andM ′′Y (0) = g ′′(0)eg(0) + g ′(0)2eg(0) = g ′′(0) = 1.
I So g is a nice function with g(0) = g ′(0) = 0 and g ′′(0) = 1.Taylor expansion: g(t) = t2/2 + o(t2) for t near zero.
I Now Bn is 1√n
times the sum of n independent copies of Y .
I So MBn(t) =(MY (t/
√n))n
= eng( t√
n).
I But eng( t√
n) ≈ e
n( t√n)2/2
= et2/2, in sense that LHS tends to
et2/2 as n tends to infinity.
![Page 55: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/55.jpg)
Proof of central limit theorem with moment generatingfunctions
I Write Y = X−µσ . Then Y has mean zero and variance 1.
I Write MY (t) = E [etY ] and g(t) = logMY (t). SoMY (t) = eg(t).
I We know g(0) = 0. Also M ′Y (0) = E [Y ] = 0 andM ′′Y (0) = E [Y 2] = Var[Y ] = 1.
I Chain rule: M ′Y (0) = g ′(0)eg(0) = g ′(0) = 0 andM ′′Y (0) = g ′′(0)eg(0) + g ′(0)2eg(0) = g ′′(0) = 1.
I So g is a nice function with g(0) = g ′(0) = 0 and g ′′(0) = 1.Taylor expansion: g(t) = t2/2 + o(t2) for t near zero.
I Now Bn is 1√n
times the sum of n independent copies of Y .
I So MBn(t) =(MY (t/
√n))n
= eng( t√
n).
I But eng( t√
n) ≈ e
n( t√n)2/2
= et2/2, in sense that LHS tends to
et2/2 as n tends to infinity.
![Page 56: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/56.jpg)
Proof of central limit theorem with moment generatingfunctions
I Write Y = X−µσ . Then Y has mean zero and variance 1.
I Write MY (t) = E [etY ] and g(t) = logMY (t). SoMY (t) = eg(t).
I We know g(0) = 0. Also M ′Y (0) = E [Y ] = 0 andM ′′Y (0) = E [Y 2] = Var[Y ] = 1.
I Chain rule: M ′Y (0) = g ′(0)eg(0) = g ′(0) = 0 andM ′′Y (0) = g ′′(0)eg(0) + g ′(0)2eg(0) = g ′′(0) = 1.
I So g is a nice function with g(0) = g ′(0) = 0 and g ′′(0) = 1.Taylor expansion: g(t) = t2/2 + o(t2) for t near zero.
I Now Bn is 1√n
times the sum of n independent copies of Y .
I So MBn(t) =(MY (t/
√n))n
= eng( t√
n).
I But eng( t√
n) ≈ e
n( t√n)2/2
= et2/2, in sense that LHS tends to
et2/2 as n tends to infinity.
![Page 57: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/57.jpg)
Proof of central limit theorem with moment generatingfunctions
I Write Y = X−µσ . Then Y has mean zero and variance 1.
I Write MY (t) = E [etY ] and g(t) = logMY (t). SoMY (t) = eg(t).
I We know g(0) = 0. Also M ′Y (0) = E [Y ] = 0 andM ′′Y (0) = E [Y 2] = Var[Y ] = 1.
I Chain rule: M ′Y (0) = g ′(0)eg(0) = g ′(0) = 0 andM ′′Y (0) = g ′′(0)eg(0) + g ′(0)2eg(0) = g ′′(0) = 1.
I So g is a nice function with g(0) = g ′(0) = 0 and g ′′(0) = 1.Taylor expansion: g(t) = t2/2 + o(t2) for t near zero.
I Now Bn is 1√n
times the sum of n independent copies of Y .
I So MBn(t) =(MY (t/
√n))n
= eng( t√
n).
I But eng( t√
n) ≈ e
n( t√n)2/2
= et2/2, in sense that LHS tends to
et2/2 as n tends to infinity.
![Page 58: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/58.jpg)
Proof of central limit theorem with moment generatingfunctions
I Write Y = X−µσ . Then Y has mean zero and variance 1.
I Write MY (t) = E [etY ] and g(t) = logMY (t). SoMY (t) = eg(t).
I We know g(0) = 0. Also M ′Y (0) = E [Y ] = 0 andM ′′Y (0) = E [Y 2] = Var[Y ] = 1.
I Chain rule: M ′Y (0) = g ′(0)eg(0) = g ′(0) = 0 andM ′′Y (0) = g ′′(0)eg(0) + g ′(0)2eg(0) = g ′′(0) = 1.
I So g is a nice function with g(0) = g ′(0) = 0 and g ′′(0) = 1.Taylor expansion: g(t) = t2/2 + o(t2) for t near zero.
I Now Bn is 1√n
times the sum of n independent copies of Y .
I So MBn(t) =(MY (t/
√n))n
= eng( t√
n).
I But eng( t√
n) ≈ e
n( t√n)2/2
= et2/2, in sense that LHS tends to
et2/2 as n tends to infinity.
![Page 59: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/59.jpg)
Proof of central limit theorem with moment generatingfunctions
I Write Y = X−µσ . Then Y has mean zero and variance 1.
I Write MY (t) = E [etY ] and g(t) = logMY (t). SoMY (t) = eg(t).
I We know g(0) = 0. Also M ′Y (0) = E [Y ] = 0 andM ′′Y (0) = E [Y 2] = Var[Y ] = 1.
I Chain rule: M ′Y (0) = g ′(0)eg(0) = g ′(0) = 0 andM ′′Y (0) = g ′′(0)eg(0) + g ′(0)2eg(0) = g ′′(0) = 1.
I So g is a nice function with g(0) = g ′(0) = 0 and g ′′(0) = 1.Taylor expansion: g(t) = t2/2 + o(t2) for t near zero.
I Now Bn is 1√n
times the sum of n independent copies of Y .
I So MBn(t) =(MY (t/
√n))n
= eng( t√
n).
I But eng( t√
n) ≈ e
n( t√n)2/2
= et2/2, in sense that LHS tends to
et2/2 as n tends to infinity.
![Page 60: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/60.jpg)
Proof of central limit theorem with moment generatingfunctions
I Write Y = X−µσ . Then Y has mean zero and variance 1.
I Write MY (t) = E [etY ] and g(t) = logMY (t). SoMY (t) = eg(t).
I We know g(0) = 0. Also M ′Y (0) = E [Y ] = 0 andM ′′Y (0) = E [Y 2] = Var[Y ] = 1.
I Chain rule: M ′Y (0) = g ′(0)eg(0) = g ′(0) = 0 andM ′′Y (0) = g ′′(0)eg(0) + g ′(0)2eg(0) = g ′′(0) = 1.
I So g is a nice function with g(0) = g ′(0) = 0 and g ′′(0) = 1.Taylor expansion: g(t) = t2/2 + o(t2) for t near zero.
I Now Bn is 1√n
times the sum of n independent copies of Y .
I So MBn(t) =(MY (t/
√n))n
= eng( t√
n).
I But eng( t√
n) ≈ e
n( t√n)2/2
= et2/2, in sense that LHS tends to
et2/2 as n tends to infinity.
![Page 61: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/61.jpg)
Proof of central limit theorem with characteristic functions
I Moment generating function proof only applies if the momentgenerating function of X exists.
I But the proof can be repeated almost verbatim usingcharacteristic functions instead of moment generatingfunctions.
I Then it applies for any X with finite variance.
![Page 62: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/62.jpg)
Proof of central limit theorem with characteristic functions
I Moment generating function proof only applies if the momentgenerating function of X exists.
I But the proof can be repeated almost verbatim usingcharacteristic functions instead of moment generatingfunctions.
I Then it applies for any X with finite variance.
![Page 63: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/63.jpg)
Proof of central limit theorem with characteristic functions
I Moment generating function proof only applies if the momentgenerating function of X exists.
I But the proof can be repeated almost verbatim usingcharacteristic functions instead of moment generatingfunctions.
I Then it applies for any X with finite variance.
![Page 64: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/64.jpg)
Almost verbatim: replace MY (t) with φY (t)
I Write φY (t) = E [e itY ] and g(t) = log φY (t). SoφY (t) = eg(t).
I We know g(0) = 0. Also φ′Y (0) = iE [Y ] = 0 andφ′′Y (0) = i2E [Y 2] = −Var[Y ] = −1.
I Chain rule: φ′Y (0) = g ′(0)eg(0) = g ′(0) = 0 andφ′′Y (0) = g ′′(0)eg(0) + g ′(0)2eg(0) = g ′′(0) = −1.
I So g is a nice function with g(0) = g ′(0) = 0 andg ′′(0) = −1. Taylor expansion: g(t) = −t2/2 + o(t2) for tnear zero.
I Now Bn is 1√n
times the sum of n independent copies of Y .
I So φBn(t) =(φY (t/
√n))n
= eng( t√
n).
I But eng( t√
n) ≈ e
−n( t√n)2/2
= e−t2/2, in sense that LHS tends
to e−t2/2 as n tends to infinity.
![Page 65: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/65.jpg)
Almost verbatim: replace MY (t) with φY (t)
I Write φY (t) = E [e itY ] and g(t) = log φY (t). SoφY (t) = eg(t).
I We know g(0) = 0. Also φ′Y (0) = iE [Y ] = 0 andφ′′Y (0) = i2E [Y 2] = −Var[Y ] = −1.
I Chain rule: φ′Y (0) = g ′(0)eg(0) = g ′(0) = 0 andφ′′Y (0) = g ′′(0)eg(0) + g ′(0)2eg(0) = g ′′(0) = −1.
I So g is a nice function with g(0) = g ′(0) = 0 andg ′′(0) = −1. Taylor expansion: g(t) = −t2/2 + o(t2) for tnear zero.
I Now Bn is 1√n
times the sum of n independent copies of Y .
I So φBn(t) =(φY (t/
√n))n
= eng( t√
n).
I But eng( t√
n) ≈ e
−n( t√n)2/2
= e−t2/2, in sense that LHS tends
to e−t2/2 as n tends to infinity.
![Page 66: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/66.jpg)
Almost verbatim: replace MY (t) with φY (t)
I Write φY (t) = E [e itY ] and g(t) = log φY (t). SoφY (t) = eg(t).
I We know g(0) = 0. Also φ′Y (0) = iE [Y ] = 0 andφ′′Y (0) = i2E [Y 2] = −Var[Y ] = −1.
I Chain rule: φ′Y (0) = g ′(0)eg(0) = g ′(0) = 0 andφ′′Y (0) = g ′′(0)eg(0) + g ′(0)2eg(0) = g ′′(0) = −1.
I So g is a nice function with g(0) = g ′(0) = 0 andg ′′(0) = −1. Taylor expansion: g(t) = −t2/2 + o(t2) for tnear zero.
I Now Bn is 1√n
times the sum of n independent copies of Y .
I So φBn(t) =(φY (t/
√n))n
= eng( t√
n).
I But eng( t√
n) ≈ e
−n( t√n)2/2
= e−t2/2, in sense that LHS tends
to e−t2/2 as n tends to infinity.
![Page 67: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/67.jpg)
Almost verbatim: replace MY (t) with φY (t)
I Write φY (t) = E [e itY ] and g(t) = log φY (t). SoφY (t) = eg(t).
I We know g(0) = 0. Also φ′Y (0) = iE [Y ] = 0 andφ′′Y (0) = i2E [Y 2] = −Var[Y ] = −1.
I Chain rule: φ′Y (0) = g ′(0)eg(0) = g ′(0) = 0 andφ′′Y (0) = g ′′(0)eg(0) + g ′(0)2eg(0) = g ′′(0) = −1.
I So g is a nice function with g(0) = g ′(0) = 0 andg ′′(0) = −1. Taylor expansion: g(t) = −t2/2 + o(t2) for tnear zero.
I Now Bn is 1√n
times the sum of n independent copies of Y .
I So φBn(t) =(φY (t/
√n))n
= eng( t√
n).
I But eng( t√
n) ≈ e
−n( t√n)2/2
= e−t2/2, in sense that LHS tends
to e−t2/2 as n tends to infinity.
![Page 68: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/68.jpg)
Almost verbatim: replace MY (t) with φY (t)
I Write φY (t) = E [e itY ] and g(t) = log φY (t). SoφY (t) = eg(t).
I We know g(0) = 0. Also φ′Y (0) = iE [Y ] = 0 andφ′′Y (0) = i2E [Y 2] = −Var[Y ] = −1.
I Chain rule: φ′Y (0) = g ′(0)eg(0) = g ′(0) = 0 andφ′′Y (0) = g ′′(0)eg(0) + g ′(0)2eg(0) = g ′′(0) = −1.
I So g is a nice function with g(0) = g ′(0) = 0 andg ′′(0) = −1. Taylor expansion: g(t) = −t2/2 + o(t2) for tnear zero.
I Now Bn is 1√n
times the sum of n independent copies of Y .
I So φBn(t) =(φY (t/
√n))n
= eng( t√
n).
I But eng( t√
n) ≈ e
−n( t√n)2/2
= e−t2/2, in sense that LHS tends
to e−t2/2 as n tends to infinity.
![Page 69: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/69.jpg)
Almost verbatim: replace MY (t) with φY (t)
I Write φY (t) = E [e itY ] and g(t) = log φY (t). SoφY (t) = eg(t).
I We know g(0) = 0. Also φ′Y (0) = iE [Y ] = 0 andφ′′Y (0) = i2E [Y 2] = −Var[Y ] = −1.
I Chain rule: φ′Y (0) = g ′(0)eg(0) = g ′(0) = 0 andφ′′Y (0) = g ′′(0)eg(0) + g ′(0)2eg(0) = g ′′(0) = −1.
I So g is a nice function with g(0) = g ′(0) = 0 andg ′′(0) = −1. Taylor expansion: g(t) = −t2/2 + o(t2) for tnear zero.
I Now Bn is 1√n
times the sum of n independent copies of Y .
I So φBn(t) =(φY (t/
√n))n
= eng( t√
n).
I But eng( t√
n) ≈ e
−n( t√n)2/2
= e−t2/2, in sense that LHS tends
to e−t2/2 as n tends to infinity.
![Page 70: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/70.jpg)
Almost verbatim: replace MY (t) with φY (t)
I Write φY (t) = E [e itY ] and g(t) = log φY (t). SoφY (t) = eg(t).
I We know g(0) = 0. Also φ′Y (0) = iE [Y ] = 0 andφ′′Y (0) = i2E [Y 2] = −Var[Y ] = −1.
I Chain rule: φ′Y (0) = g ′(0)eg(0) = g ′(0) = 0 andφ′′Y (0) = g ′′(0)eg(0) + g ′(0)2eg(0) = g ′′(0) = −1.
I So g is a nice function with g(0) = g ′(0) = 0 andg ′′(0) = −1. Taylor expansion: g(t) = −t2/2 + o(t2) for tnear zero.
I Now Bn is 1√n
times the sum of n independent copies of Y .
I So φBn(t) =(φY (t/
√n))n
= eng( t√
n).
I But eng( t√
n) ≈ e
−n( t√n)2/2
= e−t2/2, in sense that LHS tends
to e−t2/2 as n tends to infinity.
![Page 71: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/71.jpg)
Almost verbatim: replace MY (t) with φY (t)
I Write φY (t) = E [e itY ] and g(t) = log φY (t). SoφY (t) = eg(t).
I We know g(0) = 0. Also φ′Y (0) = iE [Y ] = 0 andφ′′Y (0) = i2E [Y 2] = −Var[Y ] = −1.
I Chain rule: φ′Y (0) = g ′(0)eg(0) = g ′(0) = 0 andφ′′Y (0) = g ′′(0)eg(0) + g ′(0)2eg(0) = g ′′(0) = −1.
I So g is a nice function with g(0) = g ′(0) = 0 andg ′′(0) = −1. Taylor expansion: g(t) = −t2/2 + o(t2) for tnear zero.
I Now Bn is 1√n
times the sum of n independent copies of Y .
I So φBn(t) =(φY (t/
√n))n
= eng( t√
n).
I But eng( t√
n) ≈ e
−n( t√n)2/2
= e−t2/2, in sense that LHS tends
to e−t2/2 as n tends to infinity.
![Page 72: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/72.jpg)
Perspective
I The central limit theorem is actually fairly robust. Variants ofthe theorem still apply if you allow the Xi not to be identicallydistributed, or not to be completely independent.
I We won’t formulate these variants precisely in this course.
I But, roughly speaking, if you have a lot of little random termsthat are “mostly independent” — and no single termcontributes more than a “small fraction” of the total sum —then the total sum should be “approximately” normal.
I Example: if height is determined by lots of little mostlyindependent factors, then people’s heights should be normallydistributed.
I Not quite true... certain factors by themselves can cause aperson to be a whole lot shorter or taller. Also, individualfactors not really independent of each other.
I Kind of true for homogenous population, ignoring outliers.
![Page 73: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/73.jpg)
Perspective
I The central limit theorem is actually fairly robust. Variants ofthe theorem still apply if you allow the Xi not to be identicallydistributed, or not to be completely independent.
I We won’t formulate these variants precisely in this course.
I But, roughly speaking, if you have a lot of little random termsthat are “mostly independent” — and no single termcontributes more than a “small fraction” of the total sum —then the total sum should be “approximately” normal.
I Example: if height is determined by lots of little mostlyindependent factors, then people’s heights should be normallydistributed.
I Not quite true... certain factors by themselves can cause aperson to be a whole lot shorter or taller. Also, individualfactors not really independent of each other.
I Kind of true for homogenous population, ignoring outliers.
![Page 74: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/74.jpg)
Perspective
I The central limit theorem is actually fairly robust. Variants ofthe theorem still apply if you allow the Xi not to be identicallydistributed, or not to be completely independent.
I We won’t formulate these variants precisely in this course.
I But, roughly speaking, if you have a lot of little random termsthat are “mostly independent” — and no single termcontributes more than a “small fraction” of the total sum —then the total sum should be “approximately” normal.
I Example: if height is determined by lots of little mostlyindependent factors, then people’s heights should be normallydistributed.
I Not quite true... certain factors by themselves can cause aperson to be a whole lot shorter or taller. Also, individualfactors not really independent of each other.
I Kind of true for homogenous population, ignoring outliers.
![Page 75: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/75.jpg)
Perspective
I The central limit theorem is actually fairly robust. Variants ofthe theorem still apply if you allow the Xi not to be identicallydistributed, or not to be completely independent.
I We won’t formulate these variants precisely in this course.
I But, roughly speaking, if you have a lot of little random termsthat are “mostly independent” — and no single termcontributes more than a “small fraction” of the total sum —then the total sum should be “approximately” normal.
I Example: if height is determined by lots of little mostlyindependent factors, then people’s heights should be normallydistributed.
I Not quite true... certain factors by themselves can cause aperson to be a whole lot shorter or taller. Also, individualfactors not really independent of each other.
I Kind of true for homogenous population, ignoring outliers.
![Page 76: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/76.jpg)
Perspective
I The central limit theorem is actually fairly robust. Variants ofthe theorem still apply if you allow the Xi not to be identicallydistributed, or not to be completely independent.
I We won’t formulate these variants precisely in this course.
I But, roughly speaking, if you have a lot of little random termsthat are “mostly independent” — and no single termcontributes more than a “small fraction” of the total sum —then the total sum should be “approximately” normal.
I Example: if height is determined by lots of little mostlyindependent factors, then people’s heights should be normallydistributed.
I Not quite true... certain factors by themselves can cause aperson to be a whole lot shorter or taller. Also, individualfactors not really independent of each other.
I Kind of true for homogenous population, ignoring outliers.
![Page 77: 18.600: Lecture 30 .1in Central limit theoremmath.mit.edu/~sheffield/2019600/Lecture30.pdf · Recall: DeMoivre-Laplace limit theorem I Let X i be an i.i.d. sequence of random variables](https://reader034.vdocuments.us/reader034/viewer/2022050323/5f7cc04903bc6600757ef4a4/html5/thumbnails/77.jpg)
Perspective
I The central limit theorem is actually fairly robust. Variants ofthe theorem still apply if you allow the Xi not to be identicallydistributed, or not to be completely independent.
I We won’t formulate these variants precisely in this course.
I But, roughly speaking, if you have a lot of little random termsthat are “mostly independent” — and no single termcontributes more than a “small fraction” of the total sum —then the total sum should be “approximately” normal.
I Example: if height is determined by lots of little mostlyindependent factors, then people’s heights should be normallydistributed.
I Not quite true... certain factors by themselves can cause aperson to be a whole lot shorter or taller. Also, individualfactors not really independent of each other.
I Kind of true for homogenous population, ignoring outliers.