students' misconceptions in statistics
TRANSCRIPT
Students’
Misconceptions in
Statistics
HK Mathematics Education Conference 2013
20-June-2013
NG Douglas, CHU Carlin, TSANG Kin Fun
• Common misconceptions
i. Probability/conditional probability
ii. Expectation and Variance
iii. Relationship between two variables
iv. Confidence Interval
v. Standardization and Normal distribution
vi. Relationship among distributions
Background
• Teacher:
– Share the statistical concepts in a symmetrical way
– Follow a logical sequence to introduce statistical
concepts
• Student:
– May not pay attention to all the pieces
– Learn fragmented parts
– Partial understanding of the subject matter
• Misconception:
– a wrong belief or opinion as a result of not fully understanding
something
• Cause:
– some statistical concepts are interrelated
– known portions to fill in the missing/unknown part
– similar terminologies
• Remedy:
– Spot out the related components for reinforcement
i) Probability/conditional probability
• Find P(A and B), given P(A), P(B) and P(A or B)
Probability of
compound event
P(A and B)=P(A) P(B)
Independence
condition
i) Probability/conditional probability
• Fill the missing piece
– Both independent event and dependent event
Independence
condition
Independent event:
P(A and B)= P(A)P( B)
Dependent event:
P(A and B)= P(A|B) P(B)
or P(B|A) P(A)
ii) Expectation and Variance
• Evaluate Var(2X)
As E(2X)=2E(X)
Var(2X)= 2Var(X)
Confusion:
Expectation and
Variance
ii) Expectation and Variance
• Fill the missing piece
– Relationships between Expectation and Variance
– Related mathematical proofs https://en.wikipedia.org/wiki/Variance
Confusion:
Expectation and
Variance
Relationship:
Var(X)=E[X-E(X)]2
=E[X-µ] 2
Proof:
Var(2X)=E[2X-E(2X)]2
= 4E[X-µ] 2
= 4Var(X)
Var(2X)= E[2X-E(2X)] 2
= E[2X-2µ] 2
= E[2X-2µ] 2
= E[4X 2 +4µ 2-8Xµ ]
= 4E[X 2 +µ 2-2Xµ ]
= 4E[X-µ] 2
= 4Var(X)
ii) Expectation and Variance
• Evaluate Var(X+Y) and Var(X-Y) (Out of DSE syllabus)
– Some simple equations may …
E(X+Y)=E(X)+E(Y)
Var(X+Y)=Var(X)+Var(Y)
Var(X-Y) =Var(X)+Var(Y)
When X, Y are independent
Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)
Var(X-Y) =Var(X)+Var(Y)-2Cov(X,Y)
When X, Y are not independent
Expectation and
Variance have
different
equations
iii) Relationship between two variables
• What is the conclusion of zero correlation?
Independence => zero correlation
Therefore,
zero correlation =>Independence
Zero correlation
iii) Relationship between two variables
• Fill the missing piece
– Mathematical example http://mathforum.org/library/drmath/view/64808.html
http://en.wikipedia.org/wiki/Correlation_and_dependence
http://www.purplemath.com/modules/scattreg2.htm
Zero correlation
does not imply
independence
Pictorial examples
Other example : Y=X2
X and Y are clearly
dependent
Their correlation is zero
iv) Confidence Interval
"we are 95% confident that the true
value of the parameter is in our
confidence interval”
Confident => Probability ?
Which position is a more
reasonable guess of the
true parameter value ?
P1 P2
iv) Confidence Interval
• Fill the missing piecehttp://en.wikipedia.org/wiki/Confidence_interval
Concepts of
Confidence
interval
Meaning:
Drawing N sample and construct N
different CI, 95% of the observed
confidence intervals will hold the
true value of the parameter.
After a sample is taken, the population
parameter is either in the interval made or not,
there is no chance.
i.e. Prob (θ| sample observation) = 0 or 1
iv) Confidence Interval (Advanced)
If the 95% CI that’s constructed for one
sample partially overlaps the 95% CI
that’s constructed from a second
independent sample, the two samples
statistics are not significantly different
from each other at α = 0.05.
CI of first sample CI of second sample
Compare the
CI ?
iv) Confidence Interval (Advanced)
• Fill the missing piece
– Mathematical example http://www.statisticalmisconceptions.com/MiscAndInvite07b.html
http://www.cscu.cornell.edu/news/statnews/stnews73.pdf
http://www.measuringusability.com/blog/ci-10things.php
Concepts of
Confidence
interval
Applications:
Hypothesis testing
v) Standardization and Normal distribution
Is standardized data
normally distributed ?
Standardization is used
together with Normal
distribution most of the time
v) Standardization and Normal distribution
Confusion
Standardization ensure
Mean=0
SD=1
W=(X-µ)/ σ
Shifting: using µ
Scaling: using σ
vi) Relationship among distributions
Normal
distribution
Poisson
distributionGeometric
distribution
Bernoulli
distribution
Binomial
distribution
• Ref: http://math.wustl.edu/~jmding/math493/dist.pdf
• Ref: http://math.wustl.edu/~jmding/math493/dist.pdf
Out of DSE syllabus
• Ref:
http://curricular.providen
ce.edu/~rgoldstein/statis
tics/BinPoissNorm.pdf
• Poisson or Binomial distribution• If a mean or average probability of an event happening per unit time/per page/per mile cycled etc.,
is given, and you are asked to calculate a probability of n events happening in a given
time/number of pages/number of miles cycled, then the Poisson Distribution is used.
• If, on the other hand, an exact probability of an event happening is given, or implied, in the
question, and you are asked to caclulate the probability of this event happening k times out of n,
then the Binomial Distribution must be used.
• Ref: http://personal.maths.surrey.ac.uk/st/J.Deane/Teach/se202/poiss_bin.html
• http://curricular.providence.edu/~rgoldstein/statistics/BinPoissNorm.pdf
Reference
• http://www.statisticalmisconceptions.com/
• http://blog.minitab.com/blog/real-world-quality-improvement/3-common-and-dangerous-statistical-
misconceptions
• http://statswithcats.wordpress.com/2011/02/20/six-misconceptions-about-statistics-you-may-get-
from-stats-101/
• https://en.wikipedia.org/wiki/Variance
• http://mathforum.org/library/drmath/view/64808.html
• http://en.wikipedia.org/wiki/Correlation_and_dependence
• http://www.purplemath.com/modules/scattreg2.htm
• http://www.measuringusability.com/blog/ci-10things.php
• http://math.wustl.edu/~jmding/math493/dist.pdf
• http://personal.maths.surrey.ac.uk/st/J.Deane/Teach/se202/poiss_bin.html
• http://curricular.providence.edu/~rgoldstein/statistics/BinPoissNorm.pdf
iphone Statistics App(free)
Probability/conditional probability
If someone is diagnosed as having a
very rare and fatal disease, and if the
procedure used to come up with this
diagnosis is 99 percent accurate, then
the person who’s been diagnosed has a
right to feel that “the end is near.”
• Fill the missing piece
– http://www.statisticalmisconceptions.com/MiscAndInvite05f.html