chahine hypothesis testing,
DESCRIPTION
Hypothesis Testing, Intro Statistics,TRANSCRIPT
Hypothesis Testing
Foundations ISeptember 26, 2011
22011-2012
Objectives Differentiate three measures of central tendency,
including their advantages and disadvantages Explain the rationale of hypothesis testing Define the null and alternate hypotheses Define and interpret: p value, test statistic, type I and
II error, alpha, beta and statistical power Explain how statistical power and sample size are
related and describe other factors influencing power
32011-2012
Levels of Measurement Categorical (nominal) Ordinal Interval Ratio
42011-2012
Categorical Data Non-ordered data Often represents different categories: sex, eye
colour, genotypes etc… An average would be meaningless More meaningful to talk about: different
categories, proportions, percentages or mode
52011-2012
Ordinal Data Ordered data The distance between the data points may vary E.g., Placement in a race, perceived level of pain, or
depression scale 7 is greater than 5 and greater than 3 but differences
between 7 & 5 may not be the same as 5 & 3 Average is not meaningful here; finding a middle number
maybe more meaningful and most consistent
62011-2012
Interval Data Very similar to ordinal data, but the differences are
consistent E.g., Temperature in Celsius or Ferinheight Difference between 20 and 30 is the same as the difference
between 40 and 50 Really well designed rating scales gather interval data Important to note that 0 is not meaningful in interval data An average (mean) is meaningful unless data is skewed
72011-2012
Ratio Data Very similar to interval data except 0 is meaningful E.g., Tracking growth of bacteria, height, & weight of babies Someone can be twice as tall as another person; however,
cannot say something is twice as hot or cold unless its measured in Kelvin (in Kelvin temperature of 0 is meaningful)
Average is very useful and many statistical procedures for ratio data are based on means; however, if data is skewed median is more useful
82011-2012
Central Tendency If you wanted to describe a population or a group
of people using one or two numbers you could say: • On average, students in this class scored about 75% on
last exam…. • In this class, the most frequent eye colour is….• In a small sub-sample of 10 students, the middle score on
the exam was….
92011-2012
Mean, Median & Mode Depending on the type and quality of your data,
either mean, median, or mode may be more suitable in describing the typical structure of your data or central tendency
Statistical analyses such as Analysis of Variance, or Chi Square Analysis or T-Tests are based on different measures of central tendency
102011-2012
Descriptive vs. Inferential Statistics Descriptive statistics describe the sample or
population usually by providing values of range, maximum, minimum, central tendency, variance (sum of individual differences from the mean)
Inferential statistics are often used when you do not have access to the entire population and want to make an inference about this population
112011-2012
A Conjecture….. After doing a great deal of reading, the dean of a
well know US medical school believed that in general, the students in medical programs have an average IQ of 135
This is conjecture about an entire population of undergraduate medical students
122011-2012
Hypothesis Testing: Step 1 We can test the dean’s conjecture…
Null Hypothesis - Ho: µ=135
Alternative Hypothesis - HA: µ≠135
We test for the conjecture or hypothesis by making it the null
132011-2012
Role of Software Computer programs such as SPSS, SAS, R, STATA,
etc… They have built in algorithms to carry out what you
might do by hand Its is important to initially do this by hand to
understand what it means to reject, or fail to reject the null hypothesis
142011-2012
Hypothesis Testing: Step 2 Because we are not dealing with absolutes and we are making a
prediction about a population its not exact. We need to select a criterion or significance level by which we can
either reject or accept the null hypothesis. Most often the criterion or significance level is set at .05 It is also referred to as p-value or α
At what point is the difference between the sample mean and 135 not due to chance but fact ??
152011-2012
Hypothesis Testing: Step 3 - We sample 10 students - Area of acceptance is 95%- Look up critical values on a t-
score table (±2.262)
162011-2012
Hypothesis Testing: Step 4 We need to randomly draw a sample of 10 Students
115, 140, 133, 125, 120, 126, 136, 124, 132, 129
Mean = 128
172011-2012
Hypothesis Testing: Step 5 We need to calculate Standard Deviation (SD) &
Standard Error (SE)
How many people you know has heard of standard deviation before?
How many people know what it means?
182011-2012
Before SD we need tounderstand variance
Standard Deviation – Can be thought of as an average of deviation Standard Error – Is an estimation of SD used in calculating t-statistic
IQ Scores Mean Diviations Scores Diviations scores Squared 115 128 13 169140 128 -12 144133 128 -5 25125 128 3 9120 128 8 64126 128 2 4136 128 -8 64124 128 4 16132 128 -4 16129 128 -1 1
Sum 0 512
Sample Variance 0 56.88889
Standard Deviation 0 7.542472
Standard Error 2.385139
192011-2012
T-Test The t-statistic was introduced in 1908 by William
Sealy Gosset A chemist working for the Guinness brewery in
Dublin, Ireland ("Student" was his pen name) Gosset devised the t-test as a way to cheaply
monitor the quality of stout Published the test in Biometrika in 1908
202011-2012
Hypothesis Testing: Steps 6 & 7 T-statistic = (sample average – hypothesis)/standard error
t= (128-135)/2.385t=-2.935
“The hypothesis that the mean IQ of the population is 135 was rejected, t= -2.935, df=9, p≤ .05.”
212011-2012
Type I and II Error Remember in step 2, we asked how much will we attribute the
difference of means to chance… Measurement is never exact; though some journals and papers
vary, a p-value of .05 (meaning that we are 95% sure that result did not happen by chance) is used
When we have rejected the null and it is actually true this is type I error or “false positive”
When we have not rejected the null and it is actually false this is a type II error or “false negative”
222011-2012
Power and Measures How much power does our prediction have? How much can we infer? It depends on sample size & quality of the measure IQ, Depression Scale, Cognitive ability are unobservable Growth of bacteria, cellular effects from medication are
observables – a ruler can be put to it The more we can see, the less population we will need The more accurate our inferences, the smaller error we would
produce