statistical analysis ib topic 1. why study statistics? scientists use the scientific method when...
TRANSCRIPT
Statistical Analysis
IB Topic 1
Why study statistics? Scientists use the scientific method when
designing experiments Observations and experiments result in
the collection of measurable data Statistics is a branch of mathematics
which allows us to sample small portions and draw conclusions about the larger population
Words to know and love … Mean
Average of data points Sum divided by the total
Range Measures the spread of data Difference between the largest and smallest Very large or small values are called outliers
More words … Standard deviation (SD)
A measure of how data are dispersed or spread around the mean
Determined by mathematical formula (which you do NOT need to know)
Use your calculator or online program Error bars
Graphical representation of the variability of data Error bars can show either the range of data OR the SD Look at Figures 1.1 and 1.2 in your packet
Standard Deviation In normal distribution, about 68% of all
values lie within +/- 1 SD of the mean This rises to about 95% for +/- 2 SD from
the mean The SD tells us how tightly the data points
are clustered around the mean Clustered together = small SD Spread out = large SD
Graphical Interpretation
Why is this useful? SD tells you how many extremes are in the
data
Questions: What is the shape of the graph of a normal
distribution of data points? If there are 100 bean plants represented by the
bell curve, how many will be within one standard deviation of the mean?
Comparing the means and spread of data between two or more samples Open your packet to page 6 and look at
the data table for the bean plants First, calculate the mean Look at the data – how would you describe
the values for both sets of data? How can we quantify your observations
about the variability of the data? Find the standard deviation Use your calculator
Don’t worry about the equation (unless you want to)
Options … TI 83, 84
http://www.saintmarys.edu/~cpeltier/calcforstat/StatTI-83.html
TI 86 http://www.saintmarys.edu/~cpeltier/calcforsta
t/StatTI-86.html Online calculator
http://www.graphpad.com/quickcalcs/ttest1.cfm
Back to the bean plants SD in sunlight = 17.68 cm SD in shade = 47.02 cm Looking at the means alone, it appears
there is no difference between the two sets of data
However, the high SD of the plants grown in the shade tells us what? How confident can we be in the data? What conclusions can we draw about just
looking at the mean?
Question… If all the data values are equal, such as 7, 7,
7, 7, what is the standard deviation of this set of four data points?
Answer 0, if all values are the same, there’s no
deviation from the mean
Question If the daily temperatures of a city A range from
10 *C to 30*C for one month, the mean temperature may be 20*C. Another city B may also have a mean temperature of 20*C for the same month. However, the range of city B is only 15*C to 25*C.
Which city has a temperature with a higher standard deviation?
Which city can give a more accurate prediction of weather and why?
Answer City A has a higher standard deviation City B since is has a very narrow range of
temperature or a very low standard deviation
Significant difference and the t-test The t-test is used to determine whether or
not the difference between two sets of data is a significant (real) difference
We use a Table of t values (page 8) You do not need to memorize this! This is a tool scientists use
How to navigate the table Probability (p)
Bottom of the table (p) that chance alone could make a difference 0.50 = difference is due to chance 50% of the
time This is not a significant difference Statisticians are never 100% certain, but like to be at
least 95% certain
Degrees of freedom Sum of sample sizes of each of the two groups
minus two
Practice Looking at the table … If the degree of freedom is 9, and the
given value of t is 2.60, the table indicates that the t value is just greater that 2.26.
Looking at the bottom of the table, probability that chance alone could produce the result is only 5% This means there is a 95% chance that the
difference is significant
Worked example 1.5 What are the degrees of freedom used to
determine the probability that the differences between the two groups are due to chance?
Using the given t value of 2.00 with your calculated degrees of freedom, what is the probability that chance alone can produce a difference in the heights of these girls?
How confident are we that the British girls are taller than the US girls based on this sample size?
Answers 98
50+50-2 5% 95% confident
Correlation and Causation Observing something can suggest correlation
Experiments provide a test which shows cause
Observations without an experiment can only show a correlation.
Question For years we have known that there is a
high positive correlation between smoking and lung cancer. Does this high positive correlation prove that smoking causes lung cancer?
How can the cause of lung cancer be determined?
Africanized Honey Bees (AHB) Are there any volunteers who can
summarize the relationship between correlation and causation using this example?
Cormorants Example of using a mathematical
correlation test The value of r = correlation
+1 (complete positive correlation) to 0 (no correlation) to -1 (complete negative correlation)
Exit slip Tear and share paper Name, date, period upper right hand
corner Title: Exit Slip
Exit Slip 1. What is standard deviation used for? 2. What is an error bar?