04 statistics presentation_notes
DESCRIPTION
TRANSCRIPT
Statistics…blergStatistics…blerg
09/08/0909/08/09
What is statistics? What is statistics?
From the American Statistical AssociationFrom the American Statistical Association• Statistics is the Statistics is the scientific application of scientific application of
mathematical principles to the mathematical principles to the collectioncollection, , analysisanalysis, and , and presentationpresentation of numerical of numerical datadata. Statisticians contribute to scientific . Statisticians contribute to scientific inquiry by applying their mathematical and inquiry by applying their mathematical and statistical knowledge to the design of surveys statistical knowledge to the design of surveys and experiments; the collection, processing, and experiments; the collection, processing, and analysis of data; and analysis of data; and the and the interpretationinterpretation of the results. of the results.
Why do I need to know statistics? Why do I need to know statistics?
Short answer: Short answer: • To complete your To complete your internal internal
assessmentsassessments correctly (lab component correctly (lab component of your IB score—24%)of your IB score—24%)
Why do I need to know statistics?Why do I need to know statistics?
Life answers: Life answers: 1.1. To be able to To be able to effectively conduct effectively conduct
research research To To make decisionsmake decisions based upon collected, based upon collected,
numerical datanumerical data2.2. To be able to read journals, to To be able to read journals, to make make
meaning of resultsmeaning of results of studies, to of studies, to understand sports!understand sports!
3.3. To develop To develop critical and analytic critical and analytic thinking skills thinking skills
4.4. To be an To be an informed consumerinformed consumer and not and not be mislead by erroneous reports of be mislead by erroneous reports of statistical resultsstatistical results
What parts of statistics do I need to What parts of statistics do I need to know? know?
Mean Mean Standard deviation Standard deviation Error barsError bars Significant difference with a t-testSignificant difference with a t-test Correlation and causation Correlation and causation
MeanMean
The most commonly used measure of The most commonly used measure of central tendency is the mean, or central tendency is the mean, or arithmetic average (sum of data arithmetic average (sum of data points divided by the number of points divided by the number of points) points)
How to calculate mean on a TI-83How to calculate mean on a TI-83
Push the STAT buttonPush the STAT button Choose EDITChoose EDIT Input your data in the LISTSInput your data in the LISTS
• Each level of independent variable is a Each level of independent variable is a separate list with at least 5 trials for separate list with at least 5 trials for eacheach
How to calculate mean on a TI-83How to calculate mean on a TI-83
Choose CALCChoose CALC Choose 1-VAR StatsChoose 1-VAR Stats
• This will provide for you the following information on This will provide for you the following information on ONE of your LISTS (levels of independent variable)ONE of your LISTS (levels of independent variable)
• MeanMean• Sum of the data pointsSum of the data points• Square of the sum of the data pointsSquare of the sum of the data points• Sample SD (this is the one you want)Sample SD (this is the one you want)• Population SDPopulation SD• Number of data points (better be at least 5)Number of data points (better be at least 5)• Minimum valueMinimum value• First quartile valueFirst quartile value• MedianMedian• Third quartile value Third quartile value • Maximum value Maximum value
How to calculate mean in ExcelHow to calculate mean in Excel
Input your data into the cellsInput your data into the cells Highlight all of the cells of the data Highlight all of the cells of the data
you want the SD foryou want the SD for Click the Click the ΣΣ drop down option drop down option Select average Select average
• The average will appear below your The average will appear below your highlighted texthighlighted text
Standard DeviationStandard Deviation Standard deviation is Standard deviation is
used to used to summarize summarize the spread of the spread of variables around variables around the meanthe mean. .
68% of the values of 68% of the values of a normal distribution a normal distribution fall within one fall within one standard deviation standard deviation of the mean (+/- 1)of the mean (+/- 1)
95% of the values 95% of the values fall within 2SDfall within 2SD
99%-100% of the 99%-100% of the values fall within values fall within 3SD 3SD
Standard Deviation Standard Deviation Standard Deviation (SD) can be used to Standard Deviation (SD) can be used to compare compare
populations or sets of datapopulations or sets of data. . The closer the mean and the SD, the more likely The closer the mean and the SD, the more likely
the populations studied are the same or similar. the populations studied are the same or similar. A A small SD indicates that the data is small SD indicates that the data is
clustered closely around the mean valueclustered closely around the mean value..• When completing your statistics you want to aim for a When completing your statistics you want to aim for a
small SD small SD A A large SD indicates a wider spread around large SD indicates a wider spread around
the meanthe mean..• This may mean that your collection techniques were This may mean that your collection techniques were
flawed flawed Smaller samples create variation due to the Smaller samples create variation due to the
random factors, small samples are unreliable.random factors, small samples are unreliable.• Because of this, you will always aim for a 5x5 Because of this, you will always aim for a 5x5
experimentexperiment 5 levels of independent variable, 5 trials (or more) of 5 levels of independent variable, 5 trials (or more) of
each each
How to compute SD on a TI-83How to compute SD on a TI-83
THIS SHOULD HAVE BEEN THIS SHOULD HAVE BEEN COMPLETED PREVIOUSLY FOR MEANCOMPLETED PREVIOUSLY FOR MEAN
Push the STAT buttonPush the STAT button Choose EDITChoose EDIT Input your data in the LISTSInput your data in the LISTS
• Each level of independent variable is a Each level of independent variable is a separate list with at least 5 trials for separate list with at least 5 trials for eacheach
How to compute SD on a TI-83How to compute SD on a TI-83 Choose CALCChoose CALC Choose 1-VAR StatsChoose 1-VAR Stats
• This will provide for you the following information on This will provide for you the following information on ONE of your LISTS (levels of independent variable)ONE of your LISTS (levels of independent variable)
• MeanMean• Sum of the data pointsSum of the data points• Square of the sum of the data pointsSquare of the sum of the data points• Sample SD (this is the one you want)Sample SD (this is the one you want)• Population SDPopulation SD• Number of data points (better be at least 5)Number of data points (better be at least 5)• Minimum valueMinimum value• First quartile valueFirst quartile value• MedianMedian• Third quartile value Third quartile value • Maximum value Maximum value
How to compute SD on ExcelHow to compute SD on Excel
Input your data into the cellsInput your data into the cells Highlight all of the cells of the data Highlight all of the cells of the data
you want the SD foryou want the SD for Click the Click the ΣΣ drop down option drop down option Select More functions… Select More functions… Choose STDEVChoose STDEV
• The SD will appear as The SD will appear as formula resultformula result in a in a boxbox
Error BarsError Bars
The simplest way to draw an The simplest way to draw an error bar is to error bar is to use the use the mean as the central mean as the central point, and to use the point, and to use the distance of the distance of the measurement that is measurement that is furthest from the furthest from the average as the endpoints average as the endpoints of the data barof the data bar
Can also use standard Can also use standard deviation divided by the deviation divided by the square root of the sample square root of the sample size to compute size to compute standard standard errorerror and use that to make and use that to make error barserror bars• You want to have a LOW You want to have a LOW
standard errorstandard error
Average value
Value farthest from average
Calculated distance
Drawing Error BarsDrawing Error Bars
Using error bars to explain data…Using error bars to explain data…
If the bars show If the bars show extensive extensive overlap, it is overlap, it is likely that there likely that there is is notnot a a significant significant difference difference between those between those valuesvalues
Significant Difference Significant Difference (t-tests)(t-tests)
• t-test compares the averages and standard deviations of two samples to see if there is a significant difference between them
• Find the critical value of t for the relevant number of degrees of freedom• Degrees of freedom = (n1 + n2) – 2
• If the calculated value is below the critical value there is no sig. diff. between the two sets of data • t < critical value = no sig. diff.
• If the calculated value is above the critical value there is a sig. diff. between the two sets of data• t > critical value = sig. diff.
Using a TI-83 to compute Sig. Diff.Using a TI-83 to compute Sig. Diff.
THIS SHOULD HAVE BEEN THIS SHOULD HAVE BEEN COMPLETED PREVIOUSLY FOR MEAN COMPLETED PREVIOUSLY FOR MEAN AND SDAND SD
Push the STAT buttonPush the STAT button Choose EDITChoose EDIT Input your data in the LISTSInput your data in the LISTS
• Each level of independent variable is a Each level of independent variable is a separate list with at least 5 trials for separate list with at least 5 trials for eacheach
Using a TI-83 to compute Sig. Diff.Using a TI-83 to compute Sig. Diff.
Push STATPush STAT Arrow over to TESTSArrow over to TESTS Typically you are going to be comparing Typically you are going to be comparing
your sets of data against each other for your sets of data against each other for significant difference significant difference
Arrow down to 4:2-SampTTestArrow down to 4:2-SampTTest Choose Data, press ENTERChoose Data, press ENTER Choose the Lists you wish to compare—do Choose the Lists you wish to compare—do
not POOL your datanot POOL your data Choose Calculate and hit ENTERChoose Calculate and hit ENTER T value will be listed (use absolute value)T value will be listed (use absolute value) Compare to listed critical valueCompare to listed critical value
How to compute Sig. Diff. on ExcelHow to compute Sig. Diff. on Excel
Input your data into the cellsInput your data into the cells Highlight all of the cells of the data you want the Highlight all of the cells of the data you want the
SD forSD for Click the Click the ΣΣ drop down option drop down option Select More functions…Select More functions… Select the category StatisticalSelect the category Statistical Choose TTESTChoose TTEST Array 1 = A1:A5 (the first cell:the last cell of the Array 1 = A1:A5 (the first cell:the last cell of the
data) data) Array 2 = B1:B5Array 2 = B1:B5 Tails = 2Tails = 2 Type = 1Type = 1 Formula Result will provide the answerFormula Result will provide the answer
Correlation and CausationCorrelation and Causation An action can An action can correlatecorrelate with another with another
(such as smoking is correlated with (such as smoking is correlated with alcoholism) alcoholism) • A relation existing between A relation existing between
statistical variables which tend to statistical variables which tend to vary, be associated, or occur vary, be associated, or occur together in a way not expected on together in a way not expected on the basis of chance alonethe basis of chance alone
An action or occurrence can An action or occurrence can causecause another (such as smoking causes another (such as smoking causes lung cancer) lung cancer) • The act or agency which produces The act or agency which produces
an effectan effect
Correlation and CausationCorrelation and Causation
Typically, one can only establish Typically, one can only establish correlation correlation unless the effects are unless the effects are extremely notable extremely notable andand there is there is no reasonable explanation that no reasonable explanation that challenges causalitychallenges causality. .
Without clear reasons to accept Without clear reasons to accept causality, we should only accept causality, we should only accept correlation. correlation.