data and variation
DESCRIPTION
Data and Variation. Ways to Represent Data…. There are quite a few! Let’s look at a few that we have seen, along with some that we saw in previous years. Pie Charts. Raw Data Here are all the first quiz scores for the 200 students enrolled in Algebra I. How’d they do?. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/1.jpg)
Data and Variation
![Page 2: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/2.jpg)
Ways to Represent Data…
There are quite a few!
Let’s look at a few that we have seen, along with some that we saw in previous years.
![Page 3: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/3.jpg)
• Pie Charts
![Page 4: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/4.jpg)
• Raw Data• Here are all the first quiz scores for the 200
students enrolled in Algebra I.
How’dthey do?
![Page 5: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/5.jpg)
• Put them in order.
How’dthey do?
![Page 6: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/6.jpg)
• Stem-and-Leaf Plot
How’dthey do?
![Page 7: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/7.jpg)
• Frequency Histogram
How’dthey do?
![Page 8: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/8.jpg)
• Same data, different histogram
How’dthey do?
![Page 9: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/9.jpg)
Measures of Central Tendency• What is the “average” versus the average?• Average can mean different things!– MEAN: the average of an entire set of data
– MEDIAN: the data point in the middle when a data set is ordered from lowest to highest
– MODE: the most common occurring data value(s)
![Page 10: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/10.jpg)
• Each one can be used in any situation but it can be misleading or not give you an accurate picture of the entire data set.
• If you want to find the average price to fill your tank with gas?
• If you want to find the average salary of graduates of your school?
• If you want to find the average number of pets in a family?
• If you want to find the average test score?
![Page 11: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/11.jpg)
Variation• 2000 Batting Averages• Highest was 0.372
• 1920 Batting Averages• Highest was over 0.400
and 2 players were in the 0.380s
![Page 12: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/12.jpg)
What do you see?
• 2000 Batting Averages• Not much variation in
data
• 1920 Batting Averages• More variation in data
![Page 13: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/13.jpg)
Measuring Variation
• Five-Number Summary– Minimum Value– Maximum Value– Median Value of all data– Median of Bottom Half of Data (1st quartile)– Median of Top Half of Data (3rd quartile)
![Page 14: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/14.jpg)
Box and Whisker Plots
• Here is a plot of the exam data from before.
• Dots are outliers (more than 1.5 times the distance from Q1 to Q3).
• How’d they do?
![Page 15: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/15.jpg)
Accuracy in Measurement• 100 people are given a new fancy laser that
will measure a persons’ height. Here are the results when 100 people measured the same girl.
![Page 16: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/16.jpg)
Measuring Variation
• Calculate the Mean.
• Find out how far each value is from the mean.
• How far on average is each value from the mean?
• This is called the deviation from the mean.
![Page 17: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/17.jpg)
Standard Deviation
•
![Page 18: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/18.jpg)
Look back at our data…• The standard deviation of the height of the girl
was 0.2”.• The standard deviation of 1920 batting
averages is 0.050 and of 2000 batting averages is 0.038. Smaller standard deviation implies the data is more tightly grouped.
• The standard deviation of exam scores is 14.782. (Large due to outliers that affect the mean as well.)
![Page 19: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/19.jpg)
Shapes of Graphs
• Graphs can be skewed one direction or the other.• Graphs of batting averages and height were
symmetrical around the central value.• Exam scores were not symmetrical since most
students scored higher. This is skewed to the left (where the tail is).
• A graph skewed to the right means the tail is on the right side of the graph.
![Page 20: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/20.jpg)
Salaries at Corporations
• They are skewed to the right.• Fewer people at the top of the ladder who
make the most money.• Because it is skewed to the right, this means
that the mean is HIGHER than the median.• Median is best for describing the average
employee salary, while Mean is best when doing payroll calculations and budgets.
![Page 21: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/21.jpg)
![Page 22: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/22.jpg)
Housing Prices
• Skewed to the right.
• Mean pulled in direction of skew relative to median.
• Mean is HIGHER than median.
![Page 23: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/23.jpg)
• Exam scores
• Data is skewed to the left.
• Mean is LOWER than median.
![Page 24: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/24.jpg)
![Page 25: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/25.jpg)
Example #3• The following histogram shows the exam
scores for 30 students in a freshman accounting class. Estimate the mean of these scores. Is the standard deviation of these scores likely to be closer to 12 or to 25?
![Page 26: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/26.jpg)
Answer to Example #3
• The mean score is approximately 70 The standard deviation is more likely to be closer to 12 because about half of the scores are within 10 of 70 and the other half are further than 10 but less than 30 away therefore it seems more likely that the standard deviation would average out to close to 12 rather than 25.
![Page 27: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/27.jpg)
SAT Scores
• What do you see?• Bimodal distribution – often experienced on test scores. Students
who know what they are doing come exam time and students who do NOT know.
![Page 28: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/28.jpg)
Uniform Distributions
• All are around 166 times.
• Theoretically, it should be 166 2/3 times, but that is impossible for real data.
![Page 29: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/29.jpg)
The Bell Curve
• Most famous of the shapes is the bell-shaped curve, aka normal curve, aka normal distribution, aka Gaussian distribution.
• Appears often in nature and in mathematics.• Lots of formulas to describe it and analyze it.• Let’s look at some examples!
![Page 30: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/30.jpg)
![Page 31: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/31.jpg)
![Page 32: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/32.jpg)
Why should we expect bells?
• Around the mean, there should be an expected amount of variation above and below. The more the variation, the less likely it is. Thus we have a cluster in the middle and approximately the same in high and low ends.
![Page 33: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/33.jpg)
![Page 34: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/34.jpg)
Normal Curves and Standard Deviation
• 68% of the data differ from the mean by less than one standard deviation.
• 95% of the data differ from the mean by less than two standard deviations.
• 99.7% of the data differ from the mean by less than three standard deviations.
![Page 35: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/35.jpg)
![Page 36: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/36.jpg)
Example #1
• All freshmen entering NHS have their heads measured for the beanies they are required to wear. One year the head circumference data had a normal distribution with mean 55 cm and standard deviation 1.7 cm. What percentage of the students that year had a head circumference between 53.3 cm and 56.7 cm? What percentage had circumference above 58.4 cm?
![Page 37: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/37.jpg)
Answer to Example #1• For data with a normal distribution, about 68% of the values
differ from the mean by less than one standard deviation. The normally distributed head measurements have mean 55 cm and standard deviation 1.7 cm, so heads within one standard deviation of the mean will measure between 55 - 1.7 = 53.3 cm and 55 + 1.7 = 56.7 cm. Thus approximately 68% of the freshmen have head circumferences between 53.3 and 56.7 cm. A head measuring more than 58.4 cm is more than 3.4 cm, or two standard deviations, above the mean. For the second question, recall that approximately 95% of the values in a normal distribution are within two standard deviations, so only 5% lie above or below those limits. Thus, in this case, roughly 5%/2 = 2.5% of the freshmen will have head circumferences measuring more than 58.4 cm.
![Page 38: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/38.jpg)
Example #2
• The average high temperature in Anchorage, Alaska, in January is 21ºF with a standard deviation of 10º. The average high temperature in Honolulu in January is 80ºF with a standard deviation of 8º. In which location would it be more unusual to have a day in January with a high of 57ºF?
![Page 39: Data and Variation](https://reader035.vdocuments.us/reader035/viewer/2022062305/568163be550346895dd4db6a/html5/thumbnails/39.jpg)
Answer to #2
• A January temperature of 57° would be more unusual in Anchorage. This temperature is within three standard deviations (3 * 8° = 24°) of the mean (80°) in Honolulu but is outside the range of three standard deviations (3 * 10° = 30°) of the mean (21°) in Anchorage.