Transcript
  • Slide 1

Here is a back-to-back stemplot of the pulse rates of female and male students in one AP Statistics class. Write a few sentences comparing the two distributions. Females Males 0 10 754319 0002 88642008 04688 886207 024578 7426 00234679 55 488 4 8 Here is a time plot from buzz.yahoo.com that shows the (illegal) downloading of music using the peer-to-peer software LimeWire during the period May 14 to August 6, 2006. (a) Write a few sentences to describe what this plot reveals. (b) There is a small peak in the middle of the plot that doesnt fit the overall pattern. Explain this blip. Slide 2 Slide 3 1.2 Describing Distributions with Numbers How much is a house worth? Manhattan, Kansas, is sometimes called the little apple to distinguish it from the other Manhattan. A few years ago, a house there appeared in the county appraisers records at $200, 059,000 (true value: $59,500). Before the error was discovered, the county, city, and school board had based their budgets on the total appraised value of real estate, which the one outlier jacked up by 6.5%. Slide 4 Slide 5 Mean/Mean(Centers) Both measure center in different ways, but both are useful. Use median if you want a typical number. Mean = Arithmetic Average Value Mean/Median of a symmetric distribution are close together. If a distribution is exactly symmetric, mean = median. In a skewed distribution, the mean is farther out in the long tail than the median. Slide 6 Male/Female Surgeons (# of hysterectomies performed) Put in ascending order (male dr.s): odd # 20 25 25 27 28 31 33 34 36 37 44 50 59 85 86 Min Q1 M Q3 Max Put in ascending order (female dr.s): even # 5 7 10 14 18 19 25 29 31 33 Min Q1 M = 18.5 Q3 Max Slide 7 Measures of Spread Range = Largest Smallest Observations in a list. Whats the problem with this? Better measure of spread: Quartiles. Range Quartiles 5 # Summary Variance Standard Deviation Slide 8 Slide 9 A modified boxplot plots outliers as isolated points. Slide 10 Boxplots You can instantly see that female dr.s perform less hysterectomies than male doctors. Also, there is less variation among female doctors. Slide 11 Notes on boxplots Best used for side-by-side comparisons of more than 1 distribution. Less detail than histograms or stem plots. Always include the numerical scale. Slide 12 Travel Times to Work #1 How long does it take you to get from home to school? Here are the travel times from home to work in minutes for 15 workers in North Carolina, chosen at random by the Census Bureau: 3020104025201060 1540530121010 Slide 13 The distribution Describe Is the longest travel time (60 minutes) an outlier? How many of the travel times are larger than the mean? If you leave out the large time, how does that change the mean? The mean in this example is nonresistant because it is sensitive to the influence of extreme observations. The mean is the arithmetic average, but it may not be a typicalnumber! Slide 14 You do: Travel Times to Work #2 Travel times to work in New York State are (on the average) longer than in North Carolina. Here are the travel times in minutes of 20 randomly chosen New York workers: 103052540201015 30201520851565 1560604045 Slide 15 Interquartile Range (IQR) Measures the spread of the middle of the data. An observation is an outlier if: Less than Q1 1.5(IQR) or Greater than Q3 + 1.5(IQR) Slide 16 Looking at the spread. Quartiles show spread of middle of data Spacing of the quartiles and extremes about the median give an indication of the symmetry or skewness of the distribution. Symmetric distributions:1 st /3 rd quartiles equally distant from the median. In right-skewed distributions: 3 rd quartile will be farther above the median than the 1 st quartile is below it. Slide 17 Got friends? Is there a difference between the number of programmed telephone numbers in girls cell phones and the number of programmed numbers in boys cell phones? Do you think there is a difference? If so, in what direction? 1) Count the number of programmed telephone numbers in your cell phone and write the total on a piece of paper. 2) Make a back-to-back stemplot of this information, then draw boxplots. When you test for outliers, how many do you find for males and how many do you find for females using the 1.5 X IQR test? 3) Find the 5# Summary for each group. Compare the two distributions (SOCS!). 4) It is important in any study that you have data integrity (the data is reported accurately and truthfully). Do you think this is the case here? Do you see any suspicious observations? Can you think of any reason someone may make up a response or stretch the truth? If you DO see a difference between the two groups, can you suggest a possible reason for this difference? 5) Do you think a study of cell phone programmed numbers for a sophomore algebra class would yield similar results? Why or why not? Slide 18 Spring 09 Student Data Girls: 5345724136222106237 7529615427570134 Boys: 298 65819535141247 6017633


Top Related