qms 6351 statistics and research methods chapter 2 descriptive statistics: tabular and graphical...
TRANSCRIPT
QMS 6351Statistics and Research Methods
Chapter 2Descriptive Statistics:
Tabular and Graphical Methods
Prof. Vera Adamchik
Chapter 2 Outline
• Summarizing Qualitative Data
• Summarizing Quantitative Data
Raw Data When data are collected, the information
obtained from each member of a
population or a sample is recorded in
the sequence in which it becomes
available. This sequence of data
recording is random and unranked.
Such data, before they are grouped or
ranked, are called raw data.
Example: Marada Inn
Guests staying at Marada Inn were asked to rate the quality of their accommodations as being excellent, above average, average, below average, or poor. The ratings provided by a sample of 20 quests are shown below.
Example: Marada Inn
1.Below Average 2.Above Average 3.Average 4.Above Average 5.Above Average 6.Above Average 7.Above Average 8.Below Average 9.Below Average 10.Average 11.Poor 12.Poor 13.Above Average 14.Excellent 15.Above Average 16.Average 17.Above Average 18.Average 19.Above Average 20.Average
Summarizing Qualitative Data
Tabular Presentation• Frequency Distribution• Relative Frequency Distribution• Percent Frequency DistributionGraphical Presentation • Bar Graph• Pie Chart
Frequency Distribution
• A frequency distribution for qualitative data is a tabular summary of a set of data showing all categories and the number of elements that belong to each of the categories.
• The objective is to provide insights about the data that cannot be quickly obtained by looking only at the original (raw) data.
• Frequency Distribution Rating Frequency
Poor 2Below Average 3Average 5Above Average 9Excellent 1
Total 20
Example: Marada Inn
Relative Frequency Distribution
• The relative frequency of a class is the fraction or proportion of the total number of data items belonging to the class.
• Relative frequency of a class = Frequency of that class/Sum of all frequencies
• A relative frequency distribution is a tabular summary of a set of data showing the relative frequency for each class.
Percent Frequency Distribution
• The percent frequency of a class is the
relative frequency multiplied by 100%.
• A percent frequency distribution is a
tabular summary of a set of data
showing the percent frequency for each
class.
Example: Marada Inn• Relative Frequency and Percent Frequency
Distributions
Rating Relative Freq. Percent Freq.,%
Poor 2/20 = .10 10
Below Average 3/20 = .15 15
Average 5/20 = .25 25
Above Average 9/20 = .45 45
Excellent 1/20 = .05 5
Total 1.00 100
Bar Graph
A bar graph is a graphical device for depicting qualitative data that have been summarized in a frequency, relative frequency, or percent frequency distribution.
Bar Graph
• On the horizontal axis we specify the labels used for each of the classes.
• A frequency, relative frequency, or percent frequency scale can be used for the vertical axis.
• Using a bar of fixed width drawn above each class label, we extend the height appropriately.
• The bars are separated to emphasize the fact that each class is a separate category.
Example: Marada InnBar Graph
1122
33
44
55
66
77
88
99
PoorPoor BelowAverageBelow
AverageAverageAverage Above
AverageAbove
AverageExcellentExcellent
Fre
qu
en
cy
Fre
qu
en
cy
RatingRating
Pie Chart• The pie chart is a commonly used
graphical device for presenting relative frequency distributions for qualitative data.
• First draw a circle; then use the relative frequencies to subdivide the circle into sectors that correspond to the relative frequency for each class.
• Since there are 360 degrees in a circle, a class with a relative frequency of .25 would consume .25(360) = 90 degrees of the circle.
Example: Marada Inn• Pie Chart
Average 25%Average 25%
BelowAverage 15%
BelowAverage 15%
Poor 10%Poor 10%
AboveAverage 45%
AboveAverage 45%
Exc. 5%Exc. 5%
Ratings Ratings
Example: Hudson Auto RepairThe manager of Hudson would like to get a better picture of the distribution of costs for engine tune-up parts. A sample of 50 customer invoices has been taken and the costs of parts, rounded to the nearest dollar, are listed below.
91 78 93 57 75 52 99 80 97 6271 69 72 89 66 75 79 75 72 76104 74 62 68 97 105 77 65 80 10985 97 88 68 83 68 71 69 67 7462 82 98 101 79 105 79 69 62 73
Summarizing Quantitative DataTabular Presentation• Frequency Distribution• Relative Frequency Distribution• Percent Frequency Distribution• Cumulative Frequency Distribution• Cumulative Relative Frequency Distribution• Cumulative Percent Frequency Distribution Graphical Presentation• Histogram• Ogive
Frequency Distribution
• For quantitative data, an interval that includes all the values that fall within two numbers, the lower and upper limits, is called a class.
• The classes are non-overlapping.
• Frequencies (f)give the number of values that belong to different classes.
Guidelines• Use between 5 and 20 classes.
• Larger (smaller) data sets usually require a larger (fewer) number of classes.
• Use classes of equal width.
Approximate class width =
Largest Data Value Smallest Data ValueNumber of Classes
Largest Data Value Smallest Data ValueNumber of Classes
Example: Hudson Auto Repair• Frequency Distribution
If we choose six classes, approximate class width = (109 - 52)/6 = 9.5 10
Cost ($) Frequency 50-59 2 60-69 13 70-79 16 80-89 7 90-99 7100-109 5
Total 50
Relative and Percent Frequency
• Relative Frequency of a Class =Frequency of that class/Sum of all frequencies
• Percent Frequency =(Relative Frequency)*100%
• Relative Frequency and Percent Frequency Distributions
Cost ($) Relative Freq. Percent Freq.,%50-59 2/50 = .04 460-69 13/50 = .26 2670-79 16/50 = .32 3280-89 7/50 = .14 1490-99 7/50 = .14 14100-109 5/50 = .10 10
Total 1.00 100
Example: Hudson Auto Repair
Cumulative Distribution
The cumulative frequency (or cumulative
relative frequency or cumulative percent
frequency) distribution shows the
number of items (or the proportion of
items or the percentage of items) with
values less than or equal to the upper
limit of each class.
Example: Hudson Auto Repair
• Cumulative DistributionsCost, $
Cum.Freq.Cum.Rel.Freq.Cum.Perc.Freq. < 59 2 .04 4 < 69 15 .30 30 < 79 31 .62 62 < 89 38 .76 76 < 99 45 .90 90 < 109 50 1.00 100
Histogram• The variable of interest is placed on the
horizontal axis and the frequency, relative frequency, or percent frequency is placed on the vertical axis.
• A rectangle is drawn above each class interval with its height corresponding to the interval’s frequency, relative frequency, or percent frequency.
• Unlike a bar graph, a histogram has no natural separation between rectangles of adjacent classes.
Example: Hudson Auto RepairHistogram
2244
66
88
1010
1212
1414
1616
1818Fre
qu
en
cy
Fre
qu
en
cy
5050 6060 7070 8080 9090 100100 1101105050 6060 7070 8080 9090 100100 110110Cost ($)Cost ($)
Ogive• An ogive is a graph of a cumulative frequency
(or cumulative relative frequency or cumulative percent frequency) distribution.
• The data values (class limits) are shown on the horizontal axis.
• An ogive is drawn by joining with straight lines the dots marked above the upper limits of classes at heights equal to the cumulative frequencies. Ogive starts at the lower limit of the first class and ends at the upper limit of the last class.
Example: Hudson Auto Repair• Ogive
1010
2020
3030
4040
5050
Cu
mu
lati
ve F
req
uen
cy
Cu
mu
lati
ve F
req
uen
cy
5050 6060 7070 8080 9090 100100 1101105050 6060 7070 8080 9090 100100 110110Cost ($)Cost ($)