lect w2 measures_of_location_and_spread
DESCRIPTION
Lect w2 measures_of_location_and_spreadTRANSCRIPT
![Page 1: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/1.jpg)
A.MEASURES OF LOCATION
B.MEASURES OF SPREAD
Central tendency and measures of dispersion
&
Lect 2
Prepared by: Prof. Dr. Ruhul Amin
![Page 2: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/2.jpg)
Measures of
Location Spread
Central tendency Dispersion tendency
![Page 3: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/3.jpg)
Measures of Location (Central tendency)
1. Mean 2. Median 3. Mode
Common measures of location are
A
![Page 4: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/4.jpg)
1. Mean
a. Arithmetic Mean/Average
b. Harmonic Mean
c. Geometric Mean
Mean is of 3 types such as
![Page 5: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/5.jpg)
Arithmetic Mean
The most widely utilized measure of central tendency is the arithmetic mean or average.
The population mean is the sum of the values of the variables under study divided by the total number of observations in the population. It is denoted by µ (‘mu’). Each value is algebraically denoted by an X with a subscript denotation ‘i’. For example, a small theoretical population whose objects had values 1,6,4,5,6,3,8,7 would be denoted X1 =1, X2 = 6, X3 = 4……. X8=7 …….1.1
![Page 6: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/6.jpg)
Mean….
We would denote the population size with a capital N. In our theoretical population N=8. The pop. mean µ would be
!!Formula 1.1: The algebraic shorthand formula for a pop. mean is µ =
58
78365461=
+++++++
N
XN
ii∑
=1
![Page 7: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/7.jpg)
Mean…..
• The Greek letter (sigma) indicates summation, the subscript i=1 means to start with the first observation, and the superscript N means to continue until and including the Nth observation. For the example above, would indicate the
sum of X2+X3+X4+X5 or 6+4+5+6 = 21. To reduce clutter, if the summation sign is not indexed, for example Xi, it is implied that the operation of addition begins with the first observation and continues through the last observation in a population, that is, =
∑
∑=
5
2iXi
∑
∑=
N
iiX
1∑ iX
![Page 8: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/8.jpg)
Mean…
The sample mean is defined by = Where n is the sample size. The sample
mean is usually reported to one more decimal place than the data and always has appropriate units associated with it.
The symbol (X bar) indicates that the observations of a subset of size n from a population have been averaged.
Xn
XN
ii∑
=1
X
![Page 9: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/9.jpg)
Mean….
is fundamentally different from µ because samples from a population can have different values for their sample mean, that is, they can vary from sample to sample within the population. The population mean, however, is constant for a given population.
X
![Page 10: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/10.jpg)
Again consider the small theoretical population 1,6,4,5,6,3,8,7. A sample size of 3 may consists of 5,3,4 with = 4 or 6,8,4 with = 6.
Actually there are 56 possible samples of size 3 that could be drawn from the population 1.1. Only four samples have a sample mean the same as the population mean ie = µ.
Mean…..
XX
X
![Page 11: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/11.jpg)
Mean…
Sample SumX 4+3+8 5X 6+4+5 5X 6+4+5 5X 7+3+5 5
X
![Page 12: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/12.jpg)
Mean…
Each sample mean is an unbiased estimate of µ but depends on the values included in the sample size for its actual value. We would expect the average of all possible ‘s to be equal to the population parameter, µ . This is in fact, the definition
of an unbiased estimator of the pop. mean.
X
X
![Page 13: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/13.jpg)
Mean…
If you calculate the sample mean for each of the 56 possible samples with n=3 and then average these sample means, they will give an average value of 5 , that is, the pop. mean, µ. Remember that most real populations are too large or too difficult to census completely, so we must rely on using a single sample to estimate or approximate the population characteristics.
![Page 14: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/14.jpg)
Harmonic mean
![Page 15: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/15.jpg)
Geometric mean
n= no of obs., X1, X2, X3……..Xn are individual obs.
![Page 16: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/16.jpg)
2. Median
The second measure of central tendency is the MEDIAN. The median is the middle most value of an ordered list of observations. Though the idea is simple enough, it will prove useful to define in terms of an even simple notion. The depth of a value is its position relative to the nearest extreme (end) when the data are listed in order from smallest to largest.
![Page 17: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/17.jpg)
Median: Example 2.1
Table below gives the circumferences at chest height (CCH) in cm and their corresponding depths for 15 sugar maples measured in a forest in Ohio.
CCH cm 18 21 22 29 29 36 37 38 56 59 66 70 88 93 120
Depth 1 2 3 4 5 6 7 8 7 6 5 4 3 2 1
No. of obs. = 15 (odd)
The population median M is the observation whose depth is d = , where N is the population size.
21+N
![Page 18: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/18.jpg)
Median…
A sample median M is the statistic used to approximate or estimate the population median. M is defined as the observation whose depth is d = where n is the sample size. In example 2.1 the sample size is n=15 so the depth of the sample median is d=8. the sample median X = X8 = 38 cm.
21+n
21+n
![Page 19: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/19.jpg)
Median: Example 2.2
The table below gives CCH (cm) for 12 cypress pines measured near Brown lake on North Stradebroke Island
CCH 17 19 31 39 48 56 68 73 73 75 80 122
Depth 1 2 3 4 5 6 6 5 4 3 2 1
No. of observation = 12 (even)
Since n=12, the depth of the median is = 6.5. Obviously no observation has depth 6.5 , so this is the interpretation as the average of both observations whose depth is 6 in the list above. So M = = 62 cm.
2112 +
26856 +
![Page 20: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/20.jpg)
3. Mode
The mode is defined as the most frequently occurring value in a data set. The mode in example 2.2 would be 73 cm while example 2.1 would have a mode of 29 cm.
!More than 1 mode in a data set is possible. 2, 3, 4, 1, 1, 2, 3, 4, 5, 1, 4 Mode is 1 and 4 because both appeared 3 times in the data set !
![Page 21: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/21.jpg)
Mean, median and mode concide
•
![Page 22: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/22.jpg)
Exercise
Hen egg sizes(ES, g) on 12 wks of lay were randomly measured in a layer flock as follows. Determine mean, median and mode of egg size. Hen
No.01 02 03 04 05 06 07 08 09 10 11 12
ES 44 41 47 50 49 44 46 41 39 38 45 40
![Page 23: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/23.jpg)
Measures of Spread (dispersion)
It measures variability of data. There are 4 measures in common.
1. Range 2. Variance 3. Standard Deviation (SD) 4. Standard Error (SE)
B
![Page 24: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/24.jpg)
Range
Range: The simplest measure of dispersion or spread of data is the RANGE
Formula: The difference between the largest and smallest observations (two extremes) in a group of data is called the RANGE.
Sample range= Xn – X1 ; Population range=XN-X1
The values Xn and X1 are called ‘sample range limits’.
![Page 25: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/25.jpg)
Range: ExampleMarks of Biometry of 10 students are as follows
(Full marks 100)Student ID Marks Obtained Marks ordered
01 35 80
02 40 75
03 30 70
04 25 60
05 75 40
06 80 40
07 39 39
08 40 35
09 60 30
10 70 25
Here, Range = X1-X10=80-25 = 55
![Page 26: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/26.jpg)
Range…
The range is a crude estimator of dispersion because it uses only two of the data points and is somewhat dependent on sample size. As sample size increases, we expect largest and smallest observations to become more extreme. Therefore, sample size to increase even though population range remains unchanged. It is unlikely that sample will include the largest and smallest values from the population, so the sample range usually underestimates the population range and is ,therefore, a biased estimator.
![Page 27: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/27.jpg)
Variance
Suppose we express each observation as a distance from the mean xi = Xi - . These differences are called deviates and will be sometimes positive (Xi is above the mean) and sometimes negative (Xi is below the mean). If we try to average the deviates, they always sum to zero. Because the mean is the central tendency or location, the negative deviates will exactly cancel out the positive deviates.
X
![Page 28: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/28.jpg)
Variance…Example X Mean Deviates
2 -23 -11 4 -38 46 2Sum ! 0
)( XX i −∑= 0
![Page 29: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/29.jpg)
Variance…• Algebraically one can demonstrate the same result more generally, !!!!!!!!!
!
!Since is a constant for any sample, ! !!! !! !
∑∑∑===
−=−n
i
n
ii
n
iXXXXi
111)(
X
,)(1
1XnXXX n
i i
n
ii −=− ∑∑ =
=
![Page 30: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/30.jpg)
Variance…
Since then , so
nX
X i∑= ∑= iXXn
0)(1 1
1=−=−∑ ∑ ∑
= ==
n
i
n
i
n
i iii XXXX
![Page 31: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/31.jpg)
Variance…
• To circumvent the unfortunate property , the widely used measure of dispersion called the sample variance utilizes the square of the deviates. The quantity is the sum of these squared deviates and is referred to as the corrected sum of squares (CSS). Each observation is corrected or adjusted for its distance from the mean.
2
1)( XX
n
ii −∑
=
![Page 32: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/32.jpg)
Variance…
• Formula: The CSS is utilized in the formula for the sample variance
!!The sample variance is usually reported to
two more decimal places than the data and has units that are the square of the measurement units.
nXXis ∑−
−=
22 )(
![Page 33: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/33.jpg)
Variance…Or !With a similar deviation the population
variance computational formula can be shown to be
1/)( 22
2
−
−= ∑ ∑
nnXX
s ii
NNXX ii∑ ∑−
=/)( 22
2σ
![Page 34: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/34.jpg)
Variance…Example(unit Kg)
• Data set 3.1, 17.0, 9.9, 5.1, 18.0, 3.8, 10.0, 2.9, 21.2
!
n=9
∑ = 91iX ∑ = 92.13182iX
22
2 851.49881.398
811.92092.1318
199/)91(92.1318 Kgs ==
−=
−
−=
![Page 35: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/35.jpg)
Variance…
Remember, the numerator must always be a positive number because it is sum
of squared deviations.
Population variance formula is rarely used since most populations are too
large to census directly.
![Page 36: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/36.jpg)
Standard deviation (SD)
• Standard deviation is the positive square root of the variance !!
!And
NNXX ii∑ ∑−
=/)( 22
σ
1/)( 22
−
−= ∑ ∑
nnXX
s ii
![Page 37: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/37.jpg)
Standard Error (SE)
nSDSE =
n= no. of observation
![Page 38: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/38.jpg)
Exercise 2
Daily milk yield (L) of 12 Jersey cows are tabulated below. Calculate mean, median, mode, variance and standard error.
Cow no Milk yield Cow no Milk yield
1 23.7 7 21.5
2 12.8 8 25.2
3 28.9 9 21.4
4 21.4 10 25.2
5 14.5 11 19.5
6 28.3 12 19.6
![Page 39: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/39.jpg)
Problem 1
• Two herds of cows located apart in Malaysia gave the following amount of milk/day (L). Compute arithmetic mean, median, mode, range, variance, SD and SE of daily milk yield in cows of the two herds. Put your comments on what have been reflected from two sets of milk records as regards to their differences.
![Page 40: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/40.jpg)
Table
Herd 1• Cow no. 1 18.25 • 2 12.60 • 3 15.25 • 4 16.10 • 5 18.25 • 6 15.25 • 7 12.80 • 8 15.65 • 9 14.20 • 10 10.20 • 11 10.90 • 12 12.60
Herd 2• Cow no. 1 7.50 • 2 6.95 • 3 4.20 • 4 5.10 • 5 4.50 • 6 6.15 • 7 6.90 • 8 7.50 • 9 7.80 • 10 10.20 • 11 6.30 • 12 7.50 • 13 5.75 • 14 4.75
![Page 41: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/41.jpg)
Problem 2
• Sex adjusted weaning weight of lambs in two different breeds of sheep were recorded as follows. Compute mean, median, range, variance and SE in weaning weight of lambs in two breed groups. Put your comments on various differences between the two groups.
![Page 42: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/42.jpg)
Weaning wt. (Kg) of lambs
Breed 1• 7.5 • 6.9 • 8.1 • 5.8 • 5.9 • 5.8 • 6.2 • 7.5 • 9.1 • 8.7 • 8.1 • 8.5
Breed 2• 5.6 • 4.7 • 9.8 • 4.5 • 6.1 • 3.6 • 5.7 • 4.9 • 5.1 • 5.1 • 5.9 • 4.0 • 9.8 • 10.2
![Page 43: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/43.jpg)
Problem No 3
In a retail market study data on the price (RM) of 10 kg rice were collected from 2 different markets in Malaysia. Using descriptive statistics show the differences relating to price of rice in the two markets.
Pasar 1: 20, 25, 22, 23, 22, 24, 23, 21, 25, 25,23,22,25,24,24
Pasar 2: 25, 24, 26, 23, 26, 25, 25, 26, 24, 26, 24, 23,22, 25, 26, 26, 24
![Page 44: Lect w2 measures_of_location_and_spread](https://reader033.vdocuments.us/reader033/viewer/2022052623/559cd4d01a28aba8538b456d/html5/thumbnails/44.jpg)
THANK YOU
44