94 social work intervention with communities and … block-4-unit-21-small...102 social work...
TRANSCRIPT
94 Social Work Intervention with Communities and Institutions
21
Introduction to StatisticalTechniques in Social Work
* D.K. Lal Das
Introduction
Knowledge of statistics helps the social worker intwo ways. First, the knowledge allows the socialworker to be able to analyse the data as per his/her research objectives and draw inferences whichin turn enable him/her to expand and improve notonly the knowledge base of his profession but italso enables him to make his practice effective.Second, as a consumer of research it enables him/her to understand the analysis of data and appreciatethe statistical procedures used in research reports.
In the previous Chapter we discussed about preparationof the code book, master chart and tables. In thisChapter, we will learn some of the statisticaltechniques which social workers need to apply incourse of data analysis.
Measures of Central Tendencies
It is often essential to represent a set of data bymeans of a single number which, in its way, isdescriptive of the entire set. Obviously, the figurewhich is used to represent a whole series shouldneither have the lowest value in the series nor thehighest value, but a value somewhere between thesetwo limits, possibly in the centre. Such values arecalled measures of central tendency.
The data collected for the purpose of a statisticalinquiry are simple figures without any form or
Dr. D.K. Lal Das, R.M. College of Social Work, Hyderabad
Processing and Analysis of Data 95
structure. Data obtained in this way are in a rawstate for they have not gone through any statisticaltreatment. This shapeless mass of data is knownas ungrouped data or raw data. Consider the datapresented in Table below.
Table : Marks in Social Work Researchobtained by
40 Students of a College
Roll Marks Roll Marks Roll Marks Roll MarksNo. No. No. No.
1 40 11 60 21 43 31 58
2 37 12 38 22 41 32 24
3 61 13 39 23 25 33 71
4 67 14 40 24 42 34 55
5 59 15 51 25 38 35 33
6 70 16 37 26 40 36 65
7 39 17 40 27 50 37 55
8 46 18 72 28 30 38 66
9 68 19 39 29 33 39 40
10 41 20 50 30 54 40 62
Ungrouped data presented in this manner are notcapable of being rapidly or easily interpreted. Onlya vague impression may be obtained by its perusal.
In order that data may be more readily comprehensiblegrouping further reduces the bulk of this data. Afirst step in such a grouping would be achieved byrepresenting the repetitions of a particular markby tallies instead of corresponding to any given marksin the frequency of those marks, (usually denotedby the letter ‘f’).
96 Social Work Intervention with Communities and Institutions
Table : Tally Sheet Showing the MarksObtained by 40 Students
Marks Tallies Marks Tallies Marks Tallies Marks Tallies
24 I 37 II 50 II 63 -
25 I 38 II 51 I 64 -
26 - 39 III 52 - 65 I
27 - 40 53 - 66 I
28 - 41 II 54 I 67 I
29 - 42 I 55 II 68 I
30 I 43 I 56 - 69 -
31 - 44 - 57 - 70 I
32 - 45 - 58 I 71 I
33 II 46 I 59 I 72 I
34 - 47 - 60 I
35 - 48 - 61 I
36 - 49 - 62 I
Data may be grouped by counting the number ofstudents whose heights are between 155 and 165centimeters. These types of data are called continuousdata. (Table below)
Table : Frequency Distribution: Continuous Series
S.No. Height in Centimeters No. of Students
1 150-155 12
2 155-160 15
3 160-165 18
4 165-170 17
5 170-175 15
6 175-180 14
Total 91
Processing and Analysis of Data 97
Each group of five consecutive values of heights,namely 150-155, 155-160, 160-165 etc. is called class.Since each class includes five values so is themagnitude or width of the class, commonly knownas class interval. The first figure of each class iscalled its lower limit, and the last figure of eachclass is known as the upper limit. Every class intervalhas a mid-point which is midway between the upperand lower limits. Figure 6.1 illustrates the lowerlimits, upper limits and midpoints.
150 151 152 153 154 155
2.5 units 2.5 units
152.5
Lower Limit Mid point Upper Limit
Figure : Lower Limits, Midpoints and Upper Limits
There are three important measures of centraltendency used in social work research, the mean,the median and the mode.
The Mean
The mean is the most common of all the averages.It is relatively easy to calculate, simple to understandand is widely used in social work research. Themean is defined as the sum of the values of allthe items and dividing the total by the number ofitems. An example will help us learn how to calculatethe arithmetic mean. Let us suppose that eightstudents receive 54, 58, 60, 62, 70, 72, 75 and 77marks respectively, in an examination, the meanof marks will be:
54+58+60+62+70+72+75+77 528Mean = ————————————————— = ——— = 66
8 8
98 Social Work Intervention with Communities and Institutions
In calculating arithmetic mean of a continuous series,we take the mid-value of each class as representativeof that class (and it is presumed that the frequenciesof that class are concentrated on mid-point), multiplythe various mid-values by their correspondingfrequencies and sum of the products is divided bysum of the frequencies.
Illustration
Table : Distribution of Rag Pickers by their DailyIncome
S.No. Daily Income Number of (in Rs.) Rag-Pickers
1 110-130 152 130-150 303 150-170 604 170-190 955 190-210 826 210-230 757 230-250 23
Total 380
Solution:
Daily Mid- Number of m x fIncome values Rag-(in Rs.) (m) Pickers (f)
110-130 120* 15 1800
130-150 140 30 4200
150-170 160 60 9600
170-190 180 95 17100
190-210 200 82 16400
210-230 220 75 16500
230-250 240 23 5520
∑f = N = 380∑mf = 71120
Processing and Analysis of Data 99
Calculation:
Lower limit + Upper limit 110 + 130* Mid-value = ——————————————— = ———————
2 2
240
= ————— = 120
2
__ ∑ m f m f
X = ———— = ————
∑ f N
71120
= ————— = 187.16
380
Mean = Rs.187.16 (approximately)
Solution:Table
DeviationMonthly Mid- No. of from Step Totalwages values workers assumed deviation deviation(in Rs.) (m) (f) mean.180 (d) fd
(dx)
110-130 120 15 –60 –3 –45
130-150 140 30 –40 –2 –60
150-170 160 60 –20 –1 –60
170-190 180 95 0 0 0
190-210 200 82 +20 +1 +82
210-230 220 75 +40 +2 +150
230-250 240 23 +60 +3 +69
N = 380 fd = 136
100 Social Work Intervention with Communities and Institutions
__ ∑fd
Mean (X) = a + ——— x i
N
Where ‘a’ stands for the assumed mean, ∑fd for thesum of total deviations, N for total number offrequencies and ‘i’ for class interval. Now substitutingthe values in the formula from the table we get :
136
= 180 + ——— x 20
380
= 180 + 7.16
Mean = Rs.187.16 (approximately)
The Median
The median is another simple measure of centraltendency. We sometimes want to locate the positionof the middle item when data have been arranged.This measure is also known as positional averages.We define the median as the size of the middleitem when the items are arrayed in ascending ordescending order of magnitude. This means thatmedian divides the series in such a manner thatthere are as many items above or larger than themiddle one as there are below or smaller than it.
In continuous series we do not know every observation.Instead, we have record of the frequencies withwhich the observations appear in each of the class-intervals as in the following Table. Nevertheless,we can compute the median by determining whichclass-interval contains the median.
Processing and Analysis of Data 101
Table : Daily Income of Rag-Pickers
Daily Number CumulativeIncome in of Rag- frequencies
Rs. pickers (f) (CF)
110 – 130 15 15
130 – 150 30 45
150 – 170 60 105
170 – 190 95 200
190 – 210 82 282
210 – 230 75 357
230 – 250 23 380
N = 380
In the case of data given in Table above the median
value is that value on either side of which 2N or
2380
or 190th items lie. Now the problem is to find
the class interval containing the 190th item. Thecumulative frequency for the first three classes isonly 105. But when we move to the fourth classinterval 95 items are added to 105 for total of 200.Therefore, the 190th item must be located in thisfourth class-interval (the interval from Rs. 170 –Rs. 190).
The median class (Rs.170 – Rs.190) for the seriescontains 95 items. For the purpose of determiningthe point, which has 190 items on each side, weassume that these 95 items are evenly spaced overthe entire class interval 170–190. Therefore, wecan interpolate and find the values for 190th item.First, we determine that the 190th item is the 95th
3802
102 Social Work Intervention with Communities and Institutions
item in the median class: 190 – 105 = 85. Thenwe can calculate the width of the 95 equal stepsfrom Rs.170 to Rs.190 as follows:
190 – 170 —————— = 0.21053 (approximately)
95
The value of 85th item is 0.2105 x 85 = 17.89. Ifthis (17.89) is added to the lower limit of the medianclass, we get 170 + 17.89 = 187.89. This is themedian of the series.
This can be put in the form of formula:
. N/2 – C
X = L + ——————— x i
. f
Where
X = median,
L = lower limit of the class in which median lies
N = total number of items
C = cumulative frequency of the class prior tothe median class.
‘f = frequency of the median class
i = class interval of the median class.
2380 – 105
= 170 + ——————— x (190 – 170) 95
190 – 105= 170 + ——————— x (190 – 170)
95
Processing and Analysis of Data 103
85= 170 + ——— x (190 – 170)
95
= 170 + (0.8947 x 20)
= 187.89 (approximately)
Median Income = Rs.187.89 (approximately)
The Mode
Another measure, which is sometimes used to describethe central tendency of a set of data, is the mode.It is defined as the value that is repeated mostoften in the data set. In the following series ofvalues: 71, 73, 74, 75, 75, 75, 78, 78, 80 and 82,the mode is 75, because 75 occurs more often thanany other value (three times). In grouped data themode is located in the class where the frequencyis greatest. The mode is more useful when thereare a larger number of cases and when data havebeen grouped.
Calculation of Mode
The first step in calculation of mode is to find outthe point of maximum concentration with the helpof grouping method. The procedure of grouping isas follows:
i) First the frequencies are added in two’s intwo ways: (a) by adding frequencies of itemnumbers 1 and 2; 3 and 4; 5 and 6 and soon, and (b) by adding frequencies of itemnumbers 2 and 3, 4 and 5, 6 and 7 and soon.
ii) Then the frequencies are added in three’s.This can be done in three ways: (a) by addingfrequencies of item numbers 1, 2 and 3, 4,5 and 6, 7, 8 and 9; and so on. (b) by addingfrequencies of item numbers 2,3 and 4; 5,
104 Social Work Intervention with Communities and Institutions
6 and 7; 8, 9 and 10; and so on and(c) by adding frequencies of item numbers 3,4 and 5, 6, 7 and 8, 9, 10 and 11 and soon.
If necessary grouping of frequencies can be donein four’s and five’s also. After grouping, the size ofitems containing maximum frequencies is circled.The item value, which will contain the maximumfrequency the largest number of times, is the modeof the series. This is shown in Tables given below.
After the process of grouping locates the class ofmaximum concentration the value of mode isinterpolated by the use of the following formula.
f1 – f0
Mode (X) = L + —————— x i 2 f1 – f0 – f
2
Where X stands for the mode, L is the lower limitof the modal class, f
0 stands for the frequencies of
the preceding class, f1 stands for the frequencies
of the modal class, f2,
for the frequencies of thesucceeding class and i stands for the class intervalof the modal class.
Illustration:
Table : Weekly Family Income (in Rs.)
Weekly Income Number of families
100 – 200 5
200 – 300 6 = f0
L - 300 – 400 . 15 = f1
400 – 500 10 = f2,
500 – 600 5
600 – 700 4
700 – 800 3
800 – 900 2
Total N = 50
Processing and Analysis of Data 105
Table : Location of Modal Class by Grouping
Weekly Income F(1) (2) (3) (4) (5) (6)
100 – 200 511 26
200 – 300 621
300 – 400 1525
31
400 – 500 1015
30
500 – 600 59
19
600 –700 47
12
700 –800 35
9
800 –900 2
Table : Analysis Table
Column Class Containing Maximum Frequency
100-200 200-300 300-400 400-500 500-600 600-700 700-800 800-900
1 1
2 1 1
3 1 1
4 1 1 1
5 1 1 1
6 1 1 1
No. oftimes aclassoccurs 1 3 6 3 1
Therefore 330-400 group is the modal group. Usingthe formula of interpolation, viz.,
106 Social Work Intervention with Communities and Institutions
f1 – f0.
X = L + ——————— x i
2 f1 – f0 – f2
__ 15 – 6
X = 300 + ———————— x 100
2 x 15 – 6 – 10
= 300 + 149 x 100
= 300 +
= 300 + 64.29
= 364.29 (approximately)
Measures of Dispersion
In social work research, we often wish to know theextent of homogeneity and heterogeneity amongrespondents with respect to a given characteristic.Any set of social data is values, which areheterogeneous. The set of social data is characterizedby the heterogeneity of values. In fact, the extentto which they are heterogeneous or vary amongthemselves is of basic importance in statistics.Measures of central tendency describe one importantcharacteristic of a set of data typically but they donot tell us anything about this other basiccharacteristic. Consequently, we need ways ofmeasuring heterogeneity – the extent to which dataare dispersed and the measures, which provide thisdescription, are called measures of dispersion orvariability.
Processing and Analysis of Data 107
Range
The range is defined as the difference between thehighest and lowest values. Mathematically,
R(Range) = mh – m
L
Where mh and
m
L stand for the highest and the lowest
value. Thus, for the data set; 10, 22, 20, 14 and14 the range would be the difference between 22and 10, i.e., 12. In case of grouped data, we takethe range as the difference between the midpointsof the extreme classes. Thus, if the midpoint of thelowest interval is 150 and that of the highest is850 the range will be 700.
Semi-Inter-Quartile Range or Quartile Deviation
Another measure of dispersion is the semi-inter-quartile range, commonly known as quartile deviation.Quartiles are the points, which divide the array orseries of values into four equal parts, each of whichcontains 25 per cent of the items in the distribution.The quartiles are then the highest values in eachof these four parts. Inter-quartile range is thedifference between the values of first and the thirdquartiles.
Thus, where Q1 and Q3 stand for first and the thirdquartiles, the semi-interquartile range or quartiledeviation.
Q3 – Q1
Q.D. = —————
2
Calculation of Quartile Deviation (QD)
108 Social Work Intervention with Communities and Institutions
Table : Weekly Family Income in (Rs.)
Weekly Income (in Rs.) Number of Families
100 – 200 5
200 – 300 6
300 – 400 15
400 – 500 10
500 – 600 5
600 – 700 4
700 – 800 3
800 – 900 2
Total N = 50
Table
S.No. Weekly Number Cumulative(1) Income of Frequency
(in Rs.) Families (CF)(2) (3) (4)
1 100 – 200 5 5
2 200 – 300 6 11 = c
3 Q1 - 300 – 400 15 = f 26
4 400 – 500 10 36 = c
5 Q3 - 500 – 600 5 = f 41
6 600 – 700 4 45
7 700 – 800 3 48
8 800 – 900 2 50
Total N = 50
Processing and Analysis of Data 109
– C
Q1 = 11 + ————— (I)
F
12.5 – 11
= 300 + ——————— x 100
15
1.5
= 300 + ——— x 100
15
= 300 + (0.1 x 100)
= 300 + 10
= 310
– C
Q3 = L1 + —————— (I)
5
37.5 – 36
= 500 + —————— x 100
5
1.5
= 500 + ——— x 100
5
= 500 + (0.3 x 100)
= 500 +30
110 Social Work Intervention with Communities and Institutions
= 530
= Q3 – Q1
= 530 – 310 = 220
Q3 – Q1QD = ——————
2
220= ———
2
= 110
Quartile Deviation is an absolute measure of dispersion.If quartile deviation is to be used for comparing thedispersion of series it is necessary to convert theabsolute measure to a coefficient of quartile deviation.
.Q3 – Q1
—————
2
Q3 + Q1
—————
2
Applying this to the preceding illustration we get,
Q3 – Q1 530–310 220 Q.D. = ————— = ————— = ——— = 0.26 (approximately)
Q3 + Q1 530+310 840
Mean DeviationQuartile deviation suffers from a serious drawback;they are calculated by taking into consideration onlytwo values of a series. As a result, the compositionof the series is entirely ignored. To avoid this defect,dispersion is calculated taking into considerationall the observations of the series in relation to acentral value. The method of calculating dispersionis called Mean Deviation.
Q3 – Q1Symbolically, coefficient of Q.D. = ———— = —————
Q3 + Q1 Q3 + Q1
Processing and Analysis of Data 111
Illustration:
Table : Weekly Family Income (in Rs.)
Weekly Income (in Rs.) Number of Families
100 – 200 5
200 – 300 6
300 – 400 15
400 – 500 10
500 – 600 5
600 – 700 4
700 – 800 3
800 – 900 2
Total N = 50
Solution :
Standard Deviation
The most useful and frequently used measure ofdispersion is standard deviation or root-mean square
Table : Weekly Family Income (in Rs.)
Weekly Mid No. of Cumulative DeviationIncome Value families frequency from ‘f | d |
(f) median400 | d |
100-200 150 5 5 250 1250200-300 250 5 10 = C 150 750
Median 300-400 350 15 = f 25 50 750Group
400-500 450 10 35 50 500500-600 550 5 40 150 750600-700 650 4 44 250 1000700-800 750 3 47 350 1050800-900 850 3 50 450 1350
N = 50 7400
112 Social Work Intervention with Communities and Institutions
Step Procedure Application to Table 13.8
1 Calculate the median of the Ndistribution ——— – C.
2
X = L + ——————— x ‘I ‘f
50——— – 10 2
= 300 + ————— x 100 15
25 – 10= 300 + ————— x 100
15
15= 300 + —— x 100
15
= 300 + (1 x 100)= 300 + 100 = 400
2 Find mid-points of each 100+200 300class = ————— = ———— = 150,....
2 2
3 Find absolute deviation – | 150 – 400 | = | – 250 ||d| of each mid – points = 250,.....from median (400)
4 Find total absolute 5 x 250 = 1250,.....deviation by multiplyingthe frequency of each classby the deviation of its mid– points from the median(f | d | )
5 Find the sum of products F | d | = 7400of frequency and deviations( f |d | )
6 Compute Mean Deviation f | d | 7400l (X) = —————= ——— = 148 N 50
Processing and Analysis of Data 113
deviation about the mean. The standard deviationis defined as the square root of the arithmetic meanof the squares of the deviations about the mean.Symbolically.
d2
∑ = ——— N
Where a (Greek letter sigma) stands for the standarddeviation, ∑d2 for the sum of the squares of thedeviation measured from mean and N for the numberof items.
Calculation of Standard Deviation
In a continuous series the class intervals arerepresented by their midpoints. However, usuallythe class-intervals are of equal size and thus, thedeviations from the assumed average is expressedin class interval units. Alternatively, step deviationis found out by dividing the deviations by the magnitudeof the class interval. Thus, the formula for computingstandard deviation is written as follows;
∑ fd2 ∑f d2
∑ = ———— – ———— x t
N N
Where ‘i’ stands for the common factor or themagnitude of the class-interval. The following examplewould illustrate this formula;
114 Social Work Intervention with Communities and Institutions
Table : Weekly Family Income (in Rs.)
No. Monthly Inco me Number of Families (f)
1 100 – 200 5
2 200 – 300 6
3 300 – 400 15
4 400 – 500 10
5 500 – 600 5
6 600 – 700 4
7 700 – 800 3
8 800 – 900 2
N = 50
Table
S.No. Weekly Mid Number Step f d d2 Fd2
Income values of deviation(m) families from ass.
(f) Ave(450) (d)
1 100 – 200 150 5 –3 –15 9 45
2 200 – 300 250 6 –2 –12 4 24
3 300 – 400 350 15 –1 –15 1 15
4 400 – 500 450 10 0 0 0 0
5 500 – 600 550 5 +1 5 1 5
6 600 – 700 650 4 +2 8 4 16
7 700 – 800 750 3 +3 9 9 27
8 800 – 900 850 2 +4 8 16 32
N = 50 ∑fd = – 12 ∑fd2 = 164
Processing and Analysis of Data 115
Table
1 Find the mid-points of the 100+200 = 300 = 150,....various classes 2 2
2 Assume a mid-points asaverage, preferably at the 450 = assumed average
centre
3 Take the difference of eachmid-point from the assumedaverage (450) and divide (1) 150–450 =–300/3= –3…them by the magnitude ofthe class interval to getstep deviation (d)
4 The deviations are multiplied (–3) (5) = –15by the frequency of each (–2) (6) = –1class (fd)
5 Find the aggregate of ª fd = –12products of step 4 (ª fd )
6 Square the deviations (d2 ) (–3) (–3) = 9,...
7 Squared deviations aremultiplied by the respective 9 x 5 = 45,...frequencies (fd2 )
8 Find the aggregate of ª fd2 = 164products of step 7 (ª fd2 )
9 Compute standard deviationwith the help of the formula
Conclusion
Knowledge of statistics helps the social worker intwo ways. First, the knowledge allows the socialworker to be able to analyse the data and drawinferences. Second, as a consumer of researchesit enables him/her to understand the analysis ofdata used in research reports.
Ungrouped data are not capable of being rapidly oreasily interpreted. In order that data may be morereadily comprehensible data can be grouped.
Mean, median and mode are the three measuresof central tendency. Mean is the arithmetic average
116 Social Work Intervention with Communities and Institutions
of a distribution. It is computed by dividing the sumof all values of observations by the total numberof values. Median is a point in an array, whichdivides a data set into two equal halves in sucha way that all the values in one half will be greaterthan the median value and all the values in otherhalf will be smaller than the median value. Modeis a most frequently occurring value in a distribution.
The range, and standard deviation are the mostcommonly used measures of variability. The rangeis the difference between the two extreme values.The square root of the average of the squared deviationsof the measures or values from their mean is knownas standard deviation.
References
Elhance, D.N. (1984) Fundamental of Statistics, KitabMahal, Allahabad.
Freud, J.E. (1977), Modern Elementary Statistics, PrenticeHall, New Delhi.
Gupta, S.P. (1980), Statistical Methods, S.Chand, NewDelhi.
Krishef, Curtis H. (1987), Fundamental Statistics forHuman Services and Social Work, Duxbury Press, Boston.
Lal Das, D.K. (2000), Practice of Social Research: A Social Work Perspective, Rawat Publications,Jaipur.
Philip, A.E. et. al. (1975), Social Work Research, andThe Analysis of Social Data, Peragon Press, Oxford.
Sanders, D.H. et. al. (1975), Statistics A Fresh Approach,Mc Graw Hill, New Delhi.