94 social work intervention with communities and … block-4-unit-21-small...102 social work...

94 Social Work Intervention with Communities and Institutions

21

Introduction to StatisticalTechniques in Social Work

* D.K. Lal Das

Introduction

Knowledge of statistics helps the social worker intwo ways. First, the knowledge allows the socialworker to be able to analyse the data as per his/her research objectives and draw inferences whichin turn enable him/her to expand and improve notonly the knowledge base of his profession but italso enables him to make his practice effective.Second, as a consumer of research it enables him/her to understand the analysis of data and appreciatethe statistical procedures used in research reports.

In the previous Chapter we discussed about preparationof the code book, master chart and tables. In thisChapter, we will learn some of the statisticaltechniques which social workers need to apply incourse of data analysis.

Measures of Central Tendencies

It is often essential to represent a set of data bymeans of a single number which, in its way, isdescriptive of the entire set. Obviously, the figurewhich is used to represent a whole series shouldneither have the lowest value in the series nor thehighest value, but a value somewhere between thesetwo limits, possibly in the centre. Such values arecalled measures of central tendency.

The data collected for the purpose of a statisticalinquiry are simple figures without any form or

Dr. D.K. Lal Das, R.M. College of Social Work, Hyderabad

Processing and Analysis of Data 95

structure. Data obtained in this way are in a rawstate for they have not gone through any statisticaltreatment. This shapeless mass of data is knownas ungrouped data or raw data. Consider the datapresented in Table below.

Table : Marks in Social Work Researchobtained by

40 Students of a College

Roll Marks Roll Marks Roll Marks Roll MarksNo. No. No. No.

1 40 11 60 21 43 31 58

2 37 12 38 22 41 32 24

3 61 13 39 23 25 33 71

4 67 14 40 24 42 34 55

5 59 15 51 25 38 35 33

6 70 16 37 26 40 36 65

7 39 17 40 27 50 37 55

8 46 18 72 28 30 38 66

9 68 19 39 29 33 39 40

10 41 20 50 30 54 40 62

Ungrouped data presented in this manner are notcapable of being rapidly or easily interpreted. Onlya vague impression may be obtained by its perusal.

In order that data may be more readily comprehensiblegrouping further reduces the bulk of this data. Afirst step in such a grouping would be achieved byrepresenting the repetitions of a particular markby tallies instead of corresponding to any given marksin the frequency of those marks, (usually denotedby the letter ‘f’).


Table : Tally Sheet Showing the MarksObtained by 40 Students

Marks Tallies Marks Tallies Marks Tallies Marks Tallies

24 I 37 II 50 II 63 -

25 I 38 II 51 I 64 -

26 - 39 III 52 - 65 I

27 - 40 53 - 66 I

28 - 41 II 54 I 67 I

29 - 42 I 55 II 68 I

30 I 43 I 56 - 69 -

31 - 44 - 57 - 70 I

32 - 45 - 58 I 71 I

33 II 46 I 59 I 72 I

34 - 47 - 60 I

35 - 48 - 61 I

36 - 49 - 62 I

Data may be grouped by counting the number ofstudents whose heights are between 155 and 165centimeters. These types of data are called continuousdata. (Table below)

Table : Frequency Distribution: Continuous Series

S.No. Height in Centimeters No. of Students

1 150-155 12

2 155-160 15

3 160-165 18

4 165-170 17

5 170-175 15

6 175-180 14

Total 91


Each group of five consecutive values of heights,namely 150-155, 155-160, 160-165 etc. is called class.Since each class includes five values so is themagnitude or width of the class, commonly knownas class interval. The first figure of each class iscalled its lower limit, and the last figure of eachclass is known as the upper limit. Every class intervalhas a mid-point which is midway between the upperand lower limits. Figure 6.1 illustrates the lowerlimits, upper limits and midpoints.

150 151 152 153 154 155

2.5 units 2.5 units

152.5

Lower Limit Mid point Upper Limit

Figure : Lower Limits, Midpoints and Upper Limits

There are three important measures of centraltendency used in social work research, the mean,the median and the mode.

The Mean

The mean is the most common of all the averages.It is relatively easy to calculate, simple to understandand is widely used in social work research. Themean is defined as the sum of the values of allthe items and dividing the total by the number ofitems. An example will help us learn how to calculatethe arithmetic mean. Let us suppose that eightstudents receive 54, 58, 60, 62, 70, 72, 75 and 77marks respectively, in an examination, the meanof marks will be:

54+58+60+62+70+72+75+77 528Mean = ————————————————— = ——— = 66

8 8


In calculating arithmetic mean of a continuous series,we take the mid-value of each class as representativeof that class (and it is presumed that the frequenciesof that class are concentrated on mid-point), multiplythe various mid-values by their correspondingfrequencies and sum of the products is divided bysum of the frequencies.

Illustration

Table : Distribution of Rag Pickers by their DailyIncome

S.No. Daily Income Number of (in Rs.) Rag-Pickers

1 110-130 152 130-150 303 150-170 604 170-190 955 190-210 826 210-230 757 230-250 23

Total 380

Solution:

Daily Mid- Number of m x fIncome values Rag-(in Rs.) (m) Pickers (f)

110-130 120* 15 1800

130-150 140 30 4200

150-170 160 60 9600

170-190 180 95 17100

190-210 200 82 16400

210-230 220 75 16500

230-250 240 23 5520

∑f = N = 380∑mf = 71120


Calculation:

Lower limit + Upper limit 110 + 130* Mid-value = ——————————————— = ———————

2 2

240

= ————— = 120

2

__ ∑ m f m f

X = ———— = ————

∑ f N

71120

= ————— = 187.16

380

Mean = Rs.187.16 (approximately)

Solution:Table

DeviationMonthly Mid- No. of from Step Totalwages values workers assumed deviation deviation(in Rs.) (m) (f) mean.180 (d) fd

(dx)

110-130 120 15 –60 –3 –45

130-150 140 30 –40 –2 –60

150-170 160 60 –20 –1 –60

170-190 180 95 0 0 0

190-210 200 82 +20 +1 +82

210-230 220 75 +40 +2 +150

230-250 240 23 +60 +3 +69

N = 380 fd = 136


__ ∑fd

Mean (X) = a + ——— x i

N

Where ‘a’ stands for the assumed mean, ∑fd for thesum of total deviations, N for total number offrequencies and ‘i’ for class interval. Now substitutingthe values in the formula from the table we get :

136

= 180 + ——— x 20

380

= 180 + 7.16

Mean = Rs.187.16 (approximately)

The Median

The median is another simple measure of centraltendency. We sometimes want to locate the positionof the middle item when data have been arranged.This measure is also known as positional averages.We define the median as the size of the middleitem when the items are arrayed in ascending ordescending order of magnitude. This means thatmedian divides the series in such a manner thatthere are as many items above or larger than themiddle one as there are below or smaller than it.

In continuous series we do not know every observation.Instead, we have record of the frequencies withwhich the observations appear in each of the class-intervals as in the following Table. Nevertheless,we can compute the median by determining whichclass-interval contains the median.


Table : Daily Income of Rag-Pickers

Daily Number CumulativeIncome in of Rag- frequencies

Rs. pickers (f) (CF)

110 – 130 15 15

130 – 150 30 45

150 – 170 60 105

170 – 190 95 200

190 – 210 82 282

210 – 230 75 357

230 – 250 23 380

N = 380

In the case of data given in Table above the median

value is that value on either side of which 2N or

2380

or 190th items lie. Now the problem is to find

the class interval containing the 190th item. Thecumulative frequency for the first three classes isonly 105. But when we move to the fourth classinterval 95 items are added to 105 for total of 200.Therefore, the 190th item must be located in thisfourth class-interval (the interval from Rs. 170 –Rs. 190).

The median class (Rs.170 – Rs.190) for the seriescontains 95 items. For the purpose of determiningthe point, which has 190 items on each side, weassume that these 95 items are evenly spaced overthe entire class interval 170–190. Therefore, wecan interpolate and find the values for 190th item.First, we determine that the 190th item is the 95th

3802


item in the median class: 190 – 105 = 85. Thenwe can calculate the width of the 95 equal stepsfrom Rs.170 to Rs.190 as follows:

190 – 170 —————— = 0.21053 (approximately)

95

The value of 85th item is 0.2105 x 85 = 17.89. Ifthis (17.89) is added to the lower limit of the medianclass, we get 170 + 17.89 = 187.89. This is themedian of the series.

This can be put in the form of formula:

. N/2 – C

X = L + ——————— x i

. f

Where

X = median,

L = lower limit of the class in which median lies

N = total number of items

C = cumulative frequency of the class prior tothe median class.

‘f = frequency of the median class

i = class interval of the median class.

2380 – 105

= 170 + ——————— x (190 – 170) 95

190 – 105= 170 + ——————— x (190 – 170)

95


85= 170 + ——— x (190 – 170)

95

= 170 + (0.8947 x 20)

= 187.89 (approximately)

Median Income = Rs.187.89 (approximately)

The Mode

Another measure, which is sometimes used to describethe central tendency of a set of data, is the mode.It is defined as the value that is repeated mostoften in the data set. In the following series ofvalues: 71, 73, 74, 75, 75, 75, 78, 78, 80 and 82,the mode is 75, because 75 occurs more often thanany other value (three times). In grouped data themode is located in the class where the frequencyis greatest. The mode is more useful when thereare a larger number of cases and when data havebeen grouped.

Calculation of Mode

The first step in calculation of mode is to find outthe point of maximum concentration with the helpof grouping method. The procedure of grouping isas follows:

i) First the frequencies are added in two’s intwo ways: (a) by adding frequencies of itemnumbers 1 and 2; 3 and 4; 5 and 6 and soon, and (b) by adding frequencies of itemnumbers 2 and 3, 4 and 5, 6 and 7 and soon.

ii) Then the frequencies are added in three’s.This can be done in three ways: (a) by addingfrequencies of item numbers 1, 2 and 3, 4,5 and 6, 7, 8 and 9; and so on. (b) by addingfrequencies of item numbers 2,3 and 4; 5,


6 and 7; 8, 9 and 10; and so on and(c) by adding frequencies of item numbers 3,4 and 5, 6, 7 and 8, 9, 10 and 11 and soon.

If necessary grouping of frequencies can be donein four’s and five’s also. After grouping, the size ofitems containing maximum frequencies is circled.The item value, which will contain the maximumfrequency the largest number of times, is the modeof the series. This is shown in Tables given below.

After the process of grouping locates the class ofmaximum concentration the value of mode isinterpolated by the use of the following formula.

f1 – f0

Mode (X) = L + —————— x i 2 f1 – f0 – f

2

Where X stands for the mode, L is the lower limitof the modal class, f

0 stands for the frequencies of

the preceding class, f1 stands for the frequencies

of the modal class, f2,

for the frequencies of thesucceeding class and i stands for the class intervalof the modal class.

Illustration:

Table : Weekly Family Income (in Rs.)

Weekly Income Number of families

100 – 200 5

200 – 300 6 = f0

L - 300 – 400 . 15 = f1

400 – 500 10 = f2,

500 – 600 5

600 – 700 4

700 – 800 3

800 – 900 2

Total N = 50


Table : Location of Modal Class by Grouping

Weekly Income F(1) (2) (3) (4) (5) (6)

100 – 200 511 26

200 – 300 621

300 – 400 1525

31

400 – 500 1015

30

500 – 600 59

19

600 –700 47

12

700 –800 35

9

800 –900 2

Table : Analysis Table

Column Class Containing Maximum Frequency

100-200 200-300 300-400 400-500 500-600 600-700 700-800 800-900

1 1

2 1 1

3 1 1

4 1 1 1

5 1 1 1

6 1 1 1

No. oftimes aclassoccurs 1 3 6 3 1

Therefore 330-400 group is the modal group. Usingthe formula of interpolation, viz.,


f1 – f0.

X = L + ——————— x i

2 f1 – f0 – f2

__ 15 – 6

X = 300 + ———————— x 100

2 x 15 – 6 – 10

= 300 + 149 x 100

= 300 +

= 300 + 64.29

= 364.29 (approximately)

Measures of Dispersion

In social work research, we often wish to know theextent of homogeneity and heterogeneity amongrespondents with respect to a given characteristic.Any set of social data is values, which areheterogeneous. The set of social data is characterizedby the heterogeneity of values. In fact, the extentto which they are heterogeneous or vary amongthemselves is of basic importance in statistics.Measures of central tendency describe one importantcharacteristic of a set of data typically but they donot tell us anything about this other basiccharacteristic. Consequently, we need ways ofmeasuring heterogeneity – the extent to which dataare dispersed and the measures, which provide thisdescription, are called measures of dispersion orvariability.


Range

The range is defined as the difference between thehighest and lowest values. Mathematically,

R(Range) = mh – m

L

Where mh and

m

L stand for the highest and the lowest

value. Thus, for the data set; 10, 22, 20, 14 and14 the range would be the difference between 22and 10, i.e., 12. In case of grouped data, we takethe range as the difference between the midpointsof the extreme classes. Thus, if the midpoint of thelowest interval is 150 and that of the highest is850 the range will be 700.

Semi-Inter-Quartile Range or Quartile Deviation

Another measure of dispersion is the semi-inter-quartile range, commonly known as quartile deviation.Quartiles are the points, which divide the array orseries of values into four equal parts, each of whichcontains 25 per cent of the items in the distribution.The quartiles are then the highest values in eachof these four parts. Inter-quartile range is thedifference between the values of first and the thirdquartiles.

Thus, where Q1 and Q3 stand for first and the thirdquartiles, the semi-interquartile range or quartiledeviation.

Q3 – Q1

Q.D. = —————

2

Calculation of Quartile Deviation (QD)


Table : Weekly Family Income in (Rs.)

Weekly Income (in Rs.) Number of Families

100 – 200 5

200 – 300 6

300 – 400 15

400 – 500 10

500 – 600 5

600 – 700 4

700 – 800 3

800 – 900 2

Total N = 50

Table

S.No. Weekly Number Cumulative(1) Income of Frequency

(in Rs.) Families (CF)(2) (3) (4)

1 100 – 200 5 5

2 200 – 300 6 11 = c

3 Q1 - 300 – 400 15 = f 26

4 400 – 500 10 36 = c

5 Q3 - 500 – 600 5 = f 41

6 600 – 700 4 45

7 700 – 800 3 48

8 800 – 900 2 50

Total N = 50


– C

Q1 = 11 + ————— (I)

F

12.5 – 11

= 300 + ——————— x 100

15

1.5

= 300 + ——— x 100

15

= 300 + (0.1 x 100)

= 300 + 10

= 310

– C

Q3 = L1 + —————— (I)

5

37.5 – 36

= 500 + —————— x 100

5

1.5

= 500 + ——— x 100

5

= 500 + (0.3 x 100)

= 500 +30


= 530

= Q3 – Q1

= 530 – 310 = 220

Q3 – Q1QD = ——————

2

220= ———

2

= 110

Quartile Deviation is an absolute measure of dispersion.If quartile deviation is to be used for comparing thedispersion of series it is necessary to convert theabsolute measure to a coefficient of quartile deviation.

.Q3 – Q1

—————

2

Q3 + Q1

—————

2

Applying this to the preceding illustration we get,

Q3 – Q1 530–310 220 Q.D. = ————— = ————— = ——— = 0.26 (approximately)

Q3 + Q1 530+310 840

Mean DeviationQuartile deviation suffers from a serious drawback;they are calculated by taking into consideration onlytwo values of a series. As a result, the compositionof the series is entirely ignored. To avoid this defect,dispersion is calculated taking into considerationall the observations of the series in relation to acentral value. The method of calculating dispersionis called Mean Deviation.

Q3 – Q1Symbolically, coefficient of Q.D. = ———— = —————

Q3 + Q1 Q3 + Q1


Illustration:


Weekly Income (in Rs.) Number of Families

100 – 200 5

200 – 300 6

300 – 400 15

400 – 500 10

500 – 600 5

600 – 700 4

700 – 800 3

800 – 900 2

Total N = 50

Solution :

Standard Deviation

The most useful and frequently used measure ofdispersion is standard deviation or root-mean square


Weekly Mid No. of Cumulative DeviationIncome Value families frequency from ‘f | d |

(f) median400 | d |

100-200 150 5 5 250 1250200-300 250 5 10 = C 150 750

Median 300-400 350 15 = f 25 50 750Group

400-500 450 10 35 50 500500-600 550 5 40 150 750600-700 650 4 44 250 1000700-800 750 3 47 350 1050800-900 850 3 50 450 1350

N = 50 7400


Step Procedure Application to Table 13.8

1 Calculate the median of the Ndistribution ——— – C.

2

X = L + ——————— x ‘I ‘f

50——— – 10 2

= 300 + ————— x 100 15

25 – 10= 300 + ————— x 100

15

15= 300 + —— x 100

15

= 300 + (1 x 100)= 300 + 100 = 400

2 Find mid-points of each 100+200 300class = ————— = ———— = 150,....

2 2

3 Find absolute deviation – | 150 – 400 | = | – 250 ||d| of each mid – points = 250,.....from median (400)

4 Find total absolute 5 x 250 = 1250,.....deviation by multiplyingthe frequency of each classby the deviation of its mid– points from the median(f | d | )

5 Find the sum of products F | d | = 7400of frequency and deviations( f |d | )

6 Compute Mean Deviation f | d | 7400l (X) = —————= ——— = 148 N 50


deviation about the mean. The standard deviationis defined as the square root of the arithmetic meanof the squares of the deviations about the mean.Symbolically.

d2

∑ = ——— N

Where a (Greek letter sigma) stands for the standarddeviation, ∑d2 for the sum of the squares of thedeviation measured from mean and N for the numberof items.

Calculation of Standard Deviation

In a continuous series the class intervals arerepresented by their midpoints. However, usuallythe class-intervals are of equal size and thus, thedeviations from the assumed average is expressedin class interval units. Alternatively, step deviationis found out by dividing the deviations by the magnitudeof the class interval. Thus, the formula for computingstandard deviation is written as follows;

∑ fd2 ∑f d2

∑ = ———— – ———— x t

N N

Where ‘i’ stands for the common factor or themagnitude of the class-interval. The following examplewould illustrate this formula;



No. Monthly Inco me Number of Families (f)

1 100 – 200 5

2 200 – 300 6

3 300 – 400 15

4 400 – 500 10

5 500 – 600 5

6 600 – 700 4

7 700 – 800 3

8 800 – 900 2

N = 50

Table

S.No. Weekly Mid Number Step f d d2 Fd2

Income values of deviation(m) families from ass.

(f) Ave(450) (d)

1 100 – 200 150 5 –3 –15 9 45

2 200 – 300 250 6 –2 –12 4 24

3 300 – 400 350 15 –1 –15 1 15

4 400 – 500 450 10 0 0 0 0

5 500 – 600 550 5 +1 5 1 5

6 600 – 700 650 4 +2 8 4 16

7 700 – 800 750 3 +3 9 9 27

8 800 – 900 850 2 +4 8 16 32

N = 50 ∑fd = – 12 ∑fd2 = 164


Table

1 Find the mid-points of the 100+200 = 300 = 150,....various classes 2 2

2 Assume a mid-points asaverage, preferably at the 450 = assumed average

centre

3 Take the difference of eachmid-point from the assumedaverage (450) and divide (1) 150–450 =–300/3= –3…them by the magnitude ofthe class interval to getstep deviation (d)

4 The deviations are multiplied (–3) (5) = –15by the frequency of each (–2) (6) = –1class (fd)

5 Find the aggregate of ª fd = –12products of step 4 (ª fd )

6 Square the deviations (d2 ) (–3) (–3) = 9,...

7 Squared deviations aremultiplied by the respective 9 x 5 = 45,...frequencies (fd2 )

8 Find the aggregate of ª fd2 = 164products of step 7 (ª fd2 )

9 Compute standard deviationwith the help of the formula

Conclusion

Knowledge of statistics helps the social worker intwo ways. First, the knowledge allows the socialworker to be able to analyse the data and drawinferences. Second, as a consumer of researchesit enables him/her to understand the analysis ofdata used in research reports.

Ungrouped data are not capable of being rapidly oreasily interpreted. In order that data may be morereadily comprehensible data can be grouped.

Mean, median and mode are the three measuresof central tendency. Mean is the arithmetic average


of a distribution. It is computed by dividing the sumof all values of observations by the total numberof values. Median is a point in an array, whichdivides a data set into two equal halves in sucha way that all the values in one half will be greaterthan the median value and all the values in otherhalf will be smaller than the median value. Modeis a most frequently occurring value in a distribution.

The range, and standard deviation are the mostcommonly used measures of variability. The rangeis the difference between the two extreme values.The square root of the average of the squared deviationsof the measures or values from their mean is knownas standard deviation.

References

Elhance, D.N. (1984) Fundamental of Statistics, KitabMahal, Allahabad.

Freud, J.E. (1977), Modern Elementary Statistics, PrenticeHall, New Delhi.

Gupta, S.P. (1980), Statistical Methods, S.Chand, NewDelhi.

Krishef, Curtis H. (1987), Fundamental Statistics forHuman Services and Social Work, Duxbury Press, Boston.

Lal Das, D.K. (2000), Practice of Social Research: A Social Work Perspective, Rawat Publications,Jaipur.

Philip, A.E. et. al. (1975), Social Work Research, andThe Analysis of Social Data, Peragon Press, Oxford.

Sanders, D.H. et. al. (1975), Statistics A Fresh Approach,Mc Graw Hill, New Delhi.

94 social work intervention with communities and … block-4-unit-21-small...102 social work...

Documents