computer evaluation by rex richard hutchens -...

Computer evaluation of anthropological census data

Item Type text; Thesis-Reproduction (electronic)

Authors Hutchens, Rex Richard, 1942-

Publisher The University of Arizona.

Rights Copyright © is held by the author. Digital access to this materialis made possible by the University Libraries, University of Arizona.Further transmission, reproduction or presentation (such aspublic display or performance) of protected items is prohibitedexcept with permission of the author.

Download date 12/06/2018 23:38:19

Link to Item http://hdl.handle.net/10150/318606

http://hdl.handle.net/10150/318606

COMPUTER EVALUATION OF ANTHROPOLOGICAL CENSUS DATA

byRex Richard Hutchens

A Thesis Submitted to the Faculty of theDEPARTMENT OF ORIENTAL STUDIES

In Partial Fulfillment of the Requirements For the Degree ofMASTER OF ARTS

In the Graduate CollegeTHE UNIVERSITY OF ARIZONA

1 9 7 2

STATEMENT BY AUTHOR

This thesis has been submitted in partial fulfillment of requirements for an advanced degree at The University of Arizona and is deposited in the University Library to be made available to borrowers under rules of the Library.

Brief quotations from this thesis are allowable without special permission, provided that accurate acknowledgment of source is made. Requests for permission for extended quotation from or reproduction of this manuscript in whole or in part may be granted by the head of the major department or the Dean of the Graduate College when in his judgment the proposed use of the material is in the interests of scholarship. In all other instances, however, permission must be obtained from the author.

SIGNED:

APPROVAL BY THESIS DIRECTOR This thesis has been approved on the date shown below

4i 3 - U - 7 X ._ J. M. MAHAR Date

Professor of Oriental Studies

, ACKNOWLEDGMENTS

Many hands have helped produce this work.Thanks are due Dr. J. M. Mahar whose generosity in providing access to the fruits of his years of research and field work made the project possible. His incisive criticism and demanding standards are everywhere evident. Thanks are also due Mr, Allen Ferber whose patient cooperation and late-night efforts in writing the necessary computer programs saw the project through to completion. The drudgery of coding and card punching was accomplished in a large measure by my wife Cheryl, who is also responsible for considerable editorial assistance and the final typing. A special word of thanks is also due Mr. James A. Younkins and family of Pittsburgh, Pennsylvania, without whose timely financial assistance my life would doubtless have taken a very different turn.My contribution is solely the idea, the organization and the errors.

TABLE OF CONTENTS

PageLIST OF ILLUSTRATIONS , , , , . „ , . , . , v i i

LIST OF TABLES . , . . . . 1 „ . v i i i

ABSTRACT ixI. INTRODUCTION 1

Computer Applications . 4The Problem Areas . . . , 5

II. THE SETTING . . . ......... . . . . . . . . 8III, DEVELOPMENT OF THE CODE . , . . . , . , . . . . 16

Examination of Data . 18Selection of Categories . . . . . . . . . 20Category Coding 20

N a m e ............ , 20Caste 21Khandan 21Generation .......... 21Father’s Birth Sequence .. 21Individual’s Birth Sequence . . . . . 21Sex/Family Position . 22A g e .................... 22Education Level ... , , 22Residence/Job Locale . . . . . . . . 22Occupation . . . . . . . . . . . . . 22Villages . . . . . . . . . . . . . . 23Land 24

IV. SYSTEM DEVELOPMENT AND DESIGN . . . . . . . . 25System Flow: Explanation . 26

Preparation of Data . . . . . . . . . 26Input of Data to Updata and

List P r o g r a m .......... . . , . 26Program Output .......... . . . . . 27Statistical Analysis Programs . . . . 27

System Flow: Input Requirements . . . . 27Program Explanation 27

iv

V

TABLE OF CONTENTS--Continued

. PageV. STATISTICAL TESTING AND RELATED

CONSIDERATIONS . . . „ , >. . . . . .. . . 38. Hypothesis Development . . . , , . . .. . 39

Axiom, Theory and Hypothesis , . . 40The Null Hypothesis , , , . , „ . . . , » 42Sampling . . . . . . . . . . . . . . . . 44Random Error . . . * . . . . 45Some Statistical Tests 47

Pearson Product-MomentCorrelation Coefficient (r). . . . .48

Thp Ph i - Coefficient . . . . . . .. ... 51The Chi-Square Test , . . , , . . . , 52Data Matrices . . . . . . . . . . . . 54

Statistical Subroutines , , . 56VI. the METHODOLOGY OF DATA ASSESSMENT ... . . . 59

Data Shortage and Data Evaluation . , . 62Selection of Variable Sets . . . . . . , 63Determination of Associations

. Among Variables ... . . . . . . . . , 64 ,Age-Grade Variables , , . . . . . . . 64Dichotomous Variables > . 66

. Single Variable Analysis , . . . . 6t7Marriage Reciprocity . . . . . , , , 70

The Formation of Generalizations . . . . 72The Meaning of Model . . . . . . . . . . 73

VII. SOME FINAL CONSIDERATIONS . . ... . . . . . . 77General Data Problems . . . . . . . . . . 77Specific Data Problems . . . . . . . . . 79Suggestions for Further Analysis . . . . 83Implications for Culture Change

Quantification. 85Conclusions . . . . . . . , . . . . .... 86

APPENDIX A: INTERVIEW GUIDE FORKHANDAN STUDY . ............ 87

TABLE OF CONTENTS--Continued

PageAPPENDIX B: DUPLICATE EXAMPLE OF

ORIGINAL CENSUS DATA . ......... 91APPENDIX C: CARD FORMAT AND CODES . . . . . . 101

Card Format . . , . . . . . . . , . . . . . 101Miscellaneous Codes . . . . . . . . . . . . 102Caste Code .......... 104Occupation C o d e ...................... 105

APPENDIX D: MARRIAGE RECIPROCITY MATRIX . . . 114SELECTED BIBLIOGRAPHY . . . . . . . . . . . . , 125

LIST OF ILLUSTRATIONS

Figure Pagec* JL » J. 1 i. VL J- CL 0 9 9 9 O 9 9 9 9 9 9 9 9 9 9 9 9 9 0

2-2. S a h a r a n p u r a n d M u z a f f a r n a g a r D i s t r i c t s 0 9 10

4-1. R a n k h a n d i D a t a B a s e S y s t e m F l o wD l a g r a m . . . . . # . . . 9 0 28

4-2. B a s i c P r o g r a m F l o w C h a r t » . , . . , 9 9 33

6-1. F l o w C h a r t o f D a t a A s s e s s m e n t Method. . O « 60

6-2. A g e - G r a d e D i s t r i b u t i o n o f B r a h m i nS c h o o l C h i l d r e n . , , . , . . . . . 65

6-3. A g e - G r a d e .D i s t r i b u t i o n o f B r a h m i nF e m a l e s E n r o l l e d in S c h o o l . , . . 69

LIST OF TABLES

Table Page2-1. The Population of Rankhandi, 1954 ............. 144-1. Example of Data Shortage Output . . . . . . . . 304-2. Example of Basic Data Output . , , , . . . . . . 31

V viii

ABSTRACT

Statistical analysis, as well as computer use, is becoming increasingly popular in anthropology. This study provides a series of suggestions for dealing with essentially unordered data of the nature found in a census through the use of statistics and the computer. The resultant mathematical relationships may be analyzed and integrated by the ethnographer in the construction of statistical models to explain social situations and effectively predict future situations. Suggestions for coding data are given and illustrated with census data from Rankhandi, a North Indian village. As well, a discussion of the computer system development and design used in this study provides a basic understanding of the programs developed for ordering, analyzing and storing of data. A review of some statistical tests furnishes the basic ground work for an understanding of the framework of data assessment employed in this study. Due to the inability to correct the data shortage and inaccuracy uncovered by the computer, the actual model building could not be undertaken however, the study illustrated quite clearly that the computer is of inestimable worth in evaluating large quantities of unprdered data„

.IX

CHAPTER I

INTRODUCTION

The representation of anthropological data and theory by mathematical symbolism seems to be enjoying increasing popularity. Even the more advanced areas of mathematics, such as topology, are beginning to be viewed as areas capable of giving insight to some of the problems of anthropology. One of the most persistent and enthusiastic proponents of the use of mathematics in anthropology is Edmund Leach:

. , .we can learn a lot by starting to think. about society in a mathematical way. Considered mathematically a society is not an assemblage of things but an assemblage of variables (1961:7).

My problem is simple. How can a modern social anthropologist . . . embark upon generalization with any hope of arriving at a satisfying conclusion? My answer is quite simple too: itis this: By thinking of the of g an iz at ion a1 ideasthat are present in any society as constituting a mathematical pattern (1961:2).

And further, Sherif and Sherif, implying a level of abstraction not at all adverse to the "mathematical pattern" of Leach, state that the "sociocultural level of analysis takes as its units the human group, institutions, kinship systems, systems of production and distribution and cultural values" and that: "it is not

1 /' '

imperative that the behavior of particular individuals involved in them be included in the analysis" (1969:10).

Additional support comes from Herskovits whowrote:

There is little doubt that culture can be studied without taking human beings into account. Most of the older ethnographies, descriptions of the ways of life of given peoples, are written solely in terms of institutions. Most diffusion studies --those that give the geographic spread of a given element in culture--are presented without any mention of the individuals who use the objects, or observe given customs. It would be. difficult even for the most psychologically oriented student of human behavior to deny the value of such research, It is essential that the structure of culture be understood, first of all, if the reasons why a people behave as they do are to be grasped; unless the structure of custom is taken fully into account, behavior will be meaningless (1967:21).

The above points are well taken, but a major problem in eliciting "the structure of culture" is doing so from data as nearly error free as possible. Such data can often be found in census records. The division of mathematics that would seem appropriate for the evaluation of census data is statistics. Although Harold Driver has pointed out the general limitations of the statistical approach (1961:310-311), it would seem that these limitations do not pose a problem for census data evaluation as the data tends to be highly quantitative,

: : ■ . ■ ' ■■ . ; ' • ; " - . . ' 3To be sure, statistics is not the only mathe

matical approach to anthropology. This has been aptly demonstrated by such works as Frederick Barth's "Segmentary Opposition and the Theory of Games t A Study . of Pathan Organization" and Harrison White’s Anatomy of Kinship: Mathematical Models for Structures of CumulatedRoles, which use game theory and matrix algebra respectively.

The nature of census data is that it tends to be troublesome at the point of evaluation due to the sheer bulk of facts. As Simmons (1967:245) and others have pointed out, the statistical approach can prove to be an effective instrument for the integration of masses of facts into mosaics of apparent order and sense.

At this point it might be well to emphasize that the function of the application of statistics to census data is not to derive the ’truth’ but to assist in the development of a model which reflects the actual frequency of events. The need for such models is clearly expressed by Levi-Strauss (1963 :283) . Marvin Harris has succinctly summed up this position as follows:

Insofar as quantification and science have displayed a conspicuously beneficial partnership in all the sister sciences, it is a 'mystification to urge the superiority [as Hugo

. Nut ini does (1968 : 375-379.) ] of nonstatistically- based models for anthropology. This is not to say that science and quantification are synonymous, but rather that if we proceed in ethnography to formulate models which lack statistical

authority, we do so out of humility, hoping some day to possess the research facilities for correcting our lack of data (1968:499).,

Computer Applications It is important to note the distinction, between

the ’contemplative’ and ’calculative’ aspects of mathematical propositions. Although the latter may be regarded as a function of the computer, the former is certainly the duty of the ethnographer,. Computer processing may establish the existence of a relationship between two variables to any degree of precision required, but it cannot judge whether or not the relationship is specious or what the cause may be. The function of the computer is, therefore, primarily one of examination rather than evaluation.

The application of the unique properties of computers to anthropological data is, of course, not hew. A review of the literature shows a very definite trend in this direction. Computers are used as well in archeology , particularly in the statistical computations involved in factor and cluster analysis. Also the use of the computer by the Human Relations Area Files in the field of cross-cultural studies is near legendary.

. The Problem AreasThe intent of this study is to provide a series

of suggestions for dealing with essentially unordered data of the nature found in a census. To this end, specific areas of inquiry were formulated. Could modern data processing equipment be used to effectively assist the ethnographer in ordering his data such that mathematical generalizations may be forthcoming? And then, could these resultant mathematical perceptions provide support for theoretical assumptions and/or check for accuracy and completeness of data? And further, even though census data is obtained by interviewers using standardized questionnaires, this data may be as subject to the problem of distortion as is other ethnographic data. If this is the case, by translating this basic data into mathematical symbolism would it be possible to perceive any emic responses and correct for them?

Much of the emphasis in the literature pertaining, to this topic is in the direction of advanced methods which assume a sophistication in higher mathematics and programming, knowledge that few anthropologists possess. There is a definite need for demonstrating that such extensive specialized knowledge is unnecessary for certain types Of computer evaluation. Pre-packaged subroutines that are "clean" (free of error) are readily available

and minimize the need for a professional programmer. Programming is a technical skill which is acquired much more by actual practice than intensive study.

Given that a set of census data is available, that a mathematical treatment has been deemed appropriate and that computer facilities are available, the question then becomes one of exactly what to do and how to do it.

An important point to remember is that the translation of data into a numerical: code must be based upon prior decisions as to exactly what is desired. It should be noted that some data may seem far too mundane to assist in the understanding of complex social interaction, but this is not necessarily the case. Even simple coded variables will often exhibit relationships that provide an understanding of the workings of the society from which they came.

The statistical methodology requires that samples be considered representative of the population from which they are taken. The point of this assertion is that it allows the Construction of statistical models which reflect the nature of the population without the expense of arriving at a sample size equal to the population.

Since the goal of the ethnographer is to use census data to reach an understanding of the basic descriptive parameters of the given population,

it is necessary to precisely define as many factors that could influence that decision as possible„

The actual conceptualization of a model building process is often difficult; and it is an even greater problem to effectively apply the model to gain insight into actual situations and to predict future situations.If the model construct is to have legitimate applicability to social Situations, it necessarily should be based on clearly stated mathematical generalizations formulated from clearly demonstrated mathematical relationships.No attempt is made here to actually.construct a statistical model, but rather to demonstrate a method for formulating empefically based mathematical generalizations from an . anthropological census via data processing.

CHAPTER II

THE SETTING

The source of the census data used for this study is Rankhandi, a North Indian village, situated 86 miles north of New Delhi in Uttar Pradesh, the most populous state of India (Figs. 2-1, 2-2). It lies on a flat alluvial plain midway between the Jumna and Ganges rivers, .

A.1921 District Gazeteer of the United Provincesprovides a description that is still largely adequate;

This very large village lies on the S borders of the pargana, in 29° 321 N and 77° 40' E at a distance of 6 miles S from Deoband and 26 miles from the district headquarters. There is no regular road, but rough tracks lead to Deoband and to Ambahta Sheikha, the latter crossing the canal about two miles to the west. The main site is built on the right bank of the small stream called the Imlia, which carries much of the Deoband drainage and is apt to damage the lower lands in its vicinity. Along its bank the land is poor, but elsewhere it is well cultivated and possesses an ample supply of irrigation from the canal and wells, a small branch from the former transversing the centre of the village. The total area is 2,376 acres including 54 acres under groves, the revenue is Rs 4,000; the proprietors are a bhaiyachara community of Pundir Rajputs. The population at the last census was 3,710 of whom 317 were Musalmans; the total including that of the hamlet Kheri Jhunka to the north of the principal site, Rankhandi possesses a large upper primary school, but nothing else of interest except the tomb of a Musalman saint Shah Asim- ud-din, to the west of the village (Nevill:308),

9

New DelhiRankhandi State Boundaries

Fig. 2-1. India

MEERUT

District Boundary

Rood 10L_

10 Miles

-t— f— i— r Railroad N

Fig. 2-2. Saharanpur and Muzaffarnagar Districts

Transportation facilities, by Indian standards, are adequate. The.area is serviced by three railroads and within two miles of Rankhandi is an all-weather ; hard surface road, on which bus service is available.

According to the 1951 census, the area of Saharanpur District was 2,132 square miles with a population of 1,353,636-^a population density of 635. There are four cities with a population of more than 100,000 within ninety miles of Rankhandi. All may be reached in less than four hours.

There is no heavy industry in either Saharanpur District or in Muzaffarnagar District immediately to the south. Rankhandi lies nearly on the border of the two districts and this is reflected in the data as many wives are received and given between.Rankhandi and villages in Muzaffarnagar District.

A subdivision of the district, known as tahsil, is the smallest administrative unit above the village. Rankhandi is located in Deoband tahsil, which encompasses more than a hundred villages in the south-central part of Saharanpur, District. The headquarters and police station for the tahsil are situated in the town of.Deoband, a major market for the village and six miles north of Rankhandi.

- 12Being just fifty miles northeast of Panipat,

a famous battleground for successive invasions of India, Rankhandi is located in an area with a long and stormy history; its settlement, then, hy the Rajput warrior caste in the 16th century is understandable. Even today Rajputs constitute 42% of the population and own 90% of the village land. The sheer size of this caste prevented its inclusion in the data and it is this fact which accounts for the low frequency of land holdings in the data outputs.

The presence of Muslim castes in the data evaluation procedure requires some explanation. Approximately one third of the population of Saharanpur District is Muslim. Although this percentage is not reflected in Rankhandi, which is only about 8% Muslim, the influence of Islam on the architecture, dress and medicine practiced is very much in evidence. The largest and oldest religious structure in Rankhandi is the tomb of a Muslim Pir, whose supernatural assistance is sought by Muslims and Hindus alike. Many Hindus in Rankhandi, especially wealthier members of the upper castes* follow the Muslim practice of the seclusion of women.

. Within Rankhandi itself more than 5,000 people live in an area approximately one-half by three quarters of a mile. The caste composition of the village is

13relatively stable over time. Documents from 1867 indicateapproximately the same caste composition as more recentstudies. Mahar states that the

. , „ large size of Rankhandi does not appear to have affected.its essentially village nature, Agriculture is still the basis of the economy, and the village does not fit into the transportation nets or marketing systems characteristic of urban centers. The size of the village is especially useful for the purposes of this study as it is large enough to support a full complement of the various artisan and service castes found in the area (1966:60),.

A population distribution of the castes in Rankhandi isshown in Table 2-1 complete with an explanation of eachcaste's primary and secondary occupations.

14

CasteName

' TABLE 2-1 THE POPULATION OF RANKHANDI, 1954

OccupationPrimary O'the:

Population

1 Bhaat

2. BhaT .3. Bhangii4. BhaRbujjaa5. Brahmin

6. Chamar

*7. Darzii8. Dhiimaan

brahmin9. Dhoobii10. GaRariiaa

11. Goosaaii12. JaTiiaa

chamar

13. Jhiiuaar14. Joogii

15. Julaahaa16. Julaahaa

(Chamar)

Share-croppers

No informationSweepersGrain-parchersAgriculturalists , mostly sharecroppers

Laborers, sharecroppers, house constructors

TailorsCarpenters or blacksmiths

WashermenShare-croppers

Share-croppersShoe-makers

Water-carriersAgriculturalists mostly sharecroppers

AgriculturalistsWeavers

Ceremonialattendant

LaborersShare-croppersOfficiate at ceremonies, shop- keepers

Agriculturalists

Masons

Cattle-herders, laborers

Receiver of almsAgricultural laborers , especially at harvest

LaborersReceiver of alms

Share-croppers, laborers

121259

282

620

36186

34196

3399

15851

5094

15TABLE 2-1, Continued

17. Julaahaa(Koli)

18. Khatrii

19. Kumhaar

*20. Miraassii

21, Naaii

*22. Niilgar23. Puj aaglr

24. Raj put

*25. Rajput *26. Saaii

27, SiaaNii *28. Seekh

29. Soonaar

*30. Teelii

Laborers, sharecroppersGirl’s school teacher (1)Potters

Singers, musiciansBarbers

Share-croppersLaborers

Agriculturalists, landlords

AgriculturalistsAgriculturalists

Share-croppersTeacher in local high schoolGoldsmiths

Share-croppers

Share-croppers, laborers at harvest timeLaborers, tailors

Share-croppers, laborers

LaborersShare-croppers, two are village doctors

Shopkeepersteachers

Caretakers of the Piir1s tomb

Laborers

One teacher in boy's schoolOil pressers, laborers at harvest time

44

8

74

41

78

3725

2164

958

574

70

144

31. Vaish Business, shop- Agriculturalists 154

*Muslim Casteskeepers

CHAPTER III

DEVELOPMENT OF THE CODE .

As every field worker knows, a fundamental problem lies in data gathering as well as in data evaluation; perhaps surprisingly, the computer may. be fruitfully used in this area as well. Certain modular components associated with computers may provide an ethnographer with an opportunity to methodically sort and list data according to whatever, variables are deemed appropriate.The facilitation of this simply requires the development of an internally consistent code that may readily be used to store the basic data on punched cards, of the IBM variety. This is no doubt the area requiring the greatest care, since a poorly developed code Can hardly fail to adversely affect the data manipulations.

Computer evaluation of census data requires that the data be coded in a manner amenable to punching on 80-column cards for machine manipulation. This coding may be either alphabetic, numeric or both.Four columns using only numeric designations has 10,000 possible entries. By converting one of those columns to alphabetic characters the number of possible entries

16

increases to 26,000 and by converting two columns to alphabetic designations the total possible entries is increased to 67,000, It can easily be seen that the use of alphabetic characters greatly increases the possible entries in any series of columns.

If carefully prepared, a code may allow one column or set of columns to serve more than one purpose, as in the case of the village codes which give both a unique code for each village and its mileage from Ran- khandi and geographical location as well. Whenever possible a code should be open-ended; that is, allow for the addition of new information without voiding previously recorded data. For example, the decision to use the Indian Census Bureau’s occupation codes was based upon its internal expansive properties, which will be dealt with in more detail later in this chapter. It is important to remember that coding different things or people inevitably limits some of the information you have about them. Coding is similar to converting a color photograph into black and white; the detail may not be lost but some of the perspective might be, A crucial factor in developing a code that is to be punched for machine manipulation is that the card itself constitutes a fixed field; therefore, a general idea

18of the range of any particular variable is necessary to prevent either using too little or too much space on the. card.

Examination of Data 'Prior to the actual examination of the raw data,

it is important to consider the reason for its collection in the first place, This will provide an opportunity to detect sample bias, of which we may consider two kinds: one, bias which affected the data that was retained in order to conform to an already existing hypothesis, while other data without "goodness of fit" to the hypothesis was deleted; and two, bias that is the result of the investigator’s major concern with but one aspect of the problem area. If either of these types of bias is discovered, the data must be rejected or techniques must be used to minimize the resultant distortion. This may include further data gathering in the field or cross-checking other field work done in the same geographic area or theoretical orientation.

A questionnaire was administered for the collection of the basic census data that was used for this evaluation (see Appendix A). The data available was collected in both 1954 and 1969, providing an opportunity for the examination of variables over time. An exact duplicate of Khandan (lineage) G of the Jatiya Chamar caste as listed in 1954 and 1969 is shown in Appendix B.

. . - 19A review of the data revealed several categories

that would be readily, amenable to coding. The lineage . lists are already ordered in an approximately chrono- ■ logical manner with the oldest known lineage member listed first. Each individual born in the lineage was assigned a number and wives of lineage members were designated alphabetically. Also the age of each individual when known, was given. If the village of origin of a wife was known it was given also, as well as the husband’s village of females born into a Rankhandi lineage. In a section separate from the lineage list additional data was given, such as occupation, education level and change of residence.

It should be noted that since the data was collected for a different purpose than this evaluation, there was a considerable difference in the amount of attention paid to particular questions; question #8 (occupation) received much attention, whereas question #15 (education) was commonly not even asked. As a result, there existed a high degree of variation in availability of data. As well, the 1954 data was considerably more detailed than the 1969 data, : This made legitimate comparisons difficult.

■ ■ ■ " ■ 2.0 .Selection of Categories

Categories were selected on the basis of two factors: ease of coding and availability. It was foundthat each individual could be given an identification number composed of coded categories elicited directly from the lineage lists. This procedure allowed for a duplication of the lineage lists directly from the identification number, which was created from the following categories: caste, khandan (lineage), generation, father’sbirth sequence and the individual's birth sequence. The final two digits in the identification number served to distinguish sex, as well as whether, in the case of females the individual was born into the lineage or married into it. Each of these cases was assigned a code number and in the case of males the male birth order was used. Additional categories based on availability were: age,education, occupation, villages of affinal interaction, land ownership and residence.

Category Coding

NameSixteen columns were designated for the name

which was typed as shown in the data. In the event of variation between the 1954 and 1969 sources, 1969 spellings were used.

' ■ : 21 . Caste '

Castes were listed alphabetically and then assigned numbers from 01 to 31»

KhandanThe lineages had already been given alphabetic

designations in the data, these were retained. X, Y and Z were used to denote entire lineages moving into Rankhandi as a group.

GenerationNumbers one through nine were used to designate

appropriate generations. These were shown as indentations in the original data (refer to Appendix B).

Father’s Birth SequenceNumbers one through nine were used to show the

birth order of the individual’s father; a zero indicates that the individual is the first listed in the lineage.One purpose of this category was to minimize the chance that collateral relatives of the same sex would be coded, with the same identification number.

Individual’s Birth SequenceTwo columns were allowed to designate the birth

order of the individual being coded.

Sex/Family PositionTwo columns were used to denote both the male

birth order and whether a female was married into the lineage or born into it.

AgeThis was coded for both 19 54 and 1969 as shown

in the data without regard for obvious discrepancies.

Education LevelThis also was coded as stated in the data without

regard for discrepancies between the 1954. and 1969 data.No distinctions were made between the various types of religious instruction and all such information was coded the same.

Residence/Job LocaleThis was formulated as a one column, dual purpose

code, allowing the code to reflect both residence and location of employment.

OccupationThis category was listed as six columns so that

both 19 54 and 1969 data could be coded using the Indian Census Bureau’s system. This, code has internal expansive properties that make it particularly useful in such cases as the present one, where the codes were being developed .

without a thorough examination of all the data that might be coded. The expansive properties of this code are the result of distinguishing general divisions, major categories and minor categories. The code is three Columns with the left column indicating a general division, such as agriculture (100); the second digit indicates a major category such as field crops (110); and the third digit is used for specific jobs, such as grain crops (111). Few additions, were necessary for the utilization of this code, the most notable being military service and students.

Villages .Village codes were devised by assigning an

alphabetic designation to compass directions within a tolerance of 22 1/2° and a numeric value corresponding to the distance from Rankhandi along that compass direction in miles. No greater tolerance was used in regard to compass direction because it was unnecessary for the purpose of finding the village on existing maps. In those cases where two villages shared the same alphabetic and numeric codes, the numeric codes were differentiated by either subtracting 1 or adding 1 to the given mileage; for example, Babasi, Saharanpur and Berakheri, Saharanpur were listed in the data with.

24exactly the same compass direction and mileage, the former was coded AO17 and the latter AO18. And further, in those, cases where a cluster of villages caused this procedure to be insufficient, the first numeric digit in the village code was assigned a 9, since no villages exceeded 100 miles in distance from Rankhandi; for example, Talheerii, Saharanpur and Amboli, Saharanpur are both 10 miles from Rankhandi in a northerly direction, the former was coded E010 and the latter E910. Neither of these procedures hindered the secondary functioning of the code which was to provide a guide for location on maps.

LandThis category was coded numerically as listed

in the data in kaccha bighas, Although few cases of joint tenancy were encountered in the data selected for evaluation, a system to deal with this problem was developed only after the coding was completed.

Details regarding the alphabetic and numeric codes used in each category may be found in Appendix C.

CHAPTER IV

SYSTEM DEVELOPMENT AND DESIGN

Whenever a computerized system of manipulating information for purposes of storing, analyzing and reporting data is incorporated, an overall description of the way the system is developed and utilized is necessary in order that individuals not directly involved may understand the functioning of the system.

There are usually three types of documentation used toward this end: one, a flow diagram with a writtendescription of the system as a whole; two, an analysis of the data used; and three, flow diagrams and descriptions of the computer programs written for the system.

A flow diagram, in the purest sense, is a pictorial description of the steps involved in a system or program, using symbols to show the type of step-- that is, preparation, keypunching, decisions, listing and/or the medium used, such as keypunched card or magnetic tape storage--and connecting arrows to indicate the direction or "flow" from one step to the next. These diagrams are intended to give the reader only a quick, general view of the system or program and involve little

25

. 26 detail, The detail is given in a verbal description of the '"flow" diagram, which goes into the specifics of each step comprising the diagram.

The analysis of the data includes how the data or source information was coded for computer use (input) and the final form of the printed sheets (output).

Information of this order is provided in this chapter. Included is a description of the flow and functioning of the system, coding of the input data, format and uses of the output reports, and generalized flow descriptions of the computer programs. These programs are used to. develop, store and analyze the data used.

System Flow: ' Explanation Preparation of Data

The information is obtained directly from the original source documents. It is then translated into predefined codes for computer use (see Appendix C), The coded data is transferred.to 80-column punched cards.

Input of Data to Update and List ProgramThe punched cards will be used as direct input to

the main data base program. This program reads and edits the cards and then adds the data from the cards to the master tape file.

. : 27Program Output

Output from the basic program includes the updated master tape, which will later be the source of input for the statistical analysis programs, a listing of the data and a data shortage listing.

Statistical Analysis ProgramsVarious statistical manipulations are performed

on the data from the tape and printed results are obtained. See Figure 4-1 for a flow chart illustrating the system design used to arrive at the above mentioned outputs, as well as the statistical results.

System Flow: Input RequirementsInput requirements for the system include all

the codes and their locations on the punched cards and/or magnetic tape, Refer to Appendix C for detailed tables of the codes.

Program Explanation The basic program written for this study, identified

as 'RANKDAB'. (RANKHANDI DATA BASE) , determines the form the data will assume on the printed pages; there is no analysis whatsoever involved in this stage. The data is simply being organized, which expedites testing for certain types of errors, For example, columns 27 and 28

Census

D a t a

SourceI n f o r ma t i o n

C od ed

C o d e d

D a t a

Keypunched

D a t a C a r d s

MasterTope

' R A N K D A B'

P rog r am

DataU p d a t e d

Li st i ngMaster

DataS h o r t a g e

Repor t

O

28

D e s i r e d

Stat i s t i ca l

Pr o gr ams

Resul t

Reports

Fig. 4-1. Rankhandi Data Base System Flow Diagram

29

should contain the 1954 age data code for every individual„ If the keypunch operator accidentally punches this information in columns 26 and 27j, such an error would be detected by this program. It will not, however, uncover such errors as a 27-28 column punch of '68 * instead of * 65’ These types of errors must be detected by visual checks,

This program directs the printer to begin printing a new page whenever a lineage is completed. In the case of the data shortage outputs the program allows the entire page to be built in the core memory unit of the computer before it is printed. Without this possibility, the format of the shortage output sheets would be impossible, since the printer unit prints an entire line from left to right at once and after printing it never returns to add additional information (see Table 4-1).

This basic program also allows information to be coded and punched in certain columns on the punched cards and printed out elsewhere on the page. For example, the village location codes for women are punched in columns 43-46 inclusive and overlap the range defined for 1954 and 1969 occupations (columns 41"46), but they are printed in a section of the print-out sheet devoted exclusively to village listings. Repositioned village code on the output sheets are shown in Table 4-2, Clearly

TABLE 4-1

Dhoobii Khandan A

Bhagwati 09A410155 026Age 1954 ••♦epev'eeeepefee Education Level 1954 ,,,. Education Level 1969 »..,

Munni 09A510222 028Education Level 1969 .,., Occupation 1969

Chandhra Kala 09A410222 029Age 1954 pppppppppppppppp Education Level 1954 ,.,, Education Level 1969

Barruu 09A410302 030Education Level 1969 Occupation 1954

EXAMPLE OF DATA SHORTAGE OUTPUT

Rameshon 09A410355 031Age 1954 pppppppppppppppp Education Level 1954 Education Level 1969 ....

Janeeshwar 09A410403 034 Education Level 1969 ,,,,

Chandra Vati 09A410455 035Age 1954 pppppppppppppppp Education Level 1954 ,,,, Education Level 1969 .,

Uutri 09A540122 036Education Level 1969 Occupation 1969

Rameshwar 09A410504 037Education Level 1954 Occupation 1954 ,,,,,,,,,

Ramkali 09A410555 038Age 1954 ,,j , Education Level 1954 ,,,, Education Level 1969 „,,,

Rajkumari 09A410622 039Education Level 1954 ,.,, Education Level 1969 ,,,.Residence ....Occupation 1954 Occupation 1969 .,.,,,,.,

Mangal 09A200404 040Education Level 1969 ,,,,

TABLE 4-2

BrahminSeq Name ID

001 Moola Br of 043 05X100000002 Manbhi , 05X100055003 Musaddi 05X200101004 Bishambari 05X200155005 Radhey Shiam 05X310101006 Saroj 05X310155007 Anil 05X410101008 Anita 05X410222009 Raj Bal 05X310202010 Babu ' 05X200202011 Chaman Dei 05X200255012 Koshalya 05X200255

example of basic data output

Khandan XAge Ed R Occ WJ Land54 69 54 69 54 6900 99 00 00 8 888 88800 50 00 00 4 . J00499 99 99 99 8 888 88800 99 00 00 8 P02600 28 00 01 4 000 11000 25 00 00 4. IIOIO99 04 99 99 1 888 88899 02 99 99 1 888 88800 25 00 08 5 723 71200 36 00 04 1 110 11000 99 00 00 8 N01300 35 00 00 1 . N013

Remarks

0 9 9 0 9 0 9 Q 9 9 0 9 Q 0 0 9. 0 9 9 0 9 4 0 9 9 9 9 0 9

O 0 O 0 9 9 9 0 9 0 0 0 9 0 9 0 0 0 0 9 0 9 0 O 9 O 9 0 9

0 9 0 9 0 9 9 9 9 9 0 0 9 9 9 0 0 9 9 9. 9 0 0 0 9 0 0 0 0

9 9 9 9 9 0 9 9 9 9 0 9 0 9 9 9 9 9 0 9 9 0 9 0 0 0 9 0 9

046

Heading Abbreviations; Seq = Sequence Number; ID = Identification Number; Age 54 = Age in 1954; Age 69 = Age in 1969; Ed 54 = Education in 1954; Ed 69 = Education in 1969; R = Residence/Job locale Occ 54 = Occupation in 1954; Occ 69 = Occupation in 1969; WV = Villages of Affinal Interaction

. 1 32then, the basic program, serves aesthetic and organizational functions rather than analytical ones. See Figure 4-2 for a flow chart of this program.

33

A c c e p t P r o

c e d u r e C o d e s

F r o m C o n s o l e

Read/ W r i t e T ape Or

^<ssl. Both

Write

O p e n C a r d

' Write T a p e From

CardRead

N o

JesCard oFig. 4-2. Basic Program Flow Chart

34

OC l o s e C a r d

&

T a p e Fi l es

?

T a p e

Yes

Shortage►Pages

N o

©Fig. 4-2, Continued. Basic Program Flow Chart

35

©

ReadTape

©©

Ye $

Data

Yes

Printer

Fig. 4-2, Continued. Basic Program Flow Chart

36

©?

W r i t e Page

Numbers &

NewYes

LineageHeaders

No

Yese w

No

P r i n t

T a p e

©Fig. 4-2, Continued. Basic Program Flow Chart

37

Pr int Li nes

of

Data Shortage

EachF i e l d

Da t a Shortage

Read CT a p e

Printer

NewI i n e a ge

Write

H e a d e r s

Fig. 4-2, Continued. Basic Program Flow Chart

CHAPTER V

STATISTICAL TESTING AND RELATED CONSIDERATIONS

A statistical test is a method for testing a given hypothesis about the distribution configuration of a population from which a sample is assumed to have been drawn„ A statistical statement is an expectation of the above procedure. The use of statistical statements in the social sciences is mostly of the type known as conditional probability statements (Simon 1969:366) .All hypothesis-testing Statistics are conditional probabilities. The typical conditional probability question is, "What is the probability of obtaining this sample S' if the sample were taken randomly from universe A?" Implicit in the statement of any general relationship reached through the examination of a number of cases by whatever method is the contention that the relationship observed did not occur by chance. The use of statistical procedures does nothing more than make this contention explicit and in itself introduces no additional problems or assumptions provided appropriate techniques are employed.

38: . • '

39Hypothesis Development

A hypothesis is a single statement that attempts to explain or predict a single phenomenon. It is an empirically testable statement arrived at through the process of deduction.

The following paradigm is a general scheme for hypothesis development:

Problem

and/or

Facts

Assumptions

or

Axioms

Deduction

Hypothesis

Theory

Axiom, Theory and HypothesisConventional representations of social science

data into mathematical formulation normally requires the development of axioms, that is, statements that are not directly provable but are assumed true. In order for these axioms to be generally acceptable, they should be propositions that involve variables that are taken to be directly linked causally. Axioms are not self- evident truths but are rather arbitrary rules subject only to the requirement of consistency. It is assumed that the axioms are reflective of the investigator's perception of the population. .

Theory is probably one of the most misused words in research; few statements called a "theory" actually are so. If there are well established assumptions in a field, and if there is an apparatus that permits deduction, then there exists a body of theory. The theory must cover a substantial portion of the material in a field; and it must be systematically organized. Another requirement for a theory is that essentially the same assumption must underlie many of the problems in a field. A theory is an entire system of thought that refers to many phenomena and whose parts can be related to one another in deductive logical form. That is, as Simon states, "One could work out a

' ■ ' 41set of assumptions [or axioms] from which one could deduce any single hypothesis.. But such a set of assumptions would not be enough; a set of assumptions must underpin not just the one hypothesis hut many other hypotheses also" (Simon, 1968:37) .

In the development of a hypothesis, shown in the accompanying paradigm, the assumption, or axiom, is deduced from the problem and/or fact and is assumed to be true though not directly provable. A deduction is arrived at through a logical process from the assumption and concerns itself with a relationship between the problem and/or fact and the axiom. A deduction is translated into a hypothesis which is empirically testable.A deduction, too, is testable in the framework of the theory when a body of theory is available. In addition to deriving hypotheses from deductions, hypotheses may also be deduced from theory; as well, hypotheses may also be derived directly from observation and unformalized intuition.

Following the testing of a hypothesis, data is often found to be in error when it disagrees with the hypothesis. If the data is rechecked when it disagrees with the hypothesis and hot rechecked when it does agree, a situation is. created in favor of getting data that

xy . c % . /: / - 42

agrees with the hypothesis. The only solution possible is to always recheck the data.

The Null Hypothesis The concept of the null hypothesis is difficult

to grasp. Fisher (1935:18-19) is a useful reference and provides an extensive discussion of this concept.

The logical structure of this research derives its Character from the classical system of analysis which sets up as null hypotheses conclusions that the investigator hopes to disprove. Mathematically, it is predicated on the assumption that statistics describe random variables in a sample as equivalent to corresponding parameters describing random variables in the population from which the sample was taken.- When such a situation occurs, the null hypothesis is accepted. When it does not and the divergence is large enough to not be caused by random error in sampling, the null hypothesis is rejected. ■

It is possible to state mathematical formulations in behavioral terms, although few mathematicians would consent. For example, the above hypothesis could be stated thuslythere exists an equivalence of behavior within a universe that is reflected in the behavior of random samples taken from that universe. Equivalence

of behavior means that predictions about the population may be made from the sample.

Hopefully, the "fair coin" example will demonstrate this assumption. The null hypothesis is that two coins come from the same universe (the universe of "fair coins") and that therefore, they will behave the same. If the two coins are tossed one hundred times each and one of them comes up heads fifty-five times and the other comes up heads one hundred times,, we reject the null hypothesis and assert that one of the coins does not come from the universe of "fair coins." For purposes of consistency the null hypothesis is expressed as an assumption that the universe is the same for whatever variables are under consideration.

In the analysis of systems (including social systems) the clear delineation of the universes and their behavior patterns is fundamental. In the present study the village of Rankhandi could be considered a universe, and certainly all samples under study have been drawn from it. But a series of statistical tests relating the null hypothesis to particular variables shows that, in terms of behavior, predictability is not possible. Castes were found to be much more satisfactory categories for prediction as were others such

44as sex. This is because the variables under consideration are not normally distributed at the village level. However, they are normally distributed at the caste level.

SamplingThe term sample derives its significance from

other terms which support it operationally, such as : random and uriivers e or population.

Sample and universe are parallel to the concepts of subset and 'set in mathematics. This means that the former is some portion of the latter, perhaps even all of it, although such a complete sample is relatively rare. The reason for sampling is that generalizations about the universe may be made by an examination of the sample which is presumed to be the universe in microcosm.For this purpose, it is assumed that the most reliable model of the universe can be constructed from the sample if, and only if, the selection is random (by chance alone), sufficiently Targe (obviously a factor of the size of the universe) and the variables under consideration are normally distributed.

In the instance of the present study the sample is approximately 30% of the universe. Samples half that size are generally considered far more than necessary to insure accuracy of the proposed statistical model., The

first requirement [that the sample be random),is not so easily satisfied. It was felt that since the entire village could hot be. coded for this study, it was necessary to utilize a selection method that would provide an opportunity to code and evaluate discrete social units.For this reason all available data on all castes selected for coding was utilized. No particular method was used for selecting castes. The third requirement that the variables under consideration be normally distributed was satisfied at the caste level but not at the village level.

Random ErrorRandom errors are those errors which are not

made consistently, that is, the mean of the errors is equal to zero. These errors most often arise from clerical, and primary data source errors and do hot necessarily affect the final evaluation. If particular factors which would cause the errors to cluster (such as a keypunch operator who consistently punches 61s as 9's) are not uncovered, then any remaining errors can be viewed as random errors.

If the true correlation between variables ishigh, random errors will most likely tend to lower the

> ' correlation in the observations. This is true becauseif two variables are highly correlated, then there aremany more favorable, cases than unfavorable. Consequently,

■■■ '■ ; • 46 'random errors are more likely to change a favorable case to an unfavorable one than vice versa. On the other hand, if two variables are unrelated there would be an equal number of favorable and unfavorable cases. In such a situation random error would select for distortion equally.and the net result would be that the ratio of the distribution of events would not be changed.

In summary then, statistically significant correlations are observed in spite of random errors, not because of them. The effect on a hypothesis under investigation would be that if the data tends to indicate the existence of a relationship between two variables, there is a great likelihood that the true relationship is even stronger! Under such a circumstance, the more random error that could be demonstrated, the greater support for the validity of the hypothesis. In those cases where neither random nor non-random error is responsible for some observable difference between an expec-

- J

tation of the configuration of the data and the data itself, then some adjustments must be made. As was previously stated, data are often found to be in error when in disagreement with a hypothesis.

• ■ . ■ . 47Sonie Statistica 1 Tests

None of the statistical manipulations shown here require the use of the computer„ . In fact, it is seldom the mathematical procedures alone which require data processing equipment; rather, it is the amount of data under consideration that often dictates the use of machine manipulation.

The following summary of some basic statistical procedures is included to provide a foundation in the logical processes underlying each manipulation. The use of, statistical evaluations via computer programming requires such a foundation. The intent is not mastery, but rather familiarization for the purpose of selecting appropriate methods, Statistical techniques can only be of value when the researcher understands which are - appropriate for the data under consideration. As well, these procedures are basic prerequisites for the more sophisticated techniques. Since the actual computer subroutines are readily available, this logical and procedural survey is sufficient. For. clarification of any unfamiliar terminology, the reader is referred to the indices of any of the basic statistics texts listed in the Selected Bibliography.

Pearson Correlation Coefficient (r)Correlation is an interrelation between two or

more variables, The nature of this interrelation produces two directions of correlation: one, a positivecorrelation, where two variables vary directly with each other, that is, when Variable A increases by a fixed amount, Variable B increases by the same amount; and two, a negative correlation, where two variables vary inversely to each other, that is, when Variable A increases by a fixed amount, Variable B decreases by the same amount.

The Pearson product-moment correlation coefficient (r) is a quantification of correlation and ranges from -1,00 through 0 to +1,00, It is this restricted range of v which permits a direct comparison of r ’s obtained from different sets of data without a correction for the size of the original values of the Variables. The algebraic sign indicates only direction of correlation. Degree of relationship is shown by the absolute size of the coefficient. Thus, a correlation, coefficient of +.50 shows the same degree of relationship as does one of -.50, A correlation of zero denotes the absence of any detectable relationship between variables. It should be noted that a correlation coefficient exactly equal to zero rarely occurs; in a sufficiently large sample,

49chance alone generally acts to create apparent relationships , such that a correlation coefficient of +.30 or higher may occur even though there is no intrinsic relationship between the variables; as wellp v is not a causal indicator. This fact is well illustrated by Willoughby (1940:485) who attacked an attempt to show that a high positive correlation between vocabulary and college grades' meant that an improvement in vocabulary would produce an improvement in grades, Willoughby stated that by the same reasoning a high positive corre- , lation between the height of boys and the length of their trousers would mean that lengthening trousers would produce taller boys,

If the Pearson correlation coefficient can hot establish causality between variables, the question then becomes "Of what use is this statistical test to the researcher?". The Pearson r measures a specific type of relationship which is called the linear regression between variables. A scatter plot is a graphic representation of plotted variables within a Cartesian Coordinate System; and it is the degree to which a straight line relating X and Y summarizes the trend in a scatter plot which is the measure of linear regression (Fig. 6-2, page 6 5) .

. , 50When r is used as a measure of relationship

with data that is non-linear, the r calculated is always an under-estimate of the true relationship, between variables. Many times, in fact, there may be a strong curvilinear relationship, such as a U-shaped curve, which is so removed from a straight line that the linear coefficient of correlation will approach zero. Since it is often difficult to tell whether or not data is linear by examination, a scatter plot is often used to clarify the exact nature of the data., If deviation from lineality should merely be suggested by the scatter plot, an alternative statistical test, the eta coefficient should be used (Downie and Heath, 1965:203)..

The testing of a value r for significance involves one of two possible statements of the null hypothesis. The significance of a difference between the correlation coefficients computed from two samples may be tested by posing a form of the null hypothesis, that is, the assumption that the population r 1s differ by zero. The second is that a test of the significance of a sample correlation coefficient is often made by restating the null hypothesis as a statement that the universe correlation coefficient is zero.

51The Phi-Coefficient

The phi-coefficient is a form of the Pearson product-moment coefficient of correlation and offers specific advantages for certain data. It is the simplest correlational technique providing an index of association between two dichotomous variables, such as literate and non-literate males and females.

The phi value, like the Pearson r, can vary from -1 to +1. The sign indicates only the direction of correlation, direct or inverse, and the absolute size of the coefficient shows the degree of relationship. Similarly, the phi-coefficient is not a causal indicator.

Computation of the phi-coefficient involves development of a 2 X 2 contingency table. In this case, the two dichotomous variables (male and female dichotomized as literate and non-literate] are summarized in the contingency table below:

Literate Non-Literate

Male

Female

A B

C D

52A, B, C and D are frequencies used in the computation.A high positive correlation would reflect the tendency, that both males and females will become literate.

To test the significance of phi, the phi value is converted to a Chi-square (y2) value through the formula x^Nxj)2 and tested at a desired level of significance by reference to a Chi-square table (Selby, 1970:612). If it is significant, then the obtained phi-coefficient is also, .

The Chi-Square TestChi-square is.a test of frequencies unlike

the Pearson v which is a test of measurements. Chi- square may be used to test any a priori or assumed hypothesis about a population, that is, whether the frequencies observed in the sample deviate significantly from some theoretical or expected population frequencies. This is accomplished by testing the hypothesis for no significant difference between or among groups (the null hypothesis). The function of the assumed hypothesis would be to account for any significant difference.

As in any statistical method, there are certain limitations on the use of Chi-square; first, it can only be used with data that actually reflects the frequency of events; second, the individual events must be

S3independent of each other; third, no frequency should be less than five; and finally, since the sum of the expected and the sum of the observed frequencies must be the same, then both frequency of occurrance and non-occurranee must be included in the data.

Evaluation of a Chi-square value involves determination of the degrees of freedom. Degrees of freedom is essentially a correction factor and is obtained by taking the total number of categories less one. For example, in evaluating data which categorizes

' children by school grade, the degrees of freedom would be obtained by subtracting one from the total number of grades represented. Also the significance level for the rejection or acceptance of the null hypothesis must be determined and reference to a Chi-square table made. The Chi-square test of. an a priori or assumed hypothesis about a population can tell us simply that the difference between the expected and the observed frequencies is or is not significant. If the difference is significant at a certain level of probability, then a non-chance factor was operating. And if the difference is not significant at a certain level of probability, then the discrepancy between the observed and expected frequencies is due to chance alone. The Chi-square

. : ■ 54test is riot a causal determiner, but only an indicator of variance between observed and expected frequencies.As such, it can only provide the researcher with a quantifiable statement of the nature of variance and • , whether the variance is due to chance alone„

Data MatricesIn essence, a data matrix is a graphic represen

tation of the data. It is a common element of many statistical tests ranging from the simpler calculations, such as Chi-square and the Phi-coeffiCient, to the more complex such as cluster analysis. For this reason, the researcher should be familiar with the data matrix.

The analysis of data by matrix-oriented computations is facilitated if distinctions are made between variables, which are concepts imposed upon phenomena by the anthropologist in accordance with some model of human behavior, bbj ects which include culturally defined por- tionings of the universe, such as land, and actors which include individuals, lineages and other culturally relevant categories of social groupings. According to the logic of combinations, six kinds of matrices are possible:

ROWS COLUMNS1. Variables Variables2, Objects Objects

55 '

3, Actors Actors4, Actors Variables5, Actors Objects6, Objects Variables

To generate meaningful conclusions from matrix- oriented calculations variables' must be relevant &nd the sample size should be conveniently large, Theoretically, there are no restrictions on the number of variables or the sample size, Despite this laxity, it is admissable to exclude insignificant variables and to group minor ones into general categories. In the case of the present study, this was done at the outset when the code itself was being designed. To decrease bias, individuals with a large portion of missing data should be excluded from the matrix.For example, in analyzing data based on such a variable as

■ ' " _ . .

male/female ratio, all patriarchs should be deleted as there is no data on their wives or sisters.

The statistical tests which have been explained provide the necessary logical groundwork to understand more sophisticated methods, In any case, the utilization of these statistical operations on all variables coded . should provide insight into the validity of any hypothesis for which the data was. gathered. For instance, although this was not done, it would be possible to evaluate the association between all possible two variable combinations.

56In a group of data with N variables, there are exactly

■■XT y ■

— — — — • such possible pairs. N! read N factorial is (N-2)!2 Ithe multiplicative product (Nx (N-l) . 3 ’ 2 « 1).

' ' x fGiven three variables A, B, and C there are — —— — = 3.2 !1!relationships to test: AB, AC and BC„ Then using the14 variables available in this investigation taken two

.. . . . . - .

at a time — ---:---- = 91 testable relationships. .(14-2)12!

' Statistical Subroutines A wide variety of statistical operations are

available for evaluating large quantities of data. The operations pertaining to random sampling were used since the determination of castes for coding was non-selective . Ten castes from a total population of thirty-one were coded. Each caste selected was coded in its entirety in order that legitimate cross-caste variables could be adequately analyzed. From a total village population of approximately six thousand, two thousand one hundred eleven individuals comprise the sample. These constitute about 35,2% of the total universe. No attempt was made to winnow out extreme cases of any variable under consideration. All individuals listed in every lineage for any selected caste were coded regardless of the paucity of data on them. However, no individual was included in the analysis if no information was available for that variable. Each person

was listed in the original coding in order to provide complete data shortage outputs that could.be used for further refinement of the village population description file. ■ . ' ■ "•■■■ ■ ■■ . . ; ;

For the operations used in the handling of this data, two kinds of programs were used. The basic program for the initial tabling and data shortage output runs was written exclusively for this project. For the statistical runs in this project selected programs, from IBM Corporation’s System/360 Scientific Subroutine Package, Version III (manual number GH20-0205-4) were used as models. Modifications were made to suit this particular research design. Descriptions of the subroutines were also obtained from this manual.

This subroutine system was selected for the processing of the census data used in this investigation because of the following characteristics:

1. All subroutines are free of input/output statements. This allows maximum flexibility in designing for a particular problem.

2. The subroutines are written in FORTRAN. This allows for ready meshing with other programs since this language is in common use.

; ■ . ■ . 583„ All the subroutines are documented uniformly.

This allows unusual clarity in going step-hy-step through any given set of programs.

Other subroutines for the evaluation of social science data in general are becoming available. Two such are OSIRIS and SPSS. OSIRIS was developed by the staff programmers of the Institute for Social Science Research and the Inter-University Consortium for Political Research of the University of Michigan; and SPSS [Statistical Package for the Social Sciences) was developed at the, Stanford University Computer Center (Burton, 19 70:38).

CHAPTER VI

THE METHODOLOGY OF DATA ASSESSMENT

SuGCeeding the .initial procedures, in the computer evaluation of census data as discussed in previous chapters is the actual assessment of the data. A method for the evaluation of the data and some of the implications of this method will be discussed and demonstrated with the Rankhandi data; in addition, a flow chart of the method (Fig, 6-1) is included. The application of this method to the census under study, however, has some fundamental problems, These problems arise out of an inability to: one, secure quantitative data uniformity, and two, correct known data error. Both require additional field study.For the.purpose of demonstrating the method, variable sets may be selected and statistical manipulations performed on them to demonstrate how associations may be uncovered. If the method herein described is to be applied to social science data, then the resultant model should be based on mathematical generalizations which are in turn based on mathematical associations within variable sets. Such associations assist in the acquisition of insight which would include an understanding of the basic

59

- . J '.II :-------------------------------n— ;---------r-1-------------------------n— n ------------:------------h i' ' w H ---------------------

60

C o d e d

D a ta

Update

D ata Shortage

O u tp u t

uantitative Data

Uniformity

C o rre c tD ata

AccurateD a ta Error

Output

V a r ia ble

Sets Listed

S tatis tica l M anipulat ion

ofVariab le SetsS T O P

Associations

Derived

Reject

Variables

Fig. 6-1. Flow Chart of Data Assessment Method

61

Chorocterizing

the Forms of the

Associations

Formation of

Mathematical

Generalizations

Combining Mathematical Generalize tions into a Predictive Model

Other

Relevant

ModelPredictive Model Tested for

AccuracyModification

Application

of the

Predictive Model

STOP

Fig. 6-1, Continued

. . ' 62 descriptive boundaries’ of the population under study. Mathematical generalizations are developed from the associations and as such are simply statements of probability. The method employed must be stochastic, that is, it would be a process of selecting from among a group of theoretically possible alternatives those elements or factors whose combination most closely approximate a desired result. In this case, as has been previously stated, the desired result is predictability.

Data Shortage and Data Evaluation The coded input is assumed to have been developed

in conformance with the suggestions outlined in Chapter III. Since a provision was made for coding variables with an absence of data and this absent data could seriously affect the end product, then it is necessary to initially correct this data shortage. . As was outlined in Chapter IV, the total data shortage may be generated by caste and lineage listing each individual and the particular variables for which data is not available. Preferably at this point in the assessment, the data shortage would be corrected. Any researcher dealing with census data may sample the population or include the entire population in his analysis. Regardless of the approach, data shortage should be corrected before the analysis begins; complete data uniformity is most crucial,

. ; 63however, when a sampling procedure is utilized because it is assumed that the sample accurately represents the entire population,

If quantitative uniformity of data for all variables is reached, then it would be possible to continue the evaluation„ The uniform data would be checked for obvious inaccuracies. For example, simple programming procedures would reveal inconsistent patterning. Such patterning could be reflected in a 'decrease in stated age from 19.54 to 1969, or a deerease in education level over the same time period. This is not to say that after the completion of this procedure, the data is free from error. Certain types of non-random error will escape this correction procedure, but the majority of this error will be discovered when basic statistical tests are performed on the data.

Selection of Variable Sets Given the correction of the more obvious.errors,

then sets of variables pertinent to a particular area of inquiry can be selected for analysis. In the case of anthropological census data, variables have already been chosen by the ethnographer; and from these: variables sets are made. These variable sets may contain anywhere from one to the total number of variables available.

Specific variable sets may be made in accordance with a particular goal„ In this study the following variable sets were chosen to 11lustrate the procedure: age -grade variables; male/female distribution; literacy-sex variables; and marriage reciprocity.

Determination of Associations Among VariablesAssociations among variables are determined by

the use of statistical procedures appropriate for the type and number of variables in the variable sets under consideration. In the absence of a demonstrable association, a check is made for availability of other variables. These variables are also subjected to appropriate statistical procedures. When all variables have been evaluated for•association, the forms of the associations are characterized, This is most commonly accomplished by graphing (Fig. 6-2).

Age-Grade VariablesThe age-grade variables were subjected to the

Pearson product-moment correlation test by caste, so that ten correlation coefficients were obtained. Through programming only those individuals enrolled in school in 1969 (coded as 723 in the 1969 occupation columns) were included in the statistical test for association.

Grade

XIIXI

IX

VIII

VII

VI

IV

III

4 6 9 10 11 12 13 14 15 16 17 18 19 205 7 8Age

Fig. 6-2. Age-Grade Distribution of Brahmin School Children

: • 66 Correlations varying from ,591 to 1,0 were uncovered with the. N for each; caste varying from 1 to 58, There exists an association between grade in school and age with caste functioning as a determinant of sample size.An example of the characterization of the form of the relationship was developed from the Brahmin caste data. This graph (Fig, 6-2) clearly demonstrates the range of variation in grade by age, as well as the direct relationship between grade and age, The graph also shows the tendency to cluster about the reference line which has been drawn to illustrate how many cases approach the ideal, that is, how many begin school at age six and progress one grade level each year,

Dichotomous VariablesIf variable pairs are selected which are readily

dichotomous, such as literacy and sex dichotomized as literate and non-literate and male and female, then thephi test for strength of association is applicable. Itshould be noted that such variables as literacy are ope r at ion ally defined for the particular needs of aresearch design. For the present example, literacy wasdefined as having at least a sixth grade education. Other definitions can be found in the literature, but such definitions of literacy as ’'being able to read a letter

67from a friend and write an answer to it” are not readily quantifiable. Persons for whom no data was available were excluded, as well as those individuals under twelve years of age; such exclusion was necessary because updating of data shortage was not possible. The matrix used for this phi test may be referred to on page 51.The values of 270 literate males, 115 illiterate males,29 literate females and 354 illiterate females produced a phi value of .6414 which when tested for significance by conversion to a Chi-square value yielded a value of 315.9552. A Chi-square value of 315.9552 at one degree of freedom indicates that such a distribution has only the remotest probability of occurring by chance.

Single Variable AnalysisSingle variable analysis may also be done, but

in such cases a common objective is to test for data accuracy. The variable chosen, to illustrate this point was sex distribution within the trial sample which included all available males and females, living or dead, child or adult, The resultant distribution showed 1216 males and 894 females. A Chi-square test of significance was performed to ascertain the probability of such a distribution occurring by chance alone. No Chi- square table readily available listed such a high

^ ' "' ; : . - ; : : 68Chi-square value at one degree of freedom. The highest significance level usually given is ,001 which means an event at that level of significance could be expected to occur only once in 1000 times. In the case of a Chi- square value of 49.14, the probability of a chance distribution is far greater than that.

Since an equal number of males and females would be expected in the data and the Chi-square test of significance showed that.the variation between the number Of males and females did not occur by chance, it is possible to use the method as herein explained to deter-

■ Jv. ;:mine in which groups, such as age, caste or lineage, females are not represented. The ages and education levels for those Brahmin females enrolled in school in 1969 were obtained through computer programming, A scatterplot (Fig, 6-3) of this information for the females revealed that no females under the age of eight were enrolled in school and that no females were enrolled in grade levels one through three; but ages eight through 18 were represented, as well as grade levels four through 12. A correlation coefficient of .93 in Conjunction with the scatterplot which shows the actual distribution of individual age and grade level values further indicated that there is a strong tendency for

Grade

XII1 CaseXI 2 Cases

IX

VIII

VII

VI

IV

III

4 5 6 87 9 10 11 12 13 14 15 16 17 18 19 20Age

Fig. 6-3. Age-Grade Distribution of Brahmin Females Enrolledin School

; ; ' ' V. : 70students to progress one grade level each year. It would be expected then that females between the ages . of. five and eight would be enrolled in the first through third grade levels. This is not the case, however;Brahmin females in this age-student group are not represented. This method, therefore, enables specific problems, such as an unequal male/female ratio, to be localized in one group of the total population.

Marriage ReciprocityIn accordance with the description of data matrices

in Chapter V, a matrix of marriage reciprocity by caste and village was made (see Appendix D). Since it is known that the basic pattern of marriage reciprocity in North India is caste endogamy and village exogamy, it is possible to determine the degree of reciprocity by caste for any particular village and Rankhandi. This is an actor-obj ect matrix in which the actors are the castes and the objects are the villages. The data within the matrix are daughter/wife.ratios; that is, the ratio between, the number of women leaving Rankhandi for a particular village and the number of women entering Rankhandi from that same village.

A rather striking feature of the statistical configuation of inter-village social interaction is the

; ; . 71remarkably low incidence of women exchange. Of the ten castes, analyzed, a total of 191 villages are represented as having either sent a woman to Rankhandi or having received a woman from Rankhandi. Of this total only 31 villages registered as having both sent a woman to Rankhandi and received a woman from Rankhandi. Other support that this tendency toward low marriage reciprocity is not uncommon can be found in the literature. For example, Oscar Lewis comments: .

Our study showed that the 266 married women living in the village, came .from about 200 separate villages at a distance of up to 40 miles,, . , If we now examine the other side of the picture, that is, the daughters who married out of the village, we find that over 220 daughters of. Rampur married out into about 200 villages. Thus, this relatively small village of 150 households becomes the focus of affinal kinship ties with over .400 other villages (1965:320).

For.the purpose of establishing the overall ratio of women leaving Rankhandi and those moving to the village, the individual ratios must be summed. This procedure reveals an overall reciprocity of 124/306.This is computed after excluding those individuals for whom the location of the village was not known and those individuals whose village name was unknown; however, if we were to include in the matrix total those individuals previously excluded, the ratio would

7 2be 261/450 and still reflects a shortage of women . marrying out of Rankhandi. This statistic may be the result of data error or it may actually reflect the frequency of the event and distribution of marriage interaction in this village.

The Formation of Generalizations Mathematical generalizations are formed from

associations when the associations appear to be applicable to the entire body of data. If associations are determined from a sample of the population, then generalizations are made about the entire population. And if the re - ..searcher chooses the include the ‘entire population, not a sample, in his analysis, then the generalizations about the population will be more accurate.

From the associations determined, between the variables of age and education level for those attending school by caste, the following generalization can be made. In Rankhandi individuals tend to progress at a uniform rate through school, but this rate varies significantly between castes (Fig. 6-2).

The association between sex and literacy can be raised to the level of a mathematical generalization when it can be shown that it is not limited by some other variable such as caste. In the present study associations

I

have been established.between sex and literacy for ten of 31 castes. These ten castes span the entire socioeconomic and ritual hierarchy of the village under study. Therefore, the generalization can be made that in Ran- khandi sex and literacy are related in that the number of literate men is significantly greater than literate women regardless of caste.

The Chi-square test performed on the variable of sex distribution showed that such a distribution probably did not occur by chance. The generalization to be made is that such a distribution could, occur only if a non- chance factor is. present.

From the data matrix analysis the following generalization may be formed. The degree of reciprocity between Rankhandi and other villages ranges from zero to one-to-one; however, the overall marriage reciprocity is very slight.

The Meaning of Model Although the formation of models is not dealt

with in this research design, it is assumed that model formation and utilization would succeed the procedure outlined here. For this reason it is appropriate to outline the concept of model upon which the research design is based and to provide suggestions for model

74building. This is all the more important because the term ’model’ has gained such wide and varied use within the social sciences that it has become necessary to operationally redefine the term virtually every time it is used.

The term model is used in modern anthropology to refer principally to two conceptual activities: one, the construction of "theoretical entities" by the anthropologist to explain a body of social phenomena; and two, the study and possible use of the constructs invented by the subjects of an investigation themselves in order to give an account of how they view their own social universe.

In the formulation of this research method the . model construct as explained by Hugo Nutini (1968:371-383) was used. It is briefly summarized here, but the reader is referred to the original source for a more detailed explanation. In his analysis of the model concept Nutini distinguishes between paradigms and models. The use of the term ’paradigm’ by Nutini approximately corresponds to the term ’generalization’ used in this study. Models are never part of the data that they are designed to explain. For this reason, data analysis itself will never yield a model. Generalizations are

extracted from the data and models are superimposed on the data.. The totality of mathematical generalizations constitute only a summary of the data; and the model acts to assign meaning to the generalizations.

Meaning is brought to the data by ethnographic support„ In Rankhandi, for example, it would be possible to quantify a percentage difference between Brahmins following traditional Brahmin occupations and those in non-traditional occupations. This difference may be capable of being expressed in terms of another variable, such as education. If it could be shown that adherence to traditional occupations varied with education, then the problem would be to determine the direction and extent of the relationship, as well as what the implications are for the village. To do this, other relevant data such as ethnographic material and/or a particular theoretical framework would have to be taken into 'consideration.

This model must be tested for accuracy, such a test would be based on the validity and reliability of predictions derived from the model. If the sample is the > total population, then the predictions to be tested must deal with future events; however, if the sample constitutes only a fraction of the population, then the predictions may be about future events or about another

sample drawn from the same population. In the event the predictions are not sufficiently close, then some modifications must be made in the model. This lack of . congruence between the model and reality results in the need for a cybernetic of 'feedback’ loop which has as its function the continuous adjustment of the model until the required congruenceand therefore, . predictability is reached. Several factors may operate to cause the perceived differences, non-random error such as prevarication, random error such as erroneous entries in the census sheets, an erroneous hypothesis or evaluation error such as too small a sample or falsely assuming the variables to be normally distributed.

CHAPTER VII

SOME FINAL CONSIDERATIONS

General Data Pro'blems The obstacles created by the factor of human error

cannot be eliminated; therefore, they must become a part of the research and evaluation design itself. Even the most fundamental human errors arising out of imperfect mental and physical faculties must be predicted and accounted for, and the data and hypotheses adjusted accordingly. What is being sought is predictability and anything that could affect that goal must be accounted for in the final evaluation. An aspect of this problem is what might be termed judgement dispersion, not bias, which is the systematic tendency to deviate from the ’true1 value in a given direction, but the inability of a given observer to repeat an observation again and again in exactly the same way. Often in the case of census data, particularly when dealing with such a large population as that of Rankhandi, many of the correction factors normally used to minimize errors of this nature, such as the repetition of observations by several different researchers, are not possible from a practical standpoint.

77

' • 78. Particularly in communities or other relatively

self-contained orders which come under the pervasive scrutiny of social research, purposefully given misinformation is also a common occurrence, Given the typical source of error or misinformation, the problem facing the analyst is almost overwhelming. The data gathered at any particular time and by any particular instrument such as a census questionnaire may be highly uneven and contradictory. Ralph Beals has commented that among the Indians of Mexico, where he has worked, the greater a people's belief in witchcraft, the more likely they are to deny its existence entirely (Naroll, 1967:28). Clyde Kluckhohn has noted this problem of concealment and the non-Western respondents anxiousness to fulfill interviewer expectations, and has noted the effect which these can have on data gathered (Kluckhohn, 1949:87). In another account Passin dealt directly with the problem of lying (1942:235-247). In such cases checks of the data and collection procedures must be made to reveal inconsistencies and obvious fallacies; this is facilitated with large quantities of data by use of the computer.If possible inconsistencies should be corrected in the field; if not, they must become a part of the research design.

: V . . .-.C:- . . . 79The opportunity for this type of error was

reduced in the field study under consideration here by the questioning technique. The sequence in which the questions were asked was used as a means of control.The interviews began with relatively neutral topics of a genealogical nature. This then provided a referent to ask more sensitive personal questions about an individual whose existence had already been admitted by the informant. This technique Was further supported by asking a single question about each lineage member rather than focusing all the questions on the lineage members one at a time,

Specific Data' Problems .In Chapter V it was noted that the basic census

listed fourteen variables which allow for the creation of ninety-one variable pairs. Since the sample was less than 1/3 of the total population, not all variables can be expected to demonstrate equal significance. For example, land is not normally distributed in the statistical sense and the very paucity of individuals in the sample who own land renders this topic unsuitable for analysis.

The patterning of the data shortage indicated that a few lineage members were supplying the information for the entire lineage. Occasionally the name of the wife of a living lineage member was not given. This would

------------------------- :---------- :-------------- =------------------------------- ' ' - =------- r-1------ r-r---------:= —=------ TH------------------------------------------- T T ------ (------i------ 1- --------------------------------------------

/ . , ' -.V. ■ 80indicate that at least in some instances, the data was not secured from the lineage member himself. For example, SABBIR (identification and sequence number: 30B530303 034)is coded with a 1 in the residence category and a 000 in the occupation category and Was obviously hot interviewed in person or the information may not have been, properly recorded. This example demonstrates that the method herein outlined also functions to clarify any data gathering techniques used in the field that might have weakened the data in terms of validity and/or reliability.

It is unfortunate that the Rajput caste was not included in the study. Although this caste is the largest in the village, in fact larger than the entire sample that was eventually used, it must be regarded as negative, research procedure to eliminate it from consideration. This is more pertinent when it is realized that this caste holds about 90% of the land around the village. This shows that the variable of land is hot normally distributed and as such.a random sample cannot be expected to reflect the population.

The residence category could provide an opportunity to see temporal change or stability. At present the code is inadequate in that it does not distinguish between people who have real residence stability and

those who in 1954 and .1969 lived in Rankhandi but in intervening years had resided elsewhere. This category was purposefully left narrow for this study. Its function was not to provide a key for residence patterns but to provide a sorting variable for data accuracy checks and . cross residency data,

. Computer facilities could not be secured in time for a preliminary analysis of the ninety-one variable pairs available for comparative study, Adequate computer time, however, was secured for the basic testing of the procedure as well as that needed for the development of the format sheets (Table 4-2) and, most importantly, the data shortage outputs. The data shortage outputs (Table 4-1) are of inestimable worth in providing a relatively compact set of notes detailing all individuals for whom data is missing, as well as the specific variables in need of completion. It may be assumed that the actual analysis of the pertinent variables will await further work that will correct the data, shortage. The data shortage sheets are useful for this purpose in that they provide a format that allows for relatively efficient data correction,

One of the major problems that prevented a more complete analysis of the data was an early lack of

preparation in relating the objectives of the study to the technical limitations of programming facilities.There was clearly not sufficient planning of the system needs and software capabilities. Fortunately, this did not significantly affect the development of the system. Due to a lack of funds a mass storage direct-access input/output system was not available and the system utilized was too small for the project. Because of this the COBOL and FORTRAN compilers proved inadequate for obtaining sufficiently complete statistical and list outputs in terms of the short period of manpower and computer time available. FORTRAN SSP in conjunction with full FORTRAN IV capabilities would have made a significant difference both in the scope of the research and the time needed to accomplish the desired results.The use of user oriented report.generating programs would have resulted in faster system development and programmer support would have been unnecessary.

In spite of the aforementioned problems, this data analysis method has definite utility. Given the time and funds to make revisions in the coding and programming, the rapid sorting, tabling and statistical evaluation of large quantities of anthropological census data could become routine. This is so because the nature

: ■ . v. ;; 85of this method is such that the quantity of data does not seriously affect the amount of time necessary for evaluation.

. Suggestions for Further' 'Analysis It is clear that all the possible analytical

procedures have not been performed even on the coded sample. Only a minimal analysis was done to establish the efficacy of a particular procedure. This limitation was a factor of the inability to correct data error and shortage.Given such information, the framework developed here would make possible the application of appropriate statistical procedures on all the data variables.

More sophisticated analysis requires that variable , sets be used in conjunction with each other. For example, the utilization of the codes in the residency/job locale category in conjunction with the generation and age categories could reveal relationships pertaining to such questions, as to whether younger men are leaving the village in significant numbers, Similarly, the categories of generation and occupation within castes could be tested with reference to traditional castes’ occupations to determine if younger men are deviating from traditional occupations.

; _ < . / 84Also, the use of the Indian Census Bureau's

coding system for occupations makes possible a machine Search to uncover possible patterns which may. exist ; resulting from the tendency to remain in certain related job areas„ For example, an examination of the first digit in the occupation code of a given individual for 1954 and 1969 may show a 1 for both time periods, which would indicate that although there was some occupational change, he remained in the general job description of Field Crops. These patterns could also be related to caste and generation. This procedure may prove useful in quantifying the range of occupations performed by a particular caste in this village. The ethnographic literature for this area has usually asserted that a very narrow range of occupational options is available for any particular caste, and any deviations from the ethnographer's understanding of reality were explained as culture change. It may be that the range of legitimate occupations for a particular caste is much broader than has been previously thought.

Education was the variable with the greatest amount of data shortage. This is no doubt a reflection of the original purpose of the initial research project for which the data was gathered rather than data gathering

problems, This is a logical assumption since often the field notes for entire lineages state that the question was not even asked. In any case education is an obviously rich category for cross-variable analysis.

Implications for Culture Change Quantification Given two sets of data which have been gathered

in two different time periods from essentially the same population, it may be possible to detect quantitative factors in culture change. Since culture change tends to be relatively slow it would be important to focus on populations that are sufficiently small to allow for slight changes. The definition of castes as populations, for example, would allow predictability based upon the null hypothesis that ’occupation’ and 'caste membership' belong to the same population. This fact is., of course, common knowledge among, students of Indie studies, but the point is that by virtue of the precision of the techniques such prior knowledge is unnecessary. From this example, it should be clear that populations (such as caste membership) can be assumed to be static, and the processes of culture change can themselves be defined- in terms of a shift in the congruence of behavior, This shift in congruence in turn results in a lowering of predictability.

• 8 6

As. certain culture factors change, assumptions made prior to the change often do not hold. This possibility should be kept in mind when dealing with data such as in this study where two differerit time periods are present„ If predictions are accurate for one time period and not. the other, then it may not be as much a question of data accuracy as culture change.

ConclusionsThe research design as herein proposed requires,

little in the way of expertise in either programming or statistics. Texts are available for assistance in learning the basic statistical and programming skills necessary for the development of a system.for a particular research need. However, the analysis of the final coded results would require that computer facilities be available.

Given the premise that the advantage of the computer lies in the processing and tabulation of large quantities .. of data, it may be assumed that the utilization of this method would be based on the availability of a data source of sufficient size as to merit such an effort.

APPENDIX A

.INTERVIEW GUIDE FOR KHANDAN STUDY

Identity of informant:a. name b. age c. father’s name d. castee. neighborhood (paTTii) f. lineage code number

1. List the name, age and marital status of all members of the informant's lineage in the following manner:

Identity No.*(1) Baaruu, w.d.

. * (2) Bhawaanii Sing;, w.d.*03) Harsukh, m.-n.s., w.d.*(4) Bansii, w.d.*05) Makhan, w.d.06) Moolhii, 060), w.a.07) Kailaashii, (30), w.a.08) Janeesar, 016), n.m.

*09) Sabiiraa, w.d.*010) Ram Diyaa, w.d. -OH) Jagan, (55), w.d.012) Shayaamuu, (25), w.a.

Symbols used indicate:* - individualxis dead

(16) - age w.d. - wife dead w.a. - wife alive

m.-n.s. - married, but no sons n.m. - not married

Indentation indicates difference in generation-siblings are listed in order of age -sons are listed under their father with appropriate indentation

87

Prepare a genealogical diagram for the lineage using identity numbers cited in the preceding list to represent individuals in the diagram,a. indicate on diagram the place where each indi

vidual residesif he 1ives in Rankhandi, note paTTii (division of village)

b , . encircle those individuals.residing in acommon compound

c, encircle in red those individuals sharing a common chuulhaa (hearth)

d. note those sharing a common gheer, chaupaaR, bagaaD

Obtain the following information for all members of the lineage residing outside of Rankhandi:a, place of residenceb , duration of residencec. reasons for movingd. . relationship to others in new residence Obtain a history of the lineage including:a. length of residence in Rankhandib . previous locationsc. name of ancestor who moved to Rankhandid. reasons for moving from one site to anotherWhat ties, if any, are maintained with branches of this lineage residing in other parts of Rankhandi and outside the village?a. What members of the lineage are invited to

weddings, funeral feasts and similar events?b . When and why did the last visit occur between

those residing in Rankhandi and elsewhere?a. Who is considered to be the head of the lineage

What is the basis of his authority? .b . Who acts as the mediator in the division of-

property following a death or the division Of a joint-family?

c. Who arbitrates disputes between lineage membersd. Who represents the lineage in village and caste

pahchayats (councils)?

896„ Are there any other lineages with whom this lineage

considers itself to be related even though the exact : • nature of the tie is unknown?a. Where are the other lineages located? b „ What kinds of contacts are maintained between

these lineages?7. Are there any "outsiders" (non-lineage mates) residing

with this lineage?a. What is the relationship of these individuals

to the lineage? \ .b . When and why did they move to Rankhandi?

8. List all the occupations followed by each member of the lineage. In those cases where an individual has worked outside Rankhandi, or engaged in a non-traditional occupation in the village, note the following:a. nature and duration of occupationb . place of occupationc. from whom did he obtain training?d. who helped him get the job?Note the relationship of individuals farming together as tenants or joint-1andho1ders, and other instances of cooperative enterprise.

9. Review the lineage list for cases of men who never married and enquire about the reasons for such behavior.

10. Review the lineage list and note all instances inwhich a man was married more than once. Enquire about:a . reasons for remarriageb. relationship between wivesc. type of wedding ceremony performedd. caste rules regarding sororatee. the names of sons born to each wife

11. Review the lineage list for cases of widows. Enquire about:a. whether or not these widows remarried

(1) if so, the relationship between their husbands(2) if not, with whom do (did) they reside?

ITT T TTT

■■■ ' v V - ; V:: ■ . 90b . caste rules on marriage of widows and the.prac-.

tice of the levirate.12. Have any cases of desertion by husband or wife

occurred in this lineage?a. If so, probe for details.b . Who obtained custody of children?

13. Are there any cases of orphans in this lineage?a. If so, with whom do (did) they reside?b . Where orphans reside outside Rankhandi,. what

rights (inheritance of house site, etc.) dp they retain in Rankhandi?

14. Are there any cases of adoption in this lineage?a. What, if any, formalities are observed in such

cases?b . Are adopted males considered to be members of

the lineage?15. Review the lineage list and note the extent of formal

education obtained by members of the lineage.a. Are any members of the lineage literate?b . If so, specify the language known.

16. Enquire as to the amount and type of land held by each member of the lineage.a. How was it acquired?b . What form of land tenure is involved?c . Indicate joint-hoidings on genealogical diagram.

APPENDIX B

DUPLICATE EXAMPLE OF ORIGINAL CENSUS DATA

613.1 Khandan Study Rankhandi28 January 1956

JaTiiaa (Chamar) - Khandan MG"*1. Unknown*2. Nathuuwaa*3. Naanak (d. from old age) w.d.4. Naklii (45)(46) w.a., b. in R.5. Rahtuu (7) (6)6. Santaa (36)(25) m.-n.s., w.a. [no ch. '54]*7. Mohanaa*8. Kirpaa (d. at 36) w.a.(?), m.-n.s.*9. Muaassii (d. at 40) w.d.10. Nihaalaa (28)(26) w.a. [no ch. '54]11. Geendaa (21)(18) w.a. [no ch. '54]

*1r

*2 *T! It t

*3" *8"i ii

*9

10 11The members of this khandan reside in the JaTiiaa basti in Mundia patti.

91

Question 3Mohanaa (#7) once lived in Raajpur, Saharanpur

District (parental village?) When his first wife died, he married the widow of his elder brother, Nathuuwaa (#2). This.woman was the sister of a Rankhandi JaTiiaa named Disaundii (JaTiaa Khan dan "B",. #8). She persuaded Mohanaa to move to Rankhandi. Mohanaa (and his step-son, Naanak, #3 ?) established himself in Rankhandi, and his descendants (and those of his elder brother) have continued to live here.

Question 4All the members of this khandan reside in Ran

khandi. .

Question 5We did not enquire about who is the head of

this khandan.

Question 6 •Although Nihaalaa (#10) knew that his grandfather

had come from outside Rankhandi (see question 3), he did not mention any relatives who might be living in this other place or the.contacts which might be maintained by such related groups. (We did not ask specific questions

. on this topic.)

93Question 7

There are two "outsidersM currently living with a member of this khandan. They are the "maamaas" (mother's brothers) of Nihaalaa (#10). Their names are Peemaa.and Mangat. They may not be "real" maamaas in that the original field note identifies them as"maamaas in relation" to Nihaalaa.

The members of this khandan are themselves "outsiders"-see question 3.

Question 8Naklii (#4) provided the following information

about the occupations of the:following, members of his branch of the khandan:

1. Unknown2. Nathuuwaa: His occupation was not known

by Naklii.3. Naanak was always a shoemaker.4. Naklii became an apprentice cobbler when he

was sixteen years old. He worked as an apprentice for ten months under the tutelage of Baaruu, son of Khuddaa; Baaruu was Naklii's phuuphaa (father's sister's husband). Naklii left this apprenticeship after ten months due to some quarrel which arose between his father and Baaruu. Naklii completed his training as a cobbler under his

father1 s direction. Siiice then he has followed the occupation of a shoemaker He also tans leather; a technique which he learned by observing Uddaa (JaTiiaa khandan "F", #10) do this work.

5. Rahtuu is seven years, old.6„ Santaa is a shoemaker.Nihaalaa (#10) provided the following information

about the occupations of the following members of his branch of the khandan:

7. Mohanaa always worked as a shoe-repairman.8. Kirpaa joined the army in 1914. He enlisted

in Muzaffarnagar City. Two months later, Kirpaa diedin Meerut due to some kind of sickness. Before he became a soldier, Kirpaa worked as a casual laborer.

9. Muaassii learned how to make shoes from his father and always made shoes for his livelihood.

10. Nihaalaa, beginning at the age of twelve,spent three years learning how to make shoes under his father’s direction. For about the past twelve years, he has worked independently making shoes in his own shop.At present there are two other men working in his shop. Their names are Peemaa and Mangat. ( They are maamaas "in relation" to Nihaalaa. Nihaalaa allows them to work in his shop free of charge. Nihaalaa said that he

does so because he likes to have their company while working, and they engender a spirit of competition (in shoemaking) which he enjoys. ...

Nihaalaa said that he generally makes one pairof shoes in a day. He sells these shoes for threeruppees and four annas a pair. Since the raw materials for one pair of shoes cost him two ruppees, he obtains one ruppee and four annas for his labor on each pair of shoes.

11. Geendaa became an apprentice cobbler under his father * s direction at the age of twelve. After two years of instruction, his father died. He was then taught cobbling by his elder brother (Nihaalaa, #10) for about two years. For the past five years, he has been working as a shoemaker in partnership with his brother, Nihaalaa. He and his brother live as a joint-family. They pool the income from their shoemaking shop.

At present, Geendaa’s brother (Nihaalaa) is looking for a skilled cobbler who will serve as a guru (teacher - in cobbling) for Geendaa (so that Geendaa can complete his training as a cobbler). Nihaalaa refused to name anyone when we asked if he had anyone in mind as a possible guru for Geendaa.

/- , / 1 ■ V : ■ : . 96Questiori 9

All the members of this khandan. have been married.

Question 10Naklii .(#4) told us that his father, Naanak (#3)

had been married two times. After the: death of his first wife, he married a karaav (widow) from the biraadarii. Naklii was born to the first wife, Santaa to the second.

Nihaalaa C#10) said that Mohanaa had a regularmarriage, and after this wife died, he married the widow of his elder brother, Nathuuwaa 2). Mohanaa * s children are from his first wife.

Question 11We did not learn of any instances of desertion

in this khandan.

Question 12There are no widows alive at present in this

khandan. However, the widow of Nathuuwaa (#2) remarriedhis younger brother, Mohanaa (V7). Nihaalaa also told us that the widow of Kirpaa (#8) was remarried to some member of their caste in Saharanpur City.

Question 13See question three, ten and twelve.

Questions 14 § 15We did not learn of any cases of orphans in

this khandan; although Geendaa appears to have been taken care of by his elder brother after their father died. We did not: find any cases of adoption, although Mohanaa probably took Naanak (#3) as a step-son when he married the widow of his elder brother (#2).

Question 16Naklii (#4) said that all the members of his

branch of the khan dan (Nos. 2-6) are illiterate..Nihaalaa (#10) said that he went to school for

one year before he was. twelve years old. As a result of this education, he knows enough Hindi to be able to maintain his accounts (in his shoemaking shop). He is the only member of this khandan with any formal education.

Question 17No one in this khandan owns any land.

: Question 18There appears to be little solidarity, outside

the nuclear family, among the members of this khandan. This may be due to Naklii (#4) and Santaa (#6) having had different mothers, although they are sons of the

. 98same father. A similar division occurred in the grand- parental generation, where Mohanaa (#7) married his elder brother's widow after his first wife had died leaving him with two sons. The son of his elder brother and his sons by his first wife, then had different mother's and the same "father" (step-father for Naanak).

Khandan Study Rankhandi: 1969

' ;• ■ ' MMKhan "G" Jatiya Chamar *1. Nathuwaa*2; Nanak -3. Nakl i (60)A. Buddho (55), Chhapar, Muz.

. *4. . S/0 No. 15. Santa (49)A. Ram Kali (40), Bara Bans, Sah.

*6. Mohna B/0 No, 1 * 7 , Muvasi8. Nihala (45)A. Chhajji (40), Deoband, Sah.

9. Sumitra (17) , Kurdi, Sah.10. Sushila (F-7)11. Yogendra (3)12. Genda (39)

A. Parsandi (38)13. Kunwar Pal (13)14. Dharam Pal (10)15. , Naresh (5)16. Norti (F-2 1/2)17. Salochna (10 Months)

' S.No. ' Name ’ Education ' OccupationV He prepares shoes. He

learnt shoemaking in Deoband.

I . She reads in Kanya Path-shala, Rankhandi

II He reads in PrimarySchool No. 2 at Rankhandi

8. Nihala

10. Sushila14. Dharam Pal

1 0 0

" S.No. : ' Name ' Education ' : ' ' Occupation .3. Nakli Nil Prepares shoes at home.5. Santa Nil Works as an agri

cultural labourer.12* Genda Nil Prepares shoes at

home. Also drove Tonga in 66.-67.

APPENDIX C

CARD FORMAT AND CODES

Card FormatCard Column/Tape Position

01-15

17-2517-18

- 20 2122-2324-25

. -

27-2829

30-31■ -si

33-343536-37383940

101

FieldNameBlankI„D. Number

Caste . Khandan GenerationFather's Birth Sequenc Ind.'s Birth Sequence Sex/Family Position

BlankStated Age 1954 BlankStated Age 1969 BlankEducation Level 1954 BlankEducation Level 1969 BlankResidence/Job Locale Blank

1 0 2

41-46 Occupation41-43 Primary Occupation44-46 Secondary Occupation

47 .Blank - •48-50 Land Owned51-53 Sequence Number

Redefinition of a previous field if individual is a wife: 41-42 Blank43-46 ' . Wife’s Village Location

43 Compass Direction44-46 Miles; from Ran khan di

Mi sc el1aheou s Codes.Khandan cc 19 Arbitrary: A-Z

. Except W, X, Y, Z ='outsiders

Generation cc 20 1 = 1st generation2 = 2nd generation etc. through 9

Father's Birth Sequence cc 21 0 = patriarch1 = 1st male child2 = 2nd male child etc. through 9

Individual's Birth cc 22-23 00 = patriarch. Sequence 01 = 1st child

02 = 2nd child . ". etc.-

103Sex/Male: Birth Sequence/ cc 24-25

Family Position

Stated Age ,1954

Stated Age 1969 Education Level 1954

cc 27-28

cc 30-31 cc 33-34

Education Level 1969 cc 36-37Residence/Job Locale cc 39

00 = patriarch01 = 1st male child02 = 2nd male child etc.22 - female 55 = wife99 = deceased/

not born coded as age 00 = no age statedSame as 195400 = no data01 = 1st grade etc. through 1213 = B.A.14 = M.A.15 = Ph.D.20 = literate but ho

formal education21 = religious training 99 = no educationSame as 19540 = no data1 = lives and works

in Rankhandi2 = lives in Ran

khandi, works elsewhere

3 = lives and worksaway from Rankhandi

4 = residence shifttoward the village

5 = residence shiftaway from village

8 = not applicable9 = married away from

village

n r™r

104Women Only Compass Direction

Miles from Rankhandi

01 bhaat02 bhat03 bhangii04 bharbuj j aa05 brahmin11 goosaaii12 jatiiaa chamar13 jhiiuaar14 jooggii -15 julaahaa (Kabir

c<= 43

cc 44-46

. Caste Code cc 17-18

0607080910161718 19

Panthi) 20

0 = no data A = westB = west by northwest C = northwest D = north by north

west E = northF = north by north

east G = northeast H = east by northeast J = eastK = east by southeast L = southeast M = south by southeast N = southP = south by southwest Q = southwest R = west by southwest Z = Rankhandi000 = no data 900 = village location

not known coded as distance

chamar . darzii (Muslim) dhiimaan brahmin dhoobii garariiaajulaahaa (Chamar) julaahaa (Koli) khatrii kumhaarmiraassii (Muslim)

21 naaii22 niilgar QMuslim)23 pujaagir (Muslim)24 rajput (Hindu)25 rajput (Muslim)

26 saaii (Muslim)27 siaanii28 seekh (Muslim)29 soonaar30 teelii (Muslim)

31 vaish

Division I

MajorMinor

MajorMinor

Occupation Code .100 Agriculture, livestock, for

estry, fishing and hunting.110 Field crops; ' general farming.111 Grain crops and plantation

crops.112 Other crops (include vegetables)113 General agricultural laborer.114 Agricultural overseer.115 Agricultural supervisor116 Sharecropping120. Livestock121 Production and rearing of large

livestock mainly for milk and animal power. Cattle, buffalo, and goats.

122 Rearing of sheep and production of wool.

. 123 Rearing animals for slaughter.124 Poultry production.125 Tending and herding livestock.

Division II Major

Minor

MajorMinor

Major

Minor

: 106 : :200 . Manufacturing.210 '. Manuf acturing of food"items.211 Production of rice, oil, flour

etc. by milling, debusking and . processing of crops andfopdgrains.

212 Production of sugar and syrupfrom sugarcane in mills.

213 Production of indigenous sugar,gur from sugarcane and pro-

.. duction of candy.214 Production of butter, ghee,

. cheese and other dairy products.215 Production of other food pro-

. ducts, sweetmeats, condimentsetc,220 ' Textile (Cofton).221 Cotton ginning, cleaning,

pressing and baling.222 Cotton weaving on handlooms.230 ' Manufacture of wood and wooden

products.. 231 Sawing and planing of wood.232 Manufacture of wooden furni

ture and fixtures.233 Manufacture of Structural wooden

goods such as beams, posts, doors and windows.

234 Manufacture of other wooden products such as utensils, toys and artwares.

235 Working in paper mill.

MajorMinor

Major

Minor

. Major

Minor

MajorMinor

Division III Major

V 107240 . Leather and leather products.241 Tanning and finishing of hides

and skins and preparation of finished leather.

242 Manufacture of shoes and otherfootwear.

243 . Repair of shoes and otherleather footwear.

250 ’ Non-metallic mineral productsother than petroleum and coal.

251 Manufacture of earthenware andearthen pottery.

252 Manufacture of structural clayproducts such as bricks and tiles. .

. 260 Basic metals and their productsexcept machinery and transport

: equipment.261 . Manufacture of iron and stedl

furniture. 262 Manufacture of metal products

[other than iron, brass, bell , metal and aluminum) such as tin cans.

270 Transpoft equipmeht.271 Repairing and servicing of

motor vehicles.272 Repairing of bicycles.273 . Manufacture of other transport

equipment such as animal drawn and hand drawn vehicles.

300 Construction.310 Construction.

108Minor 311 . Construction and maintenance

of buildings, including erection, flooring, decorative constructions, electrical and.sanitary installations.

312 Construction and maintenanceof roads, railways, bridges . and tunnels.

313 Construction and maintenance of waterways and water reservoirs such as bunds, embankments , dams ,. canals, tanks, tubewells and wells.

Division IV 400 Electricity, Gas, Water andSanitary Services.

Major 410 Electricity and gas.Minor . 411 Generation and transmission

of electric energy.412 Distribution of electric energy.

Major 420 Water supply and sanitaryservices.

Minor 421 Collection, purification anddistribution of water to domestic and industrial consumers .

422 Garbage and sewage disposal,operation of drainage system and all other types of work connected with public health and sanitation.

Division V 500 Trade and Commerce.501 Operation of pawn shop.

Major 510 Wholesale trading.Minor 511 Wholesale trading in cereals and

pulses.

1 0 9

Major 520 ' Retail' trade.Minor 521 Retail trading in cereals,

pulses, vegetables, fruits sugar, spices, oil, fish, dairy products, eggs and poultry,

522 Retail trading in beverages,tea (leaf), coffee (seed and powder) and aerated water.

523 Retail trading in tobacco,bidi, cigarettes and other tobacco products.

524 Retail trading in foodstuffssuch as sweetmeat, condiments,

.. cakes, biscuits etc.525 Retail trading in animals.526 Retail trading in fibers, yarns,

. dhoti, saree, readymade garments of cotton, wool, silkand other textiles' and hosiery products. This includes retail trading in piece-goods

, of cotton, wool, silk and othertextiles.

527 Retail trading in footwear,head-gear, such as hats, umbrellas, shoes and chappals.

528 Retail trading in earthenwareand earthen toys.

Major 53Q Trade and Commerce Mis cel1ahebu s.Minor 531 Money lending (indigenous). '

Division VI 600 Transport, Storage and Communi-. cation.

Major 610 ' Transport.

110Minor 611 Transporting by road by means

of hackney carriage, bullock . cart or motor rickshaw.

612 Transporting by animals such as horses, elephants, mules and

. camels„613 Transporting by man, such as

carrying of luggage, hand - cart driving, rickshaw pulling, and cycle rickshaw driving.

Major 620 Communication.Minor 621 Postal, telegraphic, wireless

and signal communications.Division VII 700 Services.

Major 710 Public services.Minor 711 Public service in police.

712 Administrative departments andoffices of the central government.

713 Public service, in administrative departments of quasi-government organizations such as muni-

. cipalities, local boards etc.714 Administrative departments of

state governments.715 Military service.716 Overseer for military.717 Village watchmen.

Major ■ 720 Education and scientific services.Minor 721 Educational services such as

those rendered by technical colleges, technical schools and

, similar vocational institutions.

t t — r

IllMinor 722 Educational services such as

those rendered hy colleges, :. . schools and. similar other

institutions of non-technical types. •

723 Students.724 Research institutes.

Major - 730 ' Medical and health services. '. Minor 731 Public health and medical

services rendered by organizations and individuals such as by hospitals, nursing homes,

- maternity and child welfare. clinics and also by hakim, unani, ayurvedic, allopathic and homepathic practitioners.

. . • 732 Veterinary services renderedby organizations and individuals.

Major • 740 Religious and welfare services.Minor 741 Religious services rendered by

religious organizations and their establishments maintained for worship or promo - tion or religious activities, this includes missions, ashrams and other allied organizations.

742 Religious and allied servicesrendered by pandit, priest, preceptor, fakir and monk.

Major 750 .' Legal services.Minor 751 Legal services rendered by

barrister, advocate solicitor, mukhter, pleader, mukuris and • munshi.

Major 760 Bu sines s' service s.

112

Minor

MajorMinor

761

770771

772

Maj orMinor

780781.

782

783

784

785

Business services rendered by accountantsj auditors, bookkeeper s or similar individuals.Recreation services..Recreation services rendered by cinema houses by exhibition of motion pictures.Recreation services rendered by organizations and individuals such as those of theatres, dancing parties, musicians, ■ exhibitions, circus and carnivals .Personal services.Services rendered to households such as those by domestic servants and cooks.Services rendered to households such as those by governesses, tutors and private secretaries.Services rendered by hotels, boarding houses, eating houses, . cafes, restaurants and similar organizations providing lodging and boarding facilities.Laundry services rendered by organizations and individuals, this includes all types of cleaning, dyeing, bleaching and dry cleaning services.Hair dressing and other services rendered by organizations and individuals such as those -by barbers, hairdressing saloons and beauty shops.

786 Tailoring,

Major

Minor

Division VIII

Major

Minor

Maj or Major

, 113790 Services hot elsewKere clas s-

Ifled.791 Services rendered by organi

zations or individuals not . elsewhere classified.

800 Activities not adequatelydescribed.

810 Activities unspecifled and.hot adequately described.

811 Activities unspecified andnot adequately described including activities of such individuals who fail to provide sufficient information about their industrial affiliation to enable them to be

. classified.812 Beggar.888 Not appTicab1e.890 ' Old and retired. .

X

APPENDIX D MARRIAGE RECIPROCITY MATRIX

114

Code Village ^District 01 05 07_Castes

' 02 10 12 . 20 21 30 ' 31 TotalAO 05 Barbas Sah, — 0/1 WWW* 0/1A006 Jaipur Sah, — 0/2 0/2A008 Balwakheri Sah, -— 0/1 0/1A009 Jaroda Sah, — 0/1 — 0/1 — — - 0/2A011 Lohari Muz, 0/2 - 0/2A014 Thana Bhawan Muz, ■ — — 2/7 — 0/1 — 2/0 4/8A016 Kuaakhera Sah, 1— 0/2 0/2A017 Babasi Sah, — 0/2 0/2A018 Berakheri Sah, —— 0/1 0/1A019 Titron Sah, 1/0 — — 1/0A024 Uun Pindora Muz, 0/4 0/4A036 Dakrola. Kar, — 0/2 —— 0/2B002 Mudhakra Sah, — -— 0/1 0/1BOOS Jhakwala Sah, — 0/1 0/1BOOS Nanharaa Sah, 0/1 - 0/1

^District Abbreviations: Sah, = Saharanpur District; Muz, = Muzaffarnagar District;Kar. = Karnal District; Mee, = Meerut District; Bui. = Bulandshahr District

115

CastesCode Village District 01 • 05 07 09 ■ 10 12 20 • 21 30 31 . TotalBOOB Muskiipar Sah, 2/0 2/0B007 Rattanheri Sah. 0/1 — - 0/1B009 Jaroda Panda Sah» — ■ 3/0 3/0B010 More Majra Sah, -T- 0/1 0/1 0/2B013 Subri Sah, 0/1 . 0/1BG21 Balu Majra Sah, ■- 0/1 --\ —— — -- — — 0/1B024 Khandlana Sah, "-■ 2/0 2/0COOS Ambahta Sheikha Sah, 0/2 — - 0/2C004 Badheri Sah, 0/1 -- - 0/1C006 Chiroun Sah, 0/3 —— 0/3 -- 0/6C007 Shamlana Sah, — 0/2 0/2C009 Sabarpur Sah. o/i; 2/0 I/O 3/1C010 Badgaon Sah, — 2/0 2/0C014 Korali Muz, 0/1 0/1C015 Sona Arjunpar Sah, 1/0 1/0 --- —— — • . “ ” -r -- 2/00017 Rampur Town Sah. — ' 0/1 — 0/1 0/2C021 Dehri Sah, r— 1/0 1/0C025 Ambehta Chand Sah, - 1/0 — ' - 0/1 0/1 - 1/2C026 Mohibddinpur Sah, - 1/0 1/0C028 Shalhapur Sah, 1/0 1/0C029 Nakur Sah, 1/0 1/0 2/0COSO Juddha Sah, 2/0 2/0

' Castes ■Code Village District 01 05 07 09 10 12 20 21 30 31 Total

C915 Khudana Sah, — — — — — — 0/2 — — 0/2D003 Kulsat . Sah, — — — — — — — — 1/1 — 1/1D008 Nonabari' Sah, — — — 0/1 — — — — — — 0/1D010 Chandena Koli ' Sah, — 1/1 — — — — — — 0/2 1/0 2/3D011 Dalheri Sah. — 1/0 — — . — — — — — 1/0D012 Mirzapur Sah, — 1/0 — — — — — — — — 1/0D018 Nandi Sah, — 0/1 — 0/1 — — — — ' 0/1 — • 0/3D023 Saharanpur City Sah. — — — — — 3/2 0/4 — — 1/1 4/7D028 Shrana Sah, — 1/0 — — — — — — — — 1/0D029 Dabki Sah, — — — 0/1 ™ — — — — ' — 0/1E001 Amlia Sah, — — 1/0 — — — — — — — 1/0E002 Imlia Sah. — — — ' 1/0 — — -r- — — 1/0E009 No j alii Sah, — — — — — — — 0/1 — : 0/1E010 Talheerii Sah. — 1/2 1/0 2/0 — — — — — — 4/2E012 Kota Sah. — 1/0 — — — — — — — 1/0 2/0E013 Bakera Sah. — — — — — — — — — 2/0 2/0E014 Salempur Sah, — — — — — — — — — 0/1 0/1E015 Sadhanpur Sah. — 1/0 — — -- — — — . — — ■ 1/0E016 Nagal Sah. — 1/0 — — — — '0/1 — — — 1/1E017 Khajurwaalaa Sah,. — 1/0 — 0/1 — — — — — 0/1 1/2E031 Mandla Sah. — 1/0 — — — — — — — — 1/0

Code Village District 01 05 07Castes

09 10 12 20 ' 21 30 31 TotalE034

-----s—

Randavli • Sah, 0/1 *** 0/1E036 . Dayalpur- Sah, - 2/0 - —— . 2/0E038 Behat Sah, 0/1 1/0 — — 0/1 — . 1/2E039 Jasmor Sah, 0/1 0/2 — — — 0/3E040 ' Shahpur Sah, '— 1/0 1/0E041 Sadhavli Qadin Sah, — 1/0 1/0E043 Babal Nagala Sah, 0/1 —m —— 0/1E910 Amboli Sah, 0/1 - — - 0/1F005 Deoband Town Sah, — 1/2 . 0/1 — 5/5 0/1 - 0/2 — — 6/11F007 Karanjali Sah, — 0/1 0/1F014 Phaloda Sah, —— 0/1 0/1F016 Sherpur . Sah, — — 0/1 0/1F018 Balaawa Sah, —— 2/0 . ■- — — — 2/0F024 Bhagwanpur Sah, — 0/1 —e~ 0/1 — - — 0/2G003 Salempur Sah, 0/1 0/1C-007 Chandpur Sah, 0/1 — — 0/1G009 Sadharanpur Sah. —— 1/0 1/0G011 Tikolaa Sah, — 1/0 1/0G012 Chandaheri Sah, 0/1 0/1GO 13 Mundyaki Sah. — 0/1 0/1G020 Landhora Sah. 0/1 0/1

CastesCode Village District 01 05 07 09 10 12 20 21 30 31 Total

G021 Roorkee Sah, 1/1 1/0 — 0/1 — — * — — 2/2G026 Amli Sah, — — 0/4 — — — — — ' — 0/4G033 Sarai Sah, 0/1 — ' — — — “ — — — . 0/1G034 Bahadurpur Sah, — 1/0 — — — — — — — — 1/0G035 Jawalpur Sah, — — 0/2 — — — 0/2H004 Jaroda Jat Sah, — 0/1 0/1 — — 1/1 — — — — 1/3H010 Ransura Sah, — 1/1 — — — ;— — "— — — 1/1H012 Kapur i Sah, — — 0/1 — — — — — — 0/1H014 Mandavli Sah, — — — — — ” . . “ ™ " 0/1 0/1' J004 Badhaii Muz, — 0/1 ~ — 1/0 — — — — 1/1J005 Kainoki Sah, — 0/1 — — — — — — — ' 0/1J007 Kutbpur Muz, — — — 1/0 1/3 — -— — . *— 1/0 3/3J008 Barlee .. Muz, — • 3/0 1/0 0/1 — —- — . — — 0/1 4/2J012 Khaikheri Muz, 0/1 0/4 — <— 0/2 0/7K003 Rohana Muz, — — 1/1 — 1/0 0/1 - 0/1 -— 0/1 2/4KQ05 Kampur Muz, — — — — — ^ 0/1 — '— 0/1K008 Chapar Muz, — — 0/1 — ■— 0/4 — — 0/2 — 0/7K015 Sikri Abdulpur Muz, — 0/1 — — -- — — — -— 0/1K016 Rahmatpur Muz, . — 0/1 — — — — ~ — — - •— 0/1K019 Bhukarheri Muz, — — • — — 1/2. — — — — 1/2K020 Mornaa Muz, — — — — — 0/2 0/2

CastesCode ' Village District 01 05 07 09 10 12 20 21 30 31 Total

LOOS Badheri Muz, ■—- 0/1 — - — 0/1 ——- — 0/2 —— 0/4L009 Datyana • Muz, ^ ^ — — 1/3 1/31010 Mustaabaad Muz, — — •. — 0/1 — 0/1LOU Khari Pachendra Muz, — 0/2 — — 3/0 — — — — — 3/2L015 Rahkara Muz, — — > 0/1 — —- — — 0/11022 Kakrauli Muz, ■ — — 1/0 — — — — — — — 1/0L024 Tandhuru Muz, — 0/2 . — — 0/1 -- --- — — — 0/3L027 Churiala Muz. — 1/0 — — — .-- — — — — 1/0M004 Barkali • Muz. -— 1/0 — — — — — — — — 1/0• M007 Ban Nagar Muz, . — 0/1 — — — — — — — — 0/1M008 Sharipur Muz, — — — — — — — — 0/1 — 0/1M011 Muzaffarnagar Town Muz, — 1/1 — - — — — ' — — — 2/1 3/2M014 Bahadurpur Muz. — — — — — >— 0/1 — — 0/1M015 Nivara Muz, — — — — -- — — -— — 0/1 0/1M016 Bhiki Muz. — 0/2 — — — — — — — — 0/2M017 Sikh era Muz, — 0/1 — — ■— — — — — — 0/1MO 18 Joharaa Muz, — — — — — — — — 0/1 0/1M021 Kaval Muz. •' — — — — — — — 0/1 0/1M027 Tiloora Muz, — — — — — ■ — ■ — — . — 0/2 0/2M02 9 Sohajni Muz, — 0/1 — — — . ' — — — — 0/1M034 .. Sakhauti Muz. — . — 1/1 — — — ■ — — — 1/1

120

CastesCode VillageM038 AturaM042 NavangabadM049 MaukhasM050 MohaM051 NaglaN009 Kheri DudaheriNO 11 Pina KanouniN012 KanauniN013 BarwalaN014 TailayN015 NarmanaN016 BasdharaN017 PuraN018 Pur BalianN918 ChaandpurN020 Jiivana.. , ,N021 NavlaN022 AtwaraN023 KhatoliN024 AncholiN924 Mundavli

DistrictMee,Mee,Mee,Mee,Muz,Muz.

. Muz. Muz. Muz.

■ Muz, Muz, Muz, Muz, Muz. Muz, Muz, Muz, Muz. Muz. Muz, Muz,

01 05 07 09 10 12 20 21 30 31- - - 0/1 - - ~ - — -— 0/1

1/00/1

0/10/12/02/00/4

0/1— — — 0/1 — — — — — . —

0/1 1/1 — ■— 0/1 — 0/2 .— — —

— 0/1 — — — — — — — . —

— 0 / 1 — — — — — — — - T

— ■ 0 / 1 — - — ■ — — — — 0 / 1 — —

— — 0/2 — — — — — — •— —

— — 0/2 — — — - — —

— 1/3 — — . — — — — — 0/1— — — ' — — ~ ~ — — 0/2

— 0/2 — 1/0

0/10/3

0/1

Total0/10/10/10/11/50/10/10/20/20/21/40/21/00/31/00/20/42/02/00/40/1

121

CastesCode VillageN025 RardhanaN026 SalawaN027 FaridpurNO 2 8 PaliNO 2 9 KapsodN030 JarathpurN035 TitaraiN040 SikheraN042 MeerutN066 GhaziabadN095 SurajpurP013 JalalpurP015 DhandavliP016 HarsaliP017 KakraaP019 PalraP020 ShikarpurP022 MuhemmadpurP023 KalianpurP026 BudhanaP027 . Rasulpur

District 01 05 07 09 10 12 20 21 30 31 TotalMee, 0/1 — — — — ■— — . — — 0/1Mee, — 0/2 — — — — — — — — 0/2Mee, ■ — 0/2 — — — — -- — — — 0/2Mee, — 1/2 — — — — — — — — 1/2Mee, — 0/2 — — — — — — — — 0/2Mee, — 0/1 — — — — — — — — 0/1Muz, — 0/1 •"* — — •— — — — - 0/1Mee, ' — 0/1 — — — — — — — — 0/1Mee, — 0/2 — -— ’— *— 0/1 —~ 0/3Mee, — — ■ 0/2 — — ' -- — 0/2Bui. — ~ — — — — — 0/1 — — 0/1

.Muz, — — — — — — — — 1/0 1/0Muz. ’ — — — — — — -- _ — ' — 0/1 0/1Muz, ■— ™ — * — 0/1 0/1Muz. — — — — 1/0 — — — — 0/1 1/1Muz. — 0/1 — — — — — — -- — 0/1Muz, — . — -- — 1/2 — — — — 0/1 1/3Muz. — 0/1 — — — — — — — — 0/1Muz, — — — 1/0 — — — — — — 1/0Muz. — 0/1 — -- — . — — — — 0/1 0/2Muz/ — 0/1 — — — — — — — — 0/1

122

CastesCode ' Village ' District 01 05 07 09 10 12 20 21 30 31 ■ TotalP032 Gotka Mee, 0/4 — — 0/3 0/7P072 Delhi U.P. — 1/0 1/0P090 Battaparsol Bui. — 0/1 0/1Q006 Charthawal Muz, 1/0 0/1 2/2 3/3Q011 Dhauri Muz, 0/2 ■ — --1 0/2Q012 Jasoii Nagla Muz. 1-• 0/1 0/1 0/2Q018 Sisauli Muz. - 0/1 0/1Q021 Kudana Muz, —— 0/1 -. . . — ■ 0/1Q022 Udampur Muz, —— 0/1 0/1Q023 Kharar Muz, - 0/1 0/1Q028 Disala Muz, —— 0/1 0/1Q029 Chova Muz, - 0/1 0/1Q030 Unchaagon Muz, - 2/0 — 1/0 3/0Q031 Khandhla Muz, - 0/1 0/1R003 Kutesra Muz, — o/l 0/2 " ' 0/3R008 Kanaheri Muz, 0/2 -. 0/2R009 Baralsi Muz. —— 0/1 0/1 , —— 0/2R010 Kathergad Muz. 0/1 0/1R012 Pipalhera Muz. 0/4 0/4.R013 Gokama Muz, — 0/1 0/1ROM Hiranwara Muz. ' —— 1/0 — — —e-. — — — —- — — . . i/o

123

' Castes .■Code Village District 01 05 07 09 10 12 20 21 30 31 Total

R015 Kairi Muz, — 0/7 — — — — — 0/1 — — 0/8R016 Baabri Muz, — — — — — — — — — 3/3 3/3R022 Garhi Muz, — 1/0 — ~ 2/0 — — — ' — 1/0 4/0R023 Shamlii Muz, — 0/2 — — — — ■ — — 2/5 2/7R025 Leloinkheri ■ . Muz, — 0/1 — — — — — — — 0/1R026 Jhingaanaa Muz, — <— — — — — — — 0/1 0/1R031 Kairana Tehsil Muz, — — — — 0/1 — — — — 1/2 1/3

SELECTED BIBLIOGRAPHY

ALLUISI, EARL A,1967 ' Basic Fortran' for Statistical Analysis.

Homewood, 111.: Dorsey Press.BARTH, FREDERIK

1959 uSegmentary Opposition and the Theory of Games: A Study of Pathan Organization." ' Political Leadership Among' Swat 'Pathans . London School of Economics, Monographs in Social Anthropology,

-.No, 19, New York: Humanities Press, 52-64.BLALOCK, HUBERT M.

1969 ' Theory Construction. Englewood Cliffs, N.J.:. Prentice-Hall,

BURTON, MICHAEL1970 "Computer Applications in Cultural Anthropology^

Computers and the Humanities, 5:37-46.DOWNIE, N. M. and R. W. HEATH

1965 Basic Statistical Methods, 2nd ed. New York: Uarper and Row.-

DRIVER, HAROLD E.1961 "Introduction to Statistics for Comparative

Research." Frank W. Moore, ed., Readings in Cross'-Cultural' Methodology, New Haven: HumanRelations Area,Files Press, 310-338 .

1965 ' The Use of Computers in Anthropology, The Hague Mouton Press.

FISHER, R.A. -1935 ' The Design of Experiments. Edinburgh, Scotland:

Oliver and BoycTh. 125

■ 126FORD, CLELLAN S. (ed,)

1967 Cross-Ctt 11ura 1 Approaches . New Haven:Human Relations Area Files Press,

HARRIS, MARVIN :1968 The Rise of Antliropdlogical Theory, New York:

Crowell Co,HERSK0V1TS, MELVILLE J, ,

1967 ' Cultural Dynamics, New York: Alfred A, Knopf,IBM CORPORATION

1970 ' System 36 O' Scientific Subroutine Package,Version H I , Manual Number GH20-0205-4,5th ed. Poughkeepsie, N, Y.

KAPLAN, ABRAHAM1964 The Conduct of InquiryMethodology for Behavioral

Science, Scranton, Pa,: Chandler Publishing Co.KLUCKHOHN, CLYDE '

1949 Idealogical Diffefences and World Order.New Haven; Yale University Press.

LEACH, EDMUND1961 Rethinking Anthropology. London: Athlone Press.

LEVI-STRAUSS, CLAUDE1963 Structural Anthropology. New York: Basic Books,

LEWIS, OSCAR1965 Village Life in Northern India. New York:

Random House,MAHAR, J. M.

1966 "Marriage Networks of the Northern Cangetic Plain," Unpublished Ph.D. dissertation,Cornell University, Ithaca, N.Y.

' 127MOORE, FRANK W. (ed.)

1961 ' Readings' in Cross -Cultura 1 Methodology.New Haven; Human Relations Area Files Press.

NAROLL, RAOUL •1967 ' Data Quality Control— A .New Research Technique.

Glencoe, 111.! The Free Press.NEVILL, H. R. ■ V

: 1921 "Rankhandi." ' District Gaze tee r of the United Provinces of Agra and Oudh. United Provinces: Government Press, 308-309.

NUTINI, HUGO1968 San Bern ad in o 'Conti a. Pittsburgh: University

oT~Pittsburgh Press.PASSIN," H.

1942 "Tarahumara Prevarication: A Problem in FieldMethod." ' American Anthr op oId g is t, 44:235-247.

ROTH, DANIEL H,1970 "Cluster Analysis for the Biological and Social

Sciences." ' Smithsonian Institution Information Systems Innovations, Vol. II, No. 2:1-19.

SELBY, SAMUEL M.1970 Standard Mathematical Tables, 18 ed. Cleveland

Chemical Rubber Co,SHERIF, MUZAFER and CAROLYN W. SHERIF

1969 Interdisciplinary Relationships in the Social Sciences. Chicago: Aldine.

SIMMONS, LEO W.1967 "Statistical Correlations in the Science of

Society." Frank W. Moore, ed. , Readings, in Cross-Cultura1 Methodology. New Haven: HumanRelations Area Files. Press, 221-245.

128SIMON, JULIAN L.

1968 Basic Research' Method in Social Science.New YorTcl Random House,

SUSZYNSKI, NICHOLAS J,1969 "Recent Advances in Source Data Automation."

SmithsonIan Institution Information Systems Innovations, Vol. II, No. 3:1-17.

TEXTOR, ROBERT B. (ed.)1967 A Cross -CultUr a1 Summary. New Haven: Human

Relations Area Files Press.WHITE, HARRISON C.

1963 Anatomy of Kinship: Mat hematic a 1 Model s' for Structures of Cumulated Roles. Englewood Cliffs! Prentice-Hall.

WILLOUGHBY, R. R.1940 "Cum Hoc Ergo Propter Hoc." School and Society,

51:485. -

u

4

“SfliSK

*

s-

6 7 05