chapter i research methodology research:...
TRANSCRIPT
1
CHAPTER – I
RESEARCH METHODOLOGY
RESEARCH: MEANING
The word research is derived from the French word 'researcher' which means
to seek again. Research is an activity directed at 'the systematic search for pertinent
information on a topic'. Most of the early researches were directed at revealing the
mysteries of nature. This process is still going on with greater vigor and emphasis.
Thus, the urge to re-examine and re-understand the world around us rightly is
called as research. Research is also defined as 'the systematic and objective
analysis and recording of controlled observations that may lead to the development
of generalizations, principles or theories, resulting in prediction and possible
ultimate control of events'.
DEFINITION
1. "Scientific research is systematic, controlled, empirical and critical
investigation of hypothetical propositions about the presumed relations among
natural phenomena" — Kerlinger, Fred. N.
2. "A social research is the systematic method of discovering new facts or
verifying old facts, their sequences, interrelationships, casual explanations, and the
natural laws which govern them". — P. V. Young.
2
CLASSIFICATION
There are two broad classifications of research viz. research m physical or
natural sciences and research in social sciences.
Social research attempts to discover facts concealed in a social phenomenon
or the law governing it. Social research is a tool used by social scientists to
understand the social processes, social values, ideas, reality, their interrelationship
and linkage with natural laws. As the social behaviour is affected by a large
number of environmental and biological factors besides the value system of the
society, social research, therefore, attempts to reveal the cause and effect
relationship existing in various social phenomena and various factors contributing
to change in the society.
CHARACTERISTICS OF SOCIAL RESEARCH
Following are the characteristic features of social research
1) Social research deals with the social phenomena.
2) Social research is carried on both for discovering new facts and verification
of the old ones.
3) Social research tries to establish causal connections between various
human activities.
4) Social research aims at discovering new facts.
5) Social research assists in the understanding of evolution of new theories.
3
6) Social research is dynamic in nature.
7) Social research in any field is interrelated.
8) In social research interrelationships between variables under study is must.
9) Research in social science is complementary to research in physical sciences
and actually both branches of knowledge help each other and is the way to
progress.
SCOPE OF RESEARCH
The goal of research is to improve the level of living in society. The word
research carries an aura of respect.
Some of the questions asked most frequently by students in introductory
courses on social research are,
a) What do social researchers concern themselves with?
b) On what general assumptions do they base their researches?
c) Research for what?
If a question is added as to what method and techniques researchers use the
whole field of social research has to be unfolded. The field is broad and complex
the unfolding is slow, gradual and at times uncertain.
The scope of social research may be confined to any of the following fields, viz.
a) Explorations which provide new insights into organized society and social
structures.
4
b) Researches which develop new horizons in scientific exploration, advances
and tests new principles or procedures and suggests new concepts.
c) Studies which attempt to test or challenge existing theories, and revise them
in the light of new evidence.
d) Studies which aim at collection and analysis of data more or less within the
existing frame of scientific theory and established techniques of exploration.
e) Studies of an experimental nature in which the systematic study of social life
is carried on under conditions of control and experiment.
Thus, the field of social research is virtually unlimited, and the materials of
research endless. Every group of social phenomena, every phase of social life,
every stage of past and present development is material for a social scientist.
SIGNIFICANCE OF RESEARCH
1) Research has an important role in guiding social planning. Knowledge of the
society and the cultural behaviour of people require proper planning for and
their well development because both knowledge and the cultural behaviour of
human beings are interdependent. A reliable as well as factual knowledge may
be most needed to take decisions for planning. This is possible by means of
research.
2) Knowledge is a kind of power with which one can foresee the implications of
a particular phenomenon. It also dispels the "rust" of old settings, superstitions
5
etc. and light are thrown on them for welfare development Thus, social research
may have the effect of promoting better understanding and social cohesion.
3) Research is charged with the responsibility for effective functioning of facts.
Thus, it affords a considerably sound basis for prediction. Otherwise it leads to
failure bound programmes which may have a serious impact on society. For e.g.
Bhopal gas tragedy. Thus, prediction serves as a better control over the
phenomena and helps successful planning. This leads to cherished goals.
4) It is the role of the researcher to effect constant improvement in the
techniques of his trade i.e. Research. He works in spatial-temporal contexts.
Each challenge he faces forces him to perfect his techniques. In other words, the
technique of research has to attain greater perfection and so, big strides in
research leads to big technological changes. Examples are electronics,
industrialization etc.
TYPES OF RESEARCH
Social research has a wide area of application. Important types of social
research are: fundamental research, applied research, quasi-social research, action
research/arid policy research etc, Each of these types is briefly described here.
a) Fundamental Research
It aims at generating theoretical knowledge and understanding of the subject
matter. It pursues knowledge for the sake of knowledge. More often than not it is a
6
unique intellectual exercise aimed at developing theoretical knowledge or
providing correct explanation of social phenomena. Its object may also be to verify
an existing theory or to establish a new theory or social law. It may also aim at
throwing additional light on an existing theory; reveal some of the missing links in
existing knowledge about some social aspect or experience.
b) Applied Research
Whereas fundamental research is theoretical, applied research is empirical
and practice based. It incorporates some value judgment and is based on utilitarian
approach. In applied research, the research findings are utilized for the solution of
some felt need or some immediate problem or to form a basis for laying down
certain guidelines for future operations. Applied research is based on fundamental
research. It uses the theoretical concepts, and tests them in real life situation.
Fundamental research forms, the building block for applied research. However, the
two cannot, be put into tight divisions, as they are complementary and one forms
the basis of the other.
c) Policy Research
Research having policy implications and application value is known as
policy research. Policy research is directed at identification of factors having
implications in organizational policy or for making changes in an existing policy.
7
d) Action Research
The purpose of action research is to acquire new skills or new approaches to
solve certain problem. For e.g.: action research may he conducted to develop a
demonstration programme for testing a management technique. This type of
research is practice based and directly relevant to real life situations. This provides
a framework and a basis for problem solving and new developments. Action
research is concerned with results, and innovations through controlled
experimentation.
e) Quasi-social Research
Some problems in social sciences cannot be classified as belonging to an
exclusive area of knowledge. Rather, these are interdisciplinary in nature and
require close collaboration among researchers from more than one discipline. This
may include sometimes even physical sciences. This type of research is known as
quasi-social research. The research methodology in such cases is designed to suit
its multidisciplinary nature. Theories and law of different disciplines are used for
interactive reasoning and drawing conclusions.
RESEARCH METHODS
Research methods are basically concerned with observation of reality,
defining the problem and its dimensions, a planned approach towards analysis of
the problem, interpretation of information and drawing conclusions. There are so
8
many systems to classify the methods of research and the following are some
important methods followed in research.
1) Historical Method
It describes what was in the past "The process involves investigating,
recording, analyzing and interpreting the events of the past for the purpose of
discovering generalizations and is helpful in understanding generalizations and is
helpful in understanding the past and the present and, to a limited extent, in
anticipating the future”.
Because of the difficulty in obtaining dependable data, historical research is
the most difficult type of research. The main purpose of the historical research is to
arrive at an accurate account of the past so as to gain a clearer perspective of the
present. Historical research is not based upon experimentation, but upon reports of
observation which cannot be repeated. The data is collected from past events in the
form of various types of documents, relics, records and artifacts having a direct or
indirect impact on the event under study.
2) Descriptive Method
It describes 'what is'. It is concerned with describing, recording, analyzing
and interpreting the existing conditions. In involves some type of comparisons or
contrast and attempts to discover relationships between existing non manipulated
normative survey, the 'status' or 'trend' research. Survey studies are designed to
9
determine the present status of a given phenomenon. They are often of
considerable immediate value. They are, however, of relatively limited scientific
sophistication.
Descriptive studies are more than just a collection of data; they involve
measurement, classification, analysis, comparison and interpretation.
Descriptive studies investigate phenomena in their natural setting. Their
purpose is both immediate arid long range.
3)Experimental Research Method
It describes "what will be", when certain variables are carefully controlled or
manipulated. Here the focus is on variable relationships'. In the experimental
research there is deliberate manipulation. Experimentation as a scientifically
sophisticated technique is capable of providing precise answers to precise
problems, ft is the most exacting and difficult of all methods and also most
important from the scientific point of view. Experimental method provides much
control and, therefore, establishes a systematic and Jogical association between
manipulated factors and. observed effects. Control, manipulation, observation and
replication are the four essential characteristics of experimental research.
4) Field Study Method
This method of research aims at discovering the relationships existing
among social institutions. A field study attempts to reveal attitudes, values,
10
perception and behaviour of individuals and groups under different situations.
There are two broad types in field study method viz. Exploratory and Hypothesis
testing. While an exploratory study attempts to observe what is existing; whereas,
through Hypothesis testing certain beliefs are examined for truth.
These may have any one of the following three purposes.
a) To discover significant variables in a field situation
b) To discover relationships among variables
c) To lay down a basis for more systematic and rigorous testing of the hypothesis.
Field study method, also known as survey method, is popular in social science. It
provides a scientific approach for obtaining reliable results, maintaining objectivity
to a large extent.
5) Case Study Method
In this method, the investigator makes intensive investigation of a social
unit, which may be a person, a family, a group, an institution, a community or even
an entire culture.
The investigator gathers pertinent information about present status, past
experiences and environmental forces which contribute to the infidelity and
behaviour of the unit. The case study is an intensive, integrated and insightful
method of studying a social phenomenon. It can also be used to illustrate the theory
by providing an example. However, it has certain limitations such as,
11
• It is more expensive, being exploratory in nature.
• The generalizations based on a single case cannot be applied to the entire
population.
• There is strong possibility for subjectivity in results.
However, with quantitative data, analytical framework, in depth evaluation of the
situation, it may provide valuable insight in the subject of study.
RESEARCH PROCESS
In social research a deductive approach is commonly used.
In a research project there are various scientific activities in which a
researcher engages himself in order to produce knowledge. Although each research
project is unique in some ways, all projects involve, by and large, some common
activities. Each of these activities is interdependent. The research process is thus
the system of these interrelated activities. The various activities are conveniently
grouped as follows.
a) Identifying a problem
Identifying a problem is the most difficult step in the research process. The
researcher must discover and define not only a problem area but also a specific
problem within that area that he chooses to study, from the field of his interest. The
problem should be such that can be clearly and precisely stated. The statement of
the problem must be complete.
12
b) Formulation of a Hypothesis
Hypothesis is a hunch, an assumption or an intelligent guess. Once a
problem has been identified, the researcher often employs the logical process of
education and induction to formulate an expectation of the outcome of the study. In
other words he conjectures or hypothesizes about the relationship between the
concepts identified in the problem. There should be careful formulation of
hypothesis.
c) Identifying and labeling variables
After formulating a hypothesis, the researcher must identify and label the
variables both in the hypothesis and elsewhere in the study. There may be different
types of variables such as dependent, independent, moderator, control and
intervening.
d) Constructing operational definitions
The abstract or conceptual forms of variables are to be converted into
operational forms. Here the variables are stated in observable and measurable
forms, they are made available for manipulation, control and examination.
e) Manipulating and controlling variables
To study the relationship between variables, the researchers undertake both
manipulation and control. The concept of internal and external validity is basic to
this undertaking.
13
g) Identifying and constructing devices for observation and measurement
Once the researcher has operationally defined the variables in a study and
chosen the design, he must adopt or construct devices for measuring selected
variables.
h) The method to be used
The researcher has to select the research methods to be used. They are
generally classified in to three categories I) Historical, ii) Descriptive, iii)
Experimental. The method to be used is determined by the nature of the problem
and the type of data required for the purpose.
i) Data collection
This step is concerned with the procedures and techniques to be adopted for
data collection. It depends on the nature of the sample chosen for the study. It
refers to selection and development of data gathering devices such as tests,
questionnaires, rating scales, interviews, observation, checklists and the like. Since
many studies in education and in allied fields rely on questionnaires and interviews
as their main source of data it includes the techniques for constructing and using
these measurement devices.
14
j) Analysis and Interpretation of data
At this step the data is to be analysed and interpreted very carefully keeping
in mind the objectives of the study. For this purpose appropriate statistical and
other techniques are to be applied for the processing of the data.
Another characteristic feature of the research process is 'self correction'. In this
situation when the results do not support or only partially supported the hypothesis
and the researchers has sufficient reasons to believe that the hypothesis is adequate,
then he may decide that the failure to confirm the hypothesis is due to error in
selecting a sample design or in measurement of the key concepts or in analysis of
data. In these situations, the researchers may decide to repeat the study beginning
with the faulty stage after re citifying the faults. Finally all the stages of the
research process make the study potentially replicable. The researcher designs his
study in such a way that either the researcher or somebody else can replicate it.
The replication of study substantiates the fact further that the findings are not due
to mere coincidence.
PROBLEMS IN RESEARCH
Research requires following an unknown path. The entire process is full of
problems and challenges. Success cannot be taken for granted, inspired by higher
objectives a researcher, therefore, has to pursue research. Obviously, he faces a
number of problems. Some of these problems are briefly discussed here.
15
1) A balance between theory and application
Theory and application are two important aspects of social sciences research.
It is observed that while in some cases there is greater emphasis upon theory
without much regard to its utility, in some others there is greater emphasis upon
application without building the theory. In both cases, the quality of research
suffers. The research in social sciences should, therefore, make an optimum
combination of theory and application. As a practical course if the research starts
with theory, it should end with application and if it starts with application, it should
conclude with the formulation of theory.
2) Overemphasis upon quantitative and statistical analysis
A cursory glance at research papers produced by scholars in social sciences
shows the tendency to fill pages with statistical tables and analysis of data.
Quantitative analysis of data is undoubtedly useful for reliable and valid result
However, in the process of generalization based on quantitative analysis
sometimes, certain glaring differences and extraditions are ignored or are attributed
to sampling variations. The researchers should account for and explain as many
variations as possible. Non-quantifiable variables should be also given weight age
while drawing conclusions.
16
3) Methodology of Research
Quantification, statistical analysis and use of computers cannot be a
substitute for a strong methodological base. Conceptualization of the problem,
formulation of research objectives, methodology and research design should not be
sacrificed. All these aspects should be given due weight age as per the
requirements of the situation.
4) Inter disciplinary approach
Social phenomena, particularly those related to human behaviour can be
explained better on a broader theoretical basis evolved by integrating the
approaches of different disciplines. In social sciences research, therefore, the
researcher should develop not only a sound understanding of the subject of inquiry
but also of the methods, techniques and approaches in other disciplines and decide
his research approach considering all these aspects.
5) Scientific objectivity
Objectivity is willingness and ability to examine the evidence
dispassionately. Unlike natural sciences, perception, experience, socio-economic
roots, approach to life etc. tend to influence the research pursuit. However, through
scientific objectivity, influence of these external factors could be minimized.
17
SIGNIFICANCE OF RESEARCH IN SOCIAL SCIENCE
Social research attempts to discover facts concealed in a social phenomenon
or the law governing it. Social research is a tool used by social scientists to
understand the social processes, social values, ideas, reality, their interrelationship
and linkage with natural laws.
Social research focuses attention on social problems and social events. It
reveals mysteries of social life, by discarding superstitions, orthodox beliefs and
ignorance, it brings people face to face with reality. This helps in the scientific
judgment of social phenomena. Social research thus has great relevance and
practical utility and contributes towards social welfare and development.
Significance of social research may be judged in terms of
1) CONTROL OVER SOCIAL PHENOMENA
Knowledge is power. Social research generates knowledge about society and
its institutions. This can be used for exercising control over social phenomena.
Social research leads to growth of social systems on sound lines.
2) HELPS IN SOCIAL PLANNING
Planning is based on knowledge of the problem, the objectives and the
resources available. Social research, by generating essential information, provides
a basis for formulation of radical social plans.
18
3) USEFUL IN SOCIAL PREDICTION
Social research affords a sound basis for predictions in a large number of
situations. Although social predictions cannot be perfect in all cases due to a
variety of constraints and assumptions, yet they prove useful in social planning
and control over social institutions.
4) CREATES SOCIAL UNDERSTANDING
Social research by revealing the truth, dispelling the superstitions and
ignorance, projects interdependence among different social groups. By bringing
about unity in diversity, giving weightage to independent and reasoned opinion, it
promotes good will and understanding and strengthens social cohesion.
5) LEADS TO SOCIAL GROWTH
Social institutions determine the direction of social change. Social research,
by promoting better understanding about these institutions, better social planning
and control leads to wholesome social growth.
6) CONTRIBUTE TO HUMAN WELFARE
The ultimate objective of all social activities is human welfare. Social
research, by identifying social evils, their magnitude, causes and consequences,
provides a basis for appropriate measures and reforms leading to social welfare.
19
7) REFINEMENT IN SOCIAL RESEARCH METHODOLOGY
Social research helps in the requirement of research methodology including
the tools and techniques of research. This leads to improvement in the quality of
research.
8) SATISFACTION OF INTELLECTUAL CURIOSITY
A strong desire to seek the truth is always at the root of research activity.
Social research, through better understanding of the environment, satisfies the urge
to know and understand human beings and the society.
20
CHAPTER – II
RESEARCH PROBLEM AND HYPOTHESIS
IDENTIFICATION, SELECTION AND FORMULATION OF A PROBLEM
Some unresolved problem can be a topic of research. A researcher Is
expected to identify a problem suitable for research, define it, submit a plan of
research to the sponsoring agency/higher authority and obtain its approval before
proceeding further in this direction.
A research problem in general refers to some theoretical or practical
situation requiring solution. However, every situation may not be suitable for
research. A research problem should offer one or more possible courses of action
and possibly one or more outcomes, which may be pursued to achieve the desired
objective. The researcher is expected to evaluate the results of different courses of
action and identify the most suitable one, which may ensure optimum use of
resources in the given environment and achieve the desired objective.
A RESEARCH PROBLEM Meaning
A research problem should be selected very carefully. The subject matter of
research should be of interest to the researcher. His educational background and
work experience being the main considerations. In the search for a suitable topic,
the literature available on the subject may be surveyed; suggestions and guidance
may be sought from scholars, scientists, co-workers and research supervisor.
21
However, ultimately the researcher is expected to make a final decision in this
matter.
The topic selected should be neither too narrow not too broad. It should be
proper, considering the utility as well as the resources. Controversial and irrelevant
topics should be avoided. Personal considerations like researchers' capability and
resources required to complete the project and social considerations like its
contribution to wealth and welfare of the society should also be examined. In brief,
the topic selected should be useful and feasible. In case any difficulty is visualized
a preliminary feasibility study may be undertaken to understand the nature and
dimensions of the challenge involved.
DEFINING THE ‘RESEARCH PROBLEM’
A research problem should be defined in clear and unambiguous terms. This
is essential to delimit the scope of researchers and also to discriminate between the
relevant and irrelevant. Proper care for details is required at this stage, so that a
suitable research design may be prepared and the work may be carried out without
much difficulty.
Defining a research problem involves laying down the limits within which
the research work will be carried out keeping in view the determined objectives.
Defining the research problem requires knowledge of the field of
investigation and experience. A young researcher may find some difficulty in
22
defining the research problem. He should seek the guidance of seniors and the
research supervisor in this respect. It should be done in an unhurried and
systematic, manner, first stating the problem in a general way, surveying the
relevant literature, developing proper understanding through discussions and
consultations and finally defining the problem in an appropriate manner.
The following points are to be kept in mind while defining a research problem
1. The right question must be addressed if research is to aid decision makers. A
correct answer to the wrong question leads either poor advice or to no advice.
2. Very often in defining research problems we have a tendency to rationalize and
defend our actions once we have embarked upon a particular research plan. The
best time to review and consider alternative approaches is in the planning stage, if
this is done the needless cost of making a false start and redoing work could be
avoided.
3. A good starting point in problem definition is to ask what the decision maker
would like to know if the requested information could be obtained without error
and without cost.
4. Another good rule to follow is "never settle on a particular approach without
developing and considering at least one alternative."
23
5. The problem definition step of research is the determination and structuring of
the decision maker's question. It must be the decision maker's question and the
researcher's question.
6. What decision do you face? If you do not have a decision to make, there is no
research problem.
7. What are your alternatives? If there are no alternatives to choose, again there is
no research problem.
8. What are your criteria to choose the best alternative? If you do not have criteria
for evaluation, again there is no research problem.
9. The researcher must avoid the acceptance of the superficial and the obvious.
MODE OF SELECTION OF A PROBLEM
The selection of a problem is the first step in research. The nature of the
problem to be selected depends upon the level at which the research is done. A
problem appropriate for the students of graduation level will be a modest one. Here
the emphasis is on the learning process. A problem to be selected at M.Phil / PhD
level is a major problem requiring comprehensive treatment. In this case, the
emphasis is upon both skill development and contribution to knowledge. On the
other hand, a problem to be selected by an experienced researcher must be a
complex probiem meant for making a significant contribution or refinement of
theory or to policy making.
24
What should be the mode of selection in the case of academic research?
Should a problem be suggested by the guide or be selected by the researcher
himself?
A beginner in research may, of course, prefer the problem to be suggested by
his guide.
However, such suggestion by the guide means an imposition. It destroys
spontaneity and the personal interest of the researcher. Therefore, it is better to
choose the problem oneself. Of course, the guide can help the candidate to help
him self.
A researcher with a critical, curious and imaginative mind and who is
sensitive to practical problems can easily identify problems for study.
SOURCES OF ‘PROBLEMS’
The sources from which one may be able to identify research problems or
develop problem awareness are
1) Reading of study books, articles, research reports etc. Academic
experiences viz. lectures, seminars, discussions, experience
sharing etc.
2) Day to day life experiences of the researcher.
3) Exposure to the field through field visits, internship training, extension work etc.
4) Consultation with experts, researchers, administrators, business executives etc
25
5) Research in one area may suggest problems for further research.
6) As the reflective mind is a spring of knowledge, intuition of the researcher may
give new ideas for research.
CRITERIA OF SELECTION
The selection of one appropriate researchable problem out of the identified
problems requires evaluation of those alternatives against certain criteria. These
may be grouped into a) internal criteria or factors, b) external criteria or factors.
Internal criteria consist of Researcher's interest, competence, resource la
finance and time etc.
External criteria include researchability of the problem, its importance and
urgency, novelty of the problem, feasibility, facilities, its social relevance and
usefulness and research personnel.
FORMULATION OF A SELECTED PROBLEM
Formulation means translating and transforming the selected research
problem into a scientifically researchable question. It is concerned with specifying
exactly what the research problem is and why it is studied. The formulation should
include both the „what‟ and the 'why' aspects. Following are the three principal
components In the progressive formulation of a problem for research.
1. the originating question (what does one want to know?)
2. the rationale (why aspect)
26
3. the specifying question (possible answer to the originating question)
IMPORTANCE OF FORMULATION
'A problem well put is half-solved.' This saying highlights the importance of
proper formulation of the selected problem. The primary task of the research is
collection of relevant data and the analysis of data for finding answers to the
research questions. The proper performance of this task depends upon the
identification of exact data and information required for the study. The formulation
serves this purpose. Once the exact data requirement is known, the researcher can
plan and execute the other steps without any waste of time and energy. Thus
formulation gives a direction and a specific focus to the research effort. It helps to
delimit the field of enquiry by singling out the pertinent facts from a vast ocean of
facts. The determination of exact Information needs through the formulation
process prevents a blind search and indiscriminate gathering of data. The data
(information) needs, and their sources determine the methods to be adopted for
sampling and collection of data. The hypotheses to be tested determine the
appropriate statistical techniques to be adopted for analysis. Thus all the major
tasks — sampling, appropriate method of collection of data, construction of tools
for data collection, designing plan of analysis -can be planned exactly without any
waste of efforts. Hence the saying, 'a problem well put is half- solved' is true.
27
REVIEW OF LITERATURE
A researcher who is not truly conversant with what has gone on before has
little chance of making a worthwhile contribution. Therefore a researcher has to
'survey1 or 'review' the available literature relating to his field of study. He must
keep himself 'updated' in his field and related areas.
LITERATURE IN RESEARCH CONTEXT CONSISTS OF
A) Books
1. Encyclopedias
- General, e.g.: Encyclopedia Britannica
- Specific, e.g.: Encyclopedia on social sciences
2. Year Books, e.g.: published and supplements to Encyclopedias
3. Text books
4. Reference Books
B) Journals: published monthl;y,quarterly or half yearly or anually.
C) Reports: 1)Reports of committee/commissions appointed by Government and
public instituitions.
2) Seminar reports and conference proceedings.
D) Research Dissertation and theses.
E) Newspapers and Magazines
F) Micro Forms: Audio and video tapes; Microfilm and Micro cards.
28
WHAT TO REVIEW AND FOR WHAT PURPOSE?
The review of literature is not mere reading for reading sake. It is also not a
casual reading like reading of a story or novel. It 'focuses' and is 'directed' towards
a specific purpose. In is also 'selective'. A researcher has to select the kinds of
literature to be reviewed and 'determine' the 'purpose' for which he has to study
them. The literature review starts with the selection of a problem for research,
continuous through the various stages of the research process and ends with report
writing. Following are the purpose of review of related literature.
1) To gain a background knowledge of the research topic.
2) To identify the concepts relating to it, potential relationships between
them and to formulate researchable hypotheses.
3) To identify appropriate methodology, research design, methods of
measuring concepts and techniques of analysis.
4) To identify data sources used by other researchers, and
5) To learn how others structured their reports.
LITERATURE SEARCH PROCEEDURE
How can the researcher 'identify' the 'related' materials? This search
procedure involves a series of steps. The exact sequence will vary depending upon
the subject and the knowledge of the researchers. A general approach is suggested
below.
29
1) Request learned professors, librarians or others familiar with the field to
suggest relevant references.
2) Find out whether any bibliography already prepared on the subject is
available in libraries. See the Bibliographic index, if any, maintained in the library.
3) Consult bibliographies in the theses on the topic and related topics.
4) See the card catalogues, both general based and subject based.
5) Examine periodicals, monographs, reports and conference proceedings and
other materials including microfilms available in the library.
6) Consult references cited in the books and articles already located. Each book
or article will be a means to locate additional references.
7) Consult the abstract journal on the subject or 'Abstracts' section in the
journals relating to the subject.
8) See the 'Book Review' pages in the leading daily newspapers and in the
journals.
The important point to be noted here is to prepare a source or bibliography
card for each item of reference located. This helps to incorporate them in the
appropriate part of the report in the future.
Thus one can identify many references, relating to the selected topic.
30
RESEARCH HYPOTHESES
The formulation of hypothesis or propositions as to the possible answers to
the research questions is an important step in the process of research. Keen
observation, creative thinking, hunch, wit imagination, vision, insight and sound
judgment are of great importance in setting up reasonable hypotheses. A through
knowledge about the phenomenon and related fields is of great value in the
process. The formulation of hypotheses plays an important part in the growth of
knowledge in every science.
MEANING OF HYPOTHESIS
Hypothesis is a tentative proposition formulated for empirical testing. It is a
declarative statement combining concepts. It is a tentative answer to a research
question. It is tentative, because its veracity can be evaluated only after it has been
tested empirically.
Lundberg defines hypothesis as, "a tentative generalization, the validity of
which remains to be tested". Goode and Hatt define it as "a proposition which can
be put to a test to determine its validity".
Hypothesis-necessary or not
Is the formulation of useful hypotheses always necessary and possible? It is
true that hypotheses are useful and they guide the research process in the proper
direction. But can hypothesis be set up in all cases? In mere fact finding
31
investigations, no problems may be raised and the need for formulating hypotheses
may not arise. Similarly, in exploratory studies, initially it may not be possible to
set up any worthwhile hypotheses. In fact, the very purpose of such exploratory
studies may be to formulate meaningful hypotheses for further formal studies. But
strictly speaking, the mere fact-finding and the exploratory studies cannot be
considered to be typical research studies. In all analytical and experimental studies,
hypotheses should be set up in order to give a proper direction to them.
SOURCES OF HYPOTHESES
Hypotheses can be derived from various sources
1. Theory: this is one of the main sources of hypotheses. It gives direction to
research by stating what is known. Logical deduction from theory leads to new
hypotheses. For examples, profit/ wealth maximization is considered as the goal of
private enterprises. From this assumption, various hypotheses are derived - "the
rate of return on capital employed is an index of business success"; "the optimum
capital structure is that combination of debt an equity which leads to the maximum
value of the firm"; "Higher the earning per share, more favorable is the financial
leverage".
2. Observation: Hypotheses can be derived from observation. From the
observation of price behaviour in a market, for e.g. the) relationship between the
price and demand for an article, a hypothesis can be framed.
32
3. Analogies: These are other sources of useful hypotheses. Julian Huxley has
pointed out that causal observations in nature or in the framework of another
science may be a fertile source of hypotheses .For example, the hypothesis that
"similar human types or activities may be found in similar geophysical regions"
came from plant ecology.
4. Intuition and personal experiences: They may also contribute to the
formulation of a hypothesis. Personal life and experiences of persons determine
their perceptions and conception. These may, In turn, direct a person to a certain
hypothesis more quickly. The story of Newton and the falling apple, the flash of
wisdom to Buddha under the banyan tree illustrate this individual experiential
process.
5. Findings of studies: Hypotheses may be developed out of the findings of
other studies in order to replicate and test.
6. State of knowledge: An important source of hypotheses is the state of
knowledge in any particular science. Where formal theories exist, hypotheses can
be deduced. If the hypothesis is rejected, theories would be modified. Where
theories are scarce, hypothesis are generated from conceptual frameworks. In
either case, the hypotheses are related to the conceptual-theoretical level.
7. Culture Another source of hypotheses is the Culture on which the researcher
was nurtured. Western culture has induced the emergence of sociology as an
33
academic discipline. Over the past decade, a large part of the hypotheses on
American society examined by researchers were connected with violence. This
interest is related to the considerable increase in the level of violence in America.
In India, socio-economic and leadership studies, hypotheses based on caste and
economic status is common, because Indian society is caste-ridden, hierarchical
and segmental, and the Indian economic system is riddled with inequalities and
undue privileges.
8. Continuity of research: The continuity of research in a field itself constitutes
an important source of hypotheses. The rejection of some hypotheses leads to the
formulation of new ones capable of explaining the phenomena in subsequent
researches on the same subject.
TYPES OF HYPOTHESES
Hypotheses are classified in several ways. With reference to their function,
hypotheses are of two types: (a) descriptive hypotheses and (b) relational
hypotheses. Another approach is to classify them into: (c) working hypotheses, (d)
null hypotheses and (e) statistical hypotheses (f) common-sense hypotheses, (g)
complex hypotheses and (h) analytical hypotheses.
a) Descriptive hypotheses: these are propositions that describe the characteristics
(such as size, form or distribution) of a variable. The variable may be an object,
person, organization, situation or event. Some examples are:
34
"The rate of unemployment among arts graduates is higher than that of commerce
graduates".
“Public enterprises are more amenable to centralized planning.”
The educational system is not oriented to human resource needs of a country".
b) Relational hypotheses: these are propositions which desscribe the relationship
between two variables. The relationship suggested may be positive or negative
correlation or causal relationship. Some examples are
“Families with higher incomes spend more for recreation”.
“Participative management promotes motivation among executives”.
“The lower the rate of job turnover in a work group, the higher the work
productivity”.
“Upper-class people have fewer children than lower-class people”.
“Labour productivity decreases as working duration increases”.
Causal hypotheses state that the existence of, or a change in, one variable causes or
leads to an effect on another variable. The first variable is called the independent
variable, and the latter the dependent variable. When dealing with causal
relationships between variables the researcher must consider the direction in which
such relationships flow, i.e., which is the cause and which is the effect.
c) Working hypotheses: while planning the study of a problem, hypotheses are
formed. Initially they may not be very specific. In such cases, they are referred to
35
as 'working hypotheses' which are subject to modification as the investigation
proceeds.
d) Null hypotheses: these are hypothetical statements denying what are explicitly
indicated in working hypotheses. They do not, nor were ever intended to exist in
reality. They state that no difference exists between the parameter and the statistic
being compared to it. For example, even though there is relationship between a
family's income and expenditure on recreation, a null hypothesis may state: "there
is no relationship between families' income level and expenditure on recreation".
Null hypotheses are formulated for testing statistical significance, since this form is
a convenient approach to statistical analysis. As the test would nullify the null
hypotheses, they are so called.
There is some justification for using null hypotheses. They conform to the qualities
of detachment and objectivity to be possessed by a researcher. If he attempts to test
hypothesis which he assumes to be true, it would appear as if he is not behaving
objectively. This problem does not arise when he uses null hypotheses.
Moreover, null hypotheses are more exact. It is easier to reject the contrary of a
hypothesis than to confirm it with complete certainty. Hence the concept of null
hypotheses is found to be very useful.
e) Statistical hypotheses: these are statements about a statistical population. They
specify in a better way the relations between variables. This association is
36
measured by the co-efficient of correlation, e.g., if the co-efficient of correlation
between bonus and productivity is +1.0, then there is a perfect positive correlation
between the bonus and productivity.
As Abraham Klan has pointed out, "all inductive inference is based on
samples.....all hypotheses might be said to be statistical hypotheses in a broad
sense; statistics has the task of assessing the weight of evidence for a particular
hypothesis contained in a given set data".
f) Common sense hypotheses: these represent the common sense ideas. They state
the existence of empirical uniformities perceived through day to day observations.
Much empirical uniformity may be observed in business establishments, the social
background of workers, and the behaviour patterns specific group like students,
e.g., "shop-assistants in small shops lack motivation".
"Soldiers from upper-class are less adjusted in the army than lower class men";
"fresh students conform to the conventions set by seniors".
Common sense statements are often a confused mixture of cliches and moral
judgments. Scientists have a large-scale job in transforming and testing them. This
requires three tasks: "first, the removal of value judgments; second, the
clarification of terms; and third, the application of validity tests". 'What everybody
knows' is not known until it has been tested'. The common sense hypotheses that
seek empirical generalization play an important role in the growth of a science.
37
g) Complex hypotheses: these aims at testing the existence of logically derived
relationships between empirical uniformities. For example, in the early stages
human ecology described empirical uniformities in the distribution of land values,
industrial concentration, types of business and other phenomena. Further study and
logical analysis of these and other related findings led to the formulation of
complex hypotheses such as "the concentric growth circles characterize a city",
"members of minority groups suffer from oppression psychosis", etc. Such
hypotheses are purposeful distortions of empirical exactness. Because of their
removal from empirical reality, these constructs are termed 'ideal types'. The
function of such hypotheses is "to create tools and problems for further research in
otherwise very complex areas of investigation".
h) Analytical hypotheses: these are concerned with the relationship of analytic
variables. These occur at the highest level of abstraction. These specify
relationship between changes in one property and empirical regularities by wealth,
education, region, and religions. If these were raised to the level of ideal type
formulation, one result might be the hypotheses: "there are two high-fertility
population segments in India, viz., low - income urban Muslims and low- income
rural low caste Hindus".
38
At a still higher level of abstraction, the effects of region, education and religion on
fertility might be held constant. This would allow a better measurement of the
relation between the variables of wealth and fertility.
This level of hypothesizing is the most sophisticated mode of formulation and
contributes to the development of 'brilliant' abstract theories.
CHARACTERISTICS OF A GOOD HYPOTHESIS
What is a good hypothesis? What are the criteria for judging? An acceptable
hypothesis should fulfill certain conditions.
1. Conceptual clarity: a hypothesis should be conceptually clear. It should consist
of clearly defined and understandable concepts. Clarity is obtained by means of
defining operationally the concepts in the hypothesis.
2. Specificity: A hypothesis should be specific and explain the expected relations
between the variables and the conditions under which these relations will hold, e.g.
“when there is dissatisfaction and no care is taken, deprivation will engender
violence”.
3. Testability: A hypothesis should be testable and should not be a moral
judgement. It should be possible to collect empirical evidences to test the
hypothesis. Statements like "capitalists exploit their workers", "bad parents
produce bad children" are common place generalizations and cannot be tested, as
they merely express sentiments and their concepts are vague.
39
4. Availability of techniques: Hypothesis should be related to available techniques.
Otherwise they will not be researchable; therefore the researcher must make sure
that methods are available for testing the proposed hypothesis.
5. Theoretical relevance: A hypothesis should be related to a body of theory. A
science can be cumulative only by building on an existing body of facts and
theories. It cannot develop if each study is an isolated investigation. When research
is systematically based upon a body of existing theory, a genuine contribution to
knowledge is more likely to result. Therefore a hypothesis should possess a
theoretical relevance.
6. Consistency: A hypothesis should be logically consistent. Two or more
propositions logically derived from the same theory must not be mutually
contradictory.
7. Objectivity: Scientific hypothesis should be a value-judgement. In principle, the
researchers system of values has no place in scientific method. However, a social
phenomenon is affected by the milieu in which it takes place. Hence the researcher
must be aware of these values and state them explicitly.
8. Simplicity. A hypothesis should be a simple one requiring fewer conditions or
assumptions. But 'simple' doesn't mean obvious. Simplicity demands insight. The
more insight the researcher has into a problem, the simpler will be the hypothesis.
40
CHAPTER – III
RESEARCH DESIGN
RESEARCH DESIGN MEANING
A research design is a logical and systematic plan prepared for directing a
research study. It specifies the objectives of the study, the methodology and
techniques to be adopted for achieving the objectives. It "constitutes the blueprint
for the collection, measurement and analysis of data". "It is the plan, structure and
strategy of investigation conceived to obtain answers to research questions. The
plan is the overall scheme or program of research". "A research design is the
program that guides the investigator in the process of collecting analyzing and
interpreting observations". It provides a systematic plan of procedure for the
researcher to follow.
NATURE OF RESEARCH DESIGN
A Research design is indispensable for a research project. However it is not
a precise and specific plan like a building plan to be followed without deviations
but like a series of guide posts to keep one going in the right direction. It is a
tentative plan which undergoes modifications as circumstances demand, when the
study progresses, and when new aspects, new conditions and new relationships
come to light and insight into the study deepens.
41
Besides, a research study cannot be as extensive and intensive as the
researchers may like it to be. It has to be geared to the availability of data and the
co-operation of the informants. It has also to be kept within the manageable limits
i.e. within the researcher's mental ability to grasp the implications, his competence
and the amount of time and other resources available for the purpose. Thus a
research design "represents a compromise dictated by many practical
considerations".
CLASSIFICATIONS OF RESEARCH DESIGN
There are a number of crucial research choices. Various writers have advanced
different classifications, some of which are.
1. Exploratory, descriptive and causal designs (Selltiz, Jonada, Deustch and
Cook)
2. Experimental, historical and inferential designs (American Marketing
Association)
3. Historical method, case and clinical studies (Goods and Slates)
4. Sample survey, field studies, experiments in field settings and laboratory
experiments (Festinger and Katz)
5. Exploratory, descriptive and experimental studies (Body and Westfall)
6. Exploratory, descriptive and causal (Green and Tull)
7. Experimental quasi experimental designs (Nachmias and Nachmias)
42
8. True experimental, quasi-experimental and non-experimental (Smith)
9. Experimental, pre-experimental, quasi-experimental designs and survey
research (Kidder and Judd)
FORMULATION OF THE RESEARCH DESIGN
Planning involves deciding things in advance. Accordingly, the preparation of
a research plan involves a careful consideration of the following questions and
making appropriate decisions on them.
1. What the study is about?
2. Why is the study made?
3. What is its scope?
4. What are the objectives of the study?
5. What are the propositions to be tested?
6. What are the major concepts to be defined operationally?
7. What criteria or measurements are to be used?
8.When or in what place the study will be conducted?
9.What is the reference period of the study?
10. What is the typology of the design?
11. What kinds of data are needed?
12. What are the sources of data?
13. What is the universe from which the sample has to be drawn?
43
14. What is the sample size?
15. What sampling technique can be used?
16. What methods are to be used for collecting data?
17. What tools are to be used for collecting data?
18. How is the data to be processed?
19. What techniques of analysis are to be adopted?
20. What is the significance of the study?
21. To what target audience is the finding meant?
22. What is the type of knowledge to be updated?
23. What is the time limit within which the whole work should be comleted?
24. What is the cost involved?
These questions should be considered with reference to the researcher's
interest competence, time and other resources, and the requirements of the
sponsoring agency, if any. Thus the considerations which enter into making
decisions regarding what, where, when, how much, by what means constitute a
plan of study or a study design.
FEATURES OF A GOOD RESEARCH DESIGN
1. It is a plan that specifies the objectives of the study and the hypotheses to be
tested.
2. It is an outline that specifies the sources and types of information relevant to the
44
research questions.
3. It is a blueprint specifying the methods to be adopted for gathering and?
analyzing the data.
4. It is a scheme defining the domain in general and whether the obtained
information can be generalized to a larger population.
FACTORS AFFECTING RESEARCH DESIGN
A research design is commonly affected by the following factors.
1. The degree of formulation of the problem.
2. The topical scope- breadth and depth - of the study.
3. The research environment
4. The time dimension
5. The mode of data collection
6. The manipulation of variables under study
7. The nature of relationship among variables.
After making decisions on the above questions, a formal research plan is
drafted, incorporating those decisions. The format may vary depending on the
purpose for which the study is undertaken.
EVALUATION OF RESEARCH DESIGN
A research design needs great care. A researcher should decide what
research design would be more appropriate for the study. There can be several
45
research designs, each suitable for a different research study. It may be difficult to
state what constitutes an appropriate research design, however the following
aspects may be taken into consideration at the time of deciding about the research
design.
1. Minimum of error: The design in experimentation as well as personal bias
should be minimized to maximize the accuracy and reliability of results.
2. Yield Maximum Information: The design must be capable of yielding
maximum information at minimum cost, effort and time.
3. Flexibility: The research design should be flexible to permit consideration of
a large number of different aspects or phenomena etc.
Much depends upon the nature of the problem to be studied and the object of
inquiry. A design suitable for one research study may or may not be useful for
another. While deciding the research design, therefore, the nature of universe,
objectives of the study, sampling frame, desired standard of accuracy, etc should
be considered. Further, while considering alternative research designs both the
strengths and limitations should be examined. Comments may be invited from
experts and a design appropriate in the given condition should be selected.
46
CHAPTER – IV
SAMPLING DESIGN
INTRODUCTION
Empirics, filed studies require collection of first-hand infection or data
pertaining to the units of study from the field. The units of study may include
geographical areas like districts, taluks, cities or villages which are covered by the
study or institutions or households about which information is required or persons
from whom information is available. The aggregate of all the units pertaining to a
study is called the “population” or the “universe”. Population is the target group to
be studied. It is the total collection of elements about which we wish to make
inferences. A member of the population is "an element". It is the subject on which
measurement is taken. It is the unit of study. A part of the population is known as a
sample. The process of drawing a sample from a larger population is called
sampling. The list of sampling units from which sample is taken is called the
sampling frame e.g., a map, a telephone directory, a list of industrial undertakings,
a list of car licensees etc.,
CENSUS VS SAMPLING METHOD
The process of designing a field study among other things involves a
decision to use sampling or not. The researcher must decide whether he should
cover all the units or a sample of units. When all the units are studied such a
47
complete coverage is called as census survey. When only a sample of the universe
is studied the study is called sample survey.
In making this decision of census or sampling the following factors are considered.
1. The size of the population
If the population to be studied is relatively small, say 50 institutions or 200
employees or 150 households, the Investigator may decide to study the entire
population. The task is easily manageable and the sampling may not be required.
But if the population to be studied is quite large, sampling is warranted. However
size is a large matter. Whether the population is warranted. However size is a
relative matter. Whether a population is large or small depends upon the nature of
the study, the purpose for which it is undertaken, and the time and other resources
available for it.
2. Amount of fund and budget for the study
The decision regarding census or sampling depends upon the budget of the study.
Sampling is opted when the amount of money budgeted is smaller than the
anticipated cost of census survey.
3. Facilities
The extent of facilities available - staff, access to computer and accessibility to
population elements - is another factor to be considered in deciding to sample or
48
not. When the availability of these facilities is extensive, census survey may be
manageable. Otherwise sampling is preferable.
4. Time
The time limit within which the study should be completed is another important
factor to be considered in deciding the question of census or sample survey. This in
fact, is a primary reason for using sampling by academic and other researchers.
PRINCIPLES OF SAMPLING
Sampling is based on two premises. They are,
1. There is much similarity among the elements in a population that a few of
these elements will adequately represent the characteristics of the total population.
2. While the sample value of some sample units may be more than the
population value the sample value of other sample units may be less than the
population value. When the sample is drawn properly, these differences tend to
counteract each other. With the result a sample value is generally close to the
population value.
SAMPLING TECHNIQUES OR METHODS
Sampling techniques or methods may be classified into two generic types.
a) Probability or random sampling and
b) Non-probability or Non-random sampling
49
a) Probability sampling is of the following types
a) Simple random sampling
b) Stratified random sampling
c) Systematic random sampling
d) Cluster sampling
e) Area sampling
f) Multi-stage and sub- sampling
g) Random sampling with probability proportional to size,
h) Double sampling and multiphase sampling
i) ) Replicated or interpenetrating sampling.
b)Non-Probability sampling methods are classified as
a) Convenience or accidental sampling
b) Purposive sampling
c) Quota sampling
d) Snow-ball sampling
PROBABILITY SAMPLING METHODS
1. SIMPLE RANDOM SAMPLING
This sampling gives each element an equal and independent chance of being
selected Equal chance means equal probability of being selected e.g., in a
population of 300 each element theoretically has 1/300th chance of being selected.
50
Equal probability selection method is described as Epsom sampling. An
independent chance means that the draw of one element will not affect the chances
of other elements being selected.
Where some elements are purposely excluded from the sample the resulting
sample is not a random one. Hence all elements should be in included in the
sample frame to draw a random sample.
Procedure
The procedure of drawing a simple random sample consists of
(i) Enumeration of all elements in the population.
(ii) Proportion of a list of all elements, giving those numbers in serial order like
1,2, 3...and so on.
(iii) Drawing sample numbers by using (a) lottery method (b) a table of random
numbers or (c) a computer.
Suitability
The simple random sampling is suitable only for a small homogeneous
population. If may yield a representative sample under the following conditions.
(1) Where the population is a homogeneous group with reference to the specified
characters (e.g. students studying in fifth standard in a boys school form a
homogeneous group as regards level of education and age group)
(2) Where the population is relativity small; and
51
(3) Where a complete list of all elements is available or can be prepared.
The simple random sampling is not suitable for drawing a sample from a large
heterogeneous population as it may a representative sample of such population.
ADVANTAGES
Some advantages of simple random sampling are
1. All elements in the population have an equal chance of being selected.
2. Of all the probability sampling techniques, simple random sampling is the
easiest to apply.
3. It is the simplest type of probability sampling to understand.
4. It does not require prior knowledge of the true composition of the population.
5. The amount of sampling error associated with any sample drawn can easily
be computed.
DISADVANTAGES
The simple random sampling technique suffers from certain draw backs.
1. The use of simple random sampling may be wasteful because we fail to use
all of the known information about the population.
2. This technique does not ensure proportionate representation to various groups
constituting the population.
3. Sampling error in the sampling is greater than that in other probability
samples of the same size, because it is less precise than other methods.
52
4. The size of the sample required to ensure its representatives is usually larger
under this type of sampling than under other random sampling techniques.
5. A simple random design may be expensive in terms of time and money.
These problems have led to the development of alternative superior random
sampling designs like stratified random sampling, systematic sampling etc.
2. STRATIFIED RANDOM SAMPLING
This is an improved type of random or probability sampling. In this method,
the population is sub-divided into homogenous groups or strata, and from each
stratum, random sample is drawn, E.g., university students may be divided on the
basis of discipline and each discipline group may again be divided into juniors and
seniors; the employees of a business undertaking may be divided into managers
and non-managers and each of those two groups may be subdivided into salary-
grade-wise strata.
NEED FOR STRATIFICATION
Stratification is necessary for,
1. Increasing a sample's statistical efficiency
2. Providing adequate data for analyzing the various sub-populations and
3. Applying different methods for different strata.
Stratification ensures representation to all relevant sub-groups of the
population. It is thus more efficient statistically than simple random sampling.
53
Stratification is essential, when the researcher wants to study the characteristics of
population subgroups, e.g., male and female employees of an organization.
Stratification is also useful when different methods of data collection are
used for different parts of the population e.g., interviewing for workers and self-
administered questionnaire for executives.
Thus the stratified random sampling method is appropriate for a large
heterogeneous population.
STRATIFICATION PROCESS
This involves three major decisions
1. The stratification base or bases to be used should be decided. The ideal base
would be the principal variable understudy. For example, if the size of firms is a
primary variable, the firms may be stratified on the basis of the total capital
employed.
2. The number of strata. What should be the number of strata?
There is no precise answer to this question. Larger the number of strata greater
may be the degree of representativeness of the sample. The decision may be based
on the number of sub-population groups to be studied and the cost of stratification.
Cochran suggests that there is little to be gained in estimating overall population
values when the number of strata exceeds set
3. Strata sample sizes
54
There are two alternatives; first the strata sample sizes may be proportionate to
strata's share in the total population. Second they may be disproportionate to
strata's shares.
3. SYSTEMATIC SAMPLING
This method of sampling is an alternative to random selection. It consists of
taking every Kth
item in the population after a random start with an item from I to
K.
As the interval between sample units is fixed, this method is also known as
fixed interval method.
APPLICATIONS
Systematic selection can be applied to various populations such as students
in a class, houses in a street, telephone directory, customers of a bank, assembly
line output in a factory, members of an association and so on.
ADVANTAGES
The major advantages of systematic sampling are
1. It is much simpler than random sampling. It is easy to use.
2. It is easy to instruct the field investigators who use this method.
3. This method may require less time. A researcher operating on a limited time
schedule will prefer this method.
4. This method is cheaper than simple random sampling.
55
5. It is easy to check whether every „K‟th
item has been included in the sample.
6. Sample is spread evenly over the population.
DISADVANTAGES
1. This method ignores all elements betweeritwo 'K1* elements selected. Further,
except the first element, other selected elements are not chosen at random.
Hence this sampling cannot be considered to be a probability sampling in the
strict sense of the term.
2. As each element does not have an equal chance of being selected the resulting
sample is not a random one. For studies aiming at estimation or generalization,
this disadvantage would be a serious one.
3. This method may sometimes give a biased sample. If by chance, several „K‟th
elements chosen represent a particular group that group would be over-
represented in the sample.
4. CLUSTER SAMPLING
Cluster sampling means random selection of sampling units from "clusters"
consisting of population elements. Each such-sampling unit is a cluster of
population elements. Then from each such sampling unit a cluster of population
elements is' selected. Then from each related sampling unit is sample of population
elements is drawn by either simple random selection or stratified random selection.
56
FEATURES
What makes a desirable cluster depends on the survey's situation and
resources. The individual elements are determined by the survey objectives. For
example, for an opinion poll, the individual person is a population element, but for
a socio economic survey of households or a consumer behavior survey, a
household may be the population element or unit of study.
CLUSTER SAMPLING PROCESS
The process of cluster sampling involves the following steps
1. IDENTIFY CLUSTERS
What can be appropriate clusters for a population? This depends on the nature of
the study and the distribution of the population relating to it. The appropriate
cluster may be area unit's or organizations/organizational units.
2. EXAMINE THE NATURE OF CLUSTERS
How homogenous is the cluster? Clusters should not be homogenous in internal
characteristics. A sample drawn from such clusters cannot fully represent the
overall population. Hence clusters should be constructed in a way as to increase
intra-cluster variance. For example, villages/city blocks that contain different
income/social groups may be combined into one cluster.
57
3. DETERMINE THE NUMBER OF STAGES
Shall we use single-stage or multistage cluster? This depends primarily on
the geographical area of the study, the scale of the study, the size of the population
and the consideration of costs. Depending on these factors, the following
alternatives are possible.
(a) single stage sampling
Select cluster on a random basis and study all elements in each of the sample
cluster.
(b) Two stage sampling
Select cluster and then select elements from each selected cluster.
(c) Multi stage sampling
Sampling is carried out in two to more stages. The population is regarded as being
composed of a number of first stage sampling units. Each of them is made up of a
number of second stage units and so forth. That is, at each stage, a sampling unit is
a cluster of sampling units of the subsequent stage.
5. AREA SAMPLING
This is an important form of cluster sampling. In larger field surveys,
clusters consisting of specific geographical areas like districts, taluks, villages, or
blocks in a city are randomly drawn as the geographical areas and selected for
sampling. It is not a separate method of sampling, but forms part of cluster
58
sampling. Area sampling invariably involves multi-stage sampling and sub-
sampling.
6. MULTI-STAGE SAMPLING AND SUB SAMPLING
Multi-stage method of sampling is carried out in one or more stages. The
population is regarded as being composed of a number of first stage sampling
units. Each of item is made up of a number of second stage units and so forth. A
sampling unit is a cluster of the sampling units of a subsequent stage. First sample
of the first stage sampling unit is a drawn, then from each of the selected first stage
sampling units, a sample of second stage sampling units is drawn. The procedure
continues down to the final sampling units or population elements. Appropriate
random sampling method is adopted at each stage.
USAGE
Multi-stage sampling is appropriate where the population is scattered over a
wider geographical area and no frame or list is available for sampling. It is also
useful when a survey has to be made within a limited time and cost budget.
ADVANTAGES
The crucial advantages of multi-stage sampling are
1. It results in concentration of field work in compact, small areas and
consequently in a saving of time, labour and money.
59
2. It is more convenient, efficient and flexible than single-stage sampling.
3. It obviates the necessity of having a sampling frame covering the entire
population.
DISADVANTAGES
The major disadvantages of the multistage sampling are that the procedure
for estimating sampling error and cost is complicated. It is difficult for non-
statisticians to follow this estimation procedure.
SUB SAMPLING
Sub-sampling is part of a multi stage sampling process. In multi stage
sampling, the sampling in second and subsequent stage frames is called sub-
sampling. Suppose that from a population of 40, 00 households in 800 streets of a
city, we want to select a sample about 400 households. We can select a sample of
400 individual households (elements) that would be scattered over the city, but the
cluster sample would be confined to 8 streets. Clustering reduces survey costs, but
increases the sampling error. Sub-sampling balances those two conflicting effects
of clustering.' In the above case, first a sample of say 80 streets may be drawn.
From each of the selected street a 10% sub-sample of households may be drawn. In
each of the above stages, an appropriate probability sampling-simple
random/stratified random sampling/ systematic random sampling - may be
adopted.
60
7. SAMPLING WITH PROBABILITY SIZE (PPS)
The procedure of selecting clusters with probability proportional to size
(PPS) is widely used. If one primary cluster has twice as large a population as
another, it is given twice the chance of being selected. If the same number of
persons is then selected from each of the selected clusters, the overall probability
of any person will be the same. Thus „PPS‟ is a better method for securing a
representative sample of population elements in multi-stage cluster sampling.
ADVANTAGES
The major advantages of 'PPS' are,
1. Cluster of various sizes get proportionate representations.
2. PPS leads to greater precision than would a simple random sample of cluster
and a constant sampling fraction at the second stage.
3. Equal-sized samples from each selected primary cluster are convenient for
field work. If one interviewer is assigned to each cluster, the interviewers have
equal workloads.
APPLICATION
Since in practice primary sampling units generally vary considerably in size,
sampling with PPS is used in all multistage sampling.
61
LIMITATION
PPS cannot be used if the sizes of the primary sampling clusters are not
known.
DOUBLE SAMPLING AND MULTI-PHASE SAMPLING
Double (or two phase) sampling "refers to the subsection of the final sample
from a respected larger sample, that provided information for improving the final
selection". When this procedure is extended to more than two phase of section, it is
then, called multi-phase sampling. This is also known as sequential sampling, as
sub-sampling is done from a main sample in phases. Additional information from
sub samples of the full sample may be collected at the same time or later.
USAGE
Double or multi phase sampling is a compromise solution for a dilemma
posed by undesirable extremes. "The statistics based on the sample can be
improved by using ancillary information from a wide base. But this is too costly to
obtain from the entire population of 'N' elements, instead, information is obtained
from a larger preliminary sample „n‟1 which includes the final sample 'n'”.
Multiphase sampling is appropriate when it is more convenient and
economical to collect certain items of general information on the whole of the units
of a sample and other item of special information from a sub-sample of cases
possessing a given set of characteristics.
62
9) REPLICATED OR INTERPENETRATING SAMPLING
Replicated sampling can be used with any basic sampling technique, simple
or stratified, single or multi-stage or single or multiphase sampling.
ADVANTAGES
The major advantages of replicated sampling are
1. It provides a simple means of resolving replicated sampling error.
2. It is practical if the size of the total sample is too large. To get the result
ready in time, one or more of the replicates can be used to get advance
results.
3. The replicated samples can throw light on variable non-sampling errors.
If a different set of interviewers are used, an estimation of inter-
interviewer variation for each of the sub-samples can be obtained.
DISADVANTAGES
A disadvantage of replicated sampling is that it limits the amount of
stratification that can be employed. This limitation is a real drawback to the use of
repHcated sampling in a multi-stage sampling plan.
NON PROBABILITY SAMPLING METHODS
1. Convenience Sampling
2. Purpose Sampling
3. Quota Sampling
63
4. Accidental Sampling
5. Snow-ball Sampling
1. CONVENIENCE SAMPLING
This is non-probability sampling. If means selecting sample units in a just hit
and miss fashion e.g., interviewing people whom we happen to meet. This
sampling also means selecting whatever sampling units are conveniently available,
e.g., a teacher may select students in his class. This method is also known as
accidental sampling because the respondents whom the researcher meets
accidentally are included in the sample.
USEFULNESS
Though convenience sampling has no status, it may be used for simple
purposes such as testing ideas or gaining ideas or rough impressions about a
subject of interest. It lays the groundwork for a subsequent probability sampling.
Sometimes it may have to be necessarily used.
ADVANTAGES
1. Convenience sampling is the cheapest and simplest.
2. It does not require a list of population.
3. It does not require any statistical expertise.
64
DISADVANTAGES
a) Convenience sampling is highly biased, because of the researcher's
subjectivity and suit ability does not yield a representative sample.
b) It is the least reliable sampling method. There is no way of estimating the
representativeness of the sample
c) The findings cannot be generalized.
2. PURPOSIVE or JUDGEMENT SAMPLING
This method means deliberate selection of sample units that conform to
some pre-determined criteria. This is also known as judgement sampling This
Involves selection of cases, which we judge as the most appropriate ones for the
given study. It is based on judgement of the researcher or some expert. It does not
aim at securing a cross section of a population.
APPLICATION
The method is appropriate when what is important is the typicality and
specific relevance of the sampling units to the study and not their overall
representatives of the population.
ADVANTAGES
The advantages of purposive or judgement sampling are.
1. It is less costly and more convenient
2. It guarantees inclusion of relevant elements in the sample. Probability
65
sampling plans cannot give such a guarantee.
DISADVANTAGES
1. The demerits of judgement sampling are
2. This does not ensure the representativeness of the sample
3. This is less efficient for generalizing when compared with random sampling
4. This method requires more prior extensive information about the population
one plans to study.
5. This method does not lend itself to the use of inferential statistics because
this sampling does not satisfy the underlying assumption of randomness.
3. QUOTA SAMPLING
This is a form of convenient sampling involving the selection of quota
groups of accessible sampling units by traits such as sex, age, social class, etc.,
When the population is known it is easy to choose various categories like sex, age,
religion social class, etc., in specific proportions. Each investigator may be given
an assignment of quota groups pre-determined in the area assigned to him.
APPLICATION
Quota sampling is used in studies like marketing surveys, opinion polls, and
readership survey and which do no aim at precision but to get quickly crude
results.
66
MERITS
The major advantages of quota sampling are
1. It is considerably less costly than probability sampling
2. It takes less time
3. There is no need for a list of population
4. Field work can easily be organized. Strict supervisions not be required.
SHORT COMINGS
1. It may not yield a precise or representative sample, and it is impossible
to estimate the sampling error. The finding, therefore is not generalizable
to any significant extent.
2. Interviewers may tend to choose the most accessible persons; they may
ignore slums or areas difficult to reach. Thus they may fail to secure
representative samples within the quota groups.
3. Strict control of field work is difficult
4. It is difficult to sample more than three variable dimensions.
5. The quota of sampling is subject to a higher degree of classification error,
because the investigator is likely to base their classification of respondent's
social status or economic status mostly on the impressions about them.
67
4. SNOWBALL SAMPLING
This is the colorful name for techniques of building up a list or a sample of a
special population by using an initial set of its members as informants. For g
example if a researches wants to study the problems faced by Indians in another
country, say, he may identify an initial group of Indians through some source like
the Indian embassy. Then he can ask each one of them to supply names of others
Indians known to them, and continue this procedure unit he gets an exhaustive list
from which he can draw a sample or make a census survey.
ADVANTAGES
The advantages sampling are
1. It is very useful in studying social groups or formal groups in formal
organizations, and diffusion of information among professionals of various
kinds.
2. It is useful for smaller populations for which no frames are readily available.
DISADVANTAGES
1. The major disadvantage of snowball sampling is that it does not allow the use
of probability statistical methods. Elements included are dependent on the
subjective choice of the original selected respondents.
2. It is difficult to apply this method when the population is large.
3. It does not ensure the inclusion of all elements in the list.
68
SAMPLING AND NON-SAMPLING ERRORS.
A survey aims at estimating or inferring selected population characteristics
or parameters by studying either the entire population or a sample of the
population. The research results may differ from the true values of the parameters
under study. Such differences are known as errors and biases.
The errors of a survey may be classified into (a) sampling errors, (b) sampling
biases (c) non-sampling errors and (d) non-sampling biases.
SAMPLING ERRORS
The errors which arise because of studying only a part of the total population
are called sampling errors. These may arise due to non-representativeness of the
samples and the inadequacy of sample size. When several sample are drawn from a
population the results would not be identical. The degree of variations of sample
results is measured by standard deviation and it is known as the standard error of
the concerned statistics. As sample size increases the magnitude. Thus sample size
and sampling of error are negatively corrected sampling error also increases.
69
SAMPLING BIAS
The average of the estimate of a population parameter received from an
infinite number of samples is called as expected value of the estimators. The
differences between this value and the true value of the parameter are the bias. Bias
may arise
1. If the sampling is done by a non-random method,
2. If the sampling frame is incomplete or inaccurate and
3. If some sections of the population are not available or refuse to co-operate.
Any of these factors will cause non-compensating errors, which cannot be reduced
by an increase in sample size. The only sure way of avoiding bias arising through
the sampling method Is to use a random method. Randomness is an essential part
of the protection against sampling bias.
NON SAMPLING ERRORS
These are errors, which arise from sources other than sampling. They
include errors of observation, errors of measurements and errors of responses. Data
are collected through the methods of observation or interviewing. The physical
procedure of observations or interviewing is subject to imperfection, which cause
errors.
70
Measurement error consists of error in processing and analysis. Errors of
response include incorrect responses of the respondents or mistakes in noting their
responses etc.
NON SAMPLING BIAS
These biases pose problems for scientific measurement. They affect both the
population sample values and account for the difference between the population
value and the true value. They consist of biases of observation and non-
observation, response biases and process biases. Obtaining and recording
observations incorrectly cause biases of observation. Non-observations biases arise
from failure to obtain observations on some segments of the population due to
either non coverage or non-response. The latter may be due to refusals, not-at-
homes, lost forms, etc., Response biases consist of biases arising from
imperfections in field observations or interviewing. Processing biases are produced
during coding, tabulating and computing.
TOTAL ERROR
In sampling theory, a populous model combines sampling and non-
sampling errors and biases into its total error. This total error is the square root of
the sums of squares of variables errors and squares of bias.
Total = √
71
The variable errors are caused only by sampling errors, and v e equals the
standard error of sampling. Bias is mostly caused by measurement biases (i.e. non-
sampling errors).
Total Error Biases
Sampling error
The total error depends on the length of both the legs. The sampling error/
standard error leg can be shortened by improving the sample design and by
increasing the sample size. The length of biases leg may be reduced by in proving
the tools of data collection, the precision of methods of data collection, field work,
coding, processing and analysis.
The measurement of sampling errors does not pose much problem, but the
measurement of non sampling errors requires special procedures and it is costly.
Hence the reduction of non-sampling errors is a challenge to the researchers.
SELECTION OF A SAMPLE
A sample design is a definite plan for obtaining a sample from a given
population. It refers to the technique or the procedure the researcher would adopt
in selecting items for the sample. Sample design may as well lay down the number
of items to be included in the sample i.e., the size of the sample.
72
(A) STEPS IN SAMPLING DESIGN
The following steps have to be followed while developing a sampling design. (1)
(1)Outlining the universe
The first step is clearly outlining the set of the objects to be studied. It may be
infinite or finite. In a finite universe, the number of items is certain for example the
number of workers in a factory me number of dealers of a company etc.
But in case of an infinite universe the number of items is infinite. For example,
number of stars in the sky, the number of TV viewers etc.
(2) Defining sampling unit
The second step is to identify the sampling unit of the universe. For eg. it may be
based on geographical basis such as state, district or village etc.
(3) Sampling frame
Once the sampling unit is drawn, the third step is preparing the source list to cover
all the samples. This list contains the names of all the items of a universe, if the list
is not ready, the researcher has to prepare it.
(4) Size of sample
The fourth step is fixing the number of items to be selected from the universe to
constitute a sample. It may be 10% or 20% or any percentage of the universe. The
size of sample should neither be excessively large, nor too small. It should be
73
optimum. An optimum sample is one which Mills the requirements of efficiency,
representativeness, reliability and flexibility.
Budgetary constraints must invariably be taken into consideration when we decide
the sample size.
(B) CHARACTERISTICS OF A GOOD SAMPLE DESIGN
The following are the characteristics of a good sample design
a) Sample design must result in a truly representative sample
b) Sample deign must be one which results in a small sampling error.
c) Sample deign must be viable in the context of funds available for the
research study
d) Sample design must be such that the systematic bias can be controlled in a
better way.
(e) samples should be such that results of the sample study can be
applied, in general, for the universe with reasonable confidence.
(C) CRITERIA FOR EVALUATING AN IDEAL SAMPLE DESIGN
Three criteria should be borne in mind in the construction of a sample design.
1. It is important that measurable or known probability sampling techniques
should be used. Sampling procedures which do not provide a good basis for the
estimation of a sampling error should be avoided.
2. Due care should be exercised to use single, straight forward, workable
74
methods suitably adapted to available techniques and personnel.
3. Every effort should be made to achieve maximum reliability of results for the
expenditure incurred as a result of the application of the sampling procedure.
SELECTING SAMPLE SIZE BY TRADITIONAL METHODS
Generally there are four traditional approaches to determining sample size.
First, the analyst can simply select a size either arbitrarily or on the basis of some
judgmentally based criterion. Similarly, there may be instances where the size of
sample represents all that were available at the time e.g., when a sample composed
of members of some organization and data collection occurs during a meeting of
the organization. Second, analysis consideration may enter and the sample size is
determined from the minimum cell size needed. For example if the critical aspect
of the analysis requires a breakdown of three variables which creates 12 cells, and
it is felt that there should be at least 30 observations in a cell, them the absolute
minimum sample size needed would be 360. Third, the budget may determine the
sample size. If, for example, the research design for a survey calls for personal
interviews, and the cost of, each interview is estimated to be Rs.50, and the
budgets allotted to data collection is Rs.10, 000, than the sample size would be
200.
It may appear that these methods are for non probability samples. While this
certainly is true, these methods are also applicable to probability samples and have
75
occasionally been used for such samples. For probability samples, the precision
must be determined after data collection.
The fourth approach to sample size determination is based on specifying the
precision of estimation the desired in advance and then applying the appropriate
standard error formula to calculate the sample sizes. This is the approach of
traditional inference. Two major classes of procedures are available for estimating
sample sizes within the context of traditional inference. The first and better known,
of these is based on the idea of constructing confidence intervals around sample
means or proportions. This can be called the confidence-interval approach.
The second approach makes use of both type I (rejecting a true null
hypothesis) and type II (accepting a false null hypothesis) error risks and can be
called the hypothesis-testing approach.
However, two points must be made. First as with the other approaches, the analyst
must still calculate the standard error after data collection in order to know what it
is for the actual sample that provided data. Second, the size of sample that results
from traditional inference refers to the obtained (or resultant) sample.
Depending on the data collection method used, the original sample may have to be
much larger. For example, suppose that the size of the desired sample is 582. A
mail survey is used for data collection and past experience has shown that the
76
response rate would be around 25%. Then original sample size in this case would
have be 2,328 in order that 582 responses would be obtained.
CRITERIA OF A GOOD SAMPLE
Whether the result obtained from a sample survey would be accurate or not
depends upon the quality of the sample. The characteristics of a good sample are
described below.
1. REPRESENTATIVENESS
A sample must be representative of the population. Probability sampling
techniques yield representative samples. In measurement terms, the sample must
be valid. The validity of a sample depends upon its accuracy and precision.
2. ACCURACY
Accuracy is defined as the degree to which bias is absent from the sample. An
accurate sample is one, which exactly represents the population. It is free from any
influence that causes any difference between sample value and population value.
3. PRECISION
The sample must yield precise estimates. Precision is measured by the
standard error or standard deviation of the sample estimate. The smaller the
standard error or estimate, the higher is the precision of the sample.
4. SIZE
77
A good sample must be adequate in size in order to be reliable. The sample
should be of such size that the inferences drawn from the sample are accurate to a
given level of confidence.
78
CHAPTER – V
SCALING TECHNIQUES
SCALING TECHNIQUES: MEANING
Scaling denotes the theory and practice of associating numbers with objects
by ranking the qualitative aspects such as character, ability, personality, etc. This
makes rigorous statistical analysis possible. Scaling can be done either by the
subject himself or by the interviewer. Ranking of candidates after a viva voce
examination scaling the items/objects/individuals in terms of certain attributes are
examples of scaling.
Broadly, techniques for registering differences in degree are of two types. In
the first type one makes a judgement about some characteristics of an individual
and places him directly on a scale defined in terms of that characteristic.
A scale is a continuum consisting of the highest point and the lowest point
there being several intermediate points between these two poles.
These scale point positions are so related to each other that the second point
indicates a higher degree in terms of a given characteristic compared to a third
point and the third point indicates a higher degree compared to the fourth and so
on.
The second type of technique for registering difference in degree consists of
a questionnaire constructed in such a way that the score of individual's responses
79
assigns him a place on a scale. Both the rating scales and attitudes have the object
of assigning individuals to numerical positions to make possible the distinctions of
degree.
TYPES OF SCALE
Scales may be classified in many ways: in terms of,
a) subject matter,
b) scaling methods or techniques,
c) scale function
d) level of measurement, and
e) number of dimensions
However, there is no widely accepted system of classification,
a) Subject matter: Scales are designed to measure 1) attitude, 2) social distance,
3) socio-economic status, and 4) other variables.
b) Scaling techniques: In terms of techniques, scales may be classified into 1)
Arbitrary scales,2)Judgement scales,3) 'Item analysis' scales.4) Rating scales,5)
Ranking scales, 6) Cumulative scales, and 7) Factorial scales.
c) Scale Function: Scales may have their predictive powers or reproductivity.
Scales which have powers to predict an external criterion are „Productive scales‟.
Aptitude tests and prediction of marital adjustment are examples of this type of
scales. Reproductive scales are designed to arrange sets of data that if the research
80
is given a single score, he can produce all items on the scale. Louis Guttman has
designed scales of this type.
d) Levels of Measurement: In terms of this property, scales may be classified as
1) nominal, 2) ordinal, 3) Interval or ratio scales.
e) Number of Dimensions: Scales are either 1) unidimensional or 2)
multidimensional.
Unidimensional scale measures only one attribute of the respondent or
object ,e.g. attitude, opinion, job satisfaction, so on. Most of the scales used in
research are of this type. Multidimensional scale measures several dimensions of
an object.
Following are the types of scales used in social science research.
(1) Graphic rating scale
This is perhaps the most widely used rating scale. In this type the rater
indicates his rating by simply making a mark at the appropriate point on a line that
runs from one extreme of the attribute or characteristic in question to the other
extreme. Scale points with brief descriptions may be indicated along the line their
junction being to help the rater in localizing his rating.
One of the major advantages of these scales is that they are relatively easy to use
and provide opportunity for fine discriminations of degree.
(2) Itemized rating scale
81
These are also known as numerical scales. In this type the rater selects one
of a limited number of categories that are ordered in terms of their scale position.
Scales with five or seven categories have generally been employed but some have
used even as many as eleven points. Barker Dembo and Lewin in their study of the
effects of frustration on consecutiveness of play of young children constructed a
seven point scale for rating consecutiveness. They draw specific illustrations of the
activities that were judged to fall at various points on the scale. The first point on
the scale indicating least abstractiveness in the above study was the "toys are
examined superficially- the fourth point indicating moderate constructiveness was
"definitely more complicated and elaborate manipulation of the toys" and the
seventh point indicating the highest degree of constructiveness was -play showing
more then usual originality".
(3) Comparative Rating Scales
In this category of rating scales, the positions on the ratings scale are
expressly defined in terms of a given population, a group or in terms of people
with known characteristics.
The rater, for example, may be called upon to incite whether an individuals
problem-solving skill or some other attribute most closely resembles that of Mr. X
or of Mr. Y or of Mr. Z etc. - all of whom are known to him (the rater) in respect of
the skill or attribute. Or again, a rater may be asked to estimate the ability of an
82
individual to do a certain kind of work as compared to the total group of persons
engaged in the above kind of work as compared to the total group of persons
engaged in the above kind of work and whom the rater has known. The rater then
may indicate whether the individual is more capable than 10% of them or 20% of
them, etc.
(4) Rank Order Scale
Here the rater is required to rank subjects/persons specifically in relation to
one another. The indicator which person rates the highest in terms of the
characteristic being measured which person is highest and so on. We have said that
In the rating scales the rater may himself be the subject to be rated This is called
self-rating have certain advantages indeed. The individual (rater himself)is often in
a better position to observe and report his feelings, opinion etc. than anyone else. Is
port if the individuals is not aware, as is not unusual, of his beliefs or feelings but
close not wish to express them for certain reasons(such as fear), then self-rating
procedure is of little value.
(5) Attitude Scales
In this approach, the individual does not directly describe himself in the
terms of his position (rater himself) on the dimension in question; rather he express
his agreement or disagreement with a number of statements relevant to the issue.
On the basis of his responses he is assigned a score. The score a person gets
83
indicates his position on the dimension. Since this technique has been generally
used in the measurement of attitudes the students can understand the rationale
behind entitling these sections-attitude scales.
If has been said that the attitudes scales are constituted of various statement
items relevant to an issue (like nationalization of banks, co-education, inter-caste
marriages, etc). The individual subjects respond in a particular manner to these
statements. To these modes of response, particular scores are assigned. The way in
which a scale discriminates among individuals depends on the way in which the
scale is formulated and the method of scoring. In some scales, the statements/items
form a gradation of such a nature that the individual agrees with only one or two of
these and disagrees with the remaining statements on either side of those agreed to.
Such scales in which a person's response fixes his position, are called the
'differential scales'. In other scales, the individual indicates his agreement or
disagreement with all statements and his total score is computed by adding scores
assigned to his responses to all separate statements. Such scales may be called
summated scales. Yet another type of scales is set up in such a way that the
statements or items form a cumulative series, principally, an individual whose
attitude is at a certain point on the dimension being measured will answer
favourably all the items on one side of that point and answer unfavourably ail those
84
on the other side of this point. They are called cumulative scales. Let us discuss
each of these types of scales hereunder.
a) Differential scales
These scales for measurement of attitudes are closely associated with the
name of L.L. Thurstone, hence often called Thurstone-type scale. This scale
consists of a number of statements whose position on the scale has been
determined by judges. The judges being persons whose judgements as to the
relative rank of different statements along dimensions can be relied on. The
method of equal-appearing intervals is most commonly used in the construction of
this scale. This method is described as under.
In selecting the statements for the scale and assigning scores to them, the
following procedure is used: (a) the researcher gathers a large number of
statements conceived as related to the attitude being, investigated, (b) A large
number of judges (up to 300) working independently are requested to classify these
statements into eleven groups or piles. The judges are requested to place in the first
pile the statements which they think are most favourable to the issue (or most
progressive or permissive, depending upon the dimension along which the
statements are to be placed) in the second pile those they consider next most
favourable and like this, in the eleventh pile the statements they consider most
85
unfavourable. The sixth position is defined as the point at which the attitude is
neutral.
(c) The scale value of a statement is computed as the mean or 'median' position to
which it is assigned by the group of judges. Statements that have too broad a
scatter, that is, whose evaluation by different judges varies very widely, are
discarded as ambiguous or irrelevant.
(d) A final selection is made taking evaluated items that spread out evenly along
the scale from one extreme position to the other (scale values like, 10.3, 9.4, 8.4,
5.5, 4.7, 3.3, 2.6 and 1.7)
The resulting scale is thus a series of statements usually about twenty, the
position of each statement on the scale having been determined by the judges'
classification. The subjects are asked during the administration of the questionnaire
to check the statement or statements with which they agree or to check two or three
statements those are closest to their position.
b) Summated scales
A summated scale like the differential scale just discussed, consists of a
series of statements to which the subject is asked to react. The main difference
between the two is that unlike the differential scales, only statements that seem to
be either definitely favorable or definitely unfavourable are used in the scale (the
intermediate shades being excluded). The respondent / subject indicates his
86
agreement or disagreement and degree thereof with each item. Each response is
given a numerical score indicating its favorableness or favourableness. The
summation of the scores of individual responses to all the separate statements gives
his total score. This score represents his position on the continuum of favourable -
unfavourableness towards an issue.
The type of summated scale most frequently used in the study of social
attitudes follows the pattern devised by Likert. Therefore, it is referred to as the
Likert type scale. In this scale, the subjects are asked to respond to each of the
statements in terms of several degrees of agreement or disagreement.
The procedure for constructing a Likert type scale is as follows, (a) The
investigator assembles a large number of statements considered relevant to the
attitude being investigated that are either clearly favorable or clearly unfavorable,
(b) The statements are administered to a small sample of subjects /representatives
of those on whom the questionnaire is to be finally administered. The subjects then
rate their response to each item by checking one of the categories of approval or
disapproval on the scale below each statement, (c) The responses to various items
are scored in such a way that a response indicative of the most favorable attitude is
given the highest score. It is important that the responses are scored consistently in
terms of the attitudinal direction they indicate whether 'approve' or' disapprove' is a
favorable response depends on the content and working of the statement.
87
The Likert-type scale has several advantages over the Thurstone scale.
1. It permits the use of items that are not manifestly related to the attitude being
studied. This is so because in the Likert method any item that is found empirically
consistent with the total score can be included. Unlike Thurstone-type scale, there
is no necessity that agreement among judges that restricts the items (statements)
content which is obviously relates to the attitude being studied. As was indicated
earlier, it is a great advantage to be able to use themes that do not, on the face of it,
appear to have a direct relationship to the attitude being studied.
2. The Likert type scale is generally considered simpler to construct. At least the
procedure of construction is less cumbersome.
3. It is likely to be more reliable than a Thurstone-type scale for some items.
The Likert-type scale permits the expression of several degrees (usually five) of
agreement-disagreement; whereas the Thurstone-type scale allows a choice
between only two alternative responses, i.e., acceptance or rejection.
4. The range of responses permitted to a statement in the Likert-type scale
provides more precise information about the individual's opinion on the issue.
5. One disadvantage of the Likert scale is that often, the total score of an
individual has little clear meaning since many patterns of response to
the various statements may produce the same score.
c) Cumulative scales
88
Cumulative scales like the earlier scales are made up of a series of items
with which the respondent indicates agreement or disagreement The distinctive
feature of a cumulative scale is that the items here are ordered or related to one
another in such a way that an individual who replies favourably to item No.3 also
repeats favourably to item No.2 and No.1 and one who replies favorably to items
No2 and No.1 and one who replies favourably to item No.4 also replies favourably
to items No 3,2 and land so on. Thus, all individuals who answer a given item
favourably have higher scores on the total scale than the individuals who answer
that item unfavourably.
The individuals score is calculated by counting the number of items he
answers favourably. This score places him on the sale of favourable-unfavourable
attitude provided by the relationship of the items to one another. One of the earlier
scales for measurement of attitudes, the Bogardus social distance scale, was
intended to be of a cumulative type. The social distance scale' which has become a
classic technique in the measuring of attitudes toward ethnic or racial groups tests a
number of relationships to which members of a given ethnic group might be
admitted. The respondent is asked to indicate for relationships to which he would
be willing to admit members of each of these groups. This attitude is measured by
the closeness of relationship that he is willing to accept or the distance that he
would wish to maintain.
89
Technique developed by Guttmann known as 'scale analysis' or the
"Scalogram method', has, for its main purpose, to ascertain whether the attitude or
characteristic being studied actually involves only a single dimension. In the
Guttmann procedure, a universe of content (the attitude or characteristic under
study) is considered to be unidimensional only if it yields a perfect or almost
perfect cumulative scale.
Samuel Stouffer points out the characteristic feature of the Guttmann
technique thus; "it must be possible to order the items such that, persons who
answer a given question favourably all have higher ranks than persons who answer
the same question unfavourably... the response to any item provides a definition of
the respondent's attitude.
The Guttmann technique may rightly be regarded as a method of
determining whether a set of statements forms an unidimensional scale. It affords
no guidance for selecting statements that are likely to form an unidimensional
scale. The scale discrimination technique developed by Edwards and Kilpatrick is
a method of selecting a set of items likely to form a unidimensional scale.
The procedure suggested is as under
1. A large number of statements dealing with the issue of study are collected.
Items that are ambiguous, irrelevant neutral or too extreme are eliminated by
inspection.
90
2. As in the Thurstone method a large number of Judges are requested to place
the remaining statements in eleven piles, according to their degree of favourable
ness- unfavourablness toward the issue. The un reliable items are discarded and each
of the remaining items is assigned a scale-value (Median position)
3. These statements are then transformed into a Likert-type scale by providing
for expression of various degrees of agreement-disagreement in response to each
item. This scale is administered to a large group of subjects and their responses
analysed to determine which of the items discriminate most clearly between the high
scores and the low scores on the total scale. Such items which have the highest
discriminatory co-efficient in their scale interval are selected in twice the number
that is actually wanted for use in the final scale. For each scale interval, an equal
number of items is selected.
4. The statement or items in the resulting list are arranged in order of their scale
value. The list is sub-divided into two counter part questionnaire.
5. But the one-dimensional scales do suffer from certain limitations which we
would do well to note.
6. One-dimensional scales hardly constitute a reliable basis for assessing
attitudes of persons towards complex objects or phenomena or for predicting the
behavioural responses of individuals towards each object or phenomena. For
instance' war' is a complex concept; hence the one-dimensional scale does not quite
91
help us to measure the attitudes of men toward war. It is of course, possible to
construct and use one-dimensional scales for the implications of war on economy,
health, morality etc., in an independent study.
7. Although these dimensions of war do effect the final shape of a person's
attitudes towards war, independent enquiries basing their assessment on
unidimensional scale hardly effort us a total perspective on persons attitude toward
'war' in all its many-sided connotations.
Secondly a scale may be unidimensional for some person but not so for
others. In our discussion of cumulative scales, we have shown how the items on
the scale do not constitute a cumulative series for one and all. Differences in
educational levels and experience are reflected in a person's subjective evaluation
of the scale items.
1. Harding and Hogrefe have in their study demonstrated how a single scale did not
in effect function as a unidimensioned scale on the three different categories of
workers.
SCALE CONSTRUCTION TECHNIQUES
Following are the five main techniques by which scales can be developed.
1. Arbitrary approach
It is an approach where scale is developed on ad-hoc basis. This is the most
widely used approach. It is presumed that such scales measure the concepts for
92
which they have been designed, though there is little evidence to support such an
assumption.
2. Consensus approach
Here a panel of judges evaluates the items chosen for inclusion in the
instrument in terms of whether they are relevant to the topic area and unambiguous
in implication.
3. Item analysis
Under this a number of individual items are developed into a test which is
given to a group of respondents. After administering the test, the scores are
calculated for everyone. Individuals are then analysed to determine which items
discriminate between persons or objects with high total scores and those with low
scores.
4. Cumulative scales
The cumulative scales are chesen on the ba* of their conforming to the
ranking of items with ascending and descending discriminating pewer. For
instance, in such a scale the endorsement of an item representing an extreme
position should also result in the endorsement of all items indicating a less extreme
position.
93
5. Factor scales
These scales may be constructed on the basis of inter correlations of items
which indicate that a common factor accounts for the relationship between items.
This relationship is typically measured through factor analysis method.
94
CHAPTER – VI
DATA COLLECTION
INTRODUCTION
Once the researcher has decided on the research design, the next job is data
collection .For data to be useful, our observations need to be organized so that one
can get some patterns and come to logical conclusions.
Statistical investigation requires systematic collection of data, so that all relevant
groups are represented in the data.
DATA - Meaning
"Data are facts, figures and other relevant materials, past and present,
serving as bases for study and analysis."
To determine the potential market for a new product, for example, the
researcher might study 500 consumers in a certain geographical area. People
represent variables such as income level, race, education and neighborhood. The
quality of data will greatly affect the conclusions and hence, utmost importance
must be given to this process and every possible precaution should be taken to
ensure accuracy while gathering and collecting data.
TYPES OF DATA
Depending upon the sources utilized, viz whether the data has come from
actual observations, or from records that are kept for normal purposes, statistical
95
data can be classified into two categories as primary and secondary. However, the
data needed for a social science research may be broadly classified into
a) data pertaining to human beings
b) data relating to organizations, and
c) data pertaining to territorial areas
Personal data or data related to human beings consists of,
1. demographic and socio-economic characteristics of individuals: age, sex,
social class, religion, marital status, education, occupation, income, family
size, location of the house hold, life style etc.
2. Behavioural variables: attitudes, opinions, awareness, knowledge, practice,
intentions, etc.
Organisational data consists of data relating to an organisation's origin,
ownership, objectives, resources, functions, performance, and growth.
Territorial data is related to geophysical characteristics, resources
endowment, population, occupational pattern, Infrastructure, economic structure,
degree of development, etc. of spatial divisions like villages, cities, taluks, districts,
states and the nation.
SOURCES OF DATA
The two main sources of data in social science research are „people‟ and
„paper‟. The responses to questions put to people contribute the major source of
96
data in social research. This source is labeled as the primary source of data. A large
amount of data is already available in the form of „paper‟ sources.
These incudes documents, historical records, diaries, biographies, statistical
records, and the like. The paper sources are commonly known as secondary
sources of data or 'available data sources'.
When a researcher decides to collect data through the primary source, he has
three options, namely observation, interview and questionnaire and, in case he opts
for the secondary source of data, he uses the methods of content analysis.
PRIMARY DATA
Primary data is one which is collected by the investigator himself for the
purpose of a specific inquiry or study. Such data is original in character and is
generated by surveys. These data are collected for the first time and have not been
collected earlier. Primary data are first hand information collected through various
methods such as observation, interview, mailing etc.
SECONDARY DATA
When an investigator uses the data which has already been collected by
others, such data is called secondary data. This data is primary data for the agency
that collects it and becomes secondary data for someone else who uses this data for
his own purposes. The secondary data can be obtained from journals, reports,
government publications, publication of professional and research organizations
97
and so on. For example if a researcher desires to analyze the weather conditions of
different regions, he can get the required information or data from the records of
the meteorology department.
DATA COLLECTION METHODS
OBSERVATION
Observation is the basic method of obtaining information about social
phenomena under investigation. Observation becomes a method of data collection
when it is planned in accordance with the purpose of research and records
systematically, keeping ,in mind the validity and reliability of observed data.
Information is collected by observing the process at work.
The method can be used to study sales techniques, customer movements,
customer response, etc. However the customers/consumer's state of mind, their
buying motives their linguiations are not revealed. Their income and education is
also not known. It also takes time for the investigator to wait for particular
situations to take place.
SURVEY
According to Hutton (1990) survey research is the method of collecting
information by asking a set of pre formulated questions in a predetermined
sequence in a structured questionnaire to a sample of individuals drawn so as to be
representative of a defined population.
98
Questionnaires and interviews are both types of survey; surveys are probably
the most common forms of research method, the primary function of surveys is to
collect information which can then be analyzed to produce conclusions. In order, to
evaluate different survey methods it is necessary first to look at the purpose to
which the information will be put. The first purpose of surveys Is to describe what
is going on; to obtain all the relevant facts about something; and to state those facts
as quantifiable data. More sophisticated descriptive surveys try to identify areas
where problems occur. Or where changes are required, other may seek to measure
the extent and nature of known problems.
In addition to explaining past change, surveys, can be used to predict future
changes. The changes may be beyond the control of the surveyor in which case the
survey seeks to isolate the magnitude, nature and timing of the change.
Accordingly, the surveyor may be in control of the change, in which case the
survey is more likely to be concerned with predicting the outcomes of the change
or examining the merits of different policy options. Once the changes have been
made, the surveyor may be called upon to evaluate the results of the change,
perhaps even suggesting further changes which might be necessary. Generally
surveys are very flexible and widely used research methods. Given a small amount
of common sense they can be used by relatively inexperienced researchers to
produce useful results.
99
QUESTIONNAIRE
By questionnaire is meant a set of questions developed in an organized and
ordered manner for finding information from the people in relation to a given
problem. Questionnaire method of data collection is the tool which is more
frequently used in mail survey research than any other method of data collection.
Mail survey research is that branch of scientific investigation which studies the
universe by selecting a sample from it and uses the mail media for collecting data.
In survey research one attempts to discover the reliable incidence, distribution and
interrelation of sociological and psychological variables. Though the approach and
technique of survey research can be used on any set of objects that can be well
defined, survey research mainly focuses on people, the vital facts of people, and
their beliefs, opinion,, attitudes, mutations, memory, behaviors, actions,
interactions and even future plans. The questionnaire is the best tool for collecting
information in all the above areas, though besides it from time to time, other types
of methods, like interview method based upon schedule, or projective techniques
have also been used.
INTERVIEW SCHEDULE
Questionnaire and Schedule are increasingly used for the collection of varied
and diverse data in survey research. Their popularity and utility are enhancing day
by day. The reason for the growing popularity is the recognition by social science
100
researchers of the importance quantitative measurement of qualitative social
behavior. The features of the interview schedule are given below.
1. The schedule is generally filled out by the research worker of the enumerator,
who can interpret questions where necessary.
2. Collection of data through schedule is relatively expensive due to the
appointment of enumerators and imparting training to them.
3. In schedule method, the response rate is very high because they are filled by
enumerators who are able to get answers to all questions.
4. In case of schedule the identity of respondents is known.
5. In case of schedule the information is collected well in time as they are filled in
by enumerators.
6. Direct personal contact is established with respondents in the schedule method.
7. The information can be gathered even when the respondents happen to be
illiterate.
8. In respect of schedules there will be difficulty in sending enumerators over
relatively wider areas.
9. The information collected If generally complete and accurate as enumerators,
can remove the difficulties faced by respondents in correctly understanding the
questions.
101
10. In case of schedule much depends upon the honesty and competence of
enumerators.
11. The difficulty with schedules is that they are to be filled in by enumerators and
not by respondents and the responses may not be the respondents.
12. Observation method can also be used along with schedules.
The schedule is especially more useful and economical in situations where the
respondents' geographical dispersal is wide. Schedule also isolates the respondent
from external Influence. The respondent is totally free to express his views
according to his knowledge, views and attitudes in an unbiased manner. Data
obtained without external influence is more valid and reliable.
EFFECTIVE INTERVIEW TECHNIQUES
Despite the fact that the interview schedules have a lot of advantages, they
may be made more effective by adopting the following techniques.
1. Before asking questions, introduce yourself or have a mutual friend Introduce
you.
2. Briefly explain the purpose of the survey and cite the relevant authorities who
have approved the-survey. Do not deviate from the study objectives which was
developed for the survey. Offer proof of interviewer status or letter of
authorization If necessary.
3. Be confident, not afraid or wary. You are a trained professional on an
102
assignment for an important purpose. If you convey insecurity or hesitancy to
the respondent; he will immediately suspect your motives.
4. Use an informal manner of speaking, natural to you, aimed at putting the
respondents at ease. Know the questions so well that you never sound as though
you're reading them formally.
5. Never get involved in a lengthy explanation of the study. Use the standard
responses that have been provided.
6. Follow the instructions for interviews explicitly. Do not change the sequence
of the questions or deviate or skip and filter instructions.
7. Ask the questions exactly as they are written on the questionnaire (or with any
minor wordings changes that were agreed on in training). You must ask the
questions exactly as all the other interviewers do so that results can be
combined and interpreted meaningfully.,
8. Ask the questions in a respectful manner, and do not imply that some answers
are "better" than others. A respondent may deny that she would like to have
more children if she believes that the interviewer associates large families with
poor family welfare.
9. Do not by word or action indicate surprise, pleasure, or disapproval at any
answer. Even a slight change in expression will tell a respondent that you have
reacted to his answer.
103
10. When an answer is unclear repeat the question exactly as written, or ask it in a
slightly different way (if agreed during training) being careful not to change the
meaning or lead the respondent to a particular response. If a respondent seems
confused by a question, even after repetition, record whatever answer is given.
11. For personal or cultural reasons, people may not want to talk about
sensitive issues like contraceptive use or the death of a child. In such
cases you will needed to be particularly sensitive and respectful in the
way you ask questions.
12. Record open ended questions verbatim. Do not paraphrase or interpret. The
exact words people use to describe opinions and5 attitudes are important.
13. Always take the blame for faulty communication. If the respondent stumbles,
say that perhaps you did not read the question clearly or read it too rapidly.
Never let the respondent feel the questions are too difficult for her.
14. Check over the questionnaire before leaving the interview to make sure all the
questions were asked and the answers recorded legibly.
LIMITATIONS OF INTERVIEW METHOD
In the interview method the information obtained is often difficult to
analyze. It may be possible to use quantitative methods for some of it, but the
method will probably have been chosen to collect information which, by its very
nature, is not susceptible to this form of analysis.
104
The main problem is that of ensuring a high degree of consistency in the
presentation of the interview. Trained interviewers can overcome many of the
problems associated with bias; becoming too personally involved with the
interview, or simply the tiring experience of coming into direct contact with a large
number of strangers every day.
105
CHAPTER – VII
TOOLS OF DATA COLLECTION
CONSTRUCTING QUESTONNAIRE
A questionnaire is a proforma containing a sequence of questions to elicit
information from the interviewees. The questionnaire is used for personal
Interview. At the same time the questionnaire is also mailed to individuals who are
requested to write the answers against each question and to return the completed
proforma by post. Thus, it is especially true of a mail questionnaire which
essentially has to speak for itself. If it is not clean not only the replies may be
vague and of little value but many potential respondents may not bother returning
the questionnaire at all.
A questionnaire may be said to possess three main aspects
1. the general form
2. the question sequence
3. the question wording
1. The general form
The form of a questionnaire will depend partly on the type of data being
sought and partly on the data collection method to be used. The choice lies
between two extremes. On the one hand, there is the highly structured
questionnaire in which all questions and answers are specified and comments, in
106
the respondents' own words are held to a, minimum. At the other end is the
unstructured questionnaire in which the interviewer is provided with a general
brief on the sort of information to be obtained but the exact question is largely his
own responsibility.
The unstructured questionnaires are useful in carrying out in depth
interviews where the aim is to probe for attitudes and reasons. They may also be
effectively employed in pre-testing, the result of which can be used as a basis for
constructing a structured questionnaire at a later stage. Thus, in order to ascertain
the expectations of the television viewers about a programme, interviews may be
conducted with an unstructured questionnaire. The resulting range of answers can
then be used to prepare a structured questionnaire for use in the main part of the
study.
The main disadvantage with an unstructured questionnaire is that it requires
a personal interview. It cannot be used in the mailed questionnaire method of data
collection.
A structured questionnaire usually has fixed alternative answers to each
question. They are simple to administer and relatively inexpensive to analyse. The
questionnaires have, however, their own limitations. It is not possible to record the
responses made by the respondents in their own words. They are considered
107
inappropriate in investigations where the aim happens to be to probe for attitudes
and feelings.
2. The question sequence
The introduction to the questionnaire should be as short and simple as
possible. The introductory letter accompanying the mailed questionnaire should
also be made very brief. The introduction lays the foundation for establishing
rapport with the respondent in addition to making the interview possible.
Once the rapport is established the questions will generally seek substantive
information of value to the study. As a general rule, questions that put too great a
strain on the memory or the intellect should be reserved till later. Likewise,
questions relating to personal wealth and personal character should be avoided in
the beginning.
Following the opening phase, should come the questions that are really vital
to the interview. Even here, substantive questions should be surrounded by more
interesting ones in order that the attention does not slip. Awkward questions, which
create the risk the respondent discontinuing the interview, are usually relegated
toward the end. By the time the interview has been terminated, some information is
available with the interviewer.
Ideally, the question sequence should conform to the respondents' way of
thinking, and this is where unstructured interviews are highly advantageous. The
108
interviewer can rearrange the order of the questions to fit the discussion in each
particular case. With a structured questionnaire the best that can be done is to
determine with pre-testing the question sequence which is likely to produce good
rapport with most people.
3. The question wording
It has been stated that the question wording and formulation are more of an
art than a science. Science does enter, however, in testing the stability and the
adequacy of replies for business and management decisions. The wording of the
questions should be impartial so as not to give a biased picture of the true state of
affairs. Colourful adjectives and undue descriptive phrases should be avoided. In
general the questions should be worded such that a) they are easily understood b)
they are simple c) they are concrete and conform to the respondents' way of
thinking.
Multiple choice questions constitute the basis of a structured questionnaire,
particularly in a mailed questionnaire method. But in addition to these questions
various open ended questions are generally inserted to provide a more complete
picture of the respondent's feelings and attitudes.
Thus construction of a questionnaire is of great importance in social
research. The following points are to be kept in mind while constructing a
questionnaire.
109
• The most important aspect of questionnaire design is to define the
research problem, to be tackled and framing the questions as per the
requirement. Lengthy and rambling questionnaires are demoralizing for
the interviewer and the respondent. The questionnaire should not be
longer than is absolutely necessary for the purpose.
• The questionnaire must introduce each topic in such a way that it appeals
to the perceptions of the respondent It must meet the respondent's
intention for reasonableness and logic.
• The questions should be practicable. No attempt should be made to
ask a person's opinion about something he does not understand or about matters
which are personals emotional.
• Unrealistic assumptions should not be made. Questionnaires should not
assume that the respondent necessarily possess any knowledge or an opinion on the
survey subject.
• The vocabulary and syntax of the questionnaire should offer the respondent
maximum opportunity for complete and accurate communication of ideas between
interviewer and respondent. The language of the questionnaire must be close to the
respondents.
• Questions should consist of simple words, which will convey the exact
meaning. Ambiguous and vague words should be avoided.
110
• It is not desirable to use leading questions. Questions should be placed in such
a way that they do not contain any suggestions as to the most appropriate response.
• Respondents should not be compelled by such questions to give socially
unacceptable responses.
• As for as possible each question should be concerned with a single issue or a
single reference.
• Questions should be so arranged that they make the most sense to the
respondents. The sequence of questions may be determined by the “formal
method”. Panel method is an approach of asking the most general, broad and
unrestricted questions first and allowing it with successively narrow or restricted
questions.
• Some respondent like to remain unidentified and unrecognized. The
respondents feel free to answer if anonymity / confidentiality is maintained.
• A questionnaire designer has ensure that all the necessary items are duly
incorporated in the questionnaire. The researcher may take the help of a standard a
check list to see that all the required items are included in the questionnaire. A
check list can also be prepared by the researcher himself.
111
FORMAT OF A GOOD QUESTIONNAIRE
The following specimen questionnaire incorporates most of the qualities of a
good questionnaire. The first questionnaire relates to Spencer Plaza. It is aimed at
evolving better ways of providing shopping facilities to the consumers.
ADVANTAGES OF A QUESTIONNAIRE
One cannot know by observation, why a buyer makes particular purchases or
what his opinion about a product is. Compared with either direct observation or
experimentation, surveys yield a broader range of information and are effective for
producing information on socio-economic characteristics, attitudes, opinions,
motives etc and to gather information for planning product features, advertising
copy, advertising media, sales promotions, channels of distribution and other
marketing variables. Questioning is much more cheaper than observation.
DISADVANTAGES
1. Unwillingness of respondents to provide information. This requires
salesmanship on the part of the interviewer. The interviewer may assure that the
information will be kept secret. Motivating respondents with some token gifts
often yields results.
2. Inability of the respondents to provide information this may be due to
i. Lack of knowledge
ii. Lapse of memory
112
iii. Inability to identify their motives and provide 'reasons why' for their actions.
3. Human biases of the respondent: i.e., Ego etc.
4. Semantic difficulties: It is difficult, if not impossible to state a given question
in such a way that it will mean exactly the same thing to every respondent.
Similarly, two different wordings of the same questions will frequently generate
quite different results. These limitations can be controlled to some extent by:
i. Careful phrasing of questions.
ii. Careful control of data gathering by employing specially trained
investigators who will observe carefully and report on the suitable reactions
of the person interviewed.
iii. Cautions interpretation by clearly accesting of the limitations e data and an
understanding of what exactly the data represents. This is especially true of
responses to questions like.
"What price would you be willing to pay for this product?"
iv. Looking at facts in related rather than absolute terms. A survey showed that
60% of families in the middle income group used toothpaste. Taken by
itself, in the absolute sense, the results of the survey are in some doubt
because the question asked encountered an obvious bias. But if this 60% is
looked at on a relative basis, viz. the corresponding figure of 60% for upper
income group families, a more meaningful and significant interpretation can
113
be made, even though the individual figure for each group may be slightly
inflated.
ADVANTAGES OF SCHEDULES
Both questionnaire and Schedule are popularly used methods of collecting
data in research surveys. They have much resemblance but from the technical point
of view, there is some difference between the two and the following are some of
the advantages of a schedule.
Higher Responses
In case of a schedule, since a research worker is present and he can explain
and persuade the respondent, response rate is high. In case of any mistake in the
schedule the researcher can rectify it.
Saving Of Time
While filling the schedule the research may use abbreviations or short terms
for answers, He may also generate a template. All these steps help in saving of
time in data collection.
Personal Context
In the schedule method there is a personal context between the respondent
and the field worker. The behavior and character of the respondent obviously
facilitates the research work.
114
Human Touch
Sometimes reading something does not impress as much as when the same is
heard or spoken by experts as they are able to lay the right emphasis. This greatly
improves the response.
Deeper probe
Through this method it is possible to probe deeper into the personality,
living conditions, values, etc, of the respondents.
Defects in sampling are detected
If there are some defects in the schedule during sampling, it easily comes to
the notice of the researcher and can be rectified by the researcher.
Removal of doubts
Presence of the enumerator removes the doubts in the minds of respondents
on the one hand and helps avoid artificial replies from the respondent owing to fear
of cross checking on the other hand.
Human elements make the study more reliable and dependable.
The presence of human elements makes the situation more attractive and
interesting which helps in making the interview useful and reliable.
115
LIMITATIONS OF A SCHEDULE
In spite of the advantages, schedules have the following limitations.
1. Costly and Time Consuming
This method is costly and time consuming because of its basic requirements
of interviewing all the respondents. This becomes a serious limitation when
respondents are not found in a particular region but are scattered over a wide area.
2. Need for Trained Field Workers
The schedule method requires involvement of well trained and experience
field workers. This involves great cost and sometimes workers are not easily
available forcing engagement of inexperienced hands. This defeats the purpose of
the research.
3. Adverse Effect of Personal Presence
Sometimes the personal presence of enumerators becomes an inhibiting
factor. Many people despite knowing certain facts may not say them in the
presence of other persons.
4. Organisational Difficulties
If the field of research is dispersed, it becomes difficult to organize it.
Getting trained manpower, assigning them duties and then administering the
research is a very difficult task.
116
PILOT STUDY
It is difficult to plan a major study or project without adequate knowledge of
its subject matter, the population it is to cover, their level of knowledge and
understanding and the like. What are the issues involved? What are the concepts
associated with the subject matter? How can they be operationalised? What method
of study is appropriate? How long will the study take? How much; | money will it
cost? These and other related questions call for a good deal of knowledge on the
subject matter of the study and its dimensions. In order to gain such pre-knowledge
of the subject matter of an extensive study, a preliminary investigation is
conducted. This is called a „Pilot study‟.
A pilot study is a "small scale replica" of the main study. It is the rehearsal
of the main study. It covers the entire process of research viz. preparation of a
broad plan of the study, construction of tools, collection of data, processing and
analysis of data and reporting.
FUNCTIONS OF A PILOT STUDY
A pilot study fulfills one or more of the following purposes
1. It provides a better knowledge of the problem under study and its
dimensions.
2. It provides guidance on conceptualization.
3. It assists in discovering the nature of relationship between variables and in
117
formulating hypothesis.
4. It shows the nature of the population to be surveyed and the variability
within it.
5. A pilot study shows whether the available sampling frame from which
sampling is to be drawn is adequate, complete, accurate, up to date and
convenient.
6. It shows the adequacy of the tool for data collection.
7. It helps the researcher to develop an appropriate plan of analysis.
8. It provides data on the relative suitability of alternative methods of collection
of data, their relative cost, accuracy and response rates to make a sensible choice.
9. it helps the researcher to identify field problems to be encountered.
10. it provides information for estimating the probable cost and duration of the
main study and of its various stages.
11. Above all, it helps the researcher to determine whether or not a more
substantial study is warranted.
SIZE AND DESIGN OF PILOT STUDY
The size, scope and design of the pilot study are a matter of convenience
time and money. It should be large enough to fulfill the above functions and the
sample should be of a comparable structure to that of the main study. It should be
designed so as to ensure a testing of alternative methods of data collection,
118
ordering of the questions, wording and the like. It should succeed in disclosing the
significant difficulties to be guarded against.
In the light of the outcome of the pilot study, if it is found that the main
study is worth understanding, then it is adequately designed on the basis of the
results of the pilot study and the lessons drawn from its experience.
PRE-TESTING
While a pilot study is a full fledged miniature study of a problem, pre-test is
a trial test of a specific aspect of the study such as method of data collection or data
collection instrument- interview schedule, mailed questionnaire or measurement
scale.
Before the final form of the questionnaire is adopted it is desirable to carry
out a preliminary experiment on a sample basis. When questionnaires are to be
distributed on a large scale, it is absolutely essential to pre-test them. There are
many advantages of pre-testing the questionnaire, such as:
1. The investigator can find out what are the drawbacks of the questionnaire,
i.e., which questions ought to be deleted and which more might to be added.
2. An idea can be formed about the extent of non-responses likely to happen.
3. Greater co-operation of the informants can be secured. Even persons most
allergic to writing can, with proper inducement, be prevailed upon to answer the
questionnaire. It is the surveyor‟s job to find out what these appeals are.
119
While pre-testing the questionnaire, it is important always to cover a cross-section
of the population eventually to be surveyed. When the sample is drawn, it should
be broken down into various sub-samples by taking, for instance, every tenth or
every hundredth case from the entire list.
The work of pre-testing the questionnaire must be done with utmost care and
caution, otherwise unnecessary and unwanted changes may be introduced. Proper
testing, revising and re-testing the questionnaire would yield high dividends.
If time and budget permit, a second pilot study should be undertaken on a
fresh sample of respondents to further improve the final document.
120
CHAPTER - VIII
ANALYSIS AND PROCESSING OF DATA
MEANING
After the collection of research data, an analysis of the data and the
interpretation of the results are necessary. Analysis of data comes prior to
interpretation. But these two operations are so lined up that they cannot be
regarded as two separate operations. There is something more crucial than the facts
and figures in research. The purpose of analysis is to build up a set of empirical
models where the relationships involved are carefully brought out so that some
meaningful inferences can be drawn. Analysis of data is to be made with reference
to the purpose / objectives of the study and its possible bearing on the scientific
discovery. An analysis is made with reference to the research problem at hand or
the hypothesis. Some authors consider processing a necessary prerequisite for
analysis. But many maintain that analysis of data involves processing. In other
words, these two operations can be simultaneously done.
DATA ANALYSIS
The first step in the analysis of data is the critical examination of the
gathered data during the study. The analysis is made with a view to find out some
significance for a systematic theory and some basis for a broader generalization.
For analyzing data, a context type analysis process is helpful. Analysis involves the
121
verification of the hypothesis or the problem. Without proper analysis, data
remains a meaningless heap of materials. Analysis involves the representation of
data, which can be done by tabulation sequentially and analysis requires logical
organization of data; otherwise, logical results cannot be achieved.
EDITING
The first step in processing of data is editing of complete schedules /
questionnaires. Editing is a process of checking to detect and correct errors and
omissions. Editing is done at two stages; first at the field work level and second at
office level.
Field Editing
During the process of interviewing, the interviewer cannot always record
responses completely and legibly. Therefore after each interview is over, he should
review the schedule to complete abbreviated responses, rewrite illegible responses
and correct omissions.
Office Editing
All completed schedules / questionnaires should be thoroughly checked in
the office for completeness, accuracy and uniformity. The first aspect to check is to
see whether there is an answer to every question. If there is an omission, the editor
can deduce the proper answer from other related data on the instrument. Apart
from checking for omissions, the accuracy of each recorded answer should be
122
checked. Arithmetic errors can be easily corrected. Clear inconsistencies should be
rectified. In editing another keen lookout is for any lack of uniformity in
interpretation of questions and instructions by the interviewers. Errors of this type
should be carefully examined and rectified. In order to examine the relationship
between answers to different questions and to detect inconsistencies, it is better to
edit the schedule / questionnaire as a whole. In making the corrections, original
entries should not be erased but just crossed out with a single line. Corrections may
be made in some distinctive colour and should be initiated.
As the editing involves a careful scrutiny of the questionnaires/ schedules,
they should be checked for 1) completeness.2) leglbllty.3) comprehensibility, 4)
consistency, 5) uniformity, 6) relevance.
Categorisation and classification
The edited data are classified and coded. The responses are classified into
meaningful categories so as to bring out their essential pattern. By this method,
several hundred responses are reduced to five or six appropriate categories
containing critical information needed for analysis. The collected data are not
amenable to analysis, unless an appropriate scheme of classification is used. The
set of categories is referred to as a 'coding frame'.
Classification can be done at any phase prior to the tabulation. Certain items
like sex, age, type of house and the like are structured and pre-classified in the data
123
collection form itself. The responses to the open ended questions are classified at
the processing stage. It is preferable to include many categories rather than a few,
since reducing the number later is easier than splitting an already classified group
of responses. However, the number of categories is limited by the number of cases
and the anticipated statistical analysis.
CODING
Coding means assigning numerals or other symbols to the categories or
responses. The purpose of symbols is to translate raw data into information that
may be counted and tabulated. The task of a coder is to give proper codes to the
responses. The respondent himself is a coder when he is denoting his position in a
particular claim or category (e.g., unemployed or employed) secondly, coding can
be done by the observers the interviewer at the time of data collection. Thirdly, the
officially appointed coder may perform the job of coding. There may be many
difficulties in coding due to the inadequacies of data, inefficiency of the coder and
lack of editing or scrutiny of the available data. Editing can be very helpful for
coding and for the improvement of the quality of data collection.
The coding instructions with their assigned symbols together with specific
coding instructions may be assembled in a book. The code book will identify
specific items of a variable/ observation and the code number assigned to each
124
category of that item. If the data are to be transferred to machine punch-cards, the
code book will also identify the column in which it is to be entered.
There are many things that may operate to make the judgements of coders
unreliable. Some of the factors may arise from the data to be categorized, some
from the nature of the categories that are to be applied and still others, from the
coders themselves. Many of the difficulties that occur in coding result from the
inadequacies of data. Frequently, the data do not supply enough relevant
information for reliable coding. This could be due to inadequate data collection
procedures. These difficulties, however, can generally be overcome by careful
editing of data.
TRANSCRIPTION
When only a few schedules are processed and hand-tabulated, tabulation can
directly be made from the schedules. On the other hand, direct tabulation from the
edited schedules/ questionnaires is difficult if the number of the schedules and the
number of response in them are large. Suppose an interview schedule contains 180
responses requiring tabulation and 210 simple and cross tables are to be
constructed, each schedule has to be handled at least 210 times for tabulation. This
will result in mutilation of the schedule, and omissions and commissions may
easily occur in tabulation. In order to avoid these drawbacks, data contained in
125
schedules/ questionnaires are transferred to another material for the purpose of
tabulation. This intermediary process is called 'transcription'.
The materials to be used for transcription depend on the method of
tabulation-manual or mechanical. Long work sheets, sorting cards or sorting strips
are used for transcription when tabulation is done manually, and punch cards or
magnetic tape (or discs) are used in a system of machine sorting and tabulation.
DIAGRAMS
Diagrams are generally more attractive, fascinating and impressive than the
any of numerical data. They are more appealing to the eyes and leave a much
lasting impression on the mind compared to the dry and uninteresting statistical
figures. Even a lay man who has no statistical background can understand them
easily. They are more catchy and as such are extensively used to present statistical
figures and facts in most of the exhibitions, trade or industrial fairs, public
functions, statistical reports, etc. The human mind has a natural craving and love
for beautiful pictures and this psychology of the human mind has been thoroughly
by the modern advertising agencies who bring out their advertisements in the shape
of attractive and beautiful pictures.
Accordingly diagrams and graphs have universal applicability. They register
a meaningful impression on the mind almost before we think. They also show a
number of them as very little effort is required to group them and draw meaningful
126
inferences from them. An individual may not like to through a set of numerical
figures but may pause for a white to have a glance at the diagrams or pictures. It is
for this reason that diagrams, graphs and charts find a place almost daily in
financial / business columns of the newspapers, economic and business journals,
annual reports of business houses etc. Graphs reveal the trends, present in the data
more avidly than the tabulated numerical figures and also exhibit the way in which
the trends change. Although this information is exhibited in a table it may be quite
different and time consuming (and sometimes may be impossible) to determine the
existence and nature of trends from the tabulation of data.
TABULATION
After the transcription of data is over, data are summarized and arranged in a
compact form for further analysis. This process is called „tabulation‟. Tabulation is
a process of orderly arrangements of data into series of rows and columns where
they can be read in two dimensions; Tabulation is essential to represent a particular
result of enquiry investigation. As has already been mentioned, a mass of data
termed as 'raw materials' can express no meaning without being tabulated. In
reality, tabulation is a process of representation of data.
Sometimes a question is asked, why is tabulation done? The reasons are manifold
i. it simplifies the comparison of data,
ii. it makes the study complete and accurate in every respect,
127
iii. it makes the comparison easier together with saving a lot of time and energy.
iv. it makes statistical treatment a possibility and
v. finally, tabulation makes the 'affairs' easily intelligible; which, in the absence
of tabulation, becomes not only difficult to understand but almost impossible
to give a practical shape.
Construction of Tables
After the data have been tabulated, they are arranged in statistical tables in
vertical columns and horizontal rows according to some classification. Tables
provide a "shorthand" summary of data. The importance of presenting statistical
data in tabular form needs no emphasis. Tables facilitate comprehending masses of
data at a glance; they give a visual picture of relationships between variables and
categories; they facilitate summation of items and detection of errors and
omissions; and they provide a basis for computations.
It is important to make a distinction between the general purpose tables and
the special tables. The general purpose tables are primary or reference tables
designed to include large amount of source data in convenient and accessible form.
The special purpose tables are analytical or derivative ones which demonstrate
significant relationships in data or the results of statistical analysis. Tables in
reports of government on population, vital statistics, of agriculture, industries etc.
are of the general purpose type. They represent extensive repositories of statistical
128
information. Special purpose tables are found in monographs, research reports and
articles, and are used as instruments of analysis. In research we are primarily
concerned with special purpose tables.
Components of Tables The major components of a table are:
A. Heading
1. Table Number
2. Title of the table
3. Designation of units
B. Body
Sub-head- headings of all rows or blocks of sub items Boxhead - headings of all
columns or main captions and their sub-captions
3. Field or body - the cells in rows and columns
C. Notations
1. Footnotes, if necessary
2. Source
Principle of Table construction
There are certain generally accepted principles of rules relating to the
construction of tables. They are
1. Every table should have a title.
2. Every table should be identified by a number to facilitate easy reference.
129
3. The captions should be clear and brief.
4. The units of measurement should always be indicated.
5. Any explanatory footnote concerning the table itself should be are placed
beneath the table
6. The specific source of the data should be indicated just below the table.
7. The columns may be numbered to facilitate reference.
8. All columns should be properly aligned.
9. The arrangement of the categories in a table may be chronological,
geographical, alphabetical or according to magnitude. Abbreviations should be
avoided.
10. The table should be made as logical, clear, accurate and simple as possible.
Types of Tables
Normally two types of tables are used to represent the data in research viz.
one-way tables and Two-way tables.
One way tables
Frequency tables present the distribution of cases on only a single dimension
or variable. For example, distribution of respondents by sex, religion, socio-
economic status and the like are shown in one way tables. Following is an
illustration of a one way table.
130
Distribution of respondents by sex
Sex No. of respondents %
Males 130 65
Females 70 35
Total 200 100
Two way tables
Distribution in terms of two or more variables and the relationship between
two variables are shown in two-way tables. The categories of the one variable are
presented, one below another, on the left margin of the table and those of another
variable at the upper part of the table, one by the side of another. The cells
represent particular combinations of both variables. To compare the distribution of
cases raw numbers are converted into percentages based on the number of cases in
each category as shown below.
Socio-economic status of the respondents
Extent of participation
Low Medium High
Categories No. of %
Respon-
dents
No. of %
respon-
dents
No. of %
respondents
Total
Rural area 65 41.9 83 56.8 2 1.3 155 Urban area 4 10.3 33 84.6 2 5.1 39
131
TYPES AND GENERAL RULES
The most commonly used graphic forms may be grouped into the following
categories.
1. Line graphs and charts
2. Bar charts
3. Segmental representation
4. Pictographs.
The general rules to be followed in graphic representation are
1. The chart should have a title placed directly above the chart.
2. The title should be clear, concise, and simple and should describe the nature
of the data.
3. Numerical data upon which the chart is based should be presented in an
accompanying table.
4. The horizontal line measures time or independent variable and the vertical
line the measured variable.
5. Measurements proceed from the left to right on the horizontal line and from
bottom to top on the vertical line.
6. Each curve or bar on the chart should be labeled. In case of more than one
such curve /bar, they should be clearly differentiated from one another by a distinct
pattern or colour.
132
7. The zero point should be represented and the scale intervals should be equal.
8. Graphic forms should be used sparingly. Too many forms detract from the
representation.
9. Graphic forms should follow, not precede, the related textual discussion.
LINE GRAPHS
The line graph is useful for showing changes in data relationships over a
period of time. In this graph, figures are plotted in relation to two intersecting lines
or axes. The horizontal line is called the 'abscissa' or X axis and the vertical, the
'ordinal' or Y axis. The point at which the two axes, intersect is zero for both 'X'
and 'Y'. The 'O' is the origin of coordinates. The two lines divide the region of the
plane into four sections known as quadrants, which are numbered anticlockwise.
Measurements to the right and above '0' are positive (plus), and measurements to
the left and below 'O' are negative (minus).The independent variable is represented
by 'X' axis and the other variable by 'Y' axis. The following Illustration shows this
type of graph.
Y
M I P
L
x O M ABSCISSA x
133
Histogram
This is another form of the line chart used for presenting a frequency
distribution.
It is constructed by erecting vertical lines on the limits of the class intervals
marked on the base line. The vertical lines are drawn from a series of contiguous
rectangles or columns. The width of each rectangle represents its class interval, and
the height represents class frequency. The following is an illustration of a
Histogram.
70
60
50
40
30
20
10
0
90 100 110 120 130 140 150 160 170
FREQUENCY POLYGONS
It is often more convenient to draw a frequency polygon instead of drawing
a histogram of a distribution. In laying out a frequency polygon the frequency of
each class is located at the midpoint of the interval and the plotted points are then
connected by straight lines. If two or more series are shown on then same graph,
the curves can be made with different kinds of ruling. If the total number of cases
134
in the two series is of different sizes, the frequencies are often reduced to
percentages. The frequency polygon is particularly appropriate for portraying
continuous series.
It is sometimes desirable to portray the data by a smoothed curve. The chart
is then called a frequency curve.
Frequency polygon gives an instant picture of a frequency distribution and
shows whether the distribution is normal or otherwise. For example, if the peak of
the frequencies occurs in the middle, the distribution is relatively a normal one. If
the peak occurs towards one end or the other, then the distribution is skewed.
OGIVE
The Ogive is a line chart plotted on arithmetic graph paper from a
cumulative frequency distribution which may be cumulated downward or upward.
It is useful in representing population, per capita income, per capita earnings etc.
135
Two or more distributions may be compared by converting the data of the
distributions to percentages of the total, then cumulating the percentages and
plotting the ogives on the same grid. The differences in steepness and shape of the
ogives facilitate comparative observations as shown below.
LORENZ CURVE
The Lorenz curve is a line chart used to compare the proportionality of two
quantitative variables. It is commonly used to show the degree by which the
distribution of income per family departs from the distribution of the number of
families. It shows that a disparate proportion of the income goes to a few families.
The curve is plotted from cumulative percentages of the total of each of the
variables. The independent variable viz. the cumulative percentages of the total
number of employees are marked along the base line and the cumulative
percentages total income (i.e. the dependent variable) are plotted on the vertical
136
line. The "line of equal distribution" is a line which would have been obtained if
the two distributions had been proportional. The line designated 'actual
distribution' reveals that the distribution of income is unequal compared to the
distribution of the number of employees. The more the 'actual distribution' line
departs from the 'line of equal distribution', the more unequal is the Y variable in
terms of the X variable and the more disproportionate is the relationship.
BARCHARTS
These charts consist of either vertical or horizontal bars to represent
variables. The length of the bars varies correspondingly to the values of the
variable. Bar charts are a most effective pictorial device for comparing data. The
bars may be depicted on solid blocks or in patterns of dots, dishes etc. They may
be of different forms; 1) linear or one-dimensional, 2) areal or two-dimensional,
and 3) Cubic or three dimensional. The actual numerical values may be shown on
the X- axis or Y- axis, as the case may be, or at the immediate ends of the bars.
137
Vertical Bar charts: They consist of vertical bars or columns erected on the
horizontal line and the values of the bars are shown on the Y-axis as illustrated
below. The vertical charts are commonly used for presenting time series data.
Horizontal Bar charts are commonly used for presenting qualitative and
geographical distributions as shown below.
Component Bar charts This is employed to show comparison involving two or
more variables on a single chart. This may consist of either horizontal bars or
vertical bars. This type of chart shows not only variations in total values, but also
components of the respective totals.
0
2
4
6
1stQuarter
2ndQuarter
3rdQuarter
4thQuarter
East
West
North
138
Principles in designing Bar charts In constructing bar charts the following
principles should be adopted.
1. The bars should be arranged in some systematic order; in chronological
order in a presentation of time series; according to magnitude, starting
with the largest, etc.
2. The bars should be of uniform width and properly adapted to the overall
size, proportion, and other features of the chart.
3. A scale should be included in every bar chart. The number of intervals on the
scale should be adequate for measuring distances but not too numerous to cause
confusion. The intervals should be indicated in round numbers.
4. The stubs or designations for the various categories of a bar chart should be
clearly indicated to the left of the vertical base line.
Pie or circle charts
The circle or pie chart is a component parts chart. The component parts form
the segments of the circle. The circle chart is usually a percentage chart. The data
139
are converted to percentage of the total; and the proportional segments, therefore,
give a clear picture of the relationship among the component parts.
PICTOGRAMS
A pictogram is a variation of the bar chart. In it the values are represented by
the identical symbols or pictures. Each one represents a fixed size of the variable.
The symbols used may be appropriate to the type of data. For example, pictures of
human beings can be used for depicting population data. The pictogram is used for
qualitative distributions and for time series distribution as well.
Whatever the graphic devices we use to represent our data, we must draw them
accurately; otherwise we may allow the readers to infer unintended meanings.
140
CHAPTER – IX
TEST OF SIGNIFICANCE
INTRODUCTION
An investigator is interested in ascertaining before he uses the results as to
what extent the results are reliable and may be used for drawing conclusions in a
given situations. For this, either the results may be compared with some parameters
or they may be compared with the results of some other sample study covering a
similar field of investigation done under similar conditions. If the two results are
identical or there is very little difference between the two, this may be taken as an
indication of their reliability. However, this cannot be taken as the conclusive
proof of their reliability. Tests of significance are used for testing the reliability of
the results under such conditions.
For the purpose of testing the significance of sample results certain tests are
used. These tests, depending upon the nature of data, are broadly put into two
categories - tests for the variables and tests for the attributes.
TESTS OF SIGNIFICANCE OF VARIABLES
Variables are features or items, which can be measured and expressed in
quantities. For instance the production of a crop in a month, height of a person,
income of an institution, sales revenue of a company, etc. for testing the
141
significance of the sample results. In case of variables these are classified as large
sample and small sample.
TESTING THE SIGNIFICANCE
After obtaining the results on the basis of analysis of data or observations, it
becomes necessary to test its significance. No valid conclusion can be drawn if
results are not significant. The sequence of steps followed in the process of testing
the significance of the results is
• Formulation of null hypothesis or alternative hypothesis.
• Specification of the level of significance
• Selection of the test statistics
• Establishment of the decision criteria
• Computation of the test statistics of the sample data
• Decision about the results including, accepting or rejecting the hypothesis.
For testing the significance of results, two types of methods are available
viz. Parametric methods and Non-parametric methods.
ASSUMPTIONS ABOUT PARAMETRIC AND NON-PARAMETRIC
TESTS
Parametric tests The classical statistical methods were based on the
conditions of normal distribution. Parametric methods were thus developed on the
assumption that the underlying distribution was normal, exponential etc. Important
142
parametric tests used for testing the significance are “t” test, “f” test, “z” test etc.
With these tests the observed values and their distribution and conclusions are
drawn on the basis of the nature and extent of difference between the two.
Non-parametric tests: Non-parametric methods, also known as distribution free
methods, have no assumption about the underlying distribution. These methods can
be used regardless of the shape of underlying distribution. For small sized samples,
non-parametric tests may be the only alternative. Non-parametric tests can be
applied even in case of nominal scale and ordinal scaled data. These tests are also
easy to apply.
Important non-parametric tests used for testing the significance are.
• Median test
• Wald-Wolfowitz Runs test
• Sign test
• Wiicoxon matched-pairs test, (signed Rank test)
• Chi-square test
• Mann-Whitney 'IT test
• Kolmogorov Smirnov test (KS test)
• Kruskal Wallis test, etc.
143
PARAMETRIC TESTS
‘t’ Test : Large sample
In case of large samples the significance of a single measure or two
measures can be tested. In case of two measures, a comparison can also be made
between two measures. This can be illustrated taking an example of mean or any
other static* In case of single measure a comparison can be made between the
actual and the expected or the sample measure and the parameter, whereas in case
of two measures a comparison can be made between these two variables. The
measures of significance known as ratio of significance is determined in case of
mean as under.
' 'x
tx
Where t=ratio of significance
x = sample mean.
= mean of universe
x = standard error of mean.
Small samples
In case of small samples also, to test the significance of sample measures, test of
significance are applied. The ratio of significance is used as a measure of
144
significance in case of small sample also. The ratio of significance („t‟) in case of
small samples is determined as
' ' sxt
n
Where, t = ratio of significance,
x = mean of sample
= mean of universe
Standard Error
The standard deviation of a sampling distributions is termed as standard
error. It is a measure of variability in the sample results due to chance. To
understand standard error we should understand the sampling distribution. If a
large number of random samples are selected from a large population of definite
size and some statistic mean, standard deviation, correlation, etc., is calculated for
each sample and these values are put in the form of a frequency distribution, the
distribution so formed is known as sampling distribution.
Uses of standard error
The standard error is used for testing the significance of the results in the
following manner.
145
1. To Test Given Hypothesis
The hypothesis is generally tested at 5% or at 1 % level of significance. If
the difference between the observed and the expected results is more than 1.96 or
2.58 times the standard error, then the difference is not accepted as arising due to
chance at 5% or at 1 % level of significance. The difference is taken as significant
or due to reasons other than chance, putting a question mark on the reliability of
the data.
2. Measures of Reliability
The standard error is used as a measure of reliability of the sample. Greater
the standard error, lesser is the reliability of the sample. To double the reliability,
therefore, the number of observations shall have to be increased 4 times.
3. Determination of limits of variation
The standard error is used to determine the limits within which the
parameter values are expected to lie. Thus x 2σ x gives the two limits within which
99.73% of the cases are expected to lie. Likewise, x + 1.96 σ x gives the limits of
variation within which 95% of the cases are expected to lie.
CRITICAL VALUES
Critical value is a value in normal distribution which indicates the number
of times of the standard error in a normal distribution expected to cover a certain
area of the universe.
146
Standard Error of Difference between Two Sample Means
When the two is not correlated,
1. When parameter standard deviation (σ p) is given
1 2
1 2
1 1( ) px x
n n
3. When (σ p) is not given
In case the parameter standard deviation is not given, either its value may be
estimated from the given information or, the combined standard deviation may be
taken as the best estimate of the parameter standard deviation. Thus,
Estimated value of
(i) Combined standard deviation 2
2 2
1 21 2
1
( )xn n
3. When the difference between the average of the two sample means is
to be compared with the mean of anyone sample
22 21 12 2
1 1 2) 1 1 2
( ) , ( ) ( ) ( )( ( )
p p
n nx or x
n n n n n n
1 2
1 1.p s
n n
147
Where , 2 2
1 1 2 2( ) ( )p x x x x
4.In case the variables have correlation
2 2
1 2 1 21 2
1 2 1 2
( ) 2x rn n n n
STANDARD ERROR OF DIFFERENCE BETWEEN TWO STANDARD
DEVIATIONS
1. When 2
1 2 1 1 2( ) / 2(1/ 1/ )n n or
1 2(1/ 1/ )/ 2
p
p
n nn
2. When p is not given
2 2
1 2 1 1 2 2( ) / 2 / 2n n
3. When the standard deviation of one sample is to be compared with the
combined standard deviation of the two samples.
1 2 1 2 1 2( ) / 2 ( / )( )n n n n
148
Ratio of significance can also be used for testing the significance of difference in
two standard deviations in above cases.
Ratio of significance
1 2
1 2)(
Small Samples
In case of large samples while the values of normal distribution may be
directly used, these cannot be used, for testing the significance, for small samples.
To meet this situation Prof.William Gusset, who was student of Prof. R.A. Fisher,
had in a paper published in 1905 derived a theoretical distribution which came to
be known as student's “t” distribution. It approximates normal distribution as the
size of the sample increases.
The “t” distribution can be used for testing the significance of results in case
of small samples. The test is applied in the same way as in large samples.
The equation is stated as
( )' '
x rt n
s
where, x = sample mean
s r = mean of universe
2
2 1 1 2/ 2( / )( )n n n n
149
s = standard of deviation of the sample, when
2( )
1
x Xs
n
n= number of observations.
The distribution curve of the statistic “t” is
0
2( )
1(1 )
2
yF t
t v
v
Where , v= degree of freedom, and yo is such that the total area is unity.
As v tends to infinity, 2
1
1(1 / )
2
vt r
tends to e-t1/2
and hence the “t” distribution
approximates normal distribution.
Properties of "t" distribution
The following are the important properties of the T distribution.
“t” distribution ranges from -a to a, just as does a normal distribution.
“t” distribution is symmetrical with mean, median and mode=0
't' distribution has greater dispersion than the normal distribution. As the n
becomes large, it approaches the normal distribution. Where n is 30 or large,
the difference is very small as shown in the following figure
150
t" distribution in case of different sizes of sample
There are two types of t-tests, one is called t-test for independent samples,
and the other is called paired t-test. The first test is used for the scores of the
other group.
That means there is no logical relationships between the scores that have
been obtained for one group when compared with the other group. However, both
the tests are used to assess significance of difference.
THE PAIRED T-TEST
The data for this example is taken from school social workers. The school
social workers want to list the effects of intervention to improve the self-esteem of
children. A group of 10 children was randomly selected from a class. Rating scales
before the intervention assessed the self esteem of children. The scale had a
scoring range from 5 to 15. After the intervention, the school social worker
administered the same rating scales. The pre-intervention and post-intervention
scores are shown below in the given table.
151
z Pre-test
scores(x)
Post-test
scores(y)
D=(y-x) D2
1 1 9 12 3 9 2. 8 10 2 4 3 15 15 0 0 4 12 14 2 4 5 8 14 6 36
6 4 11 7 49 7 6 10 4 16 8 3 8 5 25 9 3 8 5 25
10 2 8 6 36 N=10 ED=40 D
2=204
The social work researcher established a null hypothesis that there is no
difference between pre-intervention and post-intervention scores to list whether
there is a statistically significant difference. The research hypothesis is that post-
intervention scores will show improvement in self-esteem over pre-intervention.
Step Procedure Application
1 Find out the difference (D)
between the post-intervention
and pre-intervention scores.
D=Y-X
2. Compute mean difference D
xn
3. Compute square of difference D2
4. Find the sum of squares of
difference
"D2
5. Compute sum of squares (ss)
22"
DSs D
n
6. Find out degree of freedom (df) Df=n-1
7. Compute variance (s2) 2 ss
sdf
8. Compute‟t‟ by using the formula 2
' 'x
ts
n
152
Now that the value of the paired T-test has been calculated, we have to see if the
null hypothesis can be rejected. It is assumed that the intervention is likely to
improve the post-intervention scores. That mean there is directionality in the data.
Hence, we use one-tailed test.
The T-test for independent samples
The T-test for two independent samples examines the difference between
their means to see how close or apart they are. Let us consider the data given in the
table below.
The data compares scores obtained by two groups of students say, for
example, a group of master's degree students and a group of bachelor's degree
students on a scale designed to measure attitude towards AIDS patients. A social
work teacher is interested in studying whether there is significant difference among
the two groups of students with regard to their attitude toward AIDS patients. The
social work teacher selects two groups of students randomly and administers the
list. The scores at the interval level of measurement are presented in the table
153
S.No.
Scores
obtained by
masters
degree
students (x)
X2
Scores
obtained by
Bachelor's
Degree
student(y)
Y2
64 12
12
1 8 144 2 11 121 9 81
3 9 81 6 36
4 12 144 5 25
5 16 256 8 64
6 10 100 12 144
7 7 49 11 121
8 16 256 10 100
9 6 36 10 100 10 5 25 7 49
N=10 ?x=100 ?x2=1132 ?x=90 ?y
2=864
The first step in calculating T-test for two independent samples is to square
the scores of X and Y columns to get x2and y
2 values.
The next step is to sum up the columns to find out “x”,”y”,”x2”,. This is followed
by calculation of means of x and y columns, that is x and y.
Before we take the decision we have to check whether the scores of two groups
show any directionality. Since there is no indication that either set of scores
support the hypothesis. Thus, we will have to look for critical value of “t” for two -
tailed tests at .05 level of significance for 1.8 degree of freedom.
154
Difference Test
In case of paired data where the same sample is studied under two different
situations, before and after a treatment to know the effect of the treatment, the
difference test is applied. Here, there is only one sample and its two situations are
observed. Some typical examples of this type are, performance of students before
and after coaching etc. In such cases, first the two observations are recorded and
their differences are determine Taking these differences as basic data their mean
and standard deviation is determined. Then taking the hypothesis of no difference,
“t” test is applied.
SIGNIFICANCE OF DIFFERENCE BETWEEN TWO SAMPLE MEANS
The “t” test can also be used for testing the significance of difference
between the means of two independent random samples. In this case, the
hypothesis is taken that the samples are drawn from population identical both as to
their means and the standard deviations. It should be marked that this approach is
to be adopted only in case where two independent samples are drawn at random
from the same population. In case the same sample is divided into two parts; the
earlier approach should be adopted.
1 2
1 21/ 1/
x xt
s n n
Where, t= ratio of significance.
155
1x = mean of first sample
2x = mean of second sample
n1 = number of items in first sample
n2 = number of items in second sample.
S = standard deviation of difference between two sample results.
2 21 21 2
1 2
( ) ( )
2
x x x xS
n n
LIMITATIONS OF TESTS OF SIGNIFICANCE
In testing statistical significance the following points must be noted
i) They should not be used mechanically
Tests of significance are simply the raw materials from which to make decisions,
not decisions in themselves. There may be situations where real differences exist
but do not produce evidence that they are statistically significant or the other way
round. In each case it is absolutely necessary to exercise great care before taking
decisions.
ii) Conclusions are to be given in terms of possibilities and not certainties
When a test shows that a difference was statistically significant ft suggests that the
observed difference is probably not due to chance. Thus statements are not made
156
with certainty but with knowledge of probability". Unusual" events do happen once
in a while.
iii) They do not tell us "why" the difference exists
Though tests can indicate that a difference has statistical significance, they do not
tell us why the difference exists. However, they do suggest the need for further
investigation in order to reach definite answers.
iv) Serious violations of assumptions
Many, hypothesis tests are performed even though the assumptions that would
validate the procedure used are not met. There might be a difference. In addition,
an indented simple random sample can in fact turn out to be a non-probability
sample or perhaps, the populations being samples may not be normally distributed
or they may not have equal variances.
v) Non-publication of non-significant results
Results of hypothesis tests that are statistically non-significant are unlikely to be
published. This situation may have serious consequences.
Consider a null hypothesis that is in fact true and that is tested independently by
many ways at 5% level of significance.
‘F’TEST
The F-Test of the variance Ratio Test
157
The F-test is named in honour of the great statistician R.A Fisher. The object
of the F-test is to find out whether the two independent estimates of population
variance differ significantly, or whether the two samples may be regarded as drawn
from the normal populations having the same variance. For carrying out the test of
significance, we calculate the ratio F. F is defined as
where
212 1
1
1
( )
1
x xS
n
and
222 2
2
2
( )
1
x xS
n
It should be noted that s12is always the larger estimate of variance: le s1
2> s2
2
Larger estimate of variance/smaller estimate of variance
F =
Smaller estimate of variance
V1= n1-1 and V2=n2-1
V1 = degrees of freedom for sample having larger variance.
V2 = degrees of freedom for sample having smaller variance.
The calculated value of F is compared with the table value for V1 and V2 at
5% or 1 % level of significance. If calculated, value of F is greater than the table
value then the F ratio is considered significant and the null hypothesis is rejected.
On the other hand, if the calculated value of F is less than the table value the null
2
1
2
2
SF
S
158
hypothesis is accepted and it is inferred that both the samples have come from the
population having same variance.
Since “t” test is based on the ratio of two variances it is also known as the
variance ratio test. The ratio of two variances follows a distribution called the “f“
distribution named after the famous statistician R.A Fisher.
Assumption in F-Test: the F-test is based on the following assumptions.
1. Normality, i.e.: the values in each group are normally distributed.
2. Homogeneity; i.e.: the variance with in each group. This assumptions is
needed in order to combine or pool the variances within the group into a single
"with in groups" source of variation.
3. Independence of error, it states that the error (variation of each value
around its own group mean) should be independent for each value.
Z-TEST
Professor Ronald Fisher has given a method for testing the significance of
correlation of small samples. In this method, the sample „r‟ is transformed into Z.
That is why the test is called the Z test. This test may be applied in two situations.
i. To test the significance of difference between the observed and hypothetical
value of r.
Where,
11/ 2log
1e
rZ
r
,
1
3z
n
159
Co-efficient of significance
s p
z
Z Zt
ii. To test the significance of difference between the values of r of two samples.
Where, s p
z
Z Zt
1 2
1log
1e
rZ Z
r
1 2
1 1
3 3z
n n
Note: Log e is base of natural logs and is taken as equal to 1.1513 - log10.s
"Z" test is used to measure the difference between any variable value (x) and the
mean of all variable values or "x" values, which is indicated by "m", divided by the
standard deviation (s). it is based on the normal probability distribution. In
following cases, we find the usage of Z-test.
1. To judge the significance of statistical measures, particularly the mean. This
is done by comparing the observed value (test statistic) with the probable value
(table value) at a specified level of significance.
2. It is used to compare the mean of a sample with some hypothesized mean of
the population.
3. It is also used to judge the significance of difference between means of two
160
independent samples.
4. It can also be used for judging the significance of difference between sample
and population proportion or proportions of two independent samples.
5. Finally this test can also be used for measuring the significance of medium,
mode, co-efficient of correlation and other measures.
Z = Observed value of statistic (x)-expected value (μ)
Standard error of the estimate
NON-PARAMETRIC TESTSs
INTRODUCTION
Sometimes, the rigid requirement of parametric tests is not satisfied by the
data. This is particularly so in case of nominal scaled or ordinal scaled data. Under
such a situation non-parametric tests, also known as distribution free methods, can
be used for testing the significance of the results obtained from the statistical
analysis of such data. The non-parametric tests are useful under the following
conditions
1. When the nature of the population distribution from which the samples were
drawn is not known or is known to be not normal.
2. The variables are expressed in
• nominal form, i.e., classified in categories and represented by frequency 10
units or,
161
• ordinal form, i.e., ranked in order and expressed as 1,2,3,..etc.
Use of non-parametric methods
There are situations where the basic assumptions underlying the parametric
tests are not valid or one does not have the knowledge of the distribution of the
population parameters being tested. In studies involving human behaviour such as
in psychological or market research studies, such a problem is encountered. Non-
parametric tests can handle these problems. Some of the typical situations for using
non-parametric tests are
1. Not-Normal distribution
In survey, many times the responses are not distributed normally. Instead of
parametric tests based on the assumptions of normal distribution, non-parametric
tests may prove useful for testing the hypothesis.
2. Nominal scaled data
Sometimes, the responses are not numerically expressed but are given In
terms of yes or no, and mere names which cannot be treated as numbers.
3. Incomplete or partial data
Sometimes, In case of mailed questionnaires, all the information are not
filled In. Non-parametric tests can extract maximum Information-even from such
Incomplete data set by making the necessary adjustments.
162
4. Very small sample
Size of sample is significant in case of parametric tests. However, non-
parametric tests can provide reasonable good results even for very small samples.
a. One sample tests
• Runs test for randomness
• Sign test
• Chi-square test
• Kolmogorow-Smirnov one sample test
b. Two sample tests
• Median test
• Mann-Whitney U test
• Wald Wolfowitz runs test
c. Two matched pairs test
• Sign test
• Wileoxon matched pairs signed ranks test
d. K sample tests
• Median test
• Kruskal Wallis test
163
Advantages of Non-parametric tests
1. Non-parametric tests are distribution free: i.e., they do not require any
assumption to be made about population following normal or any other
distribution.
2. Generally they are simple to understand and easy to apply when the sample
sizes are small.
3. Most non-parametric tests do not require lengthy and laborious, computations
and hence are less time-consuming. If a significant result is obtained, no further
work is necessary.
4. Non- parametric tests are applicable to all types of data- qualitative, data in
rank form as well as data that have been measured more precisely.
5. Many non-parametric tests make it possible to work with very small samples.
This is particularly helpful to the researcher collecting pilot study data or to
medical researcher working with a rare disease.
6. Non-parametric methods make fewer and less stringent assumptions than do
the classical procedures.
CHI-SQUARE TEST
Defined as: The x2 test (pronounced as chi-square Test) is one of the
simplest and most widely used. The symbol x2 is the Greek Letter chl. The x
2 test
164
was first used by Karl Pearson In the year 1990. the quantity x2 describes the
magnitude of the discrepancy between theory and observation. It is defined as
22 ( )O E
xE
Where, O refers to the observed frequencies and E refers to the expected
frequencies
The x2 test has the following steps
i) State the null hypothesis and calculate the numbers in each category.
ii) Determine the level of significance the researcher is prepared to take.
iii) Calculate x2, as follows
22 ( )O E
xE
iv) Find the critical value of x2 against the number of degrees of freedom for the
specified level of significance.
v) Compare the calculated value of x2 with the tabulated value and determine the
region of reflection.
A chi-square test can be used when the data satisfies four conditions
i) There must be two observed sets of data or one observed set of data and one
expected set of data.
ii) The two sets of data must be based on the same sample size.
iii) Each cell in the data contains the observed or expected count of five or large?
165
iv) The different cells in a row or column most have categorical variables (Gender-
male, female or younger than 20 years of age, 25 years of age, older than 25 years
of age etc)
APPLICATION AREAS
When tests are undertaken to examine whether the sample data supports the
hypothetical distribution, such problems are called "test of goodness of fit", I.e.,
• To test whether the sample differences among various sample proportions are
significant or can they be attributed to chance.
• To test the independence of two variables in a contingency table.
• To use it as a test of goodness of fit.
The chi-square distribution typically looks like a normal distribution which is
skewed to the right with a long tail to the right. It is a continuous distribution with
only positive values.
CONDITIONS FOR APPLYING X2 TESTS
The following conditions should be satisfied before applying the x2 test
1. In the first place N must be reasonably large to ensure the similarity between
theoretically correct distribution and our sampling distribution of x2, the chi-square
statistic. It is difficult to say what constitutes largeness, but as a general rule x2 test
should not be used when N is less than 50, however few the cells.
166
2. No theoretical cell frequency should be small when the compacted frequencies
are too small; the value of x2 will be overestimated and will result in too many
rejections of the null hypothesis. To avoid making incorrect inferences, a general
rule is followed that expected frequency of less than 5 in one cell of a contingency
table is too small to use.
3. The constraints on the cell frequencies, if any, should be linear, i.e., they
should not involve square and higher powers of the frequencies such as
YATE'S CORRECTIONS
One of the conditions for the application of x2 test is that no cell frequency
should be less than 5 in any case, though 10 is better. This requirement is to avoid
inflated chi-square values due to the division of the square values due to the
division of the squared differences by a small size of the expected frequency.
When the theoretical frequencies are less than 10 and especially less than 5, the
ordinary table values of x2 are less reliable. This is especially true for one degree of
freedom; it is true to a lesser extent for two or three degrees of freedom. However
the error is negligible for more than three degrees of freedom.
The Yates's corrections, also called as Yates corrections for continuity are
introduced because theoretical chi-square distribution is continuous where as the
tabulated values are based on the distribution of discrete x2 statistic. The
167
corrections has the effect of reducing the calculated value of x2 as continuous
compared to the corresponding value without correction.
In a special case of 2x2 contingency table the approximation may be
improved, and bias arising out of the use of small theoretical frequencies may be
reduced, by means of a correction proposed by F.
The correction involves the reduction of the deviation of observed from
theoretical frequencies which of course reduced the value of x2. The working rule
for the application of the correction is
Adjust the observed frequency in each cell of the 2x2 table in such a way as
to reduce the absolute deviation of the observed from the theoretical frequency for
that cell by 1/2, adjustments for all the cells are to be made without changing the
marginal totals. This operation will increase F that is observed frequency, by 1/2 in
each of two cells, and will reduce F by 1/2 in each of two cells.
USES OF X 2 TESTS
The x2 test is one of the most popular statistical inference procedures today.
It is applicable to a very large number of problems in practice which can be
summed up under the following heads.
i) x2 test as a test of independence
With the help of x2 test we can find out whether two or more attributes are
associated or not. Suppose we have N observations classified according to some
168
attributes we may ask whether the attributes are related or independent. Thus, we
can find out whether quinine is effective in controlling fever or not whether there is
any association between marriage and failure. In order to test whether or not the
attributes are associated we take the null hypothesis that there is a association in
the attributes under study, or in other words, the two attributes are independent. It
should be noted that x2
is not a measure of the degree or form of relationship, it
only tells us whether two principles of classification are or are not significantly
related, without reference to any assumptions concerning the form of relationship.
ii) x2 test as a test pf goodness of fit
X2 test is very popularly known as test of goodness of fit for the reason that it
enables us to ascertain how appropriately the theoretical distributions such as
Binomial, Poisson, normal, etc, fit empirical distributions i.e., those obtained from
sample data. A test of the concordance of the two can be made just by inspection,
but such a test is obviously inadequate. Precision can be scored by a applying the
x2 test.
The following are the steps in testing the goodness of fit.
1. A null hypothesis is established and a significance level is selected for
rejection of the null hypothesis.
2. A random sample of observations is drawn from a relevant statistical
population.
169
3. A set of expected or theoretical frequencies is derived under the assumption
that the null hypothesis is true.
4. The observed frequencies are compared with the expected or theoretical
frequencies.
5. If the calculated value of x2 is less than the table value at a certain level of
significance and for certain degrees of freedom the fit is considered to be good.
The divergence between the actual and expected frequencies is attributed to
fluctuations of simple sampling.
(iii) x2 test as a test of homogeneity
The x2 test of homogeneity is an extension of the chi-square test of
independence. Tests of homogeneity are designed to determine whether two to
more independent random samples are drawn from the some population or from
different populations. Instead of one sample as we use with independence problem
we shall now have two or more samples.
It should be noted that both types of tests i.e., test of independence and
homogeneity are concerned with cross-classified data,
ADDITIVE PROPERTY OF X2
One of the merits of x2 test as an instrument of research is that it is possible
to combine the independently derived values of x2 relating to samples of similar
data by the simple process of addition. It enables a better test than could be made
170
using the data of any sample by itself. The sum of the x2 values thus combined will
itself have a x2 distribution with a degree of freedom equal to the sum of the
degrees of freedom of the separate x2 values. However, while adding x
2 values two
points must be remembered
1. The combined result in a single inclusive test is appropriate when the samples
are independent; and
2. When x2values are to be added, Yates's corrections should not be applied
because the addition theorem holds only for uncorrected constituent items.
Chi-square test for specified value of population variance
When we want to test if a random sample of x1, x2.., x n has been drawn from
a normal population with mean p and a specified variance 622, the statistic is.
22
2
0
( )xi xx
Where S is the standard deviation of the sample follows chi-square distribution
with (n-1) degrees of freedom.
By comparing the calculated value of x2 with the tabulated value (n-1) at certain
level of significance, we may accept or reject the null hypothesis.
It should be noted that this test can be applied only if the population is normal.
171
MISUSE OF CHISQUARE TEST
The most common mistake in the application of the chi-square statistic and
yet the most critical for its correct application is the violation of the independence
between measures or events. This assumption of independence is not to be
confused with the chi-square as a test of independence.
The assumption of independence refers to the individual observations or
frequencies and means that the occurrence of one event has no effect upon the
occurrence of any other event. Some other sources of error in the application of x2
test as revealed in a survey of research papers published in the journal of
experimental psychology are
i. Small theoretical frequencies.
ii. Neglect of frequencies of non-occurrence.
iii. Failure to equalize the sum of observed frequencies and the sum of the
theoretical frequencies.
iv. Incorrect or questionable categorizing.
v. Use of non-frequency data.
vi. incorrect determination of the number of degrees of freedom.
vii. Incorrect computation.
172
LIMITATIONS OF THE USE OF X2 TEST
X2 test la very widely used In practice however, in order to avoid the mis-
application of the test its following limitations should be kept in mind.
1. Frequencies of non-occurrence should not be omitted for binomial or
multinomial events.
2. The formula presented for x2 statistics is in terms of frequencies. Hence an
attempt should not be made to compute on the basis of proportions or other derived
measures.
3. The formula presented in this is not appropriate for cases in which repeated
measurements on the same or matched group are represented in one table, when
data from questionnaires and similar devices are analysed.
MANN-WHITNEY U TEST
The Mann-Whitney U test (1947) was designed to test the significance of
difference between the results of two samples drawn at random from the same
population but administered different treatments. It is a more powerful test than the
median test, for the observations are expressed in at least ordinal scale, i.e., in
ordinal or in ratio scale and then reduced to ranks, it can be used as an alternative
either to t tests when parametric assumptions are not satisfied.
a. Small samples (n1 + n2 £20)
173
The Mann-Whitney U test for small samples (n1 + n2 £20) follows the following
sequences
1. Hypothesis
Ho - there is no difference between the results of two samples.
H1 - there is difference between the results of two sample.
2. Ranking: All values of both the samples are taken together and ranked
from the lowest to the highest. In case two or more values are equal,
average of rank values is given to all. The rank assigned to the values of the
two groups is summed up separately.
3. computation of U value: computation of U value is done separately for each
sample (U1 and u2) by the relationship.
i.
ii.
Where, n1 = number of items in first group.
N2 = sum of ranks in second group.
"R1 = sum of ranks in first group.
"R2 = sum of ranks in second group.
Alternatively, value of any one U(u1 or u2) may be compared and then the other
one may be determined by the relationship. U1=(n1 n2)-U2
174
4. test of significance: to test the significance, the value of U is compared with the
critical value of U for Mann-Whitney test. If the computed value is greater than the
critical value,H0 is rejected. But if it is equal to or less than greater than the critical
value, H0 is accepted.
b. Large samples (n>20)
In case the size of samples is greater than 20, U distribution can be approximated
by the standard normal distribution. In such cases the decision has two steps
1. Determination of Z value: first taking smaller of the two values of U, value of z
is to be determined by the formula
( )
u
U E UZ
Where, E(U)=(n1 *n2)/2 and 1 2 1 2{( )( 1)}/12u n n n n
If thus, Zu =
1 2
1 2 1 2
( ) / 2
{( )( 1)}/12
U n nZu
n n n n
2.test of significance: the observed value of z is compared with the critical value
taking null hypothesis or alternative hypothesis at 5% or 1 % level of significance.
If the observed value of z is equal to or less than the critical value, the null
hypothesis is accepted and the difference between the two sample results is
considered not significant otherwise, the null hypothesis is rejected and the
alternative hypothesis is accepted.
175
KRUSKAL WALLIS TEST
If several independent samples are involved, analysis of variance is the usual
procedure. Failure to meet the assumptions needed for analysis of variance makes
its value doubtful. An alternative technique was developed called the Kruskal
Wallis one way analysis of variance or the H-test. This test helps in testing the null
hypothesis that "K " independent random samples come from identical populations
against the alternative hypothesis that the means of these samples are not all equal.
The test follows the following steps
• hypothesis
one of the following two hypotheses is taken
Ho = the difference between the observations of different samples is not significant.
H1 the difference between the observations of different samples is significant
• Ranking of observations
All the elements (observations) of all the samples are pooled together as ff they
were a single sample and ranked the lowest getting rank 1 and so on. Incase of tie
the values of ranks are equally divided.
Test statistic
The statistic is given by
H Why =
2
1
123*( 1)
*( 1)
ki
i i
rn
n n n
176
Where, n=total number of elements of K samples,
• Test of significance
Kruskal - Wallis test used X2 test to test the hypothesis. The value of x
2 distribution
for (k-1) degrees of freedom is determined. If the value of H is equal to or less than
x2, the null hypothesis is accepted. In case the value of H is greater than x
2, the null
hypothesis is rejected and it is concluded that test results of different samples are
not similar and the difference is significant.
177
CHAPTER – X
STATISTICAL ANALYSIS
INTRODUCTION TO ANNOVA
The analysis of variance frequently referred to by the contraction ANOVA is
a statistical technique specially designed to determine whether the means of more
than two quantitative populations are equal. The analysis of variance technique,
developed by R.A.Fisher in 1920's, is capable of fruitful application to a diversity
of practical problems. Basically, it consists of classifying and cross - classifying
statistical results and testing whether the means of a specified classification differ
significantly. In this way it is determined whether the given classification is
important in affecting the results by cross-classification, it could be determined
whether the mean qualities of the outputs of the various machines differed
significantly. Also it could independently be determined whether the mean
qualities of the outputs of the various machines differed significantly. Analysis of
variances thus enables us to analyse the total variation of our data into components
which may be attributed to various "source" or "causes" of variation.
Technique of analysis of variance
For the sake of clarity the technique of analysis of variance has been
discussed separately for (a) one-way classification, and (b) two - way
classification.
178
ONE-WAY CLASSIFICATION THE DATA ARE CLASSIFIED
ACCORDING TO ONLY ONE CRITERION
sall the means are not equal that is, the arithmetic means of
populations from which the k samples were randomly drawn are equal to one
another" the steps in carrying out the analysis are
1. Calculate variance between the samples
The variance between the samples measures the differences between the
samples mean of each group and the overall mean weighted by the number of
observations in each group. The variance between samples takes into account the
random variations from observation to observation. It also measures difference
from one group to another. The sum of squares between samples is denoted by
SSC for calculating variance between the samples. We take the total of the square
of the deviations of the means of various samples from the grand average and
divide this total by the degrees of freedom. Thus the steps in calculating variance
between samples will be
a) Calculate the mean of each sample; i.e x1.,, x2.etc.,
b) calculate the grand average ^, pronounced" x double bar" its value is
obtained as follows
179
1 2 2
1 2 3
......
......
x x xx
n n n
c) Take the difference between the means of the various samples and the
grand average.
d) Square these deviations and obtain the total which will give sum of squares
between the samples; and.
e) Divide the total obtained in step (d) by the degrees of freedom. The degrees
of freedom will be one less than the number of samples.
2. Calculate variance within the samples
The variance within samples measures those inter-samples difference due to
chance only. It is denoted by SSE. The variance with in samples measures
variability around the means of each group. Since the variability is not affected by
group differences it can be considered a measure of the random variation of values
with in a group. For calculating the variance within the samples we take the total of
the sum of squares of the deviation of various items from the mean values of the
respective samples and divide this total by the degree of freedom. Thus the steps in
calculating variance with in samples will be
a) Calculate the mean value of each sample.
b) Take the deviations of the various items in a sample from the mean values of
the respective samples.
180
c) Square these deviations and obtain the total which gives the sum of squares
within the sample and.
d) Divide the total obtained in step (c) by the degrees of freedom, the degrees of
freedom is obtained by deduction from the total number of items the number of
samples, i.e. (V=N-K), where, K refers to the number of samples.
e) N refers to the total number of all the observations.
3. Calculate the ratio 'f as follows
Between - column variance.
f =
Within - column variance.
The „f‟ distribution measures the ratio of the variance between groups to the
variance within groups. The variance between the samples means is the
numerator and the variance within the sample means is the denominator. If
these is no real difference from group to group, any sample difference will be
explained by random variation and the variation between groups should be
close to the variance within groups. However, if there is a real difference between
the groups, the variance groups will be significantly larger than the variance
with in groups.
4. Compare the calculated value of "f with the table value of “f” for the degrees
of freedom at a certain level. If the calculated value of “f” is greater than the
181
table value, it is concluded that the difference in sample means is significant
or, in other words, the samples do not come from the sample population. On
the other hand, if the calculated value of f is less than the table value, the
difference is not significant and has arisen to fluctuations of sampling.
It is customary to summarise calculations for sums of squares, together with the
r numbers of degrees of freedom and mean squares in a table called the analysis of
variance table, generally abbreviated ANOVA. The specimen of a ANOVA table is
given below.
Source of
variation
Ss (Sum of
squares)
v-degrees of
freedom
Ms-mean square Variance ratio
of "F"
Between
samples
SSC V1=c-1 MSC=SSC/C-1
Within
samples
SSE V2=n-c MSE=SSE/(N-
C)
MSC/MSE
Total SST n-1
SST = total sum of squares of variation.
SSC = sum of squares between samples (columns)
SSE = sum of squares with in samples (rows)
MSC = mean sum of squares between samples.
MSE = mean sum of squares within samples.
182
Coding of Data
While making out an analysis of variance, it should be noted that the final
quantity tested is a ratio and so dimensionless. This means that the original
measurement can be coded to simplify calculations without the need for any
subsequent adjustments of the results.
TWO-WAY CLASSIFICATIONS MODEL
When it is believed that two Independent factors might have an effect on the
response variable of Interest, it is possible to design the test, so that an analysis of
variance can be used to test for the effects of the two factors simultaneously. Such
a test is called a two factor analysis of variance. With the two-factor analysis of
variance, we can test two sets of hypothesis with the same data at the same time.
In a two-way classification, the data are classified according to two different
factors. The procedure for analysis of variance is somewhat different than the one
followed while dealing with problems of one-way classifications. In a two-way
classification the analysis of variance table takes the following form
183
Source of
variation
(Sum of
squares)
Degrees
of
freedom
Ms-mean square Variance ratio
of “F”
Between
samples
SSC (c-1) MSC=SSC/(c-1)
Between rows SSR (M) MSR=SSR/(r-1) MSR/MSE
Residual or
error
SSE (c-1)(r-1) MSE=SSE/ (r-1)(c-1)
Total SST n-1
SSC = sum of squares between columns
SSR = sum of Squares between rows.
SSE = sum of squares due to error.
SST = total sum of squares.
The sum of squares for the sources "Residual" is obtained by subtracting
from the total, sum of squares between columns and rows,
i.e., SSE=SST-[SSC+SSR].
The total number of degrees of freedom= n-1 or cr-1
Where c refers to number of columns, and r refers to number of rows, Number of
degrees of freedom between columns= (c-1), Number of degrees of freedom
between rows= (r-1), Number of degrees of freedom for residual- (c-1) (r-1)
The total sum of squares, sum of squares "between columns" and sum of squares
for "between rows" are obtained in the same way as before.
184
Residual or error sum of square= total sum of squares- sum of squares between
columns- sum of squares between rows.
The F values are calculated as follows
Where v1 =(c-1) and v2=(c-1) (r-1)
1 2( )MSR
F v vMSE
Where v1 =(r-1) and v2=(c-1) (r-1)
It should be carefully noted that v1 may not be same In both cases -in one case
v1=(c-1) and another case v1=(v-1)
The calculated values of F are compared swith the table values. If calculated values
of "f is greater than the table value at pre-assigned level of significance, the null
hypothesis is rejected, otherwise accepted.
It would be clear from above that in problems involving two-way classification,
"Residual", is the measuring rod for testing significance. It represents the
magnitude of variation due to forces called "chance".
185
MULTIVARIATE ANALYSIS OF DATA
INTRODUCTION
Variables can be analyzed in three different ways as given below
Univariate: whema single variable is analysed alone, e.g., sample statistic
such as "mean" which might refer to the average consumption of a particular kind
of food or the age of a certain group of students or people, it is known as univariate
analysis.
Bivariate: when some association is measured between two variables
simultaneously, e.g., cross-classification of age group and cross-consumption of
two products, it is know as bivariate analysis.
Multivariate Analysis: it is a logical extension of the univariate analysis.
Two or more independent variables form the basis for estimating the values of
dependent variables. The statistical technique used in multi-variate analysis is
called multivariate technique.
VARIOUS MULTIVARIATE TECHNIQUES
FUNCTIONAL MULTIVARIATE TECHNIQUES
These techniques help the researcher to develop the predictive methods.
Where one tries to relate two data sets consisting of dependent and
independent variables. On the basis of specific means used in measuring)' the
variables, the different dependence methods can be further classified as follows.
186
CORRELATION AND REGRESSION ANALYSIS
Correlation analysis tries to measure the magnitude and direction of
relationship between two variables. Multiple and partial correlation analysis
extends the same notion between a single variable and a set of variables.
Regression analysis is used to determine the functional relationship between a
dependent variable and a host of predictors.
DISCRIMINATE ANALYSIS
It involves a situation where the relationship between a categorical
dependent variable and number of independent variable is derived.
ANALYSIS OF VARIANCE
This is used to analyse some experimental data when there is a metric
dependent measure and a set of experimental independent variables which measure
even on nominal or ordinal scales.
All the above mentioned analysis are extended to multivariate level when
there is more than one dependent variable. The statistical formula for estimation of
parameters will, however, change.
STRUCTURAL MULTIVARIATE TECHNIQUES
Techniques which help in reducing large and complex data into meaningful groups
are known as structural multivariate techniques.
187
In some cases there is no prior basis for distinguishing between criterion and
predictor variables. But the researcher may be interested in analyzing their
interdependence or underlying relationship. Techniques belonging to
interdependence methods applicable in marketing situations are as follows
Factor analysis: here a few basis dimensions are extracted by examining the
correlations among the variables. The information from a large number of
interrelated variables is summarized into few factors.
Cluster analysis: it tries to group a set of objects into smaller sets based on
their profiles similarities.
Co joint analysis: It is a set of techniques used to derive a set of weightage
for some discrete levels of variables which are prior selected to study
consumers' choice decision.
Characteristic features of multivariate analysis
With the availability of a wide range of useful data and knowledge of statistical
techniques and their applications, multivariate techniques have become popular.
Basic characteristics of these techniques are
1. More than two variables: in multivariate analysis of data the observations are
related with more than two variables, where one or more of these are dependent
variables and the rest are explanatory or independent variables.
2. Simultaneous consideration of relationship existing: The relationship or
188
interdependence existing among all the variables under observation is considered
at the same time, rather than separately for each set of variables.
3. Composite view: multivariate analysis transforms the mass of observations
into a small number of composite scores and presents a composite view of the
mass of data. This facilitates their interpretation and use in decision making.
4. Require wide range of observations: for obtaining objective result in case of
multivariate analysis, a wide range of observations are required.
5. Useful for long range planning: the result of multi-variate analysis are useful
for long range planning, as they provide a composite view of the situation, rather
than compartmentalized view, as s large number of variables are considered.
Limitations of multivariate techniques
Multivariate techniques have great potential in research. Analysis of data is
possible with their use in greater depth. Sometimes, the results are highly revealing
and provide right direction to the research. However, in spite of all the beneficial
and positive aspects of multivariate techniques, their use is still not wide-spread
due to their following limitations.
• Laborious computations are involved in multivariate analysis. Advent
of high speed computers and software packages has greatly facilitated their
use. Enlarging the scope of their application considerably, it has made it
possible to use multivariate data in research. However, still this remains out
189
of reach of most researchers in developing countries.
• Cost of analysis is another limitation. This technique may not be cost
effective in all cases. An individual researcher has to depend upon his own
resources.
• All these techniques are data-specific and result specific. For their use,
without expert knowledge, validity of results cannot be ensured The utility
of results may be much less in many cases as compared to the effort needed.
CORRELATION
INTRODUCTION
Correlation measures the nature and extents of proximity in the relationship
of two or more variables in a field of inquiry, where cause and effect relationship is
existing among them. This provides a very useful basis for statistical analysis of
data. Correlation analysis is very useful in research. It provides a useful basis in
case of multivariate analysis.
For correlation analysis, the variables considered are classified into two
broad groups, viz., dependent variables and independent variables. The variables
whose value depends on some other variable are known as dependent variables,
and the variables influencing the behaviour of other variables are known as
regressors, explanatory variables or independent variable.
190
On the basis of the number of variables considered, the correlation analysis?, can
be termed as simple or multiple.
Simple correlation
An analysis wherein relationship existing between two variables, one
dependent and the other Independent, is examined, is known as simple correlation
analysis. The statistical measure of simple correlation is known as "co-efficient of
linear correlation" with symbol „r‟.
The co-efficient of correlation can be either a positive or negative value,
depending upon the direction of change in one variable with the change in the
other variable. If the direction of change is similar, H signifies positive correlation.
But if it is in reverse direction it signifies negative correlation.
The value of co-efficient of correlation is expected to range from minus one
to plus one Simple correlation measures the strength and type of relationship
between two variables on the assumption that no other variables comes into play
and as such, is not to be taken into account. It is therefore, also called „zero order
correlation co-efficient‟.
Linear and non-linear correlation
The distinction between linear and non-linear correlation is based on the
ratio of change between variables. If the degree of change in one variable tends to
bear a constant ratio to the degree of change in other variable, then the correlations
191
said to be linear. However, in case the degree of change in one variable does not
bear a constant ratio to the degree of change in the other variable, it is called non-
linear correlation.
Assumption of correlation analysis
• Cause and effect relationship exists between the variables, such that one is an
independent (cause) variable, and the other is a dependent (effect) variable.
• The relationship between the variables is linear.
• The variables under observation are so influenced in a large number of
cases that they form a normal distribution.
Testing the significance of correlation
Co-efficient of simple correlation, though a measure of great value, provides
only a broad indication of the nature and extent of relationship existing between
two variables. Sometimes, it may give a illusory and false indication. The size of
the sample also influences this.
For testing the significance of simple correlation or zero order correlation standard
error is determined. The standard error of simple correlation is
Where, n indicates the number of observations in the sample.
192
PARTIAL CORRELATION
In case it is suspected that the relationship between two variables, say x and
y, is being distorted due to their relationship with some other variables, then partial
correlation is determined between these two variables after the effect of one or
more other distracting variable, if any has been eliminated.
In multivariate analysis, partial correlation analysis proves useful
eliminating the distracting influence of variables and in judging the depth of
relationship between each set of variables. Partial correlation provides the basis for
determining the multiple correlation.
The measure of partial correlation is known as co-efficient of partial
correlation of the first, second and third order, and so on, depending upon how
many disturbing variables have been taken into account.
Determination of partial correlation is essential to understand the cause and
effect relationship between the variables under observation.
PARTIAL CORRELATION INCASE OF THREE VARIABLES
In a study of three variables, determination of correlation between any two
variables eliminating the effect of third variable can be* measured as
a) between y and x2 keeping x1 constant
2 1 1 22 1
2 2
1 2
( )( ).
(1 1) (1 )
ryX ryX rX Xryx x
r y r x x
193
Where, ryx1 = simple correlation between y and x1
ryx2 = simple correlation between y and x2
rx1x2 = simple correlation between x1 and x2
It may be noted in the above that the two variables under observation are identified
by the subscript before the dot, and the variabie held constant is identified by the
subscript after the dot.
PARTIAL CORRELATION IN CASE OF FOUR VARIABLES
In case of four variables, one dependent variable and three independent
variables, there can be twelve possible first order co-efficient of partial correlation.
These will be determined as shown here under
i)
3 2 1 2 2 33 1 2
2 2
1 2 2 1
. ..
1 . 1 .
ryx x ryx x rx xryx x x
r yx x r yx x
ii) 1 2 2 1
1 22 2
2 2 1
( )( ).
(1 ) (1 )
ryx ryx rx xryx x
r yx r x x
iii)
3 1 1 33 1
2 2
1 1 3
( )( ).
(1 ) (1 )
ryx ryx rx xryx x
r yx r x x
Second order co-efficient of partial correlation may be determined from the order
co-efficient as under
Between y and x1 eliminating x2 and x3
194
(i)
21.2 1.3 2.31 2 3 2.32
1.3
(11
ry ry ryry x x ry
r y
Between y and x2 eliminating x1 and x
3
(ii)
2 3 1 2 1 32 1 3
2 2
1 2 1 3
. ..
1 . 1 .
ryx x ryx x rx xryx x x
r yx x r yx x
Between y and x3 eliminating x1 and x
3
(iii)
3 2 1 2 2 33 1 2
2 2
1 2 2 1
. ..
1 . 1 .
ryx x ryx x rx xryx x x
r yx x r yx x
The main object of determining partial co-efficient is to eliminate the
disturbance of other variables, while determining the correlation between two
variables. However, whether removing the effect of other variables will strengthen
or weaken correlation between two variables, will depend upon the nature and
extent of correlation with the disturbing variables and cannot be foretold by a
general value.
Significance test for partial correlation
Standard error of co-efficient of partial correlation is used to test the
significance of partial correlation. The standard error of co-efficient of partial
correlation is determined as under.
195
S.E = 2
2 2
( ) ( ) ( )
( ) ( )
a x b xy n x
x n x
21
(3 )
r
n NVE
Where n= number of pairs of observation of variables considered.
NVE= number of variables eliminated.
MULTIPLE CORRELATION
Co-efficient of multiple correlation measures the nature and extent of
proximity in the relationship between one dependent variable and two or more
independent or explanatory variables. The statistical measure of such a relationship
is known as co-efficient of multiple correlation, with symbol R.
Co-efficient of multiple correlation as a measure has a large number of uses.
Important ones among them are
• It provides a measure of correlation between one dependent variable and large
number of independent variables.
• The co-efficient of determination provides a measure of change in dependent
variable explained by the change in independent variables.
• In multi variable analysis, the co-efficient of multiple correlation is a basic
measure useful in Identification of variables for the model where y is the
dependent variable and x1 and x
2 are regressors, the co-efficient of multiple
correlation indicated as RY.X, X, (or) RY.12 will be calculated as
196
2 2
1 2 1 2 121.2
2
12
( (2 )
(1 ( ) )
ry ry ry ry rRY
r
Determination of correlation by least square method
Correlation among variables can alternatively be determined by the least square
method. Under the least square method, the co-efficient of correlation is
determined as under
2
2 2
( ) ( ) ( )
( ) ( )
a y b xy n yR
y n y
or
2
2 2
( ) ( ) ( )
( ) ( )
a x b xy n x
x n x
The co-efficient of correlation can also be determined by first determining the
ssunexplained and total variance of regression. In this case the co-efficient$1
correlation will be determined as
R=1-exp var
var
Un ected ianceofregression
total ianceofregression
The co-efficient of correlation can also be determined as
( )2 /
1( )2 /
cy y nr
y y n
or
2
2
( )1
( )
cy y
ry y
The coefficient of correlation can also be determined as
2
2
( )1
( )
cy y
ry y
197
Where, Yc= expected value of Y for the given values of X.
y c = mean of expected values of Y.
y s = mean of Y
KARL PEARSON'S COEFFICIENT OF CORRELATION
The Karl Pearson's method is popularly known as Pearson's co-efficient of
correlation. The Pearson coefficient of correlation is denoted by the symbol 'r'. The
formula for computing Pearsonian „r‟ is
' '
x y
xyr
N
( ) : ( )x x x y y y
x =S.D of series x
y =S.D of series y
N = number of pairs of observation.
This method is applied only where deviations of items are taken from actual mean
and not from assumed mean.
Direct method of finding out correlation co-efficient
Correlation co-efficient can also be calculated without taking deviations of
items either from actual mean or assumed mean. The formula is
2 2 2 2
( )( )' '
( ) ( )
N xy x yr
N x x N y y
198
When deviations are taken from an assumed mean
When actual means are in fractions, the calculation of correlation by the
method would take a lot of time. In such a case we make use of the assumed mean
method for finding out correlation. When deviations are taken from an assumed
mean the following formula is applicable
2 2 2 2( ) ( )
x y x yN fd d fd fdR
N fdx fdx N fdy fdy
Where,
dx - refers to deviations of x series from an assumed mean.
dy- refers to deviations of y series from an assumed mean.
"dx2 = sum of the squares of the deviations of x series from an assumed mean.
"dy2 = sum of squares of the deviation of y series.
"dx = sum of deviations of x series from an assumed mean.
"dy = sum of deviations of y series from an assumed mean.
Correlation of grouped data
When the number of observations is large, the data are often classified into
two-way frequency distribution called a correlation table.
The class intervals for y are listed in the captions or column headings and
those for x are listed in the stubs at the left of the table. The order also be reversed.
199
The frequencies for each cell of the table are determined by either tallying or card
sorting just as In the case of frequency distribution of a single variable.
The formula is
2 2 2 2( ) ( )
x y x yN fd d fd fdR
N fdx fdx N fdy fdy
Assumption of the Pearson's co-efficient
Karl Pearson's co-efficient of correlation is based on the following assumptions
(1) there is linear relationship between the variables.
(2) The two variables under study are affected by a large number of independent
causes so as to form a normal distribution.
(3) There is a cause and effect relationship between the forces affecting
trie distribution of the items in the two services.
Merits and limitations of the Pearsonian coefficient
Amongst the mathematical methods used for measuring the degree of relationship,
Kari Pearson's method is most popular. The correlation co-efficient summarizes in
one figure not only the degree of correlation but also the direction.
The limitations are
• The correlation co-efficient always assumes linear relationship regards of the
fact whether that assumption is correct or not.
200
• Great care must be exercised in the value of this co-efficient as very often
the coefficient is misinterpreted. The value of the co-efficient is unduly
affected by the extreme items.
• As compared with other methods this method takes more times to compute
the value of correlation co-efficient.
Properties of the co-efficient of correlation
The following are the important properties of the correlation co-efficient, r
1. the co-efficient of correlation lies between -1 and +1, symbolically, -d"
r+1or|r|d"|
2. the co-efficient of correlation is independent of change of scale and origin. of
the variable X and Y.
3. the co-efficient of correlation is the geometric mean of two regression co-
efficient.
4. the degree of relationship between the two variables is symmetric.
RANK CORRELATION CO-EFFICIENT
It is possible to avoid making any assumptions about the populations being
studied by ranking the observations according to size and basing the calculations
on the ranks rather than upon the original observations. It does not matter which
way the items are ranked. Item number one may be the larger or it may be smallest.
201
Using ranks rather than actual observations gives the co-efficient of rank
correlation. Spearman's rank correlation co-efficient is defined as
6 21
( 2 1)
DR
n n
or
6 21
3
D
n n
Where, 'R' denotes rank co-efficient of correlation and „D‟ refers to the difference
of rank between paired items in two series.
FEATURES OF SPEARMAN'S CORRELATION CO-EFFICIENT
• the sum of the difference of ranks between two variables shall be zero.
• Spearman's correlation co-efficient is distribution free or non-parametric because
no strict assumptions are made about the form of population from which sample
observations are-drawn.
• The spearman's correlation co-efficient is nothing but Karl Pearson's correlation
co-efficient between the ranks.
In rank correlation we may have two types
• When ranks are given.
• When ranks are not given.
Where ranks are given
When actual ranks are given to us the steps required for computing rank correlation
are
202
(i) take the difference of the two ranks i.e., (R1-R2) and denote these differences
by D.
(ii) square these difference and obtain the total "D2
(iii) apply the formula R=1 -
Where ranks are not given
When we are given the actual data and the ranks, it will be necessary to
assign the ranks. Ranks can be assigned by taking either highest value as 1 or the
lowest value as 1 but whether we start with the lowest value or the highest value
we must follow the same method in case of both the variables.
Equal Ranks
In some cases it may be found necessary to rank two or more individuals or
entries as equal. In such a case it is customary to give each individual an average
rank. If there is more than one group of items with a common rank, this value is
added as many times the number of such groups. The formula can thus be written.
6{ 2 1/12( 3 ) 1/12( 3 ) ....}1
3
D m m m mR
N N
s
When to use rank correlation co-efficient?
The rank method has principal uses
• the initial data are in the form of ranks.
• If N is/fairly small (say, not more than 25 or 30). Rank method is sometimes
203
applied to interval data as an approximation to the more time-consuming r.
REGRESSION ANALYSIS
INTRODUCTION
Regression is a technique for approximating the relationship existing among
variables and using the same for estimating or predicting the value of one of these
variables on the basis of observed values of other related variable.
As a statistical measure 'regression' is based on 'correlation' i.e., existence of cause
and effect relationship among the variables under study. Regression analysis
provides a useful basis for estimation and forecasting the value of dependent
variable from the given values of one or more independent variables.
Regression is the measure of the average relationship between two or more
variables in terms of the original units pf the data.
Uses of regression analysis
Regression analysis is a branch of statistical theory that is widely used in
almost all the scientific disciplines. In economics, it is the basic technique for
measuring or estimating the relationship among economic variables that constitute
the essence of economic theory and economic life. The regression analysis
attempts to accomplish the following.
204
1. Regression analysis provides estimates of values of the dependent variables
from values of the independent variables. The device used to accomplish this
estimation procedure is the regression line.
2. A second goal of regression analysis is to obtain a measure of the error
involved in using the regression line as a basis for estimation. For this
purpose the standard error of estimate is calculated.
3. With the help of regression co-efficient we can calculate the correlation co-
efficient. The square of correlation co-efficient R, called co-efficient of
determination measures the degree of association of correlation that exists
between two variables.
Difference between correlation and regression analysis
1. Whereas co-efficient is a measure of degree of co variability between X and
Y, the objective of regression analysis is to study the nature of relationship
between the variables so that we may be able to predict the value of one on the
basis of another.
2. Correlation is merely a tool of ascertaining the degree of relationship between
two variables. In regression analysis one variable is taken as dependent while the
other as independent. In correlation analysis there is a measure of direction and
degree of relationship between two variables. In regression analysis the regression
co-efficient is not symmetric and hence it definitely makes a difference as to which
205
variable is dependent and which is independent.
3. There may be non-sense correlation between two variables which is purely
due to chance and has no practical relevance such as increase in income and
increase in weight of a group of people.
4. Correlation co-efficient is independent of change of scale and origin.
Regression co-efficient are independent of change of origin but not of
scale.
REGRESSION EQUATIONS
Regression equations, also known as estimating equations, are algebraic
expressions of the regression lines. Since there are two regression lines, there are
two regression equations.
Regression equation of Y on X
The regression equation of Y on X is expressed as follows
Y= a+bX
It may be noted that in this equation „y‟ is a dependent variables, „x‟ is independent
variable.
'a' and 'b' in the equation are called numerical constants because for any given
straight line, their value does not change.
If the values of the constants 'a' and 'b' are obtained, the line is completely
determined.
206
A straight line fitted by least squares has the following characteristics
1. It gives the best fit to the data in the sense that it makes the sum of the
squared deviations from the line, "(Y-YC)2, smaller than they would be from any
other straight line. This property accounts for the name "Least Squares".
2. The deviations above the line equal those below the line on the average. This
means that the total of the positive and negative deviations is zero, or"(Y-YC)=0.
3. The straight line goes through the overall mean of the data ().
4. When the data represent a sample from a large population the least squares
line is a best estimate of the population regression line.
"y=Na+b"x
"xy=a"x+b"x2
These equations are usually called the normal equations.
Regression equation of X on Y
The regression equation of X on Y is expressed as follows.
Xc = baby
To determine the values of a and b, the following two normal equations are to be
solved simultaneously
"x=Na+b"y
"xy=a"y+b"y2
207
Deviations taken from arithmetic means of X and Y
The above method of finding out regression equation is tedious. The calculations
can very much be simplified if instead of dealing with the actual values of X and Y
we take the deviations of X and Y series from their respective means. In such a
case the two regression equations are written as follows
(i) Regression equation of x on y
(ii) X- =R(y-)
is the mean of x series.
is the mean of y series,
r = is known as the regression co-efficient of x on y.
The regression co-efficient of X on Y is denoted by the symbol bxy or b1 It
measures the change in X corresponding to a unit change in Y.
Bxy or r =
(ii) Regression equation of Y on X
( )y
Y y r x Xx
it measures the change in Y corresponding to a unit change in X. when deviations
are taken from actual means, the regression are taken from actual means, the
regression co-efficient of Y on X can be obtained as follows.
208
2
x xyr
y x
Deviations taken from assumed means
When actual mean of X and Y variables are in fractions the calculations can be
simplified by taking the deviations from the assumed means. When deviations are
taken from assumed means, the entire procedure of finding regression equations
remains the same. The two regression equations are
( )x
x X r y yy
The value of x
ry
will now be obtained as follows
2 2
( . )
( )
y N dxdy dx dyr
x N dY dy
Dx=(X-A) and dy =(Y-As)
Similarly, the regression equation of Y on X is
( )y
Y y r x Xx
Limitations of Regression Analysis
In making an estimate from a regression equation, it is important to
remember that an assumption is being made that relationship has not changed since
the regression equation was computed. Another point worth remembering is that
209
the relationship shown by the scatter diagram may not be the same if the equation
is extended beyond the values used in computing the equation. For example, there
may be a close linear relationship between the yield of a crop and the amount of
fertilizer applied, with the yield increasing as the amount of fertilizer is increased.
It would not be logical, however, to extend this equation beyond the limits of the
experiment for it is quite likely that if the amount of fertilizer were increased
indefinitely, the yield would eventually decline as too much fertilizer was applied.
210
CHAPTER – XI
REPORT WRITING
SIGNIFICANCE OF A RESEARCH REPORT
Report writing makes the final stage of research study. After the collected
data has been analyzed and interpreted, and various generalizations have been
drawn the report has to be prepared. The report of a survey is thus, the statement
that contains in brief the procedure adopted and the findings arrived at by the
investigator of a research problem. It is not a complete description of what has
been done during the period of survey. Reporting may appear simple, but it
requires considerable thought, skill, effort, patience and presentation. This unit
discusses the purpose of research reports, types of reports, principles of writing,
drafting and finalizing the report and evaluation of a research report.
Research report is considered a major component of the research study for
the research task remains incomplete till the report has been presented and / or
written. As a matter of fact even the most brilliant hypothesis, highly well designed
and conducted research study, and the most striking generalization and findings are
of little value unless they are effectively communicated to others. The purpose of
research is not well served unless the findings are made known to others. Research
results must Invariably enter the general store of knowledge. AH this explains the
significance of writing research report. There are people who do not consider
211
seriously the research process. But the general opinion is in favor of treating the
presentation of research results or the writing of report as part and parcel of the
research project. Writing of report is the last step in a research study and requires a
set of skills somewhat different from those called for in respect of the earlier stages
of research. This task should be accomplished by the researcher with utmost care;
he may seek the assistance and guidance of experts for the purpose.
STEPS IN REPORT WRITING
Research reports are the product of slow, painstaking, accurate inductive
work. The usual steps involved in report writing are
1. logical analysis of the subject matter
2. preparation of the final outline
3. preparation of the rough draft
4. rewriting and polishing
5. preparation of the final bibliography
6. writing the final draft.
Though all the steps are self explanatory, yet a brief mention of each one of these
will be appropriate for better understanding.
1. Logical Analysis of the subject matter; It is the first step which is primarily
concerned with the development of a subject. There are two ways in which to
develop a subject.
212
a) Logically; and b) chronologically; The logical development is made on the basis
of mental connections and associations between one thing and another by means of
analysis. Logical treatment often consists in developing the material from simple
propositions to the most complex structures. ; Chronological development is based
on a connection or sequence in time or occurrence. The directions for doing or
making something usually follow the chronological order.
2. Preparation of the final outline; It is the next step in writing the research report.
"Outlines are the framework upon which long written works are constructed. They
are an aid to the logical organization of the material and a reminder of the points to
be stressed in the report."
3. Preparation of the rough report; This follows the logical analysis of the subject
and the preparation of the final outline. Such a step is of utmost importance. For
the researcher now sits to write down what he has done In the context of his
research study. He will write down the procedure adopted by him in collecting the
material for his study along with various limitations faced by him; the technique of
analysis adopted by him, the broad findings and generalizations and the various
suggestions he wants to offer regarding the problem concerned.
4. Rewriting and polishing the rough draft; This step happens to be the most
difficult part of all formal writing. Usually this step requires more time than the
writing of the rough draft. The careful revision makes the difference between a
213
mediocre and a good piece of writing. While rewriting and polishing, one should
check the report for weakness in logical development or presentation. The
researcher should also "see whether or not the material, as it is presented, has unity
and cohesion; does the report stand upright and firm and exhibit a definite pattern,
like a marble arch? Or does it resemble an old wall of moldering cement and loose
bricks." In addition the researcher should give due attention to the fact that in his
rough draft he has been consistent or not. He should check the mechanics of
writing - grammar, spelling and usage.
5. Preparation of a final Bibliography; Next in order comes the task of the
preparation of the final bibliography. The bibliography which is generally
appended to the research report is a list of books in some way pertinent to the
research which has been done. It should contain all those works which the
researcher has consulted. The bibliography should be arranged alphabetically and
may be divided into two parts; the first part may contain names of the books and
pamphlets and the second part may contain the names of the magazines and
newspaper articles. Generally this pattern of bibliography is considered convenient
and satisfactory from the point of view of the reader, though it is not the only way
of presenting bibliography. The entries in bibliography should be made adopting
the following order.
For the books and the pamphlets the order may be as under
214
1. Name of the author, with last name first.
2. Title, underlined or the use of italics.
3. Place, publisher and date of publication.
4. Number of volumes.
Example: Subbiah, A.Sethurama, "Research Methodology", Chennai, Esores
Publications, 2005.
For magazines and newspapers the order may be as under
1. Name of the author, last name first.
2. Title of the article, in quotation marks.
3. Name of the periodical; underlined to indicate italics.
4. The volume and number.
5. The date of the issue.
6. The pagination.
Example: Subbiah, A.Sethurama." Who are Children?" Frontline, Chennai,
September 7-15, 2004, pp 34 -37.
WRITING THE FINAL DRAFT
This constitutes the last step. The final draft should be written in concise and
objective style and in simple language, avoiding vague expressions such as "it
seems," "they may be" and the like. While writing the final draft, the researcher
must avoid abstract terminologies and technical jargon. Illustrations and examples
215
based on common experiences must be incorporated in the final draft as they
happen to be more effective in communicating the research findings to others. A
research report should not be dull, but must enthuse people and maintain interest
and must show originality, ft must be remembered that every report should be an
attempt to solve some intellectual problem and must contribute to the solution of a
problem and must add to the knowledge of both the researcher and the reader.
LAYOUT OF THE RESEARCH REPORT
Anybody, who is reading the research report, must necessarily be conveyed
enough about the study so that he can place it in its general scientific context,
judge the adequacy of its method and thus form an opinion of how seriously the
findings are to a taken. For this purpose there is the need of proper layout of the
report. The layout of the report means as to what the research report should
contain. A comprehensive layout of the research report should comprise, (a)
Preliminary pages (b) the main text; (c) the end matter.
(a) PRELIMINARY PAGES
In its preliminary pages the report should carry a title and date, followed by
acknowledgements in the form of 'preface' or 'forward'. Then there should be a
table of contents followed by list of tables and illustrations so that the decision
maker or anybody interested in reading the report can easily locate the required
information in the report.
216
(b) MAIN TEXT
The main text premises the complete outline of the research report along
with all details. Title of the research study is repeated at the top of the first page of
the main text and then follows the other details on pages numbered consecutively,
beginning with the second page. Each main section of the report should begin on a
new page. The main text of the report should have the following sections, i)
introduction; ii) statement of findings and recommendations iii) the results iv) the
implications drawn from the results v) the summary.
i) Introduction
The purpose of introduction is to introduce the research project to the
readers. It should contain a clear statement of the objectives of research i.e.,
enough background should be given to make clear to the reader why the problem
was considered worth investigation. A brief summary of other relevant research
may also be stated so that the present study can be seen in that context. The
hypothesis of study, if any, and the definition of the major concepts employed ^n
the study should be explicitly stated in the introduction of the report.
The methodology adopted in conducting the study must be fully explained.
The scientific reader would like to know in detail about such things as; how was
the study carried out? What was its basic design? If the study was an experimental
one, then what were the experimental manipulations? If the data were collected by
217
means of questionnaires or interviews, then exactly what questions were asked (the
questionnaire or interview schedule is usually given in an appendix) If
measurements were based on observation, then what instructions were based on
observation, then what instructions were given to the observers? Regarding the
sample used in the study the reader should be told; who were the subjects? How
many were there? How were they selected?
All these questions are crucial for estimating the probable*limits of
generalizability of the findings. The statistical analysis adopted must also be
clearly stated. In addition to all this, the scope of the study should be stated and the
boundary lines be demarcated. The various limitations, under which the research
project was completed, must also be narrated.
ii) Statement of findings and recommendations
After introduction, the research report must contain a statement of findings
and recommendation in non-technical language so that it can be easily understood
by all concerned. If the findings happen to be extensive, at this point they should
be put in the summarized form.
iii) Results
A detailed presentation of the findings of the study, with supporting data in
the form of tables and charts together with a validation of results. The next step is
writing the main text of the report. This generally comprises the main body of the
218
report, extending over several chapters. The result section of the report should
contain statistical summaries and reductions of the data rather than the raw data; all
the results should be presented in logical sequence and split into readily
identifiable sections. All relevant results must find a place in the report. But how
one is to decide about what is relevant is the basic question. Quite often guidance
comes primarily from the research problem and from the hypothesis, if any, with
which the study was concerned. But ultimately the researcher must rely on his own
judgment in deciding the outline of his report. "Nevertheless, it is still necessary
that he states clearly the problem, the conclusion at which he arrived, and the bases
for his conclusion".
iv) Implications of the Result
Towards the end of the main text, the researcher should again put down the
results of his research clearly and precisely. He should state the implications that
flow from the results of the study, for the general reader who is interested in the
implication. For understanding the human behaviour, such implications may have
these aspects as stated below.
a) A statement of the inference drawn from the present study which may be
expected to apply in similar circumstances.
b) The conditions of the present study which may limit the extent of legitimate
generalization of the inferences drawn from the study.
219
c) The relevant questions that still remain unanswered or new questions raised
by the study along with suggestions for the kind of research that would provide an
answer for them.
It is considered good practice to finish the report with a short conclusion
which summarizes and recapitulates the main points of the study. The conclusion
drawn from the study should be clearly related to the hypotheses that were stated in
the introductory section. At the same time, a forecast of the probable future of the
subject and an indication of the kind of research which needs to be done in that
particular field is useful and desirable.
v) The summary
It has become customary to conclude the research report with a very brief
summary, resting in brief the research problem. The methodology, the major
findings and the major conclusion drawn from the research results.
vi) End Matter
At the end of report; appendices should be enlisted in respect of ail technical
data such as questionnaires, sample information, mathematical derivations and the
like. Bibliography of sources consulted should also be given. Index (an
alphabetical listing of names, places and topics along with the numbers of the
pages in a book or report on which they are mentioned or discussed) should
220
invariably be given at the end of the report. The value of index lies, in the fact that
it works as a guide to the reader for the contents in the report.
TYPES OF REPORT
Research reports vary greatly in length and type. In each individual's case,
both the length and the form are largely dictated by the problems at hand. For
instance, business firms prefer reports in the letter form, just one or two pages in
length. Banks, insurance organizations and financial institutions are generally fond
of the short balance sheet type of tabulation for their annual reports to their
customers and shareholders. Mathematicians prefer to write the results of their
investigations in the form of algebraic notations. Chemists report their results in
symbols and formulae. Students of literature usually write long reports presenting
the critical analysis of some writer or period or the like with a liberal use of
quotations from the works of the author under discussion. In the field of education
and psychology, the favorite form is the report on the results of experimentation
accompanied by the detailed statistical tabulation. Clinical psychologists and social
pathologists' frequently find it necessary to make use of the case-history form.
New items in the daily papers are also forms of report writing. They represent
firsthand-on-the scene accounts of the events described or compilations of
interviews with persons who were on the scene. In such reports the first paragraph
221
usually contains the important information in details and the succeeding paragraph;
contain materials which are progressively less and less important.
Book reviews analyze the content of the book and report on the author's
intention, his success or failure in achieving his aims, his language, his style,
scholarship, bias or his point of view. Such reviews also happen to be a kind of
short report. The reports prepared by governmental bureaus, special commissions,
and other similar organizations are generally very comprehensive records on the
issue Involved. Such reports are usually considered as important research products,
similarly, PhD theses and dissertations are also a form of report-writing, usually
completed by students in academic institutions.
The above narration throws light on the fact that the results of a research
investigation can be presented in a number of ways viz.., a technical report, a
popular report, an article, a monograph or at times even in the form of oral
presentation. The method (s) of presentation to be used in a particular study
depends on the circumstances under which the study arose and the nature of the
results. A technical report is used whenever a full written report of the study is
required whether for record-keeping or for public dissemination. A popular report
is used if the research results have policy implications. We give below a few
details about the said two types of reports.
222
(a) TECHNICAL REPORTS
In the technical report the main emphasis is on (I) the method employed, (ii)
assumptions made in the course of the study, (iii) the detailed presentation of the
findings including their limitations and supporting data.
A general outline of a technical report can be follows
1. SUMMARY OF RESULTS
A brief review of the main findings just in two or three pages.
2. NATURE OF THE STUDY
Description of the general objectives of study, formulation of the problem in
operational terms, the working hypothesis, the type of analysis and data required
etc.
3. METHODS EMPLOYED
Specific methods used in the study and their limitations; for instances; in
sampling studies we should give details of sample design vizo sample size, sample
selection, etc.
4. DATA
Discussion of data collected, their sources, characteristics and limitation. If
secondary data are used, their suitability to the problem at hand is fully assessed. In
case of a survey, the manner in which data were collected should be fully
described.
223
5. ANALYSIS OF DATA AND PRESENTATION
The analysis of data and presentation and of the findings of the study with
supporting data in the form of tables and charts is fully narrated. This, infect,
happens to be the main body of the report usually extending over several chapters.
6. CONCLUSIONS
A detailed summary of the findings and the policy implications and the
results be explained.
7. BIBLIOGRAPHY
Bibliography of various sources consulted be prepared and attached.
8. TECHNICAL APPENDICES
Appendices are given for all technical matters relating to questionnaire,
mathematical derivations, elaboration on particular technique of analysis and the
like.
9. INDEX
Index must be prepared and be given invariably at the end of the report.
The order presented above only gives a general idea of the nature of a technical
report; the order of presentation may not necessarily be the same in all the
technical reports. This, in other words, it means that the presentation may vary in
different reports; even the different sections outlined above will not always be the
same, nor will all these sections appear in any particular order.
224
It should, however, be remembered that even in a technical report, simple
presentation and ready availability of the findings remain an important
consideration and as such the liberal use of charts and diagrams is considered
desirable.
b) POPULAR REPORT
The popular report is one which gives emphasis on simplicity and
attractiveness. The simplification should be sought through clear writing,
minimization of technical, particularly mathematical, details and liberal use of
charts and diagrams. Attractive layout along with large print many subheadings,
even an occasional cartoon now and then is another characteristic feature of the
popular report* Besides, in such a report emphasis is given on practical aspects and
policy implications.
We give below a general outline of a popular report.
1. The findings and their implications
Emphasis in the report is given on the findings that are most practical and on
the implication of these findings
2. Recommendation for action
Recommendations for action on the basis of the findings ofstudy are made in
this section of the report.
225
3. Objective of the study
A general review of how the problem arises is presented along with the
specific objectives of the project under study.
4. Methods employed
A brief and non-technical description of the methods and techniques used
including a short review of the data on which the study is based, is given in this
part of the report.
5. Results
This section constitutes the main body of the report where in the results of
the study are presented in clear and non-technical terms with liberal use of all sorts
of illustrations such as charts, diagrams and the like ones.
6. Technical Appendices
More detailed information on methods used, forms, etc is presented in the
form of appendices. But the appendices are often not detailed if the report is
entirely meant for general public.
There can be several variations of the form in which a popular report can be
prepared. The only important thing about such a report is that it gives emphasis on
simplicity and policy implications from the operational point of view avoiding the
technical details of all sorts to the extent possible.
226
ORAL PRESENTATIONS
At times oral presentation of the results of the study is considered effective,
particularly in cases where policy recommendations are indicated by project
results. The merit of this approach lies in the fact that it provides an opportunity for
give-and-take decisions which generally lead to a better understanding of the
findings and their implication. But the main demerit of this sort of presentation is
the lack of any permanent record concerning the research details and it may be just
possible that the findings may fade away from people's memory even before any
action is taken. In order to overcome this difficulty, a written report may be
circulated before the oral presentation is made. Presentation is effective when
supplemented by various visual devices. Use of slides, wall charts and guide broad
are quite helpful in contributing to clarity and in reducing the boredom, if any.
Distributing a broad outline, with a few important tables and charts concerning the
research results, makes the listeners attentive who have a ready outline on which to
focus their thinking. This very often happens in academic institutions where the
researcher discusses his research findings and policy implication with others either
in a seminar or in a group discussion. Thus, research results can be reported in
more than one way, but the usual practice adopted, in academic institution
particularly, is that of writing the Technical Report and then preparing several
research papers to be discussed at various forums in one form or the other. But in
227
the practical field and with problems having policy implications, the technique
followed is that of writing a popular report. Research done on governmental
account or on behalf of some major public or private organizations is usually
presented in the form of a technical report.
228
CHAPTER – XII
MECHANICS OF REPORT WRITING
MECHANICS OF WRITING A RESEARCH REPORT
There are many definite and sets of rules which should be followed in %
actual preparation of the research report or paper. Once the techniques are finally
decided, they should be scrupulously adhered to, and no deviation permitted. The
criteria of format should be decided as soon as the materials for the research paper
have been assembled. The following points deserve mention so far as the
mechanics of writing a report are concerned.
SIZE AND PHYSICAL DESIGN
The manuscript should be written on Unit ruled paper 8.1/2"X11" in size. If
it is to be written by hand, then black or blue-black ink should be used. A margin
of at least one and one-half inches should be allowed at the left hand and of at least
half an inch at the right hand of the paper. There should also be one-inch margins,
top and bottom. The paper should be neat and legible. If the manuscript is to be
typed, then typing should be double-spaced on one side of the page only except for
insertion of the long quotations.
PROCEDURE
While writing the report the steps already mentioned in the earlier chapter
should be strictly adhered to.
229
LAYOUT
Keeping in view the objective and nature of the problem, the layout of the
reports should be thought of and decided and accordingly adopted.
TREATMENT OF QUOTATION
Quotations should be placed in quotation marks and double spaced, forming
an immediate part of the text. But if a quotation is of a considerable length then it
should be single - spaced and indented at least half an inch to the right of the
normal text margins.
THE FOOTNOTES
Regarding footnotes one should keep in view the following
a) The footnotes serve two purposes viz., the identification of materials used in
quotation in the report and the notice of materials not immediately necessary to the
body of the research text but still of supplemental value. The modern tendency is to
make the minimum use of footnotes for scholarship does not need to be displayed.
b) Footnotes are placed at the bottom of t he page on which the reference or
quotation which they identify or supplement ends. Footnotes are customarily
separated from the textual materials by a space of half an inch and a line about one
and a half inches long.
c) Footnotes should be numbered consecutively, usually beginning with 1 in
each chapter separately. The number should be put slightly above the line, say at
230
the end of a quotation.
d) Footnotes are always typed in single space though they are divided from one
another by double space
DOCUMENTATION STYLE
Regarding documentation, the first footnotes reference to any given work
should be complete in its documentation, giving all the essential facts about the
edition used. Such documentary footnotes follow a general sequence. The common
order may be described as under.
i) Regarding a single volume reference
1. authors name in normal order followed by a comma
2. title of work, underline to indicate italics
3. place and date of publication
4. pagination references
Example
John Gassner, Master of the Drama, New York; Dover publication, Inc
1954, P.315
ii) Regarding multivolume reference
1. Author's name in the normal order
2. Title of work, underlined to indicate italic
3. place and date of publication
231
4. number of volume
5. Pagination reference.
iii) Regarding work arranged alphabetically
For works arranged alphabetically such as encyclopedias and dictionaries,
no pagination references are usually needed. In such cases the order is illustrated as
under.
Example 1
"Salamanca", Encyclopedia Britannica, 14th Edition.
Example 2
"Mary Wollstonecraft Godwin, "Dictionary of National biography".
But if there should be a detailed reference to any encyclopedia article, volume and
pagination reference may be found necessary.
vi) Regarding periodicals references
1. name of the author in normal order
2. title of article, in quotation marks
3. name of periodical, underlined to indicate italics
4. volume number
5. date of issuance
6. Pagination
232
(vi) Regarding anthologies and collections references
Quotations from anthologies or collections of literacy works must be
acknowledged not only by author, but also by the name of the collector.
vii) Regarding second hand quotations reference
In such cases the documentation should be handled as follows
1. original author and title
2. "quoted or cited in,"
3. Second authors and work. Example
J.F. Jones, Life in Polynesia, P.16. Quoted in History of the Pacific Ocean Area, by
R.B.Abel, P. 191.
viii) Case of multiple authorship
If there are more than two authors or editors, then in the documentation the
name of only the first is given and the multiple authorship is indicated by "etal". or
"and other".
PUNCTUATION AND ABBREVIATIONS IN FOOTNOTES
The first item after the number in the footnotes is the author's name, given in
the normal signature. This is followed by a comma. After the comma, the title of
the book is given; the article (such as "A", "An", "The" etc.) is omitted and only
the first word and proper nouns and adjectives are capitalized. The title is followed
by a comma. The place of publication is then stated; it may be mentioned in an
233
abbreviated form, if the place happens to be a famous one such as LON. for
London, N.Y for NEW York, N.D for New Delhi and so on. This entry is followed
by a comma. Then the name of the published is mentioned, and this entry is closed
by a comma. It is followed by the date of publication if the date is given on the title
page. If the date appears in the copyright notice on the reverse side of the title page
or else where in the volume the comma should be omitted and the date enclosed in
square brackets (d 978), [1978].
Certain English and Latin abbreviations are quite often used in
bibliographies and footnotes to eliminate tedious repetition. The following is a
partial fist of the most common abbreviations frequently used in report-writing,
(the researcher should learn to recognize them as well as he should learn to use
them)
Anon., anonymous
Ante, before
Art., article
Aug., augmented
Bk.,book
Bull., bulletin
Cf., compare
Ch., chapter
234
Col., column
Discs, dissertation
Ed., editor, edition, edited
Edacity, edition cited
e.g., example gratis; for example
eng., enlarged
etal; and others
ex; example
f., ff., and the following
fig(s). figures
fir. footnotes
ibid. bidden in the same place
USE OF STATISTICS CHARTS AND GRAPHS
A judicious use of statistics in research reports is often considered a virtue
for it contributes a great deal towards the clarification and simplification of the
materials and research results. One may well remember that a good picture is often
worth more than a thousand words. Statistics are usually presented in the form of
tables, charts, bars Shd line-graphs and pictograms.
235
THE FINAL DRAFT
Revising and rewriting the rough draft of the report should be done with
great care before writing the final draft. For the purpose, the researcher should put
to himself questions like; Are the sentences written in the report dear? Are they
grammatically correct? Do they say what is meant? Do the various points
incorporated in the report fit together logically? Having at least one colleaque read
the report just before the final revision is extremely helpful. Sentences that seem
crystal-clear to the writer may prove quite confusing to people; a connection that
had seemed self evident may strike other as not so evident
BIBLIOGRAPHY
Bibliography should be prepared and appended to the research report as
discussed earlier.
PREPARATION OF THE INDEX
At the end of the report an index should invariably be given, the value of
which lies in the fact t hat it acts as a good guide to the reader. Index may be
prepared both as subject index and as author index. The former gives the name of
the subject topics along with the number of pages on which they have appeared or
discussed in the report, whereas the latter gives the similar information regarding
the names of authors. The index should always be arranged alphabetically. Some
236
people prefer to prepare only one index common for names of authors, subject
topics, concepts and the like ones.
NORMS FOR USING TABLES, CHARTS AND DIAGRAMS
The following general norms should be followed while constructing the charts and
diagrams.
1. Title: Every chart/ diagram must be given a suitable title. The title should
convey in as few words as possible the main idea that the diagrams intend to
portray. Title should be given either at the top or at the bottom of the chart /
diagram.
2. Proportion between width and height: If either the width or height of the
diagram / chart is too short or too long in proportion, it would giwsan ugly look.
While there are no fixed rules about the dimensions, Lutz suggested "Root-Two"
norm, which may be adopted for general use. It means, a ratio of 1 (short side) to
1.414 (long side).
3. Selection of scale: The scale showing the values should be in even numbers or
in multiples of five or ten. E.g. 25, 50,75 or 20,30,40. Etc. Odd numbers like 1,3,5,
7, should be avoided.
4. Foot notes: In order to clarify certain doubts about the diagram / art, foot
notes may be given at the bottom of the diagram /chart.
5. Index: An index illustrating different types of lines or different shades, colour
237
should be given so that the reader can easily make out the meaning of the diagram.
6. Neatness and cleanliness: The chart / diagram should be absolutely neat and
clean.
7. Simplicity: Diagrams should be simple so that the reader can understand
clearly and easily. It is important that too many materials should not be loaded in a
single diagram which may confuse the reader.
Tables and figures may be profitably employed to present the data more
clearly and more concisely than would be possible if the same information were
presented in text form. Most style manuals provide examples of commonly used
types of tables and figures and instruction for their construction. A well
constructed table can give the reader a concise overview of the data.
The tables one has constructed in the conduction of the study cannot usually
be incorporated directly into the report. For example, at the completion of a study
one may have an alphabetical list of the subjects in one's study and their scores on
criterion measures. Rather than present this list as it stands one should construct a
table with the information in summary form.
It is desirable to arrange the tables in such a way that they illustrate the
relationship of the data to the hypothesis of the study. Beginning researchers are
frequently tempted to include the data in both forms. This merely makes the report
longer and more tedious. A better approach is to present the data in tables and
238
figures accompanied by sufficient text to point out the most important and
interesting findings. It is especially important to relate the information in the tables
to the hypothesis.
APPENDIX, INDEX AND BIBLIOGRAPHY
The appendix contains pertinent materials that are not important enough to
be included in the body of the report, but which may be of value to some readers.
Such materials may include complete copies of locally devised texts or
questionnaires, together with the instructions and scoring key for such instruments,
item analysis data for measurements used, pertains instructions to subjects, and
tables that are very long or of only minimum importance of the study.
The following materials may be incorporated in the appendices.
1. original data, 2. long tables, 3. Long quotations, 4. supportive legal decisions,
documents, 5. illustrative material, 6. questionnaires, 7.case studies, 8. schedules or
forms, 9. transcripts of interviews.
The appendices may be serialized with capital letters (eg. Appendix-A, Appendix-
B) to differentiate from the chapter or table numbers,
In the text, reader's attention is drawn to the appendices as in the case of tables.
All the appendices are listed in the table of contents.
239
INDEX
An index is seldom required in a research report, but one should be included
if it would be useful to the reader. Index is otherwise called as glossary. A
glossary/ index is a short dictionary giving definitions and examples of terms and
phrases which are technical, used in a special connotation by the author, unfamiliar
to the reader, or foreign to the language in which the book is written. It is listed as
a major section in all capital letters in the table of contents.
The glossary appears after the bibliography. It may also appear in the introductory
pages of a book after the lists of tables and illustrations.
Items are listed in alphabetical and normal order.
BIBLIOGRAPHY
The bibliography must include all sources mentioned in the text or footnotes.
Most universities insist that only these be listed, but a few insist that pertinent
references not specifically mentioned also be listed. The style manual previously
stated will give complete details on the method of listing references. It is important
to follow these rules rigorously and completely. In fact, it is a good strategy to
learn them before carrying out the search through the literature for the proposal. By
listing each reference in the correct form as it is encountered, one can avoid the
extra time involved in finding the references again in order to have them in
240
complete form for the bibliography. It is advisable to list them on cards so that they
can be filed in alphabetical order.
The bibliography comes after the appendices section and is separated from it
by a division sheet written 'BIBLIOGRAPHY'. It is listed as a major section in all
capital letters in the table of content. The title of a bibliography should indicate
what type of items is listed. Some common varieties of bibliographies are given
below.
1. Bibliography of works cited
2. Selected bibliography
3. Annotated Bibliography
When the bibliography is long, items are classified for easy references according to
s1. format like books, periodicals, and news papers, 2. subject or theme, or 3.
chronological order.
If the name of the author is not given, the title of the book or the article
appears first An author's works written in collaboration with others are listed after
the works which he has written alone.
If there are more than three authors, the symbol et al. is used after the first author's
name and other names are omitted.