advanced research methods handout final and complete

Ethiopian Civil Service University

Center for Public Policy Studies

Course: Advanced Research Methods

Identifier: M-oo1

Title: Handout (Reader)

Prepared by:

Atakilt Hagos

March, 2014

Addis Ababa

Didactic Design: <Title of Module>

Summary of the Didactic Design for the Module

General DataModule Number

PPS- 562

Module Title Advanced Research MethodsModule Description

The module, Advanced Research Methods is a module that is provided to masters students of public policy studies. The aim of the module is to introduce students with the basic knowledge and skill to do research in relation to public policy issues, processes and outcomes/impacts. Therefore, this module provides students with some conceptual and theoretical backgrounds of Advanced Research Methods. It will enable participants to identify and apply appropriate research methodologies in order to plan, conduct and evaluate basic research in organizations. In addition, the module will also enhance the research ability of students to understand public policy problems from the perspectives of various fields; identify appropriate research designs and draw the attention of students to understand techniques of writing a basic research report. The module will, furthermore, enable participants to understand the ethical issues in scientific research while laying the foundation for research skills at higher levels. As part of the assessment, participants will individually select a topic and prepare a research proposal for which they will get a feedback so that they prepare themselves to the task they face in Thesis I and II. Statistical package (e.g., SPSS) will be used whenever applicable. The module introduces participants with the meaning of scientific research and the research process. It also sheds some light on types of research design as well as sampling design. The module also drills down into the data types/sources and methods of data collection. The participants will also be introduced to the basics of proposal writing, including referencing, and qualitative and quantitative research strategies and methods of Analyzing the Data. They will also acquire skills on the use of SPSS in analyzing quantitative data. Specifically, they will apply SPSS in generating descriptive statistics from a data set as well as in conducting correlation analysis, hypothesis testing and regression analysis. Application. The module uses Lecture, Individual work, collaborative work, tutorial, and presentation as teaching and learning methods. The assessment methods for this module are short test/quiz, individual assignment, group assignment and final exam.

Module Level Masters Abbreviation EPPSubtitle -Duration in Semesters

Semesters two

Frequency Once in a yearLanguage EnglishMode of Delivery

Face-To-Face

ECTS 7

2

2


WorkloadContact Hours 70 HrsNon-Contact Hours 140 HrsTotal Hours 210 Hrs

AssessmentDescription The assessment method will follow Continuous assessment where students will

be given individual assignments, group assignments, short quiz and final exam.Individual assignments (20%)Group assignment (20%)Short quizzes (20%)Final examination (40%)Total 100%

Examination Types

Written

Examination Duration

180 minutes

Assignments Both individual and group assignments

Repetition Once only in case of certified health, maternity leave, and for students who scored an “F” grade in the module.

DescriptionLearning Outcomes At the end of the module, students are able to:

1. Applying the concepts, approaches and methods of scientific research in develop a research proposal and in designing data collection instruments

2. Review the literature, research proposals and research reports prepared by other researchers

3. Analyze qualitative/quantitative data using statistical packages 4. Conduct scientific research individually or as a member of a research

group5. Understand the meaning, characteristics, and steps of scientific

research and the various types of research design and sampling design

Prerequisites Basic mathematics & statistics

Content PART ONE: Research Meaning, Process, and Proposal Writing

1. Scientific Research and the Research Process: Meaning and objective of Scientific research; Characteristics of Scientific Research; Criteria of Good Research; The Research Process: An Overview

2. Research Approaches and Sampling Design: Types of research and Research Approaches/designs; Determining Sample Design (probability and non-probability sampling);

3. Data Sources and Methods of Data Collection: Data Types and Sources/Methods of collecting Primary Data; Sources of

3

3


Secondary Data; Guideline for Designing a Questionnaire and other Instruments

4. Research Proposal, Referencing, Reporting Results and Ethical Considerations: The Research Proposal; Referencing Styles; 4.3 Report Writing: Communicating the Results; Preparing and Delivering a Presentation; Ethical Issues and Precautions

5. The survey method and case studies

PART TWO: PRESENTING AND ANALYZING QUANTITATIVE DATA AND HYPOTHESIS TESTING (With SPSS Application)

6. Presentation and analysis of quantitative data - Descriptive Statistics: Basic concepts in statistics; Classification and Presentation of Statistical Data (bar chart, pie chart, histogram); Measures of central tendency and dispersion (mean, median, mode, mean deviation, variance, standard deviation, covariance, Z-score); Exercise with SPSS Application

7. Tests of hypothesis concerning means and proportions: Tests of hypotheses concerning means; Tests concerning the difference between two means (independent samples); Tests of mean difference between several populations (independent samples); Paired-samples t-test (Differences between dependent groups); Tests of association (the Pearson coefficient of correlation and test of its significance, The Spearman rank correlation coefficient and test of its significance); Nonparametric Correlations (The Chi-square test); Hypothesis test for the difference between two proportions; Exercise with SPSS Application

8. The simple linear regression model and Statistical Inference; the simple linear regression model, estimation of regression coefficients and interpreting results; hypothesis testing; Exercise with SPSS application

9. The multiple linear regression model and Statistical Inference; the multiple linear regression model, estimation of regression coefficients and interpreting results; hypothesis testing; Exercise with SPSS application

Learning & Teaching Methods

Lecture, tutorial, Individual work, collaborative work, presentation

Media Face-to-face lecture, notes, power points, internets, books, journal articles

Literature Electronic Books: Khotari , C.R (2004) Research Methodology: Methods and

Techniques, New Age International (P) Limited Publishers, New Delhi.

Dowson, Catherine (2002) Practical Research Methods. How to Books, Oxford, UK.

Punch, Keith F. (2006) Developing Effective Research Proposal.

4

4


Marczyk, Dematteo and Festinger (2005). Essentials of Research Design and Methodology.

Gray, David E. (2004). Doing Research in the Real World. Greener, Sue (2008). Business Research Methods. Flick, Kardorff and Steinke (2004). A Companion to Qualitative

Research. Singh, Yogesh Kumar (2006). Fundamentals of Research

Methodology and Statistics. Dowdy, Wearden and Chilko (2004). Statistics for Research, 3rd

ed. Yin (1994) Case StudyResearch

Books from the library:

Hill, R. Carter (1997), Undergraduate econometrics, Wiley, New York Gujarati, Damodar N (1994) Basic econometrics.

5

5

http://library.ecsc.edu.et/cgi-bin/koha/opac-search.pl?q=pb:Wiley,%20

http://library.ecsc.edu.et/cgi-bin/koha/opac-detail.pl?biblionumber=6974


The Reader

Part I: Research Meaning and Process

Unit One:

Scientific Research and the Research Process

As students of the masters program and as professionals after graduation, you will be engaged in scientific research. As decision makers, you may be provided with information on the progress and findings of a research project sponsored by your organization or another agency. One way or another, you are likely to be involved in research. So it is very essential for you to know what research is and how it is carried out. Research requires passion, knowledge and skills. So what is research? Why do we conduct research? What are the building blocks of scientific research? What process should you follow in conducting scientific research? We will address these and other questions in this chapter.Learning Objectives: After reading this chapter, you should be able to:

Explain the meaning and objectives of research Discuss characteristics of scientific research Describe the criteria of Good Research Distinguish between inductive and deductive research types Discuss the nine steps of the Research Process

1.1Meaning and objective of scientific research

As per the Merriam-Webster Online Dictionary, the word research is derived from the French "recherche", which means "to go about seeking". Research has been defined in a number of different ways. The Merriam-Webster Online Dictionary defines research in more detail as "a studious inquiry or examination; especially: investigation or experimentation aimed at the discovery and interpretation of facts, revision of accepted theories or laws in the light of new facts, or practical application of such new or revised theories or laws.

The Market Research Society (in UK) defines (social science) research as “The application of scientific research methods to obtain objective information on people’s attitude and behavior based usually on representative sample of the relevant populations” (Yuvon McGivern (2003).Objectives of Research:

To gain familiarity with a phenomenon or achieve new insights into it (via: exploratory or formative research studies)

To portray/describe the characteristics of a particular individual, situation or group (Via: descriptive research studies)

6

6

http://en.wikipedia.org/wiki/French_language


To determine the frequency with which something occurs or with which it is associated with something else (Via: diagnostic research studies)

To test a hypothesis of a causal relationship between variables (Via: hypothesis-testing research studies)

1.2Characteristics of Scientific Research

Generally scientific research has the following characteristics:

It is empirical: Science is based purely around observation and measurement, and the vast majority of research involves some type of practical experimentation.

It relies upon data: quantitative and qualitative

It is Intellectual and Visionary: Science requires vision, and the ability to observe the implications of results. the visionary part of science lies in relating the findings back into the real world. The process of relating findings to the real world is known as induction, or inductive reasoning, and is a way of relating the findings to the universe around us.

It uses experiments to test predictions: This process of induction and generalization allows scientists to make predictions about how they think that something should behave, and design an experiment in a laboratory or by just observing the natural world.

It is systematic and methodical: Follows certain steps that are repeatable.

1.3 Criteria of Good Research

Whatever may be the types of research works and studies, one thing that is important is that they all meet on the common ground of scientific method employed by them. One expects scientific research to satisfy the following criteria:

i) Good research is systematic: It means that research is structured with specified steps to be taken in a specified sequence in accordance with the well defined set of rules. Systematic characteristic of the research does not rule out creative thinking but it certainly does reject the use of guessing and intuition in arriving at conclusions.

ii) Good research is logical: This implies that research is guided by the rules of logical reasoning and the logical process of induction and deduction are of great value in carrying out research. Induction is the process of reasoning from a part to the whole whereas deduction is the process of reasoning from some premise to a conclusion which follows from that very premise. In fact, logical reasoning makes research more meaningful in the context of decision making.

7

7

http://www.experiment-resources.com/design-of-experiment.html

http://www.experiment-resources.com/design-of-experiment.html

http://www.experiment-resources.com/what-is-generalization.html

http://www.experiment-resources.com/inductive-reasoning.html



http://www.experiment-resources.com/qualitative-research-design.html

http://www.experiment-resources.com/quantitative-research-design.html

http://www.experiment-resources.com/experimental-research.html

http://www.experiment-resources.com/scientific-measurements.html


iii) Good research is empirical: It implies that research is related basically to one or more aspects of a real situation and deals with concrete data that provides a basis for external validity to research results.

iv) Good research is replicable: This characteristic allows research results to be verified by replicating the study and thereby building a sound basis for decisions.

1.4 Deductive vs Inductive Research

Scientific research follows logical reasoning, which could be deductive or inductive.

In deduction, conclusions are based on premises.

8

Induction: moving from specific to general. Based on facts, researchers develop principles or theories, which could later be used as basis for deductive research.

Deduction: moving from general to specific. Arguments are based on laws, rules and accepted principles.

8


9

9


In induction, conclusion is based on reasons with proof and evidence for a fact.

1.5 The Research Process: An Overview

The research process depends on the type of the research logic (deductive or inductive).

1.5.1 The Deductive Research Process

The research process involves several steps. The number of the steps may differ from author to author. However, most authors divide the entire research process into NINE steps. In this course, we will discuss these nine steps. However, please bear in mind that all these steps are not equally applicable to all types of research.

Step One: Find a Research Topic and State Your Problem

This is the first step in the process of research. From my experience as an academic staff; from my exposure to different research methodology trainings; and, from my own little research experience, I have seen two approaches or practices regarding the selection of a research topic.

10

10


The first practice is that masters and even PhD students are required to make their topic as broad as possible and incorporate as many aspects of the problem as possible. The advantage of this approach is that the student will have some wider knowledge on the problem; what it lacks is depth or it takes more time to make the research both wider and deeper at the same time. In the absence of sufficient time and resources,

o The literature survey becomes too broad.

o Many of the variables in this type of research are not well identified or else they are not well-defined – relevant indicators are not sufficiently included.

o As a result, the questions in a questionnaire or interview are too general and shallow. The research methods adopted tend in most cases to be descriptive.

The second practice, a growingly dominant paradigm in western countries, is that students are obliged to narrow down their research topic, make it specific, and dig deep down into the issues. In this case, the researcher can do a better job given the limited time and resources he/she has. Specifically:

o The literature survey becomes targeted.

o The variables or assessment issues are well identified

o The specific indicators for each variable are also identified

o The data collection instruments will be sharper and to the point.

o Students will have the opportunity to go beyond simple description of phenomenon and attempt an analysis of causes and effects, hypothesis testing, etc.

Having quality of research in mind, the second approach is more appealing. With this in mind, you can follow the following guideline while identifying your research problem/topic.

Decide the general area of interest or aspect of a subject matter that you would like to enquire into and consider the feasibility of a particular solution. Pick a smaller part of a bigger problem; do not try to address a big problem in one research.

Understand the problem by discussing with friends/colleagues or with those that have some expertise in the matter or with those agencies working in relation to the issue.

Narrow the problem down based on the general discussion and phrase the problem in operational terms. This process of narrowing down the problem is iterative.

Examine all available literature to acquaint oneself with the selected problem. There are two types of literature: the conceptual (concepts and theories) and the empirical. This can also help the researcher know what data are available.

Requires careful verification of the validity and objectivity of the background facts concerning the problem.

Define pertinent terms

o What are key variables in your study?

o What relationship do you investigate?

11

11


Checklist for a good research topic

Is the topic something in which you are really interested?

Does the topic have a clear link to theory?

Do you have, or are you able to develop, the necessary research skills to undertake the topic?

Is your topic societal relevant?

Subject which is overdone must not be selected.

Controversial subject should not become the choice of an average researcher.

Avoid too narrow or too vague problems.

The subject should be familiar and feasible

Consider importance of the areas/subject, capacity of the researcher, cost and time requirements, accessibility of necessary cooperation, etc.

Note:

i) Your problem statement must be specific to the issue at hand and often ends up with research questions. Within a given research topic, it is possible that different researchers could formulate different research questions. Therefore, it is very important to write down your research questions at the end of the problem statement. The research questions specifically indicate what your study is about.

ii) While you state the problem, you may need to provide some data or information to express the magnitude of the problem. This may require you to do a preliminary data gathering.

Step Two: Literature Survey: Theoretical and Conceptual Framework

Once you have chosen your research topic and narrowed it down, you have to carry out an extensive literature survey of the academic literature (journal articles, conference proceedings, books, unpublished materials, etc) to know more about the theories and debates going on regarding the topic and understand the specific nature of the problem in the academic literature. You need also to gather some preliminary information regarding the topic and the study area from government and non-government documents, reports, etc.

Expected Outputs from the Literature Survey

The theoretical and conceptual framework is what is ultimately expected as a product of your literature survey.

i) The Theoretical Framework

The theoretical framework refers to a summary of the theories that you will refer to in your study. You will refer to the theories during the development of the hypotheses and the conceptual framework; when you prepare the research design; do the data analysis and generalization. Your conceptual framework will

12

12


indicate the important issues to be assessed or the variables to be measured; their possible indicators; the type and direction of relationship that exists among the variables, and so on.

What you summarize as part of the theoretical framework has to be very relevant to the topic and particularly the research problem and the research questions. While conducting the literature survey, students often throw in whatever literature that is in one way or another related with the research area but not necessarily with the research problem. Failing to prepare the theoretical framework properly has at least the following disadvantages:

a) You will not have the basis to define relevant concepts, identify the assessment issue or the variables and define them.

b) You don’t know what relationship to expect.

c) It will not be easy for you to choose the appropriate research design.

d) Your data collection instruments will be ill designed.

e) During analysis, you will not have any theory to compare your results with.

f) The contribution of your research to the existing theory will be blurred.

ii) The Conceptual Framework

Based on the theoretical framework, you are expected to develop your conceptual framework. In this part, you will define the concepts you will use in your research. In the literature, concepts may defined in different ways and you will have to make a choice here. In your study, how the concepts defined operationally? What are the variables and indicators that you will use in your study to measure the concepts? How do the different concepts relate among each other? These and other questions should be answered via your conceptual framework.

Conceptual frameworks are best done graphically rather than in text. The diagram depicts the concepts/issues and how they relate among each other. You may create your illustrative diagram or adopt or adopt it from the literature. In the later case, you have to clearly cite the sources for your diagram.

Step Three: Development of Working Hypothesis:

If you are conducting a deductive type of research, you will be required to develop hypotheses based on the theories you have encountered while doing your literature survey. The question is, what does the theory(ies) say about the phenomenon you are investigating and the relationship among the various variables involved?

A hypothesis:

Is a tentative assumption made in order to draw out and set its logical or empirical consequences; It should be specific and pertinent to the piece of research in hand;

Provides the focal point for the research; to delimit the area, sharpen thinking and keep the researcher on the right track.

Determines data type required; the data collection and sampling methods to be used; the tests that must be conducted during data analysis.

Results from a-priori thinking about the subject, examination of available data and material.

13

13


Step Four: Preparing the Research Design:

Once you have developed your hypothesis, the next step is o craft your research design. A research design is like the blue print for house construction. If you start building a house with out first having the design (consisting of the architectural, electrical, sanitary, etc designs), the result is you don’t know what type of house you will end up with; it will be costly and time taking; often involving construction and demolition of what has been constructed. Most importantly, the house lacks quality and may be prone to risks. Likewise, a research conducted without a research design at hand is aimless, ambiguous, time taking, costly and may be totally irrelevant and unacceptable in light of the requirements for a scientific investigation.

A research design refers to the crafting of the conceptual structure within which research will be conducted in a way that is as efficient as possible, the collection of relevant evidence with minimal expenditure of effort, time and money. More explicitly, the design decisions happen to be in respect of:

i. What is the study about?

ii. Why is the study being made?

iii. Where will the study be carried out?

iv. What type of data is required?

v. What periods of time will the study include?

vi. What will be the sample design?

vii. What techniques of data collection will be used?

viii. How will the data be analysed?

ix. In what style will the report be presented?

Step Five: Collecting the Data and Administering data Collection:

Once you have completed the research design and have it endorsed by the concerned parties, you can proceed to data collection using the data collection instruments you have developed as part of your research design. We have seen that scientific research is empirical. Therefore, you need to gather empirical data to answer you research questions and meet your research objectives. This involves collecting the data through observation, personal interview, telephone interview, mailing of questionnaire, schedule etc.

Administering Data Collection (Managing the project)

During the data collection process, the researcher must address the possible problem of bias in information collection. Possible sources of bias during data collection:

Defective instruments, such as questionnaires, weighing scales or other measuring equipment, etc

Observer bias

Effect of the interview on the informant

14

14


Information bias

o These sources of bias can be prevented by carefully planning the data collection process and by pre-testing the data collection tools.

o All these potential biases will threaten the validity and reliability of your study.

o By being aware of them it is possible, to certain extent, to prevent them.

Managing the project (data collection process) involves the following:

o Organizing fieldwork

o Briefing interviewers (enumerators) and coordinators

o Developing an analysis plan (e,g, Coding)

o Organizing data processing (e.g. entering the coding of the questionnaire items into EXCEL or SPSS spread sheet; entering data before the data collection has been completed)

o Starting the analysis

o Checking and reporting progress of data collection

Step Six: Analysis of Data

A. Data Preparation

Once all the data that is required to answer the research questions and meet the research objectives is collected, you can now start analyzing the data. First you have start with data preparation. Data preparation involves establishment of categories, the application of these categories to raw data through coding, tabulation, and then drawing statistical inferences. Coding, data editing, tabulation, computation of different statistic, etc follow

Editing:- Editing is a process of examine the collected raw data ( especially in survey) to detect error and omission and to correct these when possible,

Coding: – is the process of assigning numerals or other symbols to answer so that responses can be put into a limited number of categories or classes and such classes should be appropriate to the research problem.

Classification: – The raw data must be grouped or classes on the basis of common characteristic, data having a common characteristic are placed in one class and in this way the entire data get divided into a number of groups or classes.

Tabulation /Compilation:- It is the process of summarizing and displaying raw data in compact form ( statistical tables) for further analysis. In a broader sense, tabulation is an orderly arrangement of data in

column and row.

B. Data Analysis:

This is a very important part of your study. Depending on the methods you have selected during your research design, which in turn depends on the type of the research (exploratory, descriptive, etc) and the

15

15


research approach (qualitative vs, quantitative, or both), you need to analyze the data using those methods. What is expected as a result of your analysis is the findings pertinent to the research questions and objectives.

Step Seven: Hypothesis Testing:

For those types of research for which hypotheses must be developed in advance based on theory, this is the time to test the hypothesis based on the findings of your analysis. There are different types of statistical tests of hypothesis: Chi-square test, t-test, F-test, ANOVA, etc,

Step Eight: Generalization and Interpretation:Your analysis and hypothesis testing is useless unless it leads you to certain generalizations. After the hypothesis has been tested several times, you may arrive at generalizations. Again, depending on the type of the research, your generalization could be in different forms. For example:

In exploratory research, it could be in the form of proposing a hypothesis that shall be tested by several other studies in the future.

In explanatory studies, your generalization could be in the form of statements regarding what factors explain the dependent variable and whether this is in accordance with what is stated in theory.

If you had no hypothesis at the beginning, explain the findings on the basis of some theory known as interpretation. The process of interpretation may trigger new questions which will serve as a basis for further researches.

Step Nine: Preparation of the Research Report

Your research effort and output is summarized and presented in the research report of thesis. If your language and writing style is not up to the expected standard, the reader may find it annoying and less interesting to read your work no matter how good your topic and research questions may be. If it is your masters’ thesis and the report is not well written, you are likely to face a problem when your advisor and examiners read your paper. If it is for publication in a journal, the editor or the blind reviewers will discard the paper unless it is well-written in light of some standard or expected style.

1.5.2 The Inductive Research Process

16

Research topic

Tentative working hypothesis

Tentative Research design

Field work (data collection)

Observation, artifacts

Interviews, hanging out

Focus groups

Field notesAnalyze data

Refining hypothesis

If necessary, collect additional data

Develop theory

Write the Report

16


Unit Two

Types of Research Design and the Sampling Design

2.1 Research Approaches

2.1.1 Types of Research based on the nature of enquiry

Based on the nature of the research enquiry, research approaches are classified as exploratory, descriptive, explanatory (causal) or predictive research

A. Exploratory research:

Research undertaken to explore an issue or a topic to identify a problem, clarify the nature of the problem or define the issue involved. It can be used to develop propositions (hypothesis) for further research, to gain new insights and a greater understanding of the issue, especially when little is known about the issue.

Characterstics of exploratory research:

It is more of qualitative rather than being quantitative

It is carried out in small-scale rather than in large-scale.

Provides answers to questions: “what”, “how” and “why”

Concerned with hypothesis development rather than hypothesis testing.

It could be conducted as a pre-study: Literature, Focus group, Experience survey, Brainstorm

It could also be conducted as main study: Observation, Case-study/ies/.

Exploratory studies could be carried out based on literature search (review), experience survey, and analysis of selected cases. Observation, focus groups, and interviews are useful methods of data collection for exploratory researches.

B. Descriptive – size exploration (ex-post facto research):

Fact-finding enquiries, describing the state of affairs as it exists; researcher has no control over variables, can only report what happened or what is happening by using Survey or Correlational Methods. It aims at answering the questions: Who? What? Where? When? How? And How Many? This type of research is carried out to answer more clearly defined research questions

Classification of descriptive studies: Descriptive studies could be Longitudinal or Cross sectional.

i) Longitudinal: Studying different units (e.g. households) over time (e.g. over several years). This study could be based on either a true panel or an omnibus panel.

True panel: In this case, the units of analysis included in the sample (e.g. households) are consistently studied over time. Eaxmple, if the study is about consumption pattern of households and Ato Ayeyel’s household is part of the study, Ato Ayele’s household will be studies throught the time period.

Omnibus panel: In this case, the members of the sample may change. In the previous example, Ato Ayele’s household could be part of the study in year I while it may not be part of the study in year II.

17

17


ii) Cross sectional: Studying different units (e.g. households, sub-cities, regions, etc) at a given point in time.

C. Causal or explanatory (Hypothesis-Testing/Experimental): -

Helps to develop causal explanations about variables/factors by addressing the ‘why’ questions: Why do people chose brand A and not brand B? Why are some customers and not others satisfied with a product of a firm? Why do some celebrities and not others use drags? Explanatory research may involve experiment (laboratory or field experiment).

D. Predictive: to predict the likely future effects of current actions using the “if...then” proposition.

2.1.2 Types of research based on the mode of data collection

a) Continuous (longitudinal) and – takes data for many years where the subjects may be over time.

b) Ad hoc (one time) research

2.1.3 Types of research based on the type of data

a. Quantitative: applied to phenomenon that can be expressed quantitatively.

Involves the generation of data in quantitative form, which can be subjected to rigorous quantitative analysis in a formal or rigid fashion.

Can be further reclassified into inferential, experimental and simulation approaches to research.

o Inferential Approach: Purpose is to form a database from which to infer characteristics or relationships or population – survey research.

o Experiential Approach: Characterized by much greater control over the research environment and manipulation of some variables to observe their effect on other variables.

o Simulation Approach: Involves the construction of artificial environment to permit an observation of the dynamic behaviour of a system (or its sub-systems) under controlled conditions.

b. Qualitative: applied to quality or kind to describe the underlying motives of human behavior. E.g., motivation research deals with why people think or do certain things using depth interviews. Other techniques include: Word Association Test, Sentence completion test; Story completion test, etc.

o Concerned with subjective assessment of attitudes, opinions and behaviour.

o Research is a function of the researcher’s insights and impressions (judgements)

o Result is either in non-quantitative form or in forms that are not subjected to rigoros quantitative analysis.

o Utilizes techniques such as Focus Group Interviews, Projective Techniques and Depth Interviews.

18

18


2.1.4 Types of research based on the use of the research output:

a) Applied: to solve immediate problems of society

b) Fundamental/Basic/Pure: to develop theories- for knowledge’s sake.

2.1.5 Types of research based on the degree of theorization: Conceptual Vs Empirical:

Conceptual: Related to some abstract ideas or theory, generally used by philosophers and thinkers to develop new concepts or reinterpret existing ones.

Empirical: Relies on experience or observation alone; data-based research that starts with a working hypothesis or guess, then collection of data, then proving or disproving hypothesis; characterized by controlling and manipulation of variables.

2.1.6 Types of research based on the environment in which it is to be carried out

a) Field-Setting Research

b) Laboratory/Simulation Research

2.1.7 Based on cause of the research:

c. Conclusion-Oriented (the researcher is free to pick up a problem according to his/her wishes);

d. Decision-Oriented (Research problem emanates from needs of the decision maker- e.g., Operations Research)

2.2 Determining Sample Design

2.2.1 Key elements of the sampling design

While explaining the research design as the fourth step in your research process, we have seen that the sampling design is part of the research design or the research methodology. When you deal with the sample design, you are deciding the way of selecting a sample before data collection actually takes place. While developing a sampling design, you must pay attention to the following points:

(ii) Type of universe:- finite or infinite universe

(iii) Sampling unit:- Sampling unit may be a geographical one such as state, district, village, etc., or a construction unit such as house, flat, etc., or it may be a social unit such as family, club, school, etc., or it may be an individual. The researcher will have to decide one or more of such units that he has to select for his study.

(iv) Source list (sampling frame): is the frame from which sample is to be drawn. It contains the names of all items of a universe (in case of finite universe only). If source list is not available, researcher has to prepare it.

(v) Size of sample: An optimum sample is one which fulfills the requirements of efficiency, representativeness, reliability and flexibility. Furthermore, the desired precision as also an acceptable confidence level for the estimate, the parameters

19

19


of interest and the budgetary constraint must invariably be taken into consideration when we decide the sample size.

(vi) Parameters of interest: In determining the sample design, one must consider the question of the specific population parameters which are of interest (e.g., mean, median, mode) .

(vii) Budgetary constraint: Cost considerations. This fact can even lead to the use of a non-probability sample.

(viii) Sampling procedure: There are several sample designs out of which you must choose one for your study. Obviously, you must select that design which, for a given sample size and for a given cost, has a smaller sampling error.

2.2.2 Criteria of selecting a sampling procedure (mainly for deductive research)

While preparing your sampling design, you must remember that two costs are involved:

i) The cost of collecting the data and

ii) The cost of an incorrect inference resulting from the data

There are two causes of incorrect inferences: systematic bias and sampling error.

Systematic bias

i) A systematic bias results from errors in the sampling procedures, and it cannot be reduced or eliminated by increasing the sample size.

ii) At best the causes responsible for these errors can be detected and corrected.

Usually a systematic bias is the result of one or more of the following factors:

1. Inappropriate sampling frame:

2. Defective measuring device:

3. Non-respondents:

4. Indeterminancy principle: Sometimes we find that individuals act differently when kept under observation than what they do when kept in non-observed situations.

5. Natural bias in the reporting of data: People in general understate their incomes if asked about it for tax purposes, but they overstate the same if asked for social status or their affluence. Generally in psychological surveys, people tend to give what they think is the ‘correct’ answer rather than revealing their true feelings.

Sampling errors

Sampling errors:

Are the random variations in the sample estimates around the true population parameters (e.g., population mean).

20

20


Since they occur randomly and are equally likely to be in either direction, their nature happens to be of compensatory type and the expected value of such errors happens to be equal to zero.

Sampling error decreases with the increase in the size of the sample, and it happens to be of a smaller magnitude in case of homogeneous population.

If we increase the sample size, the precision can be improved. But increasing the size of the sample has its own limitations viz., a large sized sample increases the cost of collecting data and also enhances the systematic bias.

Thus the effective way to increase precision is usually to select a better sampling design which has a smaller sampling error for a given sample size at a given cost. In practice, however, people prefer a less precise design because it is easier to adopt the same and also because of the fact that systematic bias can be controlled in a better way in such a design.

In brief, while selecting a sampling procedure, researcher must ensure that the procedure causes a relatively small sampling error and helps to control the systematic bias in a better way.

Characteristics of a good sampling design

From what has been stated above, we can list down the characteristics of a good sample design as under:

(a) Sample design must result in a truly representative sample.

(b) Sample design must be such which results in a small sampling error.

(c) Sample design must be viable in the context of funds available for the research study and in the context of practicality to pick the selected elements of the sample.

(d) Sample design must be such so that systematic bias can be controlled in a better way.

(e) Sample should be such that the results of the sample study can be applied, in general, for the universe with a reasonable level of confidence.

2.2.3 Different Types of Sample Designs: Probability and non-probability

Sample design involves probability or non-probability sampling. This is necessary is we are not going for a census. A census is a complete enumeration of the entire population. The population and housing census which was conducted in Ethiopia in 1986 E.C. can be taken as an example of a census.

There are several reasons for taking a sample (and hence a sampling design) instead of a complete enumeration of the whole population or census. These include:

21

21


a) A census may be very expensive.b) A census may require too much time.c) A carefully obtained sample may be more accurate than a census. For example, in a large inventory

census or in a complete audit, errors due to fatigue or carelessness on the part of the census taker may introduce a serious bias in the results.

Broadly speaking, there are two types of sampling techniques: random sampling and non-random sampling. In random sampling, the elements to be included in the sample entirely depend on chance. Random sampling techniques often yield samples that are representative of the population from which they are drawn. In non-random sampling, the units in the sample are chosen by the investigator based on his/her personal convenience and beliefs.

Probability Sampling: includes sampling techniques such as Simple Random Sampling; Systematic Random Sampling; Stratified Sampling; Cluster/Area Sampling.

Non-Probability (Purposive) Sampling includes sampling techniques such as Convenience; Judgemental; Quota Sampling.

A. Random or Probability Sampling Techniques Simple Random Sampling: This is a method of sampling in which every member of the population has the same chance of being included in the sample.

Systematic Random Sampling: In some instances, the most practical way of sampling is to select, say, every 20th name on a list, every 12th house on one side of a street, every 50th piece of item coming off a production line, and so on. This is called systematic sampling, and an element of randomness can be introduced into this kind of sampling by using random numbers to pick the unit with which to start.

Stratified Random Sampling: The methods of stratified sampling tend to be economically desirable if the population to be sampled can be divided into relatively homogeneous subdivisions or strata. Stratified random sampling is the procedure of dividing the population into relatively homogeneous groups, called

22

22


strata, and then taking a simple random sample from each stratum. If the population elements are homogeneous, then there is no need to apply this technique.

Example: If our interest is the income of households in a city, then our strata may be:

low income households middle income households high income households

To obtain a sample from each stratum, we may follow two ways:

i. Taking a sample of size proportional to the sub-population (stratum) size, i.e., draw a large sample from a large stratum and a small sample from a small sub-population. This is known as proportional allocation.

ii. Selecting a sample from each stratum so that the variation due to sampling is minimized. This is known as optimum allocation.

iii. Selecting equal units from each stratum. This is known as equal allocation.Cluster Sampling : This is a method of sampling in which the total population is divided into relatively small subdivisions, called clusters, and then some of these clusters are randomly selected using simple random sampling. Once the clusters are selected, one possibility is to use all the elements in the selected clusters. However, if elements within selected clusters give similar results, it seems uneconomical to measure them all. In such cases, we take a random sample of elements from each of the selected clusters (called two-stage sampling).

Example: Suppose we want to make a survey on the ‘attitude and awareness of households about solid waste management (SWM)’ in Addis Ababa. Collecting information on each and every household is impractical from the point of view of cost and time. What we do is divide the city into a number of relatively small subdivisions, say, Kebeles. So the Kebeles are our clusters. Then we randomly select, say, 20 Kebeles using simple random sampling. To collect information about individual households, we have two options:

1) We visit all households in these 20 Kebeles, or,2) We randomly select households from each of these 20 selected Kebeles using simple random

sampling. This method is called two-stage sampling since simple random sampling is applied twice (first, to select a sample of Kebeles and second, to select a sample of households from the selected Kebeles)

B. Non-random Sampling Techniques Convenience, Haphazard or Accidental sampling (members of the population are chosen based on

their relative ease of access) Judgmental sampling or Purposive sampling – (The researcher chooses the sample based on who

he/she thinks would be appropriate for the study)

Purposive sampling starts with a purpose in mind and the sample is thus selected to include people or objects of interest and exclude those who do not suit the purpose. Purposive sampling can be subject to bias and error.

23

23


Case study (The research is limited to one group, often with a similar characteristic or of small size.) Ad hoc quotas (A quota is established and researchers are free to choose any respondent they wish

as long as the quota is met.) Snowball sampling (The first respondent refers a friend. The friend also refers a friend, etc.)

Comparison: Probability and non-probability sampling

Probability sampling (or random sampling) is a sampling technique in which the probability of getting any particular sample may be calculated. Non-probability sampling does not meet this criterion and should be used with caution. Non-probability sampling techniques cannot be used to infer from the sample to the general population. Performing non-probability sampling is considerably less expensive than doing probability sampling, but the results are of limited value.

The difference between non-probability (accidental or purposive) and probability sampling is that non-probability sampling does not involve random selection and probability sampling does. Does that mean non-probability samples aren't representative of the population? Not necessarily. But it does mean that non-probability samples cannot depend upon the rationale of probability theory. At least with a probabilistic sample, we know the odds or probability that we have represented the population well. We are able to estimate confidence intervals for the statistic. With non-probability samples, we may or may not represent the population well, and it will often be hard for us to know how well we've done so. In general, researchers prefer probabilistic or random sampling methods over non-probabilistic ones, and consider them to be more accurate and rigorous.

24

24


Unit Three

Data Types/Sources and Methods of Data Collection

3.1 Data sources/types: primary and secondary

Generally, the data you need for your research are classified into two categories: primary data and secondary data.

Primary Data: Data that you collect for the first time by yourselves for your own purpose. For example, you may measure the heights of students in a class using a meter.

Secondary data: Data that have been collected by others for their own purpose or for a general purpose. In this case, you had no control over the design and data collection Government data (economics and demographics), Media reports (TV, newspapers, Internet), etc

As a general rule, primary data sources are preferred to secondary sources since the primary source contains much pertinent information about collection methods and limitations associated with the data. If the information is derived from a secondary source, for instance, it is possible that the data might have been altered for some reason. However, it is also common that a particular research could employ both primary and secondary data.

3.2 Methods of Collecting Primary Data

3.2.1 Observation method:-

Description of the Method: In observation method researchers involves in recording the behavioural patterns of people objects and events in a systematic manner. Observational methods may be: Participant or Non – participant; Structured and unstructured; Disguised and undisguised; and, Personal or Mechanical.

Advantages of Observation Method: Disadvantage of Observation Method

It helps in overcoming issues of validity, bias etc.

It is useful when the subject can not provide information.

It is also useful when the subject is feared to provide inaccurate information.

The researcher can have first hand experiences of their study.

The ability to record and report all findings that are true to the topic at hand. It depends on what you are studying or researching as to the variables of the project at hand of course.

Past events being studied

Frequently measuring attitudes or opinions

Selecting sample is tricky

Time and costs are high – can be automated

Ethical issues

There may be too few trials/studies/ or objects observed to make an end conclusion to the study.

25

25


3.2.2 Interview Method

Description of the Method: Interviewing is a technique that is primarily used to gain an understanding of the underlying reasons and motivations for people’s attitudes, preferences or behaviour. Interview can be undertaken on a personal interview one-to one basis and, if possible, through telephone. They can be conducted at work, at home, in the street or in a shopping center, or some other agreed location.

There are three types of interview:

i. Structured interview: A structured interview means that the questions are developed a head of time with some opportunity to ask pre-planned, open-ended, probing questions.

ii. Semi- Structured Interview: In a semi-structured interview, the interviewer will have some set questions but interviewer can ask some spontaneous question too.

iii. Unstructured Interview: This interview is also called an in-depth interview. The interviewer begins by asking a general question. The interviewer then encourages the respondent to talk freely.

A. Personal interview:

Personal interview is the process taking place between interviewer (person asking question) and interviewee (person answering to the question).

Advantages of Personal Interview: Disadvantages of personal interview:

Good response rate

Possible in-depth questions

Can investigate motives and feelings

Can use recording equipment

Can complete in set time & immediate

Interviewer in control and can give help if there is a problem

Serious approach by respondent resulting in accurate information

Characteristics of respondent can be assessed – tone of voice, facial expression, hesitation, etc.

Need to set up interviews

Time consuming

Geographic limitation

Can be expensive

Normally need a set of questions

Embarrassment possible if personal questions

If many interviewees, interviewers training required,

Respondent bias – tendency to please or impress, create false personal image, or end interview quickly.

Steps in conducting personal interview

List the areas in which you require information

Decide on type of interview – structured, semi-structured, unstructured

Transform areas into actual question

Make an appointment with respondent(s) – discussing details of why and how long.

26

26


Try and fix a venue and time when you will not be disturbed.

B. Telephonic interview:

This is an alternative form of interview to the personal, face-to-face interview.

Telephonic interviews are less time consuming and less expensive and the researcher has ready access to anyone on the planet that has telephone.

Advantages of Telephone Interview: Disadvantages of Telephone Interview:

Relatively cheap and quick

Can cover reasonably large numbers of people or organizations.

Wide geographic coverage

No waiting and spontaneous response

Help can be given to the respondent

Can tape answers

Questionnaire required

Not everyone has a telephone.

Repeat calls are inevitable

Straightforward questions are required to ask

Respondent has little time to think on an issue

Good telephone manner is required

3.2.3 Questionnaire Method:

Description: A questionnaire is a series of written questions an a topic about which the subjects’ opinions are sought. In this method of data collection questionnaire will be sent to respondent through either post or e-mail and asking the respondent to fill up the questionnaire and send it back to the researcher. A questionnaire consists of a number of questions well formulated, printed or typed in a definite order to probe and obtain responses from respondents. Therefore there is a variation in the form and content of questionnaire from situation to situation.

Advantages of Questionnaire Method Disadvantage of Questionnaire Method

Can be used as a method in its own right or as a basis for interviewing or a telephone survey

Can be posted, e-mailed or faxed

Can cover a large number of people or organizations

Wide geographic coverage.

Relatively cheap and no prior arrangements are needed.

Avoids embarrassment on the part of the respondent.

Respondent can consider responses.

Possible anonymity of respondent.

No interviewer bias.

Design problems can hamper the research

Questions have to be relatively simple.

Historically low response rate (although inducements may help).

Time delay whilst waiting for responses to be returned.

Require a return deadline.

Several reminders may be required

Assumes no literacy problems.

No control over who completes it

Problems with incomplete questionnaires.

27

27


3.2.4 Focus Group Discussion method

A focus group could be defined as a group of interacting individuals (8-12 people in one group) having some common interest or characteristics, brought together by a researcher, who uses the group and its interaction as a way to gain information about a specific or focused issue. FGDs are facilitated by a moderator.

Advantages of Focus Groups:

Takes advantage of the fact that people naturally interact and are influenced by others ( high face validity).

Provide data more quickly and at lower cost than if individuals interviewed.

Generally requires less preparation and is comparatively easy to conduct.

Researcher can interact directly with respondents ( allows clarification, follow-up questions, probing)

Data uses respondents’ own words: can obtain deeper levels of meaning, make important connections, identify subtle nuances.

Very flexible; can be used with wide range or topics, individuals, and settings

Disadvantages of Focus Groups:

Have less control over group; less able to control what information will be produced.

Produces relatively chaotic data making data analysis more difficult.

Small numbers and convenience sampling severely limit ability to generalize to larger populations.

Requires carefully trained interviewer who is knowledgeable about group dynamics.

Researcher may knowingly or unknowingly bias results by providing cues about what types of responses are desirable.

Uncertainty about accuracy of what participants say. Results may be biased by presence of a very dominant or opinionated member; more reserved members may be hesitant to talk.

3.3 Secondary data

In our modern world there is an unbelievable mass of data that is routinely collected by governments, businesses, colleges, and other national and international organizations. Much of this information is stored in electronic databases that can be accessed and analyzed. Secondary data can also be obtained from published researches, government and non-government policy documents and reports, internal records, and so on. Secondary data is taken by the researcher from secondary sources, internal or external.

The researcher must thoroughly search secondary data sources before commissioning any efforts for collecting primary data, there are many advantages in searching for and analyzing data before attempting the collection of primary data.

Usually the cost of gathering secondary data is much lower than the cost of organizing primary data. Moreover, secondary data has several supplementary uses.

It also helps to plan the collection of primary data, in case, it becomes necessary.

28

28


Advantages of Secondary data analysis:

Secondary data analysis has several advantages:

It makes use of data that were already collected by someone else.

It often allows researcher to extend the scope of your study considerably.

It saves time that would otherwise be spent collecting data.

It provides a larger database (usually) than what would be possible to collect on ones own.

In many small research projects it is impossible to consider taking a national sample because of the costs involved.

Many archived databases are already national in scope and, by using them; researcher can leverage a relatively small budget into a much broader study than if you collected the data yourself.

Disadvantage of secondary data:

You may have less control over how the data was collected.

There may be biases in the data that you don’t know about.

Its answers may not exactly fit your research questions.

It may be obsolete data.

Old secondary data collections can distort the results of the research.

Secondary data con also raise issues of authenticity and copyright.

3.4 Guideline for Designing a Questionnaire and other Instruments

3.4.1 Guideline for Choice/Design of Data Collection Instruments in general

In the design of data collection instruments, the decision about question content, wording and order are the result of a process that considers the following:

i) What is the research problem?: The problem definition and objectives of the research.

ii) What type(s) of evidence is needed to address it?: Exploratory, descriptive, causal or explanatory

iii) What ideas, concepts, variables are we measuring? Content, definition and indicators

iv) What type(s) of data is(are) appropriate? Qualitative, quantitative, both.

v) From whom should we collect the data? Nature of the target population or sample (e.g., their education level, cultural background, etc)

vi) What method of data collection is most suitable? Observation, interviews, questionnaire or schedule, face-to-face or telephone, e-mail, web or postal.

29

29


vii) Where will the data be collected? In the street/shopping centre. At respondents’ office or home.

viii) How will responses be captured? Pen and paper, computer, Audi and/or video recording, photograph.

ix) What are the constraints? Time and/or budget.

x) How will the response be analyzed? Computer or/and manually.

3.4.2 Designing a Questionnaire

Masters students often conduct surveys and use questionnaire to collect data. However, they face difficulties in designing the questionnaire. The way you design the questionnaire or schedule has a big role to play in helping your or the enumerator gather the data accurately and effectively; and in helping the respondents provide accurate, complete and reliable data.

Why worry about the quality of the questionnaire/schedule?

i) A purely designed questionnaire can result in an unpleasant experience for the respondents and adversely affect their perception about research, reducing their willingness to cooperate in other future researches.

ii) A poor introduction and description of the research (e.g., purpose) can lead to high level of non-response, adversely affecting the representativeness of the sample.

iii) Poorly conceived questions not measuring what they claim to measure mean the data collected are not valid.

iv) Questions that are beyond the knowledge of the respondent ot that require heavy memory about past events result in inaccurate and unreliable data.

v) Poorly worded questions (using ambiguous, vague, difficult, unusual or technical jargons) can be misunderstood or misinterpreted or interpreted differently by different people, resulting in unreliable and invalid data.

vi) A badly structured questionnaire (that begins with difficult, sensitive or personal questions) can result in refusal to answer or complete the questionnaire.

vii)Poor question order can result in order bias or contamination of later responses by earlier questions.

viii) Long, boring or repetitive questions may result in a loss of interest or produce inaccurate responses.

ix) Too long questionnaire results in respondents fatigue, loss of interest

x) Poor layout can lead to errors in recording, coding and data processing.

30

30


Guideline to the Questionnaire Design Process:

i) Decide on the question content: This is done by clarifying the research objectives (the information requirements) and what exactly it is that the question needs to measure.

o Some questions require standard answer options. For example: Marital status has standard

answer options (single or never married, married, living as married, separated, divorced, widowed). While developing the content of a questionnaire, clarify the meaning (concepts, definitions and indicators). If you are not clear with the concepts and their indicators, it is difficult to craft the questions with the right wording of the statements.

ii) Ensure Proper Wording of the Questions:

Each question should be worded so that the following hold:

It measures what it claims to measure

It is relevant, meaningful and acceptable to the respondent

It is understandable to the enumerator and well as the respondent

It is interpreted in the way in which you intend by all respondents.

It elicits an clear, unambiguous, accurate and meaningful response

Examples of vaguely worded questions:

“How much money do you earn?” - What type of earning is this question referring to (from work? Investment? Remittances? Social benefits?); what time period is it referring to? (dayly? Weekly? Annually?)

“Do you have a personal computer?” – is this questions referring to ownership of the computer or type of the computer? What does it mean by “you”? (Myself? My household?) What does it mean by have? (own? or have access to?)

Other pitfalls in wording of questions:

Using of technical Jargons and abbreviations: Example – refurbishments, fiscal policy, monetary policy, UNHCR, UNESCO, etc

Using words that are difficult to read out or pronounce: E.g., ,‘In an anonymous form’.

Use of double-barrelled questions: Example- ‘Do you like using e-mail and the web’; ‘Would you like to be rich and famous?’

Use of negatively phrased questions: ‘Do you agree that it is NOT the job of the government to take decisions about the following?’

Use of very long questions

Use of questions that challenge the respondent’s memory: Example – ‘How many hours of television did you watch last month?’; ‘List the books you have read in the last year’

Including leading questions: Example - ‘Public speeches against racism should not be allowed. Do you agree or disagree?’

31

31


Wording questions using sensitive or loaded ‘non-neutral’ questions: Example – ‘What do you think of welfare for the poor?’

Questions that make assumptions: Example – ‘How often do you travel to rural areas?’ (this wording assumes that the respondent travels to rural areas); ‘When did you stop beating your wife?’

Questions with overlapping response categories: Example- ‘Howmany hours did you spend in the library yesterday?’ (response categories: 0-1 hours, 1-2 hours, 2-3 hours, etc)

Questions with insufficient response categories: Example – ‘How do you travel to work each day?’ (response categories: by my own car, on foot, by public bus, on a bicycle). In this case other modes are missing (e.g., by service bus, by a friend/colleague’s car, by motor bicycle, by Bajaj, by cart, etc) and the possibility to chose more than one mode is not provided.

Questions on sensitive topics – Example: ‘which political party did you elect during the May 7 election?’

iii) Follow the right question order

Put your questions into effective and logical order. Don’t ask sensitive and difficult questions too early. Also it is preferable to ask personal questions (e.g. classification questions such as those on age, income, etc) at the end.

Classify your questions into groups and provide a brief introduction to each group. Within each group (module), begin with general questions and then move on to specific questions.

In case a particular question is not relevant to the respondent, indicate that they can jump the question to indicate which question they should jump to.

iv) Make good layout and appearance:

The layout should enable the respondent fill the questionnaire easily. Instruction to the interviewer should be in CAPITALIZED and BOLD format while the question text and answer categories are in lower cases, not bold.

v) Optimize questionnaire Length:

The questionnaire must be long enough to cover the research objectives but not too long considering the research cost and time to the respondent.

Recommended maximum length:

o For an in-home face-to-face questionnaire (schedule) is 45 – 60 minutes.

o For Telephone interview: about 20 minutes

o For Street interview: 5 to 10 minutes.

vi) Conduct Pilot Study and make necessary changes:

Test the questionnaire out to identify its pitfalls and correct them before you go for the full-scale survey.

32

32


Unit Four

Research Proposal, Referencing, Reporting Results and Ethical Considerations

4.1 The Research Proposal

4.1.1 The Need for Research Proposal

Before you embark up on your research for the masters’ thesis, you will be required to submit a research proposal. Most journals and calls for conference papers also require a submission of a research proposal or abstract or synopsis.

4.1.2 Structure of Research Proposal

A research proposal should include the following sections.

1) Covering page:

3) Introduction

3) Statement of the research problem.

4) Research objectives/hypothesis/ key concepts

6) Research methods/preliminary survey of literature;

7) References

8) Time table/ time schedule or research plan; and

9) Budget.

4.1.3 Content of research proposal: a brief Guide

This brief guideline is prepared to help you in the process of preparing your research proposal for the research area and research topics you have chosen (as reflected in the research title you have submitted.

Cover page: The title: the title should be as short as possible but should adequately represent the topic

and the research problem/objective1. The introduction1.1 General Background

Your research proposal should have an introduction. The introduction should given a general background in relation to the research area/topic and enhance interest of readers. Also indicate the debate/controversy in the literature over the topic or issue you intend to deal with in your research. You should demonstrate the relevance of your research to the theory (to the debate) and/or for practice (policy).

33

33


1.2 Background to the study area

Provide a brief background to the geographic area in which the study will be conducted. Make your description relevant to the research topic. Avoid unnecessary descriptions.

1.3 Statement of the problem

In this part you are expected to state the problem that your research aims at addressing. The problem can be stated from theoretical point of view and/or practical point of view. As part of the problem statement, you should provide a justification as to why this research has to be done. If similar research has been done by others, you have to indicate their gaps. Your problem statement should end by specifying the general and specific research questions that the research aims at answering.

1.4 Scope of the study:

Indicate or delimit the thematic and geographic scope of the study.

1.5 Significance of the study

Indicate the theoretical and/or practical significance of your research.

1.6 Limitations of the study

Though a complete enumeration of the limitations will be done after you have gathered the data and did the analysis, you should anticipate the possible limitations or weaknesses of your study. Shortage of time and money are not limitations. The limitations could, for instance, be related to sampling procedures and sample size, quality of the data, data analysis techniques used, etc.

2. Theoretical/conceptual frameworkThis is the part where you will provide a summary of the literature review you have conducted. Your review should lead to a clear definition of the theoretical framework and the conceptual framework.

2.1 Theoretical framework

The theoretical framework refers to a summary of the theories that you will refer to in your study. Review the relevant literature and identify what the theory says about the issue/topic you are addressing. Elaborate the different perspectives. Also provide alternative definitions (if any) of the important concepts and variables that you will use in your research. The literature review will help you identify the relevant variables that could potentially be used as indicators/measures for your concepts. If you are going to investigate the relationship among variables, show what the theories state about the relationships (i.e. about the direction of relationships and significance).

What you summarize as part of the theoretical framework has to be very relevant to the topic and particularly the research problem and the research questions. While conducting the literature survey, students often throw in whatever literature that is in one way or another related with the research area but

34

34


not necessarily with the research problem. Failing to prepare the theoretical framework properly has at least the following disadvantages:

You will not have the basis to define relevant concepts, identify the assessment issue or the variables and define them.

You don’t know what relationship to expect.

It will not be easy for you to choose the appropriate research design.

Your data collection instruments will be ill designed.

During analysis, you will not have any theory to compare your results with.

The contribution of your research to the existing theory will be blurred.

2.2 Conceptual Framework

Based on the theoretical framework, you are expected to develop your conceptual framework. This is the part in which you are going to define the concepts operationally (in a way you like your

readers to understand the concepts) and where you will select your factors/variables. In the literature, concepts may defined in different ways and you will have to make a choice here. In your study, how the concepts defined operationally? What are the variables and indicators that you will use in your study to measure the concepts? How do the different concepts relate among each other? These and other questions should be answered via your conceptual framework. For instance, education quality can be measured using a number of indicators. Among these indicators (which you must have identified and defined in the conceptual framework), clearly show which ones you are going to use in your research. Consider the usefulness/appropriateness of the indicator as well as data availability and feasibility in your choice of the variables.

If you are doing a qualitative research that involves, for instance, assessments, you should make clear the assessment themes/issues and the indicators. If you could add a diagrammatical representation of the

conceptual framework, it will add a visual effect to your operationalization. The diagram depicts the concepts/issues and how they relate among each other. You may create your illustrative diagram or adopt or adopt it from the literature. In the later case, you have to clearly cite the sources for your diagram.

2.3 Hypothesis

In deductive research designs, it is necessary that you formulate some hypothesis. Once you are clear with the theory and your conceptual framework, you can state some hypothesis. A hypothesis is a tentative assumption made in order to draw out and set its logical or empirical consequences. It should be specific and pertinent to the piece of research in hand. The hypothesis will provide the focal point for your research; to delimit the area, sharpen thinking and keep the researcher on the right track. Also remember that your hypothesis determines data type required; the data collection and sampling methods to be used; the tests that must be conducted during data analysis.

3. Methodology (Research design):

35

35


Once you have developed your hypothesis, the next step is to craft your research design. A research design is like the blue print for house construction. If you start building a house with out first having the design (consisting of the architectural, electrical, sanitary, etc designs), the result is you don’t know what type of house you will end up with; it will be costly and time taking; often involving construction and demolition of what has been constructed. Most importantly, the house lacks quality and may be prone to risks. Likewise, a research conducted without a research design at hand is aimless, ambiguous, time taking, costly and may be totally irrelevant and unacceptable in light of the requirements for a scientific investigation. Research design refers to the crafting of the conceptual structure within which research will be conducted in a way that is as efficient as possible, the collection of relevant evidence with minimal expenditure of effort, time and money.

3.1 Research type:

Describe the type of your research based on some commonly known criteria. More specifically, describe the type of your research based on the nature of the research enquiry (e.g., exploratory, descriptive, etc); the mode of data collection; the type of the data; and so on.

3.2 Data type, source and data collection techniques

In this part, describe:

The data type (primary and secondary)

The sources of data (primary and secondary sources)

The methods of data collection (which method to which data type – interview, questionnaire, FGD, Observation, etc)

The data collection procedure you are going to follow

3.3 Sampling Design

In this part, indicate your sample population, sampling frame, sampling unit; determine the sample size and, show the sampling techniques and procedures.

3.4 Method of data analysis

Indicate how you are going to analyze the data you will gather. Specify the techniques and statistical packages you will use (if applicable).

4. Timeline and budget

Identify the major activities in your research and put a time line (a start and finish period). The Gant Chart is a useful tool in this respect. Then, attach a budget to the activities.

5. List of References

Include a list of references. References from journal articles and books are more reliable. Also include and clearly indicate other resources such as government policy documents, reports, magazine articles, conference papers and proceedings, legal rule/regulation, thesis, online database, other unpublished works.

36

36


Annexes:

Additional information that is not directly part of the proposal, but which is considered to be relevant for the understanding of the project, should be attached to the research proposal as an annex.

4.2 Referencing Styles: The APA Referencing Style

The American Psychological Association reference style uses the Author-Date format.

Refer to the Publication Manual of the American Psychological Association (6th ed.) for more information. Check the Library Catalogue for call number and location(s).

When quoting directly or indirectly from a source, the source must be acknowledged in the text by author name and year of publication. If quoting directly, a location reference such as page number(s) or paragraph number is also required.

IN-TEXT CITATION

Direct quotation – use quotation marks around the quote and include page numbers

Samovar and Porter (1997) point out that "language involves attaching meaning to symbols" (p.188).

Alternatively, “Language involves attaching meaning to symbols" (Samovar & Porter, 1997, p.188).

Indirect quotation/paraphrasing – no quotation marks

Attaching meaning to symbols is considered to be the origin of written language (Samovar & Porter, 1997).

N.B. Page numbers are optional when paraphrasing, although it is useful to include them (Publication Manual, p. 171).

Citations from a secondary source

As Hall (1977) asserts, “culture also defines boundaries of different groups” (as cited in Samovar and Porter, 1997, p. 14).

At the end of your assignment, you are required to provide the full bibliographic information for each source.

References must be listed in alphabetical order by author.

EXAMPLES OF REFERENCES BY TYPE

37

37


(Retrieved on Jan. 14, 2013, from http://www.libraries.psu.edu/psul/lls/students/apa_citation.html)

Books

Important Elements:

Author (last name, initials only for first & middle names) Publication date

Title (in italics; capitalize only the first word of title and subtitle, and proper nouns)

Place of publication

Publisher

Citing Books

Source Example Citation

Book by a single author Rollin, B. E. (2006). Science and ethics. New York, NY: Cambridge University Press.

Book by two authors Sherman, C., & Price, G. (2001). The invisible web: Uncovering information sources search engines can’t see. Medford, NJ: CyberAge Books.

Book by three or more authors

Goodpaster, K. E., Nash, L. L., & de Bettignies, H. (2006). Business ethics: Policies and persons (3rd ed.). Boston, MA: McGraw-Hill/Irwin.

Book by a corporate author

American Medical Association. (2004). American Medical Association family medical guide (4th ed.). Hoboken, NJ: Wiley.

Article or chapter within an edited book

Winne, P. H. (2001). Self-regulated learning viewed from models of information processing. In B.J. Zimmerman & D.H. Schunk (Eds.), Self-regulated learning and academic achievement (2nd ed., pp. 160-192). Mahwah, NJ: Lawrence Erlbaum Associates.

Translation Tolstoy, L. (2006). War and peace. (A. Briggs, Trans.). New York, NY: Viking. (Original work published 1865).

38

38


Articles from Print Periodicals (magazines, journals, and newspapers)

Important Elements:

Author (last name, initials only for first & middle names) Date of publication of article (year and month for monthly publications; year, month and

day for daily or weekly publications)

Title of article (capitalize only the first word of title and subtitle, and proper nouns)

Title of publication in italics (i.e., Journal of Abnormal Psychology, Newsweek, New York Times)

Volume and issue number

Page numbers of article

Citing Articles from Print Periodicals


Article in a monthly magazine (include volume # if given)

Swedin, E. G. (2006, May/June). Designing babies: A eugenics race with China? The Futurist, 40, 18-21.

Article in a weekly magazine (include volume # if given)

Will, G. F. (2004, July 5). Waging war on Wal-Mart. Newsweek, 144, 64.

Article in a daily newspaper

Dougherty, R. (2006, January 11). Jury convicts man in drunk driving death. Centre Daily Times, p. 1A.

Rimer, S. (2003, September 3). A campus fad that’s being copied: Internet plagiarism seems on the rise. New York Times, p. B7.

Article in a scholarly journal

Stock, C. D., & Fisher, P. A. (2006). Language delays among foster children: Implications for policy and practice. Child Welfare, 85(3), 445-462.

Book review Rifkind, D. (2005, April 10). Breaking their vows. [Review of the book The mermaid chair, by S.M. Kidd].

39

39


Washington Post, p. T6.

Electronic Resources - including online articles, websites, and blogs

The following guidelines for electronic sources follow the recommendations in the sixth edition (2009) of the Publication Manual of the American Psychological Association.

Articles from the Library’s Online Subscription Databases

Important Elements:

Publication information (see Print Periodicals, above) DOI number (if available). More information about DOI numbers is available on the

American Psychological Association's APA Style page.

If the DOI number is not available, APA recommends giving the URL of the publication. If the URL is not known, include the database name and accession number, if known: Retrieved from ERIC database (ED496394).

Citing Articles from the Library’s Online Subscription Databases


Magazine article with URL

Poe, M. (2006, September). The hive. Atlantic Monthly, 298, 86-95. Retrieved from http://www.theatlantic.com

Journal article with DOI

Blattner, J., & Bacigalupo, A. (2007). Using emotional intelligence to develop executive leadership and team and organizational development. Consulting Psychology Journal: Practice and Research, 59(3), 209-219. doi:10.1037/1065-9293.59.3.209

Articles in Online Journals, Magazines and Newspapers

Important Elements

Author (last name, initials only for first & middle names) Date of publication of article

40

40

http://www.apastyle.org/learn/faqs/what-is-doi.aspx


Title of article

Title of publication (in italics)

Volume and issue number (for scholarly journals, if given)

Page numbers, if given

DOI number, if given. More information about DOI numbers is available on the American Psychological Association's APA Style page.

If the DOI is not available, give the URL (Web address) of the article.

Citing Articles in Online Journals, Magazines and Newspapers


Article in an online scholarly journal Overbay, A., Patterson, A. S., & Grable, L. (2009). On the outs: Learning

styles, resistance to change, and teacher retention. Contemporary Issues inTechnology and Teacher Education, 9(3). Retrieved from http://www.citejournal.org/vol9/iss3/currentpractice/article1.cfm

Article in an online magazine

Romm, J. (2008, February 27). The cold truth about climate change. Salon.com. Retrieved from http://www.salon.com

Article in an online newspaper

McCarthy, M. (2004, May 24). Only nuclear power can now halt global warming. Earthtimes. Retrieved from http://www.earthtimes.org

Web Sites: Important Elements Author (if known) Date of publication, copyright date, or date of last update

Title of Web sitei

Date you accessed the information (APA recommends including this if the information is likely to change)

URL (Web address) of the site

Citing Web Sites


41

41

http://www.apastyle.org/learn/faqs/what-is-doi.aspx


Web site with author Kraizer, S. (2005). Safe child. Retrieved February 29, 2008, from http://www.safechild.org/

Web site with corporate author

Substance Abuse and Mental Health Services Administration (SAMHSA). (2008, February 15). Stop underage drinking. Retrieved February 29, 2008, from http://www.stopalcoholabuse.gov

Web site with unknown author

Penn State myths. (2006). Retrieved December 6, 2011, from http://www.psu.edu/ur/about/myths.html

Page within a Web site (unknown author)

Global warming 101. (2012). In Union of Concerned Scientists. Retrieved December 14, 2012, from http://www.ucsusa.org/global_warming/global_warming_101/

Electronic Books

Important Elements:

Author (last name, initials only for first & middle names) Publication date

Title (in italics; capitalize only the first word of title and subtitle, and proper nouns)


Publisher

URL (Web address) of the site from which you accessed the book

Citing Electronic Books


Electronic Book McKernan, B. (2005). Digital cinema: The revolution in cinematography, postproduction, and distribution. New York, NY: Mc-Graw Hill. Retrieved from www.netlibrary.com.

42

42


Post, E. (1923). Etiquette in society, in business, in politics, and at home. New York, NY: Funk & Wagnalls. Retrieved from http://books.google.com/books.

Multimedia Resources - including motion pictures and television

Motion Picture (film, video, DVD): Important Elements Director Date of release

Title (in italics)

Country where motion picture was made

Studio

Citing Films, Videos, DVDs


Motion Picture Johnston, J. (Director). (2004). Hidalgo. [Motion Picture]. United States, Touchstone/Disney.

Television Program: Important Elements Producer Date of broadcast

Title of television episode

Title of series (in italics)

Location of network and network name

Citing Television Programs


Television program in series

Buckner, N. & Whittlesey, R. (Writers, Producers & Directors). (2006). Dogs and more dogs. [Television series episode]. In P. Apsell (Senior

43

43


Executive Producer), NOVA. Boston: WGBH.

Government Publications: Important Elements Government Agency Date of publication

Title of document (in italics)


Publisher

Citing Government Publications


Government document

U.S. Dept. of Housing and Urban Development. (2000). Breaking the cycle of domestic violence: Know the facts. Washington, DC: U.S. Government Printing Office.

Citing Indirect Sources

If you refer to a source that is cited in another source, list only the source you consulted directly (the secondary source) in your reference list. Name the original source in the text of your paper, and cite the secondary source in parentheses: “Wallace argues that…. (as cited in Smith, 2009).” In this example, only the Smith source would be included in the reference list.

Whenever possible, try to find and consult the original source. If the Penn State University Libraries does not have the original source, we can try to get it for you through interlibrary loan.

4.3 Report Writing: Communicating the Results

4.3.1 Structure of your research report (Thesis):

It depends on the type of study. Generally, it includes:

1. Title page, Table of contents, acronyms, abstract, etc

2. Main body:

44

44


– Chapter 1: Introduction

– Chapter 2: Theoretical and Conceptual Framework

– Chapter 3: Methodology

– Chapter 4: Results and Discussion

– Chapter 5: Conclusion and Recommendations

3. Annexes

4.3.2 Your sentences and paragraphs

Pay attention to your sentences and paragraphs:

• Sentences: Grammar, spelling, mechanics, sentence construction(verb-subject agreement, flow), etc.

• Paragraphs:

– not too long, not too short;

– Convey one idea in one paragraph,

– Usually a paragraph has introduction and concluding sentences

– Check for flow, coherence, economy, etc

– Use connecting words/phrases; avoid repetition of words/phrases

4.3.3 While presenting the results (findings):

– Do not report one result in different formats (e.g. table, text, graph) – use one!

– Do not repeat the results that are presented in a table or graph in your paragraphs; write what you observe (trends, patterns, averages, etc)

4.3.4 When discussing of results/findings

Do not report results again!; Focus on the why part (reasons)/explanations for the findings from your data or from theory, or give your own interpretation

Link with theory/other studies, your hypothesis

Use the specific objectives of your study as a guide

4.3.5 When writing the conclusion and recommendations:

The conclusions should be based on your findings

While concluding, answer your research questions/address your specific objectives

Recommendations should be based on your conclusions45

45


While recommending solutions, indicate how your recommendations could be put into practice (the how part)

4.4 Preparing and Delivering a Presentation

What do the advisor and the examiner(s) expect from you? Content-wise, Be focused! Don’t present everything! You could focus on:

The problem, research questions

Your conceptual model & variables

Methodology (in short)

Summary of Key findings, results of hypothesis testing, and comparison with theory (or interpretation) and other studies; Main Conclusion

What else do they need from you?

Dressing – elegant. Dress for the audience!

Language fluency – Correct grammar, pronunciations

Eye contact – don’t look at the roof or the floor or only one person!

Voice (loud enough, not so noisy, attractive sound),

Speed – medium,

Self-confidence (show that it is your work),

Openness (transparency), receptive of comments

Honesty – Say I don’t know this if you really don’t know something – Don’t pretend or be too defensive

4.5 Ethical Issues/Considerations

4.5.1 Ethical Issues related to Purpose of Research:However lofty the stated purposes of research, the product of research in the public sector may be to provide tools for manipulation and control for some segments of society at the expense of others.

• For example, the tendency to describe some populations as deviant leads away from focusing on larger problems of the distribution of political, economic, and social power.

• Social scientist need to be aware of the possible uses to which their research may be put.

Research should not only enhance the researcher's career, but also benefit the group, organization, or population studied.

Those who fund and conduct research also reap its benefits.

46

46


4.4.2 Ethical Issues related to the Subject Matter

What populations can be studied with little risk or harm?

Are there some populations which are routinely subjects of research, while others are ignored?

Populations with little social or political power ( e.g. children, the elderly, the poor, mentally disabled, students, parents, criminals, delinquents, addicts, the military, etc) are often targets of research, while those with substantial power are not.

Those who "own" or "run" organizations are usually in charge of the research that goes on in them.

The people who are the "subjects" of the research may have neither the power to shape the research nor the ability to refuse to participate.

Is participation Voluntary?

1. Participants must be voluntary and not coerced.

2. Participants cannot be threatened with a loss of other, unrelated benefits( e.g., food stamps, bilingual education)

3. Participants cannot be offered unreasonably large inducements to participate ( e.g., prisoners)

4. Information must be provided about all risks or potential risks of participation including physical harm, pain, discomfort, embarrassment, loss of privacy; exposure to illness, etc.

5. Information must be provided about all benefits or potential benefits of participation, for example, free health care, monetary incentives, the value of the research to science, etc.

The ratio of risks to benefits should be stated.

7. Are the benefits sufficient to allow participants to put themselves at risk? Should the study be done at all?

8. Are the participants' rights and well being sufficiently protected?

9. Are the means of obtaining informed consent adequate and appropriate?

10. Participants can withdraw from the study at any time, refuse to comply with any part of the study and refuse to answer any questions.

III. Ethical issues related to the methods

Most ethical violations correspond to illegitimate use of the investigator's power.

Researchers need to be trained to be concerned, as social scientists with people as well as with research design, methodology, etc.

Ethical concerns include:

47

47


1. Involvement without consent:

through participants observation or covert observations;

through unknown intervention in ongoing programs or

operations;

through field experiments.

Disguising the true nature or purpose of the research:

- the way it will be used is not revealed;

- information is withheld that would affect informed

consent;

3. Deceiving the research participant:

- to conceal the purpose of the research;

- to conceal the true function of the participant's actions;

- to conceal the experiences the participants will have to

undergo;

Leading participants to commit acts that lessen their self-esteem:

- Cheating, lying stealing; harming others;

- yielding to social pressure contrary to one's ideas;

- prohibiting the rendering of aid when needed;

- behavioral control or character change;

- denial of the right to self-determination;

Coercion that abridges freedom of choice

- research is linked to participation in organizational or institutional programs;

- requests for participation" are worded in such a way that it is difficult to say no;

- Participation is made a requirement of a college course;

6. Physical or mental stress:

horror, threat to identity, failure, fear, emotional shock.

7. Invasion of privacy:

- covert observation;

- unnecessary questions of personal nature on interviews or questionnaires;

48

48


- disguised, indirect, or projective tests;

- using third-party information without consent;

- it is also the ethical responsibility of the researcher to ensure that the data are accurately collected, coded, entered, analyzed, and interpreted, so as not to perform a disservice to the subject population.

After the project is over, the researcher should:

- remove any harmful after- effects from the participants;

- maintain anonymity and/or confidentiality;

- publish the findings in reports and articles;

- store the data for use by other researchers in the future;

- inform participants of the results if they so choose;

- inform colleagues and professional associates of the research

49

49


UNIT FIVE:

The Survey Method and Case Studies

5.1 The Survey Method

5.1.1 What is a survey?

A survey is a detailed and quantified description of a population. They attempt to identify something about a population, that is, a set of objects about which we wish to make generalizations. A population is frequently a set of people, but organizations, institutions or even countries can comprise the unit of analysis.

Surveys involve the systematic collecting of data, whether this be by interview, questionnaire or observation methods, so at the very heart of surveys lies the importance of standardization. Precise samples are selected for surveying, and attempts are made to standardize and eliminate errors from survey data gathering tools.

A particular form of survey, a census, is a study of every member of a given population. For example, The Central Statistics Agency of Ethiopia conducts the population and housing census every ten years. A census provides essential data for government policy makers and planners, but is also useful, for example, to businesses that want to know about trends in consumer behavior such as ownership of durable goods, and demand for services.

5.1.2 Characteristics of Survey Methods

The survey method involves a team of enumerators going into urban/rural areas and eliciting data via answers to questions on a structured form.

Large number of observation can be collected within a certain period of time from a relatively large number respondents.

The sample can be spread over a wide area, which has statistical value, and there by making the study some what more generalizable.

At the same time, the principal researcher need not spend too much time in the field.

Surveys require respondents to answer questions about their opinions, attitudes, or preferences and about socio demographic characteristics of respondents.

It is essentially cross-sectional (during a particular time).

It is not concerned with the characteristics of individuals.

It involves clearly defined problem.

It requires experts imaginative planning.

It involves definite objectives.

50

50


It requires careful analysis and interpretation of the data gathered.

It requires logical and skilful reporting of the findings.

5.1.4 Types of Survey Studies

There are three criteria for classifying the survey research:

(a) Nature of variables: i) Status survey, or ii) Survey research

(b) Group Measured: i) Sample or ii) Population

(c) Sources of data collection:

i. Questionnaire

ii. Interview

i. Controlled observations survey.

5.1.5 Stages of the survey method

51

51


Source: Gray (2004)

52

52


5.2 Case Studies

Surveys are used where large amounts of data have to be collected, often from a large, diverse and widely distributed population. In contrast, case studies tend to be much more specific in focus. While surveys tend to collect data on a limited range of topics but from many people, case studies can explore many themes and subjects, but from a much more focused range of people, organizations or contexts. The case study method can be used for a wide variety of issues, including the evaluation of training programmes (a common subject), organizational performance, project design and implementation, policy analysis and relationships between different sectors of an organization or between organizations. Case studies, then, explore subjects and issues where relationships may be ambiguous or uncertain. But, in contrast to methods such as descriptive surveys, case studies are also trying to attribute causal relationships and are not just describing a situation. The approach is particularly useful when the researcher is trying to uncover a relationship between a phenomenon and the context in which it is occurring. For example, a business might want to evaluate the factors that have made a recent merger a success (to prepare the ground for future mergers). The problem here, as with all case studies, is that the contextual variables (timing, global economic circumstances, cultures of the merging organizations, etc.) are so numerous that a purely experimental approach revealing causal associations would simply be unfeasible.

The case study approach requires the collection of multiple sources of data but, if the researcher is not to be overwhelmed, these need to become focused in some way. Therefore case studies benefit from the prior development of a theoretical position to help direct the data collection and analysis process. Note that the case study method often (but not always) tends to be deductive rather than inductive in character.

Case study is both method and tool for research. Case study leads to very novel idea and no longer limited to the particular individual. In case study investigator tries to collect the bits in support of proposition. Case study methodological is not longitudinal study but it depends on the methods of information about the individual as far as possible.

Therefore, case study is conducted only for specific case. Actually case study means a study in depth. Here depth means to explore all peculiarities of case. It gives a detailed knowledge about the phenomena and not able to generalize beyond the knowledge. In physical science every unit is the true representative of the population, but in social sciences, the units may not be true representative of the population. This is because there are individual differences as well as intra- individual differences. Therefore, prediction cannot be made on the basis of knowledge obtained from case study. No statistical inferences can be drawn from the exploration of a phenomenon.

Here case does not necessarily mean an individual. Case means an unit, it may be an institution or a nation, or religion or may be an individual or a concept

53

53


WHEN SHOULD WE USE CASE STUDIES?

The case study method is ideal when a ‘how’ or ‘why’ question is being asked about a contemporary set of events over which the researcher has no control.

Source: Yin (1994)

54

54


SOURCES OF DATA IN CASE STUDY

Source: Adopted from Yin (1994) by Gray (2004).

55

55


PART TWO

Presenting and Analyzing Quantitative Data

(With SPSS Application)

Contents:

Unit 6: Analyzing Quantitative Data - Descriptive Statistics: Basic concepts in statistics; Classification and Presentation of Statistical Data (bar chart, pie chart, histogram); Measures of central tendency and dispersion (mean, median, mode, mean deviation,

variance, standard deviation, covariance, Z-score); Exercise with SPSS Application

Unit 7: Analyzing Quantitative Data- Tests of hypothesis concerning means and proportions:

Tests of hypotheses concerning means; Tests concerning the difference between two means (independent samples);

Tests of mean difference between several populations (independent samples); Paired-samples t-test (Differences between dependent groups);

Tests of association (the Pearson coefficient of correlation and test of its significance, The Spearman rank correlation coefficient and test of its significance); Nonparametric Correlations (The Chi-square test);

Hypothesis test for the difference between two proportions; Exercise with SPSS Application

Unit 8: Analyzing Quantitative Data - The simple linear regression model and Statistical Inference;

The simple linear regression model, estimation of regression coefficients and interpreting results;

Hypothesis testing; Exercise with SPSS application

Unit 9: Analyzing Quantitative Data - The multiple linear regression model and Statistical Inference;

The multiple linear regression model, estimation of regression coefficients and interpreting results;

Hypothesis testing; Exercise with SPSS application

56

56


Unit Six

Analyzing Quantitative Data: Descriptive Statistics

6.1 Basic concepts in statistics

6.1.1 What is Statistics?

Statistics is a science pertaining to the collection, presentation, analysis and interpretation or explanation of data. Data can then be subjected to statistical analysis, serving two related purposes: description and inference.

Descriptive statistics summarize the population data by describing what was observed in the sample numerically or graphically.

Inferential statistics uses patterns revealed through analysis of sample data to draw inferences about the population represented.

For a sample to be used as a guide to an entire population, it is important that it is truly a representative of that overall population. Appropriate and scientific sampling procedures assure that the inferences and conclusions can be safely extended from the sample to the population as a whole.

The raw materials for any statistical analysis are the data. Once data are collected, we have to organize and describe these data in a concise, meaningful manner so that they become meaningful. In order to determine their significance, we must display the data in the form of tables, graphs and charts (so that we can have a good overall picture of the data). Then, we have to analyze the data, i.e., we calculate summary measures such as the mean and standard deviation; assess the extent of relationship (correlation) between two (or more) variables; and the like. Finally, based on the analysis, we have to make generalizations and arrive at reasonable decisions.

6.1.2 Limitations of statisticsAlthough statistics is widely applied and has shown its merit in planning, policy making, marketing decisions, quality control, medical studies, etc., it has some limitations:

(a) Statistical laws are not exact. They are probabilistic in nature, and inferences based on them are only approximate.

(b) Statistics is liable to be misused. It deals with figures which are innocent by themselves, but which can be easily distorted and manipulated.

Example: Information released from the President’s Office of a certain university concerning minority students states that their number has increased from 10 to 20 in this academic year. The release also stated that the student population of the university has also increased from 1000 to 2500. Based on this information, a newspaper headline reads:

Number of minority students doubled

57

57


The newspaper headline above strayed from the content of the main feature. Focusing on one aspect of the data – the number of minority students has increased from 10 to 20 – the newspapers ignored the other fact, that is, the student population of the university has increased from 1000 to 2500 this year. The fact is that the percentage of minority students has decreased: from 1% last year to 0.8% this year.

6.1.3 Some Basic Terms in StatisticsIn collecting data concerning the characteristics of a group of individuals or objects, it is often impossible or impractical (from the point of view of time and cost) to observe the entire group. In such cases, instead of examining the entire group, called population, we examine only a small part of the population, called sample.

Definition: A population is the set of all elements that belong to a certain defined group. A sample is a part (or a subset) of the population.

Definition: Numerical characteristic of a population is called a parameter. Numerical characteristic of a sample is called a statistic.

6.2 Classification and Presentation of Statistical Data

6.2.1 Introduction

We can apply various sampling techniques and methods of data collection to obtain the data of interest. In its original form, such data set is a mere aggregate of numbers and hence, is not very helpful in extracting information. So, we need to summarize and display the above information in a readily digestible form. This may take various forms such as ordering the data according to their magnitude; compiling them into tables; or graphing them to form a visual image. By doing so, a good overall picture and sufficient information can often be attained.

6.2.2 Scales of measurement and types of classification of data

A. Scales of measurement of data i. Nominal Scale: The nominal scale assigns numbers as a way to label or identify characteristics. The

numbers assigned have no quantitative meaning beyond indicating the presence or absence of the characteristic under investigation. In other words, the numbers are not obtained as a result of a counting or measurement process.

For example, we can record the gender of respondents as 0 and 1, where 0 stands for male and 1 stands for female. The numbers we assign for the various categories are purely arbitrary, and any arithmetic operation applied to these numbers is meaningless.

ii. Ordinal Scale: The ordinal scale is the next higher level of measurement precision. It ensures that the possible categories can be placed in a specific order (rank) or in some ‘natural’ way. Again here the numbers are not obtained as a result of a counting or measurement process, and consequently, arithmetic operations are not allowed. For example, responses for health service provision can be coded as 1, 2, 3 and 4: 1 for

poor – 2 for moderate – 3 for good – 4 for excellent. It is quite obvious that there is some natural ordering: the category 'excellent' (which is coded as 4) indicates a better health

58

58


service provision than the category 'moderate' (which is coded as 2) and, thus, order relations are meaningful.

iii. Interval Scale: The interval scale is the second highest level of measurement precision. Unlike the nominal and ordinal scales of measurement, the numbers in an interval scale are obtained as a result of a measurement process and have some units of measurement. Also the differences between any two adjacent points on any part of the scale are meaningful. However, a point can not be considered to be a multiple of another, that is, ratios have no meaningful interpretation. For example, Celsius temperature scale that subdivides the distance between the freezing

and boiling point into 100 equally spaced parts is an interval scale. There is a meaningful difference between 30 degree Celsius and 12 degree Celsius. However, a temperature of 20 degree Celsius can not be interpreted as twice as hot as a temperature of 10 degree Celsius.

iv. Ratio Scale: The ratio scale represents the highest form of measurement precision. In addition to the properties of all lower scales of measurement, it possesses the additional feature that ratios have meaningful interpretation. Furthermore, there is no restriction on the kind of statistics that can be computed for ratio scaled data. For example, the height of individuals (in centimeters), the annual profit of firms (in Birr)

and plot elevation (in meters) represent ratio scales. The statement ‘the annual profit of Firm X is twice as large as that of Firm Y’ has a meaningful interpretation.

Why is level of measurement important?

a) First, knowing the level of measurement helps you decide on how to interpret the data. For example, if you know that a measure is nominal, then you know that the numerical values are just short codes for the longer names.

b) Second, knowing the level of measurement helps you decide what statistical analysis is appropriate on the values that were assigned. If a measure is nominal, for instance, then you know that you would never average the data values.

6.2.3 Types of classification of dataThe word variable is often used in the study of statistics, so it is important to understand its meaning. A variable is a characteristic that may assume more than one set of values to which a numerical measure can be assigned. Sex, age, amount of income, Region or country of birth, grades obtained at school and mode of transportation to work are all examples of variables.

There are broadly three types of data that can be employed in quantitative analysis: time series data, cross-sectional data, and panel data.

i) Time series data: Time series data, as the name suggests, are data that have been collected over a period of time on one or more variables. Time series data have associated with them a particular frequency of observation or collection of data points. The frequency is simply a measure of the interval over, or the regularity with which, the data are collected or recorded.

Examples: Daily Dow-Jones stock market average close for the past 90 days, a firm’s quarterly sales over the past 5 years. etc

59

59

http://www.statcan.ca/english/edu/power/glossary/gloss.htm#variable


The data may be quantitative (e.g. exchange rates, prices, number of shares outstanding), or qualitative (e.g. the day of the week, the number of the financial products purchased by private individuals over a period of time, etc.).

ii) Cross-sectional data: Cross-sectional data are data on one or more variables collected at a single point in time. Such data do not have a meaningful sequence. For example, the data might be on: Sales of 30 companies Productivity of each sales division A cross-section of stock returns on the New York Stock Exchange (NYSE)

iii) Panel data: Panel data have the dimensions of both time series and cross-sections, e.g. the daily prices of a number of blue chip stocks over two years.

Note:

i) For time series data, it is usual to denote the individual observation numbers using the index t, and the total number of observations available for analysis by T. For cross-sectional data, the individual observation numbers are indicated using the index i , and the total number of observations available for analysis by N.

ii) In contrast to the time series case, no natural ordering of the observations in a cross-sectional sample. For example, the observations i might be on the price of bonds of different firms at a particular point in time, ordered alphabetically by company name. On the other hand, in a time series context, the ordering of the data is relevant since the data are usually ordered chronologically.

6.2.4 Continuous and discrete variables

As well as classifying data as being of the time series or cross-sectional type, we could also distinguish it as being either continuous or discrete.

i) A quantitative variable that has a ‘connected’ string of possible values at all points along the number line, with no gaps between them, is called a continuous variable. In other words, a variable is said to be continuous if it can assume an infinite number of real values within a certain range. It can take on any value and is not confined to take specific numbers. The values of such variables are often obtained by measuring. Examples of a continuous variable are distance, age and daily revenue.

The measurement of a continuous variable is restricted by the methods used, or by the accuracy of the measuring instruments. For example, the height of a student is a continuous variable because a student may be 1.6321748755... meters tall. However, when the height of a person is measured, it is usually measured to the nearest centimeter. Thus, this student's height would be recorded as 1.63 m.

ii) A quantitative variable that has ‘separate’ values at specific points along the number line, with gaps between them, is called a discrete variable. Such variables can only take on certain values, which are usually integers (whole numbers), and are often defined to be count numbers (i.e., obtained by counting).

The number of people in a particular shopping mole per hour or the number of shares traded during a day are examples of discrete variables. These can take on values such as 0, 1, 2, 3 ... In these cases, having 86.3 people in the mole or 585.7 shares traded would not make sense.

60

60


1.2.5 Methods of Summarizing and Presenting Quantitative Data

A. Grouped Frequency Distribution Raw data in its original form is a mere aggregate of numbers and hence, is not very helpful in extracting information. So, we need to summarize and display the information contained in a readily digestible form. One such method is the grouped frequency distribution.

Definition : A grouped frequency distribution is a table in which the observed values of a variable are grouped into classes together with the number of observed values falling into each class. The number of observed values that belong to a particular predefined interval (or class) is called its frequency.

Example: The following data represents the average monthly number of road fatalities (human injury due to traffic accidents) for a total of 50 major roads in a city:

46.6 61.5 50.3 57.8 52.7 56.055.2 56.0 51.2 53.9 52.4 59.255.7 59.3 50.6 58.8 47.4 57.148.8 60.5 55.4 53.6 53.0 53.348.8 58.0 50.9 63.8 55.2 52.4

43.2 52.6 49.6 58.3 47.854.5 47.1 53.3 59.1 45.8 56.1

54.6 57.6 57.9 56.8 57.3 51.856.8 57.0 53.9

These data may be summarized into a grouped frequency distribution as:

Table: Frequency distribution of the average monthly number of road fatalities

Average monthly number of road fatalities (class limits)

Frequency (number of major roads)

43.2 – 46.6 3

46.7 – 50.1 6

50.2 – 53.6 13

53.7 – 57.1 15

57.2 – 60.6 11

60.7 – 64.1 2

50

61

61


B. Graphical presentation of data Graphs are effective visual tools because they present information quickly and easily. It is not surprising then, that graphs are commonly used by print and electronic media. Often data are better understood when presented by a graph than by a table because the graph can easily reveal a trend ( rise or decline of a variable over time) and is a simpler visual aid for comparison purposes. Some of the reasons why we use graphs when presenting data include: they are quick and direct; they facilitate understanding of the data; they can convince readers; and they can be easily remembered.

If you have decided that using a graph is the best method to relay your message, then some of the guidelines to follow are:

1. Define your target audience.

Ask yourself the following questions to help you understand more about your audience and what their needs are: Who is your target audience? What do they know about the issue? What do they expect to see? What do they want to know? What will they do with the information?

2. Determine the message(s) to be transmitted.

Ask yourself the following questions to figure out what your message is and why it is important: What do the data show? Is there more than one main message? What aspect of the message(s) should be highlighted? Can all of the message(s) be displayed on the same graphic?

Knowing what type of graph to use with what type of information is crucial. Depending on the nature of the data some graphs might be more appropriate than others. There are many different types of graphs that can be used to convey information. These include vertical line graphs, bar graphs (charts), pie charts and histograms, among others.

The presentation of data in the form of tables, graphs and charts is an important part of the process of data analysis and report writing.

The results can be expressed within the text of a report, data are usually more digestible if they are presented in the form of a table or graphical display.

Graphs and charts can quickly convey to the reader the essential points or trends in the data.

Some general recommendations to follow when presenting data:

The presentation should be as simple as possible, avoid the trap of adding too much information. A good rule of thumb is to only present one idea or to have only one purpose for each graph or chart you create.

The presentation should be self-explanatory.

The title should be clear, and concise indicating what?, when?, and where? the data were obtained.

Codes, legends and labels should be clear and concise, following standard formats if possible,

The use of footnotes is advised to explain essential features of the data that are critical for the correct interpretation of the graph or chart.

Data Presentation Tools62

62


Several types of statistical/data presentation tools exist, including:

1. Charts displaying frequencies (bar, pie, and pareto charts),

2. Charts displaying trends ( run and control charts),

3. Charts displaying distributions ( histograms), and

4. Charts displaying associations ( scatter diagrams).

Different types of data require different kinds of statistical tools. There are two types of data

Attribute data are countable data or data that can be put into categories: e.g., the number of people willing to pay, the number of complaints, percentage who want blue/ percentage who want red/percentage who want yellow.

Variable data are measurement data, based on some continuous scale: e.g., length, time and cost.

To Show Use Data Needed

Frequency of Occurrence:

Simple percentages or comparisons of magnitude

Bar chart

Pie chart

Pareto chart

Tallies by category ( data can be attribute data or variable data divided into categories)

Trends over time Line graph

Run chart

Control chart

Measurements taken in chronological order (attribute or variable data can be used)

Distribution: Variation not related to time (distributions)

Histograms Forty or more measurements ( not necessarily in chronological order, variable data )

Association: Looking for a correlation between two things

Scatter diagram Forty or more paired measurements ( measures of both things of interest, variable data )

i) The bar graph (chart)

Bar graphs are one of the many techniques used to present data in a visual form so that the reader may readily recognize patterns or trends. Bar graphs usually present categorical (qualitative) variables or numeric (discrete) variables grouped in class intervals.

Example: The following data is on the bed sizes (that is, total number of beds available for patient use) in three hospitals for the years 2003-2005.

63

63


Table: The bed sizes of three hospitals from 2003 to 2005

Hospital 2003 2004 2005

A 40 45 45

B 25 60 60

C 35 45 75

Total 100 150 180

The simple bar chart does not consider the contribution of each hospital to the total bed size. It simply provides information on the aggregate bed sizes in the three hospitals for the years 2003-2005.

0

50

100

150

200

2003 2004 2005

Year

Bed

siz

e

Figure 1: A simple bar chart for the total bed size in three hospitals

A multiple bar chart (double (or group) bar graph) is another effective means of comparing sets of data. This type of vertical bar graph gives two or more pieces of information for each item on the x-axis instead of just one as in Figure 1. In this particular example it allows you to make direct comparisons of values across categories on the same graph, where the values are bed sizes and the categories are the years 2003 – 2005. The graph is shown in Figure 2 below.

64

64


0

10

20

30

40

50

60

70

80

2003 2004 2005

Year

bed

siz

e

A

B

C

Figure 2: A multiple bar chart for the bed sizes in three hospitals

From Figure 2, for example, we can see that the total number of beds available for patient use in Hospital C has consistently increased from 2003 to 2005. On the other hand, the bed size of Hospital B has increased from 2003 to 2004, but remained the same in 2005. When comparison is made between hospitals, we can see that Hospital A had the highest bed size in 2003, but gave way to Hospital B in 2004 and to Hospital C in 2005.

ii) The Pie chartA pie chart is a chart that is used to summarize a set of categorical data or to display the different values of a given variable by means of percentage distribution. This type of chart is a circle divided into a series of segments (or sectors) each representing a particular category. The area of each segment is the same proportion of a circle as the category is of the total data set. The use of the pie chart is quite popular since the circle provides a visual concept of the whole (100%).

Example: A sample of 100 adults was asked what they feel is ‘the most important issue facing today's youth’ among: unemployment, youth violence, rising school fees, drugs in schools, and career counselling. The results are shown in below:

Table: Adults’ opinion on the most important issue facing the youth

Issue Number of adults

Unemployment 38

Youth violence 8

Rising school fees 12

Drugs in schools 22

Career counselling 20

100

65

65


First we have to find the percentage contribution of each category (issue), and then the angle measures of the sectors representing each category of responses have to be calculated.

Issue Percentage share Angle measure of sector

Unemployment 38/100 = 38% 38% x 3600 = 136.80

Youth violence 8/100 = 8% 8% x 3600 = 28.80

Rising school fees 12/100 = 12% 12% x 3600 = 43.20

Drugs in schools 22/100 = 22% 22% x 3600 = 79.20

Career counselling 20/100 = 20% 20% x 3600 = 720

100 3600

At last we partition the circle into sectors based on the above angle measures.

38%

8%12%

22%

20%

Unemployment

Youth violence

Rising school fees

Drugs in schools

Career counselling

Figure 3: A pie chart of the opinion of adults as to the most important issue facing today's youth

Iii) The histogramThe most common form of graphical presentation of a grouped frequency distribution is the histogram. It is used to summarize variables whose values are numerical and measured on an interval scale. It divides up the range of possible values in a data set into classes or groups. For each group, a rectangle is constructed with a base length equal to the range of values in that specific group (or class width), and an area proportional to the number of observations falling into that group.

Example: Considering the grouped frequency distribution of the average monthly number of road fatalities for a total of 50 months in a city, construct a histogram.

66

66


Table: Frequency distribution of the average monthly number of road fatalities

Average monthly number of road fatalities (class limits)

class boundaries Frequency (number of major roads)

43.2 – 46.6 43.15 – 46.65 3

46.7 – 50.1 46.65 – 50.15 6

50.2 – 53.6 50.15 – 53.65 13

53.7 – 57.1 53.65 – 57.15 15

57.2 – 60.6 57.15 – 60.65 11

60.7 – 64.1 60.65 – 64.15 2

50

Figure 4: Histogram of the mean monthly number of road fatalities in a city

Summary

After being collected and processed, data need to be organized to produce useful information or output. Output is usually governed by the need to communicate specific information to a specific audience. The only limit to the different forms of output you can produce is the different types of output devices currently available. To help determine the best output type for the information you have produced, you need to ask yourself these questions: For whom is the output being produced? How will the audience best understand it?

Generally we have two types of output devices: tables and graphs. Grouping variables and presenting them as a grouped frequency distribution is part of the process of organizing data so that they become useful information. If a variable is continuous or takes a large number of values, then it is easier to present

67

67


and handle the data by grouping the values into class intervals. On the other hand, discrete variables can be grouped into class intervals or not.

The other type of output devices are graphs. Graphs are effective visual tools because they present information quickly and easily. If you have decided that using a graph is the best method to relay your message, then the guidelines to remember are: define your target audience (understand more about your audience and what their needs are); determine the message to be transmitted (figure out what your message is and why it is important); and experiment with different types of graphs and select the most appropriate. Note that it not appropriate to use a graph when there are too few data (one, two or three data points) or the data show little or no variations.

1.3 Measures of central tendency and dispersion

6.3.1 Measures of central tendency

One of the most important objectives of statistical analysis is to determine various numerical measures which describe the inherent characteristics of the data. In other words, it is often necessary to represent a data set by means of a single number which is descriptive of the entire set. Such values usually tend to lie centrally within a set of data arranged in increasing or decreasing order of magnitude. Thus, we refer to these as measures of central tendency. These include the mean, median and mode.

A. The Mean

i) The arithmetic mean

The arithmetic mean, or simply the mean, of a set of n observations , , … , , denoted by , is

defined as :

....................................................

...... (1)

ii. The weighted meanSometimes some of the observations in a data set may have greater weight or importance as compared to the others. For instance, the final examination in a course is often given more weight as compared to the

mid-term examination or tests in determining the overall score. In such cases, we associate weights ,

, … , with the observations , , … , , respectively, depending on their importance. The

mean obtained in this way is called the weighted mean, and is defined as:

................................. (3)

Example: Portfolio Rate of Return

68

68


Portfolio expected return (an interest rate, indicating performance) is the weighted average of the expected rates of return of assets in the portfolio, weighted by dollars invested

Suppose portfolio contains three stocks. One ($1,000 invested) is expected to return 20%. Another ($1,800 invested) expects 15%. Third is $2,200 and 30%.

Total invested is 1,000+1,800+2,200 = $5,000 Weights are:

w1 = $1,000/$5,000 = 0.20

w2 = $1,800/$5,000 = 0.36

w3 = $2,200/$5,000 = 0.44

Weighted average is0.20 × (20%) + 0.36 × (15%) + 0.44 × (30%) = 22.6%

This is the expected (mean) return for the portfolio. Note that each stock is represented in proportion to $ invested.

iii. The grand mean (mean of means)

If the mean of a set of numbers is , of numbers is , . . . , of numbers is , then we

define the grand mean (or mean of means) as :

................................... (4)

Example: In a company, the average salary of a group of 50 male employees is 700 Birr and that of a group of 30 female employees is 500 Birr. Find the average salary of male and female employees combined.

Solution:

Let denote the number of male and female employees, respectively, and let denote the

average salary of male and female employees, respectively. Then the average salary of male and female employees combined is:

= = 625 birr.

B. The median

The median, denoted by , is a single value that divides a set of data into two equal parts. It is the

middle most or most central item in the data set.

Note: Data values which are by far smaller or larger as compared to the bulk of data are called extreme values or outliers. Whenever such extreme values exist, the mean may give a distorted picture of the data. On the other hand, the median of such data gives a good overall picture of the data.

69

69


Example: income

Average (mean) income for a country equally divides the total, which may include some very high incomes

Median income chooses the middle person (half earn less, half earn more), giving less influence to high incomes (if any)

C. QuartilesThe median divides a set of data into two equal parts. The values that divide a set of data into four equal

parts are called quartiles, and are denoted by , and . If data are arranged in increasing order,

the positions of the quartiles are:

├─────┼─────┼─────┼─────┤

min max

where min is the minimum observation, and max is the maximum observation. Note that the second

quartile ( ) is the median.

D. The mode

The mode, denoted by X , of a set of numbers is that value which occurs with the greatest frequency. A

data set is called uni-modal, bi-modal or multi-modal depending on whether it has one mode, two modes or more than two modes, respectively.

6.3.2 Measures of dispersionThe measures of central tendency help us in describing a set of data by a single number or by a typical value. However, they do not provide us any information about the extent to which the values differ from one another or from an average value. This bit of information is very essential as illustrated in the following example.

Example: Suppose we are told that the mean of two numbers is 1000. Then, these two numbers may be 4 and 1996, or 990 and 1010. Definitely 1000 is not a good descriptive measure for the numbers 4 and 1996, while it is a good representative figure for 990 and 1010. The reason behind this is that there is a considerable difference between the numbers 4 and 1996, while the difference between 990 and 1010 is relatively small.

Thus, the dispersion (spread or variability) of a data set gives us additional information that enable us to judge the reliability of our measure of central tendency: if data are widely dispersed, then the mean (median or mode) is less representative of the data as a whole than it would be for data with small dispersion. The measures of dispersion also enable us to compare several samples with similar averages.

Example

70

70


1. Financial analysts are concerned about the dispersion of a firm’s earnings. Widely dispersed earnings, those varying from extremely high to low or even negative levels, indicate a higher risk to stock holders and creditors than do earnings remaining relatively stable.

2. Quality control experts analyze the dispersion of a product’s quality levels. A drug that is average in purity but ranges from very pure to highly impure may endanger lives.

A. The rangeA quick and easy indication of dispersion is the range. The range of a set of data is the difference between the largest and smallest observed values, i.e;

Range = max - min

where max = largest observation and min = smallest observation.

B. The interquartile rangeIn case an extreme value(s) exists, we use another measure of dispersion, called the interquartile range (Q), which is defined as:

Q = -

where and are the third and first quartiles, respectively.

Identifying Outliers

Outliers are observations that are far from the center of the distribution. These are defined as observations which are either:

Greater than or

Less than

These are shown in the figure below. This figure is known as the box-plot.

71

71


c. Mean (average) deviation

If , , . . . , are n observations of a variable X, then the mean deviation (MD) about the mean

is defined as:

,

where is the sample mean.

D. Variance and standard deviation

Suppose a population consists of the values , , . . . , . We define the population variance,

denoted by , as:

where is the population mean defined by :

The positive square root the population variance is called the population standard deviation, i.e.;

Usually, information about the entire population is not available. The reason for this is that collecting data about the entire population is time consuming, expensive, and sometimes impossible. Thus, we often take a sample and infer something about the population based on sample statistics. If we have a sample of size

n comprising of the values , , . . . , , we calculate the sample variance as:

where is the sample mean defined by :

72

72


The sample standard deviation is defined as the positive square root of the sample variance, i.e.;

Properties of the standard deviation

a) If we add (subtract) a constant to (from) each value of a data set, then the standard deviation remains unchanged.

b) If we multiply each value of a data set by a constant, then the new standard deviation will be the original standard deviation multiplied by that constant.

Symbolically, if the standard deviation of the observations , , … , is S, then the standard

deviation of:

a) k, k, . . . , k will be S

b) k , k , . . . , k will be kS.

where k is a constant.

E. The coefficient of variationThe standard deviation is an absolute measure of dispersion that expresses the variation in the same units as the original data. But it cannot be the sole basis for comparing two distributions. If we have a standard deviation of 10 and a mean of 5, the values vary by an amount twice as large as the mean itself. If, on the other hand, we have a standard deviation of 10 and a mean of 5,000, the variation relative to the mean is insignificant. Therefore, we can not fully determine the dispersion of a set of data unless we know the standard deviation, the mean, and how the standard deviation compares with the mean.

Example: If the weights of certain objects or individuals have a standard deviation of 1 Kg, then this figure alone does not tell us whether there is a great deal of variation or not: if the data are weights of new born babies, S = 1 Kg shows a great deal of variability; whereas if the data are the weights of adult elephants, S = 1 Kg shows a small variability.

73

73


In such cases, what we need is a relative measure that will give us a feel for the magnitude of the deviation relative to the magnitude of the mean. One such relative measure of dispersion is the coefficient of variation, V, defined by:

where S = standard deviation and = mean.

F. The standard scoreSuppose a student scored 65% in a statistics test and 70% in a mathematics test. In which subject is his performance better? In order to answer such questions, we need to compare the score of the student with the average score of all students who sat for these exams in both subjects (simply comparing 65 and 70 may lead us to a wrong conclusion). For such purposes, we define the standard score (or Z-score) which is given by:

The standard score measures the deviation of each value of a data set from the mean in units of standard deviation. It is used to compare the relative standing of values.

6.3.3 General shape of distributions

One of the main features of a distribution is the extent to which it is symmetric.

1) A perfectly symmetric curve is one in which both sides of the distribution would exactly match the other if the figure were folded over its central point. An example is shown below:

A symmetric, uni-modal, bell-shaped distribution is called a normal distribution. In such distributions, central values do have the highest frequency. As we move to both tails, the frequency keeps on decreasing. Many phenomena, such as weight, height, intelligence quotient (IQ), etc. of individuals; daily output of a production line, and the like, can be approximately described by a normal distribution.

74

74


2) If a distribution is lop-sided, that is, if most of the data are concentrated either on the left or the right end, it is said to be skewed.

a) A distribution is said to be skewed to the right, or positively skewed, when most of the data are concentrated on the left of the distribution.

Income provides one example of a positively skewed distribution. Most people have small income, but some make quite a bit more, with a smaller number making many millions of dollars a year. Therefore, the positive (right) tail on the line graph for income extends out quite a long way, whereas the negative (left) skew tail stops at zero. The right tail clearly extends farther from the distribution's centre than the left tail as shown below:

b) A distribution is said to be skewed to the left, or negatively skewed, if most of the data are concentrated on the right of the distribution. The left tail clearly extends farther from the distribution's centre than the right tail as shown below:

Example: The following data is the score of 41 students on math test (rounded to the nearest integer):

31, 49, 19, 62, 50, 24, 45, 23, 51, 32, 48, 55, 60, 40, 35, 54, 26, 57, 37, 43, 65, 50, 55, 18, 53, 41, 50, 34, 67, 56, 44, 4, 54, 57, 39, 52, 45, 35, 51, 63, 42

The histogram and the frequency polygon superimposed on it are shown in the Figure below.

Figure: Histogram of the score of students on math test

75

75


The amount of distribution spread can be spotted on the graph. For example, the figure reveals that most students scored in the interval between 50 and 59, while only a few students scored less than 20. The distribution has a single peak within the 50–59 interval. The distribution also shows that most data are clustered at the right. The left tail extends farther from the data centre (median = 48) than the right tail. Therefore, the distribution is skewed to the left or negatively skewed.

Note: For normal distributions, the mean and standard deviation are the best measures of central tendency and dispersion (spread), respectively. However, the standard deviation is not a good measure of spread in highly skewed distributions. It is also highly influenced by outliers (extreme values). A single outlier can raise the standard deviation and in turn, distort the picture of spread.

6.4 Exercise with SPSS Application (Computer Lab)

76

76


Unit Seven

Tests of hypothesis concerning means and proportions

7.1 Tests of hypotheses concerning means

1. Parametric and non-parametric statistics (tests)

Parametric tests are statistical tests which make certain assumptions about the parameters of the full population from which the sample is taken (e.g., a normal distribution). If those assumptions are correct, parametric methods can produce more accurate and precise estimates (they are said to have more statistical power). However, if those assumptions are incorrect, parametric methods can be very misleading. These tests normally involve data expressed in absolute numbers (interval or ratio) rather than ranks and categories (nominal or ordinal). Such tests include analysis of variance (ANOVA), t-tests, etc.

Consider the frequency distribution shown below. It can easily be observed that the distribution deviates substantially from the normal distribution (the bell-shaped distribution).

This may also be that case for many variables of interest. For example, is income distributed normally in the population? -- probably not. The incidence rates of rare diseases are not normally distributed in the population, and neither are very many other variables in which a researcher might be interested. With a sample of small size at hand, analyzing such variables using parametric tests might be misleading!

Note: We can apply parametric tests even if we are not sure that the distribution of the variable under investigation in the population is normal as long as our sample is large enough. If our sample is very small, however, then those tests can be used only if we are sure that the variable is normally distributed.

Applications of tests that are based on the normality assumptions are further limited by a lack of precise measurement. For example, course grade (A, B, C, D, F) is a crude measure of scholastic accomplishments that only allows us to establish a rank ordering of students from "good" students to "poor" students. Most common statistical techniques assume that the underlying measurements are at least

77

77


of interval. However, as in our example, this assumption is very often not tenable, and the data rather represent a rank ordering of observations (ordinal) rather than precise measurements.

Thus, the need is evident for statistical procedures that allow us to process data of ‘low quality’ (nominal or ordinal), from small samples on variables about which nothing is known (concerning their distribution). Nonparametric methods have been developed to be used in cases when the researcher knows nothing about the parameters of the variable of interest in the population (hence the name nonparametric).

In more technical terms, nonparametric methods do not rely on the estimation of parameters (such as the mean or the standard deviation) describing the distribution of the variable of interest in the population. Therefore, these methods are also sometimes (and more appropriately) called parameter-free methods or distribution-free methods.

Non-parametric methods are widely used for studying populations that take on a ranked order. The use of non-parametric methods may be necessary when data have a ranking but no clear numerical interpretation, such as when assessing preferences; in terms of levels of measurement, for data on an ordinal scale.

When to use which method

Basically, there is at least one nonparametric equivalent for each parametric general type of test. In general, these tests fall into the following categories:

Tests of differences between groups (independent samples) Tests of differences between variables (dependent samples) Tests of relationships between variables.

Differences between independent groups: Usually, when we have two samples that we want to compare concerning their mean value for some variable of interest, we would use the t-test for independent samples. A nonparametric alternative for this test is the Mann-Whitney U test. If we have multiple groups, we would use (the parametric) analysis of variance; a nonparametric equivalent to this method is the Kruskal-Wallis analysis of ranks test.

Differences between dependent groups: If we want to compare two variables measured in the same sample we would customarily use the t-test for dependent samples (if we want to compare students' math skills at the beginning of the semester with their skills at the end of the semester). Nonparametric alternatives to this test are the Sign test and Wilcoxon's matched pairs test. If the variables of interest are dichotomous in nature (i.e., "pass" vs. "no pass") then McNemar's Chi-square test is appropriate.

Relationships between variables: To express a relationship between two variables one usually computes the correlation coefficient. A nonparametric equivalent to the standard correlation coefficient is the Spearman rank correlation coefficient. If the two variables of interest are categorical in nature (e.g., "passed" vs. "failed" by "male" vs. "female") an appropriate nonparametric test of the relationship between the two variables is the Chi-square test.

78

78


2. Tests concerning the difference between two means (independent samples)

The null and alternative hypotheses are:

where is the mean of population 1 and is the mean of population 2. The null hypothesis ( )

simply states that the two populations under consideration have equal means. Note than our conclusion is about the means of the two populations (i.e., true means); it is not about the samples (or sample means)!

a) The t- test :

Assumptions:

1) The samples come from two normally distributed populations.

2) and are assumed equal but not known.

Under these assumptions, we can use the t-distribution with ( ) degrees of freedom to find the

critical values ( ) for a given level of significance (). The test statistic is given by:

where is the pooled standard deviation defined as:

Here and are the variances computed from the samples.

Decision rule: Reject H0 if

Example 1: The following summary statistics are on the annual household income (in thousands of dollars) of individuals who previously defaulted (group 1) and not defaulted (group 2) on their bank loans (data obtained from SPSS package: bankloan.sav).

79

79


Defaulted (group 1) Not defaulted (group 2)

Mean = 41.2131 = 47.1547

Variance = 1858.949 = 1171.019

Sample size = 183 = 517

Test if there is a significant difference in the mean income of defaulters and non-defaulters at the 5% level of significance.

Solution


The level of significance is = 0.05.

The pooled standard deviation is computed as:

= 36.7477

The test statistic is thus:

= - 1.686

= 0.05

Decision: Since the absolute value of the test statistic = 1.686 does not exceed the critical

value, we do not reject .

80

80


Conclusion: There is no significant difference in the mean income of defaulters and non-defaulters at the 5% level of significance.

Remark: The nonparametric equivalent - the Mann–Whitney U test

The nonparametric equivalent of the t-test is the Mann–Whitney U test. This nonparametric test is virtually identical to performing an ordinary parametric two-sample t-test on the data after ranking over the combined samples. It requires the two samples to be independent, and the observations to be ordinal or continuous measurements. Unlike the parametric t-test, this non-parametric makes no assumptions about the distribution of the data (e.g., normality).

Example 2: Consider the data on the annual household income (in thousands of dollars) of individuals who previously defaulted (group 1) and not defaulted (group 2) on their bank loans above. To apply the Mann-Whitney U test for the difference in the mean income of defaulters and non-defaulters, the procedure in SPSS is as follows:

Analyze Nonparametric Tests 2 Independent Samples

Test Variable List Household income [Income]

Grouping Variable default (? ?)

Define Groups Group 1: 1

Group 2: 0

OK

The output is as shown below:

Mann-Whitney Test

Ranks

Previously defaulted N Mean Rank Sum of Ranks

Household income in thousands No 517 368.83 190685.50

Yes 183 298.71 54664.50

Total 700

81

81


Test Statisticsa

Household income in thousands

Mann-Whitney U 37828.500

Wilcoxon W 54664.500

Z -4.032

Asymp. Sig. (2-tailed) .000

a. Grouping Variable: Previously defaulted

Decision: Since the p-value is less than 1%, we reject . Thus, we conclude that there is a

significant difference in the mean income of defaulters and non-defaulters at the 1% level of significance.

Question: Why are the two tests led to a different conclusion?

This is unexpected, and we have to look for the source of this fallacy. The box plot of the data is shown below:

Figure: A box plot of the mean income of defaulters and non-defaulters

82

82


We see from the box plot that the item on the 445 th row is an outlier (extreme value). Probably the fallacy is due to this value. After removing this row, the in independent sample t-test yields the following result:

Independent Samples Test

Levene's Test for Equality of Variances t-test for Equality of Means

95% Confidence Interval of the Difference

F Sig. t dfSig. (2-tailed)

Mean Difference

Std. Error Difference Lower Upper

Household income in thousands

Equal variances assumed

3.517 .061 -2.836 697 .005 -8.16573 2.87926 -13.81879 -2.51266

Equal variances not assumed

-2.975 347.531 .003 -8.16573 2.74484 -13.56432 -2.76713

Remark

The independent samples t-test is of two types: equal variances assumed ( ) and equal

variances not assumed ( ). In order to identify which of the two tests is appropriate,

we use the Levene's test for equality of variances. If the p-value for this test is less than 5%, then

we reject and consider the result under ‘Equal variances not assumed’ row; other wise, we

use the result under ‘Equal variances assumed’ row.

In our case, the Levene's test for equality of variances has a p-value of 0.061 which is greater

than 5%. Thus, we do not reject , and consequently, we refer to the result in the

‘Equal variances assumed’ row. The p-value is 0.005 which is less than 1%. Thus, we reject the hypothesis of equality of means of the two groups.

3. Tests of mean difference between several populations (independent samples) We have seen in section 3 above how to apply hypothesis testing procedures to test the null hypothesis of no difference between two population means. It is not unusual for the investigator to be interested in testing the null hypothesis of no difference among several population means.

Assumptions: Random samples of size n are taken from each of the k populations which are

independently and normally distributed with means , , . . ., and common variance

(i.e the variability in each group is the same). Also all observations are continuous.

83

83


Under this general principle we want to test:

against the alternative:

: At least two of them are different.

ANOVA

For such test of hypothesis we use a method called analysis of variance. Analysis of variance is a method of splitting the total variation into meaningful components that measure different sources of variation.

In other words, we split the total sum of squares ( ) into ‘between groups (sample) sum

of squares’ ( ) and ‘within group (sample) sum of squares’ ( ). And the test

statistic for testing versus is given by the variance ratio:

where k is the number of groups, and n is the sample size from each group. In order to decide whether the null hypothesis has to be rejected or not, we compare the above test statistic with

, the value from the F-distribution with (k – 1) and k(n – 1) degrees of freedom

for a given level of significance .

Decision rule:

If the calculated value (test statistic) exceeds , we reject H0 and conclude that

the group means are not all equal.

84

84


The analysis of variance (ANOVA) table for testing such hypotheses is as shown below.

ANOVA table for a one way classification

Source of variation

Sum of squares d.f. Mean square F-ratio

Between groups

k – 1

Within group k(n – 1)

Total kn – 1

Example 3: This example uses data from SPSS:

Files\SPSSInc\Statistics17\Samples\English\car_insurance_claims.sav

The data is on: insurance policy holder’s age, vehicle group, vehicle age, average cost of claims and number of claims.

Our aim is to see if there is a significant difference in the average cost of claims for vehicles of age: 0 – 3, 4 – 7, 8 – 9 and 10+.

Before we test if the average cost of claims for the three groups of vehicles (based on age) is the

same, we have to test the existence of common variance (i.e the variability in each group is

the same).

The SPSS output for One-Way ANOVA is shown below:

85

85


Test of Homogeneity of Variances

Average cost of claims

Levene Statistic df1 df2 Sig.

1.353 3 119 .261

The Levene’s test of:

versus

at least two of the variances are different

has a p-value of 0.261. Since this figure is greater than 5%, we do not reject the null hypothesis. Thus, we can say that the variability in the cost of claims is the same regardless of the age of a vehicle. However, such parametric tests are highly influenced by the existence of outliers (if any). The figure below shows the box plot of the average cost of claims.

Figure: a box plot of the average cost of claims

It can be seen that the 13th, 14th and 48th items are outliers. After removing these items, the Levene’s test of equality of variances yields the following result.

86

86


Test of Homogeneity of Variances


Levene Statistic df1 df2 Sig.

4.696 3 116 .004

Here the p-value is less than 1%. Thus, we reject the null hypothesis and conclude that the variability in the cost of claims is different depending on the age of a vehicle. Now we can test if the average cost of claims for the three groups of vehicles (based on age) is the same.

The ANOVA table for testing:

versus

at least two of the means are different is shown below:

ANOVA


Sum of Squares df Mean Square F Sig.

Between Groups 462109.712 3 154036.571 49.721 .000

Within Groups 359371.613 116 3098.031

Total 821481.325 119

Since the p-value is less than 1%, we conclude that there is a significant difference in the mean cost of claims for vehicles of different ages at the one percent level of significance.

Question: which groups of means are different?

To answer this question, we apply pair-wise comparison of means. Since the equality of variances assumption is rejected, the appropriate tests are those listed under ‘Equal Variances Not Assumed’ in SPSS. The output of Post Hoc Tests (Multiple comparisons) is shown below:

87

87


(I) Vehicle age

(J) Vehicle age

Mean Difference (I-J) Std. Error Sig.

95% Confidence Interval

Lower Bound Upper Bound

0-3 4-7 34.774 15.546 .164 -7.68 77.23

8-9 98.677* 16.000 .000 55.04 142.31

10+ 165.570* 15.113 .000 124.17 206.97

4-7 0-3 -34.774 15.546 .164 -77.23 7.68

8-9 63.903* 13.206 .000 27.97 99.84

10+ 130.796* 12.115 .000 97.75 163.84

8-9 0-3 -98.677* 16.000 .000 -142.31 -55.04

4-7 -63.903* 13.206 .000 -99.84 -27.97

10+ 66.892* 12.693 .000 32.26 101.52

10+ 0-3 -165.570* 15.113 .000 -206.97 -124.17

4-7 -130.796* 12.115 .000 -163.84 -97.75

8-9 -66.892* 12.693 .000 -101.52 -32.26

*. The mean difference is significant at the 0.05 level.

Thus, there is a significant difference in the mean cost of claims between vehicles of age:

a) 0 – 3 versus 8 – 9 and 10+b) 4 – 7 versus 8 – 9 and 10+c) 8 – 9 versus 10+

We can see from the results that vehicles of age 10+ do have the lowest mean cost of claims, followed by those with ages 8 – 9 years.

That is, mean cost of claims is statistically significantly greater in those vehicle whose age is lower (0-3 and 4-7) as compared to 8 – 9 and 10+; The lower the vehicle age, the higher the cost of claim.Remark:

88

88


The nonparametric equivalent of the one-way ANOVA is the Kruskal–Wallis test. The Kruskal–Wallis one-way analysis of variance by ranks is a nonparametric method for testing equality of population medians among groups. It is identical to a one-way analysis of variance with the data replaced by their ranks. It is an extension of the Mann–Whitney U test to three or more groups. Since it is a non-parametric method, the Kruskal–Wallis test does not assume a normal population, unlike the analogous one-way analysis of variance.

Example 4: Consider the data is on the average cost of vehicle insurance claims and vehicle age. To apply the Kruskal–Wallis one-way analysis of variance for the difference in the mean cost of claims for vehicles of different ages, the procedure in SPSS is as follows:

Analyze Nonparametric Tests k Independent Samples

Test Variable List Average cost of claims

Grouping Variable vehicleage (? ?)

Define Range Minimum 1; Maximum 4

OK

The output is as shown below:

Kruskal-Wallis Test

Ranks

Vehicle age N Mean Rank

Average cost of claims 0-3 31 89.69

4-7 31 79.68

8-9 31 48.58

10+ 27 18.65

Total 120

Test Statisticsa,b


89

89


Ranks

Vehicle age N Mean Rank

Average cost of claims 0-3 31 89.69

4-7 31 79.68

8-9 31 48.58

10+ 27 18.65

Chi-Square 73.989

df 3

Asymp. Sig. .000

a. Kruskal Wallis Test

b. Grouping Variable: Vehicle age

Since the p-value is less than 1%, we again conclude that there is a significant difference in the mean cost of claims for vehicles of different ages at the one percent level of significance.

4. Paired-samples t-test (Differences between dependent groups)If we want to compare two variables measured in the same sample we would customarily use the t-test for dependent samples. For example, we might be interested to see if there is a significance difference in the mean output per individual worker before and after an intensive training. In this case, we take a random sample of workers and record the amount of output each produced before the training and again after training. We then compute the difference between the two variables (output before training and output after training) for each case, and test to see if the average difference is significantly different from zero.

Let and be the measurements before treatment and after treatment for the jth individual,

respectively. We compute the differences as:

We then calculate the mean and sample variance of the differences ’s as:

90

90


Assumption:

The differences ( ’s) are normally distributed with mean and variance .

The hypotheses to be tested are:

(no significant difference in the before – after mean)

The test statistic for this test is given by:

Decision rule: Reject H0 if

Example 5: This example uses data from SPSS:

Files\SPSSInc\Statistics17\Samples\English\property_assess.sav

The data is on current sale value of n = 1000 houses ( ) and the value of the same at last

appraisal ( ). Our aim is to see if there is a significant difference in the mean sale value.

The SPSS output for paired-sample t-test is shown below:

91

91


Paired Samples Statistics

Mean N Std. Deviation Std. Error Mean

Pair 1 Sale value of house 161.4920 1000 55.44955 1.75347

Value at last appraisal 134.9620 1000 44.79421 1.41652

Paired Samples Test

Paired Differences

T df

Sig. (2-

tailed)

Mean

(D)

Std.

Deviation

Std. Error

Mean

95% Confidence Interval

of the Difference

Lower Upper

Pair 1 26.53000 31.74022 1.00371 24.56037 28.49963 26.432 999 .000

Since the p-value is less than 0.01, we reject the null hypothesis and conclude that there is a significant difference between the current mean sale value of houses and the mean sale value in the last appraisal at the 1% level of significance (the sale value has appreciated, on average).

Remark: The nonparametric equivalent to the paired-samples t-test is the Wilcoxon Signed Ranks Test. This test does not assume a normal population.

Example 6: For the data in example 5, the Wilcoxon Signed Ranks Test results are shown below:

Wilcoxon Signed Ranks Test

92

92


Ranks

N Mean Rank Sum of Ranks

Value at last appraisal - Sale value of house

Negative Ranks 839a 548.34 460056.50

Positive Ranks 160b 246.52 39443.50

Ties 1c

Total 1000

a. Value at last appraisal < Sale value of houseb. Value at last appraisal > Sale value of housec. Value at last appraisal = Sale value of house

Test Statisticsb

Value at last appraisal - Sale value of house

Z -23.055a

Asymp. Sig. (2-tailed) .000

a. Based on positive ranks.

b. Wilcoxon Signed Ranks Test

Since the p-value is less than 0.01, we again reject the null hypothesis and conclude that there is a significant difference between the current mean sale value of houses and the mean sale value in the last appraisal at the 1% level of significance.

5. Tests of associationa) The Pearson coefficient of correlation and test of its significance

For two continuous variables X and Y, a measure of the strength of linear relationship is provided by the Pearson coefficient of correlation which is defined as:

The coefficient of correlation r ranges from -1 to 1, inclusive; i.e., -1 r 1.

93

93


The sign of r indicates the direction of the relationship between the two variables X and Y. If an inverse relationship exists, then r will fall between 0 and -1. Likewise, if there is a direct relationship, then r will be a value within the range 0 to 1.

To see if this value of r is of sufficient magnitude to indicate that the two variables of interest are correlated, we test the hypothesis:

H0: = 0

HA: 0

where is the true (population) coefficient of correlation. The test statistic is:

Decision: Reject the null hypothesis if:

Example 7: The following are on the advertising spending and sales of a company recorded over a period of n = 24 months (obtained from SPSS package: C:\Program Files\SPSSInc\Statistics17\Samples\English\advert.sav). Is there a significant correlation between advertising spending and sales?

MonthAdvertising

spending (X)Detrended sales

(Y) MonthAdvertising

spending (X)Detrended sales

(Y)

1 4.69 12.23 13 5.15 12.27

2 6.41 11.84 14 5.25 12.57

3 5.47 12.25 15 1.72 8.87

4 3.43 11.1 16 3.04 11.15

5 4.39 10.97 17 4.92 11.86

6 2.15 8.75 18 4.85 11.07

7 1.54 7.75 19 3.13 10.38

8 2.67 10.5 20 2.29 8.71

9 1.24 6.71 21 4.9 12.07

10 1.77 7.6 22 5.75 12.74

94

94


11 4.46 12.46 23 3.61 9.82

12 1.83 8.47 24 4.62 11.51

Solution

Summary statistics:

n = 24, = 89.28, = 253.65, = 1001.954, = 386.58, =

2755.299

Plugging in these values, the sample coefficient of correlation is:

= 0.91627

The hypothesis of interest is:

H0: = 0

HA: 0

The test statistic is: = = 10.72913

For = 0.01, . Since the calculated value t = 10.72913 is

greater than 2.819, we reject H0 and conclude that there is a significant positive (or direct) correlation between advertising spending and sales at the one percent level of significance. The SPSS output is shown below:

Correlations

95

95


Correlations

Advertising spending Detrended sales

Advertising spending Pearson Correlation 1 .916**

Sig. (2-tailed) .000

N 24 24

Detrended sales Pearson Correlation .916** 1

Sig. (2-tailed) .000

N 24 24

**. Correlation is significant at the 0.01 level (2-tailed).

b) The Spearman rank correlation coefficient and test of its significance The Pearson coefficient of correlation requires precise numerical values (i.e., continuous data) for the variables. However, in many instances such numerical measurements may not be possible (for instance, job performance, taste, intelligence, etc.). In such cases, we can compute a nonparametric measure of association that is based on ranks. This measure is known as the

Spearman rank correlation coefficient ( ), and is given by:

where n = number of paired observations

d = difference between the ranks for each pair of observations

The steps involved in computing are as follows:

Step1 : Rank the x’s among themselves giving rank 1 to the largest (or smallest) observation, rank 2 to the second largest (or second smallest) observation, and so on.

Step 2 : Rank the y’s similarly. Step 3 : Find d = rank of x - rank of y for each pair of observations.

Step 4 : Find (the sum of squares of the differences between each pair of ranks)

Step 5 : Compute the rank correlation coefficient using the above formula.

Example 8: For the data on the advertising spending and sales of a company recorded over a period of n = 24 months, the SPSS output of nonparametric correlations is shown below:

96

96


Nonparametric Correlations

Correlations

Advertising spending

Detrended sales

Advertising spending Correlation Coefficient 1.000 .889**

Sig. (2-tailed) . .000

N 24 24

Detrended sales Correlation Coefficient .889** 1.000

Sig. (2-tailed) .000 .

N 24 24

**. Correlation is significant at the 0.01 level (2-tailed).Since the p-value is less than 0.01, we reject H0 and conclude that there is a significant correlation between advertising spending and sales at the one percent level of significance.

c) The Chi-square test If the two variables whose degree of association we want to test are categorical in nature (for example, job satisfaction versus income), the appropriate nonparametric statistic for testing such relationship is the Chi-square test.

Example 9: Here we use the data in SPSS package:

Files\SPSSInc\Statistics17\Samples\English\customer_dbase.sav

Suppose we want to check if there is a relationship between the level of income of employees (categorized into five) and job satisfaction (categorized from highly dissatisfied to highly satisfied). Here job satisfaction is not continuous, and hence we can not apply the Pearson coefficient of correlation.

Before going to the test it is a good idea to see what the data look like using graphs. The multiple bar chart for the said variables is shown below:

97

97


The chart gives us some idea about the relationship between the two variables. For example, the frequency of highly dissatisfied employees keeps on decreasing as income increases. However, we do not come to the final judgement before we apply objective statistical tests. In tests of independence, the null and alternative hypotheses are of the form:

H0: The two classifications are independent.

HA: The two classifications are dependent.

The null hypothesis can also be written as ‘there is no association between the two classifications.' The SPSS procedure for testing such hypotheses is:

Analyze Nonparametric Tests Chi-Square

Test Variable List income category

Job satisfaction

OK

The SPSS outpour is as shown below:

98

98


Chi-Square Test

Test Statistics

Income category in thousands Job satisfaction

Chi-Square 1252.834a 24.426a

df 4 4

Asymp. Sig. .000 .000

a. 0 cells (.0%) have expected frequencies less than 5. The minimum expected cell frequency is 1000.0.

Since the p-value is less than 0.01, we reject H0 and conclude that income and job satisfaction are dependent or associated. A cross-tabulation of income category versus job satisfaction is shown below.

Job satisfaction

Total

Income category (in thousands)

Highly dissatisfied

Somewhat dissatisfied Neutral

Somewhat satisfied

Highly satisfied

Under $25 Count 413 303 242 219 153 1330

percentage

31.1% 22.8% 18.2% 16.5% 11.5% 100.0%

$25 - $49 Count 365 413 440 348 227 1793

percentage

20.4% 23.0% 24.5% 19.4% 12.7% 100.0%

$50 - $74 Count 119 173 182 168 177 819

percentage

14.5% 21.1% 22.2% 20.5% 21.6% 100.0%

$75 - $124 Count 53 100 143 189 183 668

percentage

7.9% 15.0% 21.4% 28.3% 27.4% 100.0%

$125+ Count 17 52 85 90 146 390

percentage

4.4% 13.3% 21.8% 23.1% 37.4% 100.0%

It can be seen that for income category ‘Under $25’, more than 50% of employees are highly dissatisfied or somewhat dissatisfied. On the other hand, for the income category ‘$125+’, about

99

99


60% of employees are either somewhat satisfied or highly satisfied. In general, the higher the income level, the more likely are the employees to be satisfied with their job.

6. Hypothesis test for the difference between two proportionsHere our aim is to conduct a hypothesis test to determine whether the difference between two population proportions is significant or not. The test procedure, called the two-proportion z-test, is appropriate when the two samples are independent.

When the null hypothesis states that there is no difference between the two population

proportions and , the null and alternative hypotheses for a two-tailed test are often stated in

the following form:

Denoting the two populations by the subscripts ‘1’ and ‘2’, we take random sample of size

from population 1 and compute the sample proportion of individuals that possess a specific

characteristic. Similarly, we take a random sample of size from population 2 and compute the

sample proportion . Using sample data, we compute the following:

Pooled sample proportion:

Standard error (SE) of the sampling distribution of the difference between two proportions:

where P is the pooled sample proportion.

The test statistic is given by:

100

100


We then compare this statistic with the critical value from the standard normal distribution for a given level of significance .

Decision: Reject the null hypothesis if:

Example 10: Consider the data in example 9 (the level of income of employees and job satisfaction).

Now let us compare employees who earn under $25 (thousand per year) and those who earn $25 – $49.

Job satisfaction

Total

Income category (in thousands) Highly

dissatisfied

Somewhat dissatisfied

Neutral

Somewhat satisfied

Highly satisfied

Under $25 Count 413 303 242 219 153 1330

Percentage

31.1% 22.8% 18.2% 16.5% 11.5% 100.0%

$25 - $49 Count 365 413 440 348 227 1793

Percentage

20.4% 23.0% 24.5% 19.4% 12.7% 100.0%

Is there a significant difference between the proportion of those who are highly dissatisfied in the two income groups?

Solution

Here = 413, = 0.311, = 365, = 0.204. The pooled sample proportion is:

= 0.260801

The standard error is calculated as:

= 0.031543

The test statistic is thus:

101

101


= 3.392191

The critical value from the standard normal distribution for = 0.01 is .

Decision: Since , we reject the null hypothesis. Thus, there a significant difference

between the proportion of those who are highly dissatisfied in the two income groups at the one percent level of significance.

Note that > 0 . This indicates that a significantly higher proportion of employees who

earn under $25 are highly dissatisfied with their job as compared to those who earn $25 – $49.

Exercise with SPSS Application

102

102


Unit Eight:

The simple linear regression model and Statistical Inference

8.1 What is a regression model?

What is regression analysis? In very general terms, regression is concerned with describing and evaluating the relationship between a given variable and one or more other variables on which the given variable depends. More specifically, regression is an attempt to explain movements in a variable by reference to movements in one or more other variables.

The given variable is referred to as the dependent (or response) variable (denoted by Y), while the variables which are thought to affect it are referred to as independent (explanatory or

regressor) variables (denoted by , , , . . ., ). The case where we have just one

explanatory variable is called simple linear regression. If we have two or more explanatory variable, then we have the multiple linear regression model.

8.2 Regression versus correlation

The correlation between two variables measures the degree of linear association between them. If X and Y are correlated, then there is an evidence of a linear relationship between the two variables. However, it is not implied that changes in X cause changes in Y, or that changes in Y cause changes in X. The degree of linear relationship between these two variables is measured by the coefficient of correlation.

In regression, the dependent variable and the independent variable are treated very differently.

The dependent variable is assumed to be random or ‘stochastic’ in some way, i.e. to have a certain probability distribution.

The independent variables are, however, assumed to have fixed (‘non-stochastic’) values in repeated samples.

8.3 Simple linear regression

For simplicity, suppose for now that it is believed that Y depends on only one X variable. Examples of the kind of relationship that may be of interest include:

How asset returns vary with their level of market risk103

103


Measuring the long-term relationship between stock prices and dividends

Suppose that a researcher has some idea that there should be a relationship between two variables Y and X, and that economic, financial, etc. theory suggests that an increase in X will lead to an increase in Y. A sensible first stage to testing whether there is indeed an association between the variables would be to form a scatter plot of them. Suppose that the outcome of this plot is as shown in figure 1.

Figure 1: Scatter plot of two variables Y and X

In this case, it appears that there is an approximate positive linear relationship between X and Y, which means that increases in X are usually accompanied by increases in Y, and that the relationship between them can be described approximately by a straight line.

It would therefore be of interest to determine to what extent this relationship can be described by an equation that can be estimated using a defined procedure. It is possible to use the general equation for a straight line:

………………….. (1)

to get the line that best ‘fits’ the data. The researcher would then be seeking to find the values of the parameters or coefficients, α and β, which would place the line as close as possible to all of the data points taken together.

However, this equation is an exact one. Assuming that this equation is appropriate, if the values of α and β had been calculated, then given a value of X, it would be possible to determine with certainty what the value of Y would be. Imagine -- a model which says with complete certainty what the value of one variable will be given any value of the other!

Clearly this model is not realistic. Statistically, it would correspond to the case where the model fitted the data perfectly -- that is, all of the data points lay exactly on a straight line. To make the

model more realistic, a random disturbance or error term, denoted by , is added to the

equation. Thus, we have:

104

104


………………. (2)

where the subscript t (= 1, 2, 3, . . .) denotes the observation number.

Reasons for the inclusion of the error term

Even in the general case where there is more than one explanatory variable, some determinants of will always in practice be omitted from the model. This might, for example, arise because the number of influences on Y is too large to place in a single model, or because some determinants of Y may be unobservable or not measurable.

There may be errors in the way that Y is measured which cannot be modelled. There are bound to be random outside influences on Y that again cannot be modelled. For

example, a terrorist attack, a hurricane or a computer failure could all affect financial asset returns in a way that cannot be captured in a model and can not be forecast reliably. Similarly, many researchers would argue that human behaviour has an inherent randomness and unpredictability!

So how are the appropriate values of α and β determined?

α and β are chosen so that the (vertical) distances from the data points to the fitted lines are minimised (so that the line fits the data as closely as possible). The parameters are thus chosen to minimise collectively the (vertical) distances from the data points to the fitted line.

The most common method used to fit a line to the data is known as ordinary least squares (OLS). This approach forms the workhorse of econometric model estimation.

Suppose now, for ease of exposition, that the sample of data contains only five observations. The method of OLS entails taking each vertical distance from the point to the line, squaring it and then minimising the total sum of squares of the errors (hence ‘least squares’) (see figure 2).

105

105


Figure 2: The estimating line together with the associated errors

Let denote the actual data point for observation t and denote the fitted value from the

regression line – in other words, for the given value of X of this observation at time t, is the

value for Y which the model would have predicted. Note that a hat (ˆ) over a variable or

parameter is used to denote a value estimated by a model. Finally, let denote the residual,

which is the difference between the actual (observed) value of Y and the value fitted by the

model for this data point; i.e. . What is done is to minimise the sum of the .

Note: The reason that the sum of the squared distances is minimised rather than, for example,

finding the sum of that is as close to zero as possible, is that in the latter case some points will

lie above the line while others lie below it. Then, when the sum to be made as close to zero as possible is formed, the points above the line would count as positive values, while those below would count as negative values. So these distances will cancel each other out and the sum would be zero. However, taking the squared distances ensures that all deviations that enter the calculation are positive and therefore do not cancel out.

So the sum of squared distances to be minimized is given by:

……………. (3)

This sum is known as the residual sum of squares (RSS) or the sum of squared residuals. But

what is ? Again, it is the difference between the actual point and the estimating line, that is,

. So minimising is equivalent to minimising .

106

106


Letting and denote the values of α and β selected by minimising the RSS, respectively, the

equation for the fitted line is given by . Now let L denote the RSS,

which is also known as a loss function. Take the summation over all of the observations, i.e. from t = 1 to T , where T is the number of observations:

………… (4)

To find the values of α and β which minimise the residual sum of squares (equivalently, to find

the equation of the line that is closest to the data), L is minimised with respect to and . This

is achieved through differentiating L with respect to and , and setting the first derivatives to

zero. The resulting coefficient estimators for the slope and the intercept are given by:

= ……. (5)

……………… (6)

Thus, given only the sets of observations and , it is always possible to calculate the

estimated values of the two parameters and so that the line: is the best fit to

the set of data. This method of finding the optimum is known as OLS.

Note (estimator and estimate)

Estimators are the formulae used to calculate the coefficients (or parameters in general). For

example, the expressions given above for and are estimators. Estimates, on the other hand,

are the actual numerical values for the coefficients that are obtained from the sample.

Example1: The following data is on the excess returns of a given asset (Y) together with the excess returns on a market index (market portfolio) (X) from January 2009 to December 2010 recorded on a monthly basis:

year/month X Y

2009/01 -7.93 -17.75

2009/02 -9.93 -14.39

107

107


2009/03 8.83 12.9

2009/04 10.17 16.11

2009/05 5.33 7.21

2009/06 0.44 -1.96

2009/07 7.76 9.08

2009/08 3.23 7.7

2009/09 4.15 3.17

2009/10 -2.49 -5.38

2009/11 5.64 5.65

2009/12 2.8 0.76

2010/01 -3.51 -0.61

2010/02 3.39 3.63

2010/03 6.3 8.09

2010/04 2.13 1.85

2010/05 -7.86 -8.73

2010/06 -5.67 -6.5

2010/07 7.27 6.81

2010/08 -4.81 -7.06

2010/09 9.56 8.48

2010/10 4.02 2.05

2010/11 0.63 0.58

2010/12 6.77 9.15

108

108


The idea here is to check if there is a linear relationship between X and Y. The first stage could be to form a scatter plot of the two variables. This is shown in Figure 3 below. Clearly, there appears to be a positive, approximately linear, relationship between X and Y.

Figure 3: Scatter plot of X and Y

The next step is to estimate the parameters α and β using the above formula. The necessary calculations are shown below:

X Y XY

-7.93 -17.75 140.7575 62.8849

-9.93 -14.39 142.8927 98.6049

8.83 12.90 113.9070 77.9689

10.17 16.11 163.8387 103.4289

5.33 7.21 38.4293 28.4089

0.44 -1.96 -0.8624 0.1936

7.76 9.08 70.4608 60.2176

3.23 7.70 24.8710 10.4329

4.15 3.17 13.1555 17.2225

-2.49 -5.38 13.3962 6.2001

5.64 5.65 31.8660 31.8096

109

109


2.80 0.76 2.1280 7.8400

-3.51 -0.61 2.1411 12.3201

3.39 3.63 12.3057 11.4921

6.30 8.09 50.9670 39.6900

2.13 1.85 3.9405 4.5369

-7.86 -8.73 68.6178 61.7796

-5.67 -6.50 36.8550 32.1489

7.27 6.81 49.5087 52.8529

-4.81 -7.06 33.9586 23.1361

9.56 8.48 81.0688 91.3936

4.02 2.05 8.2410 16.1604

0.63 0.58 0.3654 0.3969

6.77 9.15 61.9455 45.8329

TOTAL 46.22 40.84 1164.7554 896.9532

Plugging in these values in the above formulae we get the estimates:

= = 1.344286

= = – 0.8872

The fitted line would thus be:

where is the excess return of the market portfolio over the risk free rate (i.e. ), also

known as the market risk premium.

110

110


Interpretation of and

The coefficient estimate of β is interpreted as, ‘if X increases by 1 unit, Y will be expected to

increase by the amount units, everything else being equal’. If is negative, a rise in X would

on average cause a fall in Y. The intercept coefficient estimate ( ) is interpreted as the value that would be taken by the dependent variable Y if the independent variable X took a value of zero.

In Example 1, the β coefficient estimate of 1.344 is interpreted as: if the excess return of the market portfolio over the risk free rate increases by 1%, then the excess returns of this particular asset will be expected to increase by 1.344%, everything else being equal.

If an analyst tells you that he expects the market to yield a return 10% higher than the risk-free rate next month, what would you expect the excess return on this asset to be? To answer this, plug in X = 10 in the estimated equation. This yields:

Thus, for a given expected market risk premium of 10%, this fund would be expected to earn an excess over the risk-free rate of 12.553%.

Note: Caution should be exercised when producing predictions for Y using values of X that are a long way outside the range of values in the sample. In Example 1, X takes values between – 9.93% and 10.17% in the available data. So, it would not be advisable to use this model to determine the expected excess return on the fund if the expected excess return on the market were, say 20% or − 15% (i.e. the market was expected to fall).

Precision and standard errorsAny set of regression coefficient estimates and are specific to the sample used in their

estimation. In other words, if a different sample of data was selected from within the population,

the data points (the and ) will be different, leading to different values of the OLS

estimates.

Recall that the OLS estimators ( and ) are given by equations (5) and (6). It would be

desirable to have an idea of how ‘good’ these estimates of α and β are in the sense of having

some measure of the reliability or precision of the estimators and . It is thus useful to know

whether one can have confidence in the estimates, and whether they are likely to vary much from one sample to another sample within the given population. An idea of the sampling variability, and hence of the precision of the estimates, can be calculated using only the sample of data

111

111


available. This estimate is given by its standard error. Valid estimators of the standard errors of

and are given by:

…………………. (10)

……………………. (11)

where is the estimated standard deviation of the residuals. It is worth noting that the standard errors give only a general indication of the likely accuracy of the regression parameters. They do not show how accurate a particular set of coefficient estimates is. If the standard errors are small, it shows that the coefficients are likely to be precise on average, not how precise they are for this particular sample. Thus, standard errors give a measure of the degree of uncertainty in the estimated values for the coefficients. It can be seen that they are a function of the actual observations on the explanatory variable, X, the sample size, T, and another term, .

is an estimate of the variance of the disturbance term. The actual variance of the disturbance

term is denoted by . An estimator of is given by:

……………………. (12)

where the are the OLS residuals, i.e., . The square root of this

estimator, namely , is known as the residual standard deviation. It is sometimes used as a broad measure of the fit of the regression equation. Everything else being equal, the smaller this quantity is, the closer is the fit of the line to the actual data.

Example 2: Consider the data in example 1. The size of the sample was T = 24, and the OLS

estimators were calculated as: = 1.344 and = – 0.887. Thus, the equation of the fitted line

is:

The OLS residuals are obtained as:

112

112


The OLS residuals corresponding to each time point, their square and the total sum of squares of the residuals are shown in the following table.

year/month X Y

2009/01 -7.93 -17.75 -6.2026 38.4723

2009/02 -9.93 -14.39 -0.1540 0.0237

2009/03 8.83 12.9 1.9172 3.6755

2009/04 10.17 16.11 3.3258 11.0610

2009/05 5.33 7.21 0.9322 0.8689

2009/06 0.44 -1.96 -1.6643 2.7698

2009/07 7.76 9.08 -0.4645 0.2157

2009/08 3.23 7.7 4.2452 18.0214

2009/09 4.15 3.17 -1.5216 2.3152

2009/10 -2.49 -5.38 -1.1455 1.3122

2009/11 5.64 5.65 -1.0446 1.0911

2009/12 2.8 0.76 -2.1168 4.4808

2010/01 -3.51 -0.61 4.9957 24.9565

2010/02 3.39 3.63 -0.0399 0.0016

2010/03 6.3 8.09 0.5082 0.2583

2010/04 2.13 1.85 -0.1261 0.0159

2010/05 -7.86 -8.73 2.7233 7.4163

2010/06 -5.67 -6.5 2.0093 4.0373

2010/07 7.27 6.81 -2.0758 4.3088

2010/08 -4.81 -7.06 0.2932 0.0860

2010/09 9.56 8.48 -3.4842 12.1395

2010/10 4.02 2.05 -2.4668 6.0852

2010/11 0.63 0.58 0.6203 0.3848

2010/12 6.77 9.15 0.9364 0.8768

TOTAL 0.0000 144.8748

113

113


Note that the sum of the residuals ( ) is zero. The residual variance is computed as:

= 6.585217

The residual standard deviation is the square root of the variance, i.e.,

= 2.566168

Estimated standard errors of and are given by:

= 0.551918

= 0.090281

With the standard errors calculated, the results are written as:

The standard error estimates are usually placed in parentheses under the relevant coefficient estimates.

8.4 An introduction to statistical inference

Often, (economic, financial, etc.) theory will suggest that certain coefficients should take on particular values, or values within a given range. It is thus of interest to determine whether the relationships expected from theory are upheld by the data to hand or not. Estimates of α and β have been obtained from the sample, but these values are not of any particular interest. Instead, the population values that describe the true relationship between the variables would be of more interest, but are never available. In practice, inferences are made concerning the likely population values from the regression parameters that have been estimated from the sample of data to hand. In doing this, the aim is to determine whether or not the differences between the coefficient estimates that are actually obtained and expectations arising from theory are a long way from one another in a statistical sense.

114

114


Example 3: Suppose the following regression results have been obtained:

= 0.6091 is a single (point) estimate of the unknown population parameter, β. As stated above,

the reliability of the point estimate is measured by the coefficient’s standard error. The

information from the sample coefficients ( and ) and their standard errors ( and )

can be used to make inferences about the population parameters ( and ). So the estimate of the

slope coefficient is = 0.6091, but it is obvious that this number is likely to vary to some degree

from one sample to the next. Thus, it might be of interest to answer the question, ‘Is it plausible, given this estimate, that the true population parameter, β, could be 0.5? Is it plausible that β could be 1?’ etc. Answers to these questions can be obtained through hypothesis testing.

8.5 Hypothesis testing: some concepts

In the hypothesis testing framework, there are always two hypotheses that go together, known as

the null hypothesis (denoted by ) and the alternative hypothesis (denoted or ). The null

hypothesis is the statement or the statistical hypothesis that is actually being tested, whereas the alternative hypothesis represents the remaining outcomes of interest. For example, suppose that given the regression results above, it is of interest to test the hypothesis that the true value of β is in fact 0.5. The following notation will be used:

This states that the hypothesis that the true but unknown value of β could be 0.5 is being tested against an alternative hypothesis that β is significantly different from 0.5. This is known as a two-sided test, since the outcomes of both β < 0.5 and β > 0.5 are subsumed under the alternative hypothesis.

Sometimes, some prior information may be available suggesting, for example, that β > 0.5 would be expected rather than β < 0.5. In this case, β < 0.5 is no longer of interest to us, and hence a one-sided test would be conducted:

115

115


Here the null hypothesis that the true value of β is 0.5 is being tested against a one-sided alternative that β is more than 0.5. On the other hand, one could envisage a situation where there is prior information that β < 0.5 is expected. For example, suppose that an investment bank bought a piece of new risk management software that is intended to better track the riskiness inherent in its traders’ books and that β is some measure of the risk that previously took the value 0.5. Clearly, it would not make sense to expect the risk to have risen, and so the hypothesis β > 0.5, corresponding to an increase in risk, is not of interest. In this case, the null and alternative hypotheses would be specified as:

This prior information should come from (financial) theory of the problem under consideration, and not from an examination of the estimated value of the coefficient.

Note that there is always an equality sign under the null hypothesis. So, for example, β < 0.5 would not be specified under the null hypothesis.

There are two ways to conduct a hypothesis test: via the test of significance approach or via the confidence interval approach. Both methods centre on a statistical comparison of the estimated value of the coefficient, and its value under the null hypothesis. In very general terms, if the estimated value is a long way away from the hypothesised value, the null hypothesis is likely to be rejected; if the value under the null hypothesis and the estimated value are close to one

another, the null hypothesis is less likely to be rejected. For example, consider = 0.6091 as

above. A null hypothesis that the true value of β is 5 ( ) is more likely to be rejected

than a null hypothesis that the true value of β is 0.5 ( ). What is required now is a

statistical decision rule that will permit the formal testing of such hypotheses.

The probability distribution of the least squares estimators

In order to test hypotheses, we assume that the error terms ( ) are normally distributed with

mean zero and variance . This is written as: . The normal distribution is a

convenient one to use since the algebra involved in statistical inference considerably simpler.

116

116


a) Since tY is a linear combination of , it can be stated that if is normally distributed then

will also be normally distributed.

b) The least squares estimators are linear combinations of the random variables tY . For instance:

where are the weights. Since the weighted sum of normal random variables

is also normally distributed, it can be said that the coefficient estimates will also be normally distributed. Thus:

where:

…………………. (13)

……………………. (14)

117

117


Thus, inferences about the true regression coefficients and can be made based on the normal distribution (or ‘similar’ distributions). Note that relations (13) and (14) involve the unknown residual standard deviation .

Will the coefficient estimates still follow a normal distribution if the errors do not follow a normal distribution? The answer is ‘yes’ provided that the sample size is sufficiently large. This is due to the central limit theorem (CLT). The normal distribution is plotted below.

Figure 4: The normal distribution

For inferential purposes, we often deal with standard normal random variables whose mean is zero and whose variance is 1 (denoted by N(0,1)). Standard normal variables can be constructed

from and by subtracting their mean and dividing by the square root of their variance, i.e.,

and

The square roots of the coefficient variances are the standard errors that are given by relations (13) and (14) above. Unfortunately, the true standard errors of the regression coefficients are never known (since the true residual standard deviation is unknown) -- all that is available are

their sample counterparts, the calculated standard errors of the coefficient estimates and

defined by the relations (10) and (11), respectively. Replacing the true values of the standard errors with the sample estimated versions induces another source of uncertainty, and also means that the standardised statistics are not normally distributed! They rather follow another distribution, namely, the Student’s t-distribution with (T − 2) degrees of freedom. That is:

and

A note on the t and the normal distributions118

118


The normal distribution is ‘bell’ shaped and is symmetric about the mean μ (or about zero for a standard normal distribution). A normal variate can be scaled to have zero mean and unit variance (or standardized) by subtracting its mean and dividing by its standard deviation. What does the t-distribution look like? It looks similar to a normal distribution, but with fatter tails and a smaller peak at the mean (shown in Figure 5 below). In addition to the two parameters (mean and variance), the t-distribution has another parameter, its degrees of freedom.

Figure 5: The t-distribution versus the normal

Some examples of the percentiles from the normal and t-distributions taken from the statistical tables are given in the table below. When used in the context of a hypothesis test, these percentiles become critical values. The values presented in the table would be those critical values appropriate for a one-sided test of the given significance level. It can be seen that as the number of degrees of freedom for the t-distribution increases from 5 to 40, the critical values fall substantially. It can also be seen that the critical values for the t-distribution are larger than those from the standard normal. This arises from the increased uncertainty associated with the situation where the error variance must be estimated. So now the t-distribution is used, and for a given statistic to constitute the same amount of reliable evidence against the null hypothesis, it has to be bigger in absolute value than in circumstances where the normal is applicable.

Significance level (%) N(0,1) t(5) t(40)

5% 1.645 2.015 1.684

2.5% 1.96 2.571 2.021

1% 2.33 3.365 2.423

Table: Critical values from the standard normal versus t-distribution

There are broadly two approaches to testing hypotheses under regression analysis: the test of significance approach and the confidence interval approach. Each of these will now be considered in turn.

119

119


a) The test of significance approachAssume that the regression equation is given by:

t = 1, 2, . . . , T .

The steps involved in doing a test of significance are shown below:

1. Estimate and , and using the relations that are discussed earlier.

2. Calculate the test statistic. If the null hypothesis is and the alternative hypothesis

is (for a two-sided test), the test statistic is given by:

…………… (15)

3. A tabulated distribution with which to compare the estimated test statistics is required. Test statistics derived in this way can be shown to follow the t-distribution with (T − 2) degrees of freedom.

4. Choose a ‘significance level’, often denoted by α (not the same as the regression intercept coefficient). It is conventional to use a significance level of 5% or 1%.

5. Given a significance level, a rejection region and non-rejection region can be determined. If a 5% significance level is employed, this means that 5% of the total distribution (5% of the area under the curve) will be in the rejection region. That rejection region can either be split in half (for a two-sided test) or it can all fall on one side of the y-axis, as is the case for a one-sided test. For a two-sided test, the 5% rejection region is split equally between the two tails, as shown in figure 6(a). For a one-sided test, the 5% rejection region is located solely in one tail of the distribution, as shown in figures 6(b) and 6(c), for a test where the alternative is of the ‘less than’ form, and where the alternative is of the ‘greater than’ form, respectively.

120

120


Figure 6(a): Rejection regions for a two-sided 5% test of hypothesis

Figure 6(b): Rejection region for a one (left) -sided 5% test of hypothesis

Figure 6(c): Rejection regions for a one (right) -sided 5% test of hypothesis

6. Use the t-tables to obtain a critical value or values with which to compare the test statistic. The critical value will be that value of x that puts 5% into the rejection region. In figures 6(a) – 6(c), c1, c2, c3 and c4 denote such critical values.

7. Finally perform the test. If the test statistic lies in the rejection region, then reject the null hypothesis ( ), else do not reject .

Steps 2 – 7 require further comment. In step 2, the estimated value of β is compared with the value that is subject to test under the null hypothesis, but this difference is ‘normalised’ or scaled

121

121


by the standard error of the coefficient estimate. The standard error is a measure of how confident one is in the coefficient estimate obtained in the first stage. If a standard error is small, the value of the test statistic will be large relative to the case where the standard error is large. For a small standard error, it would not require the estimated and hypothesised values to be far away from one another for the null hypothesis to be rejected.

In this context, the number of degrees of freedom can be interpreted as the number of pieces of additional information beyond the minimum requirement. If two parameters are estimated (α and β -- the intercept and the slope of the line, respectively), a minimum of two observations is required to fit this line to the data. As the number of degrees of freedom increases, the critical values in the tables decrease in absolute terms, since less caution is required and one can be more confident that the results are appropriate.

The significance level is also sometimes called the size of the test (note that this is completely different from the size of the sample) and it determines the region where the null hypothesis under test will be rejected or not rejected. Remember that the distributions in figures 6(a) – 6(c) are for a random variable. Purely by chance, a random variable will take on extreme values (either large and positive values or large and negative values) occasionally. More specifically, a significance level of 5% means that a result as extreme as this or more extreme would be expected only 5% of the time as a consequence of chance alone. To give one illustration, if the 5% critical value for a one-sided test is 1.68, this implies that the test statistic would be expected to be greater than this only 5% of the time by chance alone. There is nothing magical about the test -- all that is done is to specify an arbitrary cut-off value for the test statistic that determines whether the null hypothesis would be rejected or not. It is conventional to use a 5% size of test, but 10% and 1% are also commonly used.

However, one potential problem with the use of a fixed (e.g. 5%) size of test is that if the sample size is sufficiently large, any null hypothesis can be rejected. This is particularly worrisome in finance, where tens of thousands of observations or more are often available. What happens is that the standard errors reduce as the sample size increases, thus leading to an increase in the value of all t-test statistics. This problem is frequently overlooked in empirical work, but some econometricians have suggested that a lower size of test (e.g. 1%) should be used for large samples

Note also the use of terminology in connection with hypothesis tests: it is said that the null hypothesis is either rejected or not rejected. It is incorrect to state that if the null hypothesis is not rejected, it is ‘accepted’ (although this error is frequently made in practice), and it is never said that the alternative hypothesis is accepted or rejected. One reason why it is not sensible to say that the null hypothesis is ‘accepted’ is that it is impossible to know whether the null is actually true or not! In any given situation, many null hypotheses will not be rejected. For example,

suppose that : β = 0.5 and : β = 1 are separately tested against the relevant two-sided

122

122


alternatives and neither null is rejected. Clearly then it would not make sense to say that ‘ : β

= 0.5 is accepted’ and ‘ : β = 1 is accepted’, since the true (but unknown) value of β cannot be

both 0.5 and 1. So, to summarise, the null hypothesis is either rejected or not rejected on the basis of the available evidence.

Example 4: Consider the data in example 1 (on the excess returns of a given asset (Y) together with the excess returns on a market portfolio (X) from January 2009 to December 2010 recorded on a monthly basis. The regression result was:

where the figures in parentheses are the standard error estimates. Using both the test of

significance and confidence interval approaches, test the hypothesis that against a two-

sided alternative at the 5% level of significance.

Solution

The test of significance approach


The test statistic is calculated as:

= 14.887

Now we need the critical value from the t-distribution with (T-2) = (24-2) = 22 degrees of freedom and at the 5% level. This means that 5% of the total distribution will be in the rejection region, and since this is a two-sided test, 2.5% of the distribution is required to be contained in each tail. From the t-distribution table we have:

From the symmetry of the t-distribution around zero, the critical values in the upper and lower tail will be equal in magnitude, but opposite in sign. Thus, the rejection (critical) regions are as shown below:

123

123


Figure 7: Rejection regions for a two-sided 5% test of hypothesis

Decision: Reject since the test statistic lies within the rejection region. Thus, CAPM

‘beta’ is significantly different from zero. The implication is that movements in the given asset (X) are significantly related with movements in the market (Y).

Some more terminology

If the null hypothesis is rejected at the 5% level, it would be said that the result of the test is ‘statistically significant’. If the null hypothesis is not rejected, it would be said that the result of the test is ‘not significant’, or that it is ‘insignificant’. Finally, if the null hypothesis is rejected at the 1% level, the result is termed ‘highly statistically significant’.

Classifying the errors that can be made in tests of hypotheses

is usually rejected if the test statistic is statistically significant at a chosen significance level.

There are two possible errors that could be made:

1. Rejecting when it is really true; this is called a type I error.

2. Not rejecting when it is in fact false; this is called a type II error.

The possible scenarios can be summarised in the following table.

True condition (reality)

124

124


Result of test is true is false

Significant

(Reject )

Type I error correct decision

Not significant

(Do not reject )

correct decision Type II error

The probability of a type I error is just α, the significance level or size of the test chosen. To see this, recall what is meant by ‘significance’ at the 5% level: it is only 5% likely that a result as or more extreme as this could have occurred purely by chance. Or, to put this in another way, it is only 5% likely that this null would be rejected when it was in fact true.

Note that there is no chance for a free lunch (i.e. a cost-less gain) here! What happens if the size of the test is reduced (e.g. from a 5% test to a 1% test)? The chances of making a type I error would be reduced – but so would the probability that the null hypothesis would be rejected at all, so increasing the probability of a type II error. So there always exists a direct trade-off between type I and type II errors when choosing a significance level. The only way to reduce the chances of both is to increase the sample size or to select a sample with more variation, thus increasing the amount of information upon which the results of the hypothesis test are based. In practice, up to a certain level, type I errors are usually considered more serious and hence a small size of test is usually chosen (5% or 1% are the most common).

Steps involved in formulating a model

Although there are of course many different ways to go about the process of model building, a logical and valid approach would be to follow the steps described below.

1. general statement of the problem : This will usually involve the formulation of a theoretical model based on a certain theory that two or more variables should be related to one another in a certain way. The model is unlikely to be able to completely capture every relevant real-world phenomenon, but it should present a sufficiently good approximation that it is useful for the purpose at hand.

125

125


2. collection of data relevant to the model: The data required may be available electronically from data provider or from published government figures. Alternatively, the required data may be available only via a survey after distributing a set of questionnaires i.e. primary data.

3. choice of estimation method relevant to the model proposed: For example, is a single equation or multiple equation technique to be used?

4. statistical evaluation of the model: What assumptions were required to estimate the parameters of the model optimally? Were these assumptions satisfied by the data or the model? Also, does the model adequately describe the data? If the answer is ‘yes’, proceed to step 5; if not, go back to steps 1--3 and either reformulate the model, collect more data, or select a different estimation technique that has less stringent requirements.

5. evaluation of the model from a theoretical perspective: Are the parameter estimates of the sizes and signs that the theory or intuition from step 1 suggested? If the answer is ‘yes’, proceed to step 6; if not, again return to stages 1--3.

6. use of model: When a researcher is finally satisfied with the model, it can then be used for testing the theory specified in step 1, or for formulating forecasts or suggested courses of action. This suggested course of action might be for an individual, or as an input to government policy.

It is important to note that the process of building a robust empirical model is an iterative one, and it is certainly not an exact science. Often, the final preferred model could be very different from the one originally proposed, and need not be unique in the sense that another researcher with the same data and the same initial theory could arrive at a different final specification.

SPSS Application

126

126


Unit Nine:

The Multiple linear regression model and Statistical Inference

9.1 Introduction

So far we have seen the basic statistical tools and procedures for analyzing relationships between two variables. But in practice, economic models generally contain one dependent variable and two or more independent variables. Such models are called multiple regression models.

Example 1:

a) In demand studies we study the relationship between the demand for a good (Y) and price of the good ( ), prices of substitute goods ( ) and the consumer’s income ( ). Here, Y is

the dependent variable and , and are the explanatory (independent) variables. The relationship is estimated by a multiple linear regression equation (model) of the form:

where , , and are estimated regression coefficients.

b) In a study of the amount of output (product), we are interested to establish a relationship between output (Q) and labour input (L) & capital input (K). The equations are often estimated in log-linear form as:

c) In a study of the determinants of the number of children born per woman (Y), the possible explanatory variables include years of schooling of the woman ( ), woman’s (or husband’s)

earning at marriage ( ), age of woman at marriage ( ) and survival probability of

children at age five ( ). The relationship can thus be expressed as:

9.2 Estimation of regression coefficients

Example: Consider the following model with two independent variables and :

127

127


Expressing all variables in deviations form, that is, , and

, the OLS estimators of the parameters , and are given by:

where , and are the mean values of the variables Y, and , respectively.

An estimator of the variance of the errors is given by:

where

The standard errors of the estimated regression coefficients and are estimated as:

and

where is the coefficient of correlation between and , that is:

Example 2: Consider the following data on per capita food consumption (Y), price of food ( )

and per capita income ( ) for the years 1927-1941 in the United States. Retail price of food and per capita disposable income are deflated by the Consumer Price Index.

128

128


Year Y Year Y

1927 88.9 91.7 57.7 1935 85.4 88.1 52.1

1928 88.9 92.0 59.3 1936 88.5 88.0 58.0

1929 89.1 93.1 62.0 1937 88.4 88.4 59.8

1930 88.7 90.9 56.3 1938 88.6 83.5 55.9

1931 88.0 82.3 52.7 1939 91.7 82.4 60.3

1932 85.9 76.3 44.4 1940 93.3 83.0 64.1

1933 86.0 78.3 43.8 1941 95.1 86.2 73.7

1934 87.1 84.3 47.8

We want to fit a multiple linear regression model:

To simplify the calculations, it is better to work with deviations: ,

and . The transformed values are shown in the following table.

129

129


Year Y

1927 88.9 91.7 57.7 -0.007 5.800 1.173

1928 88.9 92.0 59.3 -0.007 6.100 2.773

1929 89.1 93.1 62.0 0.193 7.200 5.473

1930 88.7 90.9 56.3 -0.207 5.000 -0.227

1931 88.0 82.3 52.7 -0.907 -3.600 -3.827

1932 85.9 76.3 44.4 -3.007 -9.600 -12.127

1933 86.0 78.3 43.8 -2.907 -7.600 -12.727

1934 87.1 84.3 47.8 -1.807 -1.600 -8.727

1935 85.4 88.1 52.1 -3.507 2.200 -4.427

1936 88.5 88.0 58.0 -0.407 2.100 1.473

1937 88.4 88.4 59.8 -0.507 2.500 3.273

1938 88.6 83.5 55.9 -0.307 -2.400 -0.627

1939 91.7 82.4 60.3 2.793 -3.500 3.773

1940 93.3 83.0 64.1 4.393 -2.900 7.573

1941 95.1 86.2 73.7 6.193 0.300 17.173

Total 1333.6 1288.5 847.9

Mean 88.90667 85.90 56.52667

130

130


The necessary calculations using the transformed variables are shown below:

-0.007 5.800 1.173 -0.039 -0.008 6.805 33.640 1.377 4.45E-05

-0.007 6.100 2.773 -0.041 -0.018 16.917 37.210 7.691 4.45E-05

0.193 7.200 5.473 1.392 1.058 39.408 51.840 29.957 0.037

-0.207 5.000 -0.227 -1.033 0.047 -1.133 25.000 0.051 0.043

-0.907 -3.600 -3.827 3.264 3.470 13.776 12.960 14.643 0.822

-3.007 -9.600 -12.127 28.864 36.461 116.416 92.160 147.056 9.040

-2.907 -7.600 -12.727 22.091 36.992 96.723 57.760 161.968 8.449

-1.807 -1.600 -8.727 2.891 15.766 13.963 2.560 76.155 3.264

-3.507 2.200 -4.427 -7.715 15.523 -9.739 4.840 19.595 12.297

-0.407 2.100 1.473 -0.854 -0.599 3.094 4.410 2.171 0.165

-0.507 2.500 3.273 -1.267 -1.658 8.183 6.250 10.715 0.257

-0.307 -2.400 -0.627 0.736 0.192 1.504 5.760 0.393 0.094

2.793 -3.500 3.773 -9.777 10.540 -13.207 12.250 14.238 7.803

4.393 -2.900 7.573 -12.741 33.272 -21.963 8.410 57.355 19.301

6.193 0.300 17.173 1.858 106.36 5.152 0.090 294.923 38.357

TOTAL 27.630 257.397 275.900 355.14 838.289 99.929

Summary statistics:

n = 15, = 27.63, = 257.397, = 275.9, = 355.14, =

838.289, = 99.929, = 88.90667, = 85.9, = 56.52667

OLS estimates of the regression coefficients are:

= -0.21596

131

131


= 0.378127

= 88.90667 – (-0.21596)(85.9) – (0.378127)(56.52667)

= 86.08318

Hence, the estimated model is:

Estimation of standard errors of estimated coefficients

The estimated errors (residuals) are:

The error sum of squares (ESS) = = 8.567271. Thus, an estimator of the error variance

is:

= 0.713939225

The coefficient of correlation between and is computed as:

= = 0.505655733

The standard errors of estimated regression coefficients and are estimated as:

= 0.05197

132

132


= 0.033826

10. Evaluating the regression equation

Is the estimated equation a useful one? To answer this, an objective measure of some sort is desirable. Such an objective measure, called the coefficient of determination, is available. First let us define some measures of dispersion or variability.

The total sum of squares (TSS) is a measure of dispersion of the observed values of Y about their mean. This is computed as:

The regression (explained) sum of squares (RSS) measures the amount of the total variability in the observed values of Y that is accounted for by the linear relationship between the observed values of X and Y. This is computed as:

The error (residual or unexplained) sum of squares (ESS) is a measure of the dispersion of the observed values of Y about the regression line. This is computed as:

Note: It can be shown that the total sum of squares is the sum of the regression sum of squares and the error sum of squares; i.e., TSS = RSS + ESS.

If a regression equation does a good job of describing the relationship between the dependent variable and the independent variables, the regression (explained) sum of squares (RSS) should constitute a large proportion of the total sum of squares (TSS). Thus, it would be of interest to determine the magnitude of this proportion by computing the ratio of the explained sum of squares to the total sum of squares. This proportion is called the sample coefficient of determination, . That is:

Coefficient of determination =

measures the proportion of variation in the dependent variable that is explained by the independent variables (or by the linear regression model). It is a goodness-of-fit statistic. The

133

133


proportion of total variation in the dependent variable that is accounted for by factors other than X (for example, due to excluded variables, chance, etc) is equal to ( ) x 100%.

Example 3: Consider the data on per capita food consumption (Y), price of food ( ) and per

capita income ( ). Calculate the coefficient of determination and interpret.

Solution

The variation in the dependent variable Y (food consumption) can be decomposed into:

Total sum of squares: = 99.929

Error sum of squares: = 8.567271

Regression sum of squares = RSS = TSS – ESS = 91.362

The coefficient of determination is thus:

= 0.914

= 0.914 indicates that 91.4% of the variation (change) in food consumption is attributed to the effect of food price and/or consumer income.

= 0.086. This indicates that 8.6% of the variation in food consumption is due to factors (variables) not included in our specification (such as habit persistence, geographical and time variation, etc.)

Analysis of variance (ANOVA)

measures the proportion of variation in the dependent variable Y that is explained by the

explanatory variables (or by the multiple linear regression model).The largest value that can assume is 1 (in which case all observations fall on the regression line), and the smallest it can assume is zero. A small value of casts doubt about the usefulness of the regression equation. We do not, however, pass final judgment on the equation until it has been subjected to an objective statistical test.

134

134


A test for the significance of ((i.e., the adequacy of the multiple linear regression model) is equivalent to testing the hypotheses:

The null hypothesis ( ) states that all regression coefficients are insignificant (none of them

explains the dependent variable). Not rejecting means that such a model is inadequate, and is

useless for prediction or inferential purposes.

The above test is accomplished by means of analysis of variance (ANOVA) which enables us to test the significance of . The ANOVA table for multiple linear regression model is given below:

ANOVA table for multiple linear regression

Source of variation

Sum of squares

Degrees of freedom

Mean square

Variance ratio

Regression RSS k – 1 RSS/(k-1)

Residual ESS n – k ESS/(n-k)

Total TSS n – 1

Here k is the number of parameters estimated from the sample data and n is the sample size. To

test for the significance of , we compare the variance ratio with , the critical

value from the F distribution with (k – 1) and (n – k) degrees of freedom in the numerator and denominator, respectively, for a given significance level .

Decision rule: Reject if:

If is rejected, we then conclude that is significant (or that the fitted model is adequate

and is useful for prediction purposes).

Note:

135

135


As the number of explanatory (independent) variables increases, always increases. This implies that the goodness-of-fit of an estimated model depends on the number of independent (explanatory) variables regardless of whether they are important or not. To eliminate this dependency, we calculate the adjusted (denoted by ) as:

Unlike , may increase or decrease when new variables are added into the model.

Example 4: Consider the multiple regression model of per capita food consumption (Y) on price

of food ( ) and per capita income ( ) given by:

The fitted multiple regression model from the sample data was:

Is the model adequate?

Solution

Here, k = 3 (since we have estimated three regression coefficients , , ), n = 5,

, and .

A test of model adequacy is accomplished by testing the hypothesis:

The ANOVA table is:

ANOVA table for multiple linear regression

Source of variation

Sum of squares

Degrees of freedom

Mean square

Variance ratio

Regression 91.362 3 – 1 = 2 45.681

Residual 8.567 15 – 3 = 12 0.714

Total 99.929 15 – 1 = 14

136

136


We then compare this F-ratio with :

For = 0.01,

For = 0.05,

Since the test statistic is greater than both tabulated values, the above ratio is significant at the conventional levels of significance (1% and 5%). Thus, we reject the null hypothesis and conclude that the model is adequate, that is, variation (change) in per capita food consumption is significantly attributed to the effect of food price and/or per capita disposable income.

11. Tests on the regression coefficients

Once we come up with the conclusion that the model is adequate, the next step would be to test for the significance of each of the coefficients in the model. To test whether each of the coefficients is significant or not, the null and alternative hypotheses are given by:

for j = 1, 2.

The test statistic is the ratio of the estimated regression coefficients to their estimated standard errors, that is,

Decision rule:

If , we reject and conclude that is significant, that is, the regressor

variable significantly affects the dependent variable Y.

Example 5: Consider our fitted multiple regression model of per capita food consumption (Y) on

price of food ( ) and per capita income ( ):

137

137


We have already calculated the standard errors of estimated regression coefficients and as:

= 0.05197, = 0.033826.

a) Does food price significantly affect per capita food consumption?The hypothesis to be tested is:


For significance level α = 0.01 and degrees of freedom (n-3) = (15-3) = 12, the value from the student’s t-distribution is:

Decision: Since , we reject the null hypothesis and conclude that food price significantly affects per capita food consumption at the 1% level of significance.

b) Does disposable income significantly affect per capita food consumption?The hypothesis to be tested is:


The 1% critical value from the student’s t-distribution is again 3.055.

Decision: Since , we reject the null hypothesis and conclude that disposable income significantly affects per capita food consumption at the 1% level of significance.

Generally we have the following:

Food price significantly and negatively affects per capita food consumption, while disposable income significantly and positively affects per capita food consumption.

138

138


The estimated coefficient of food price is -0.21596. Holding disposable income constant, a one dollar increase in food price results in a 0.216 dollar decrease in per capita food consumption.

The estimated coefficient of food price is 0.378127. Holding food price constant, a one dollar increase in disposable income results in a 0.378 dollar increase in per capita food consumption.

12. Fitting a multiple linear regression model using computer software

In a multiple linear regression analysis involving a large number of explanatory variables, the computations are complicated and tedious. Fortunately, there are a number of computer packages readily available for such analysis, and thus, one does not need to go through the details of the calculations involved. The SPSS output for the above data is shown below.

Model Summary

.956a .914 .900 .84495Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), income, pricea.

= 0.914 indicates that 91.4% of the variation (change) in food consumption is attributed to the effect of food price and/or consumer income.

ANOVAb

91.362 2 45.681 63.984 .000a

8.567 12 .714

99.929 14

Regression

Residual

Total

Model1

Sum ofSquares df Mean Square F Sig.

Predictors: (Constant), income, pricea.

Dependent Variable: consumptionb.

A test of model adequacy is accomplished by means of analysis of variance (ANOVA) which enables us to test the null hypothesis of no linear relationship between the dependent variable and the set of explanatory variables. In this particular example, the ANOVA table is used to test the hypothesis:

139

139


Definition: (p-value)

A p-value is the smallest level of significance or the smallest value of for which the null hypothesis is to be rejected.

Example:

If p-value = 0.002, then we can reject the null hypothesis for all values of greater than 0.002 (such as = 0.01, = 0.05).

If p-value = 0.03, then we can reject the null hypothesis for all values of greater than 0.03 (such as = 0.05). However, we can not reject the null hypothesis at = 0.01.

If p-value = 0.07, then we can not reject the null hypothesis at both 5% and 1% levels of significance.

Note: In SPSS output, the p-value corresponds are displayed in the column under Sig.

In this particular example, Sig. = 0.000 which is less than 0.01 or 0.05. Thus, we can conclude that the model is adequate at the 1% level of significance. That is, there is a significant linear relationship between food consumption and food price and/or consumer income. This means based on food price and consumer income, we can make valid inferences about per capita food consumption at the 99% level of confidence.

Coefficientsa

86.083 3.873 22.226 .000

-.216 .052 -.407 -4.155 .001

.378 .034 1.095 11.178 .000

(Constant)

price

income

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: consumptiona.

As can be seen from the table above, the p-values for price and income are both less than 0.01. Thus, we can conclude that both variables significantly affect consumption at the 1% level of significance. From the signs of the estimated regression coefficients we can see that the direction of influence is opposite: price affects consumption negatively while income affects consumption positively. The constant term (intercept) is also significant.

Note: In general, if Sig. > 0.05, we doubt the importance of the variable!

140

140


Multicollinearity

Introduction

In the construction of an econometric model, it may happen that two or more variables giving rise to the same piece of information are included, that is, we may have redundant information or unnecessarily included related variables. This is what we call a multicollinearity (MC) problem.

Such kind of MC is so common in macroeconomic time series data (such as GNP, money supply, income, etc) since economic variables tend to move together over time.

Consequences of a high degree of MC (moderate to strong MC)

Consider the case when there is a high degree (moderate to strong) MC but not perfect MC. What happens to the parameter estimates?

Again consider the model in deviations form (K = 3):

There is a high degree of MC means that , the correlation coefficient between and ,

tends to 1 or –1.

We have seen earlier that the variances of and are estimated by:

and

Now, tends towards :

approaches to one

approaches to zero

both and approach to zero

both and become very large (or will be inflated)

Particularly, if , then the variances become infinite.

141

141


Recall that to test whether each of the coefficients is significant or not, that is, to test

versus , the test statistic is:

where .

Thus, under a high degree of MC, the standard errors will be inflated and the test statistic will be a very small number. This often leads to incorrectly accepting (not rejecting) the null hypothesis when in fact the parameter is significantly different from zero!

Major implications of a high degree of MC

1. OLS coefficient estimates are still unbiased.2. OLS coefficient estimates will have large variances (or the variances will be inflated).3. There is a high probability of accepting the null hypothesis of zero coefficient (using the t-

test) when in fact the coefficient is significantly different from zero.4. The regression model may do well, that is, may be quite high.5. The OLS estimates and their standard errors may be quite sensitive to small changes in the

data.

Example: Consider the following data on imports (Y), GDP ( ), stock formation ( ) and

consumption ( ) for the years 1949 – 1967.

Year Year

1949 15.9 149.3 4.2 108.1 1959 26.3 239.0 0.7 167.6

1950 16.4 161.2 4.1 114.8 1960 31.1 258.0 5.6 176.8

1951 19.0 171.5 3.1 123.2 1961 33.3 269.8 3.9 186.6

1952 19.1 175.5 3.1 126.9 1962 37.0 288.4 3.1 199.7

1953 18.8 180.8 1.1 132.1 1963 43.3 304.5 4.6 213.9

1954 20.4 190.7 2.2 137.7 1964 49.0 323.4 7.0 223.8

1955 22.7 202.1 2.1 146.0 1965 50.3 336.8 1.2 232.0

1956 26.5 212.4 5.6 154.1 1966 56.6 353.9 4.5 242.9

1957 28.1 226.1 5.0 162.3 1967 59.9 369.7 5.0 252.0

1958 27.6 231.9 5.1 164.3

142

142


Applying OLS, we obtain the following results (using SPSS):

Coefficient Standard error t-ratio

Constant -19.982 4.372 -4.570

GDP 0.100 0.194 0.515

Stock formation 0.447 0.341 1.309

Consumption 0.149 0.297 0.501

= 0.975, = 197.873 (p-value < 0.001)

The value of is close to 1, meaning GDP, stock formation and consumption together explain 97.5% of the variation in imports. Also the F-statistic is significant at the 1% level of significance. Thus, the linear regression model is adequate. However, all of the estimated regression coefficients (save the constant term) are insignificant at the conventional levels of significance. This is an indication that the standard errors are inflated due to MC. Since an increase in GDP is often associated with an increase in consumption, they have a tendency to grow up together over time leading to MC. The coefficient of correlation between GDP and consumption is 0.999. Thus, it seems that the problem of MC is due to the joint appearance of these two variables.

Methods of detection of MC

Multicollinearity almost always exists in most applications. So the question is not whether it is present or not; it is a question of degree! Also MC is not a statistical problem; it is a data (sample) problem. Therefore, we do not “test for MC’’; but measure its degree in any particular sample (using some rules of thumb).

Some of the methods of detecting MC are:

1. High but few (or no) significant t-ratios.2. High pair-wise correlations among regressor. Note that this is a sufficient but not a necessary

condition; that is, small pair-wise correlation for all pairs of regressors does not guarantee the absence of MC.

3. Variance inflation factor(VIF)

Consider the regression model:

……… (*)

143

143


The VIF of is defined as:

where is the coefficient of determination obtained when the variable is regressed on

the remaining explanatory variables (called auxiliary regression). For example, the VIF of

is defined as:

where is the coefficient of determination of the auxiliary regression:

Rule of thumb:

a) If exceeds 10, then is poorly estimated because of MC (or the jth regressor

variable ( ) is responsible for MC).

b) (Klien’s rule) MC is troublesome if any of the exceeds the overall (the

coefficient of determination of the regression equation (*)).

Example: Consider the data on imports (Y), GDP ( ), stock formation ( ) and

consumption ( ) for the years 1949 – 1967. The coefficient of determination of the

auxiliary regression of GDP ( ) on stock formation ( ) and consumption ( ):

is (using SPSS) = 0.998203. The VIF of is thus:

144

144


Since this figure is by far exceeds 10, we can conclude that the coefficient of GDP is poorly estimated because of MC (or that GDP is responsible for MC). It can also be shown that

indicating that consumption is also responsible for MC.

Remedial measures

To circumvent the problem of MC, some of the possibilities are:

1. Include additional observations maintaining the original model so that a reduction in the correlation among variables is attained.

2. Dropping a variable.This may result in an incorrect specification of the model (called specification bias). If we consider our example, we expect both GDP and Consumption to have an impact on Imports. By dropping one or the other, we have introduced specification bias.

Exercise with SPSS Application

145

145