Cairo University
Faculty of Economics and Political Science Department of Statistics
ESTIMATION OF THE INTERCLASS CORRELATION
COEFFICIENT USING MINQUE APPROACH
WITH APPLICATIONS
Prepared by Sohier Kotb Ahmed
Supervised by
Prof. Heba El-Laithy Prof.Sahar El-Tawela Department of Statistics Department of Statistics Faculty of the Economics Faculty of the Economics
and Political Science and Political Science Cairo University Cairo University
A Thesis Submitted to the
Department of Statistics
In Partial Fulfillment of the Requirements
For the Degree of Master of Science
In the Faculty of the Economics and Political Science
Cairo University 2012
CAIRO UNIVERSITY
FACULTY OF THE ECONOMICS AND POLITICAL SCIENCE
DEPARTMENT OF STATISTICS
The undersigned hereby certify that they have read and recommend to the
faculty of Economics and Political Science for acceptance a thesis entitled
“Estimation of The Interclass Correlation Coefficient Using MINQUE
Approach with Applications” by Sohier Kotb Ahmed
in partial fulfillment of the requirements for the degree of Master of Science.
Dated: Nov. 21, 2012
Research Supervisor: _____________________________
Prof. Heba El-Laithy
External Examiner: ______________________________
Prof. Amany Mousa
Internal Examiner:
______________________________
Prof. Dina Magdy
ii
Abstract
This study is concerned with the estimation of the interclass correlation. MINQUE
estimator for the interclass correlation is derived.
We apply this estimator of the interclass correlation on the data of the ‘Egypt
Demographic and Health Survey, 2000’. The relationship between mother’s education
and her children’s education is studied. Relevant data from the EDHS-2000 file was
utilized to construct the different indices using SPSS. MATLAB program was used to
estimate the interclass correlation using the data of the aforementioned data.
Key words: Familial data; Variance-covariance components; The interclass correlation;
Point estimation; Analysis of variance; Minimum Norm Quadratic Unbiased Estimation
(MINQUE); The 2000 Egypt Demographic and Health Survey (2000 EDHS).
Supervised by
Prof. Heba El-Laithy Prof.Sahar El-Tawela
Department of Statistics Department of Statistics
Faculty of the Economics and Faculty of the Economics and
Political Science Political Science
Cairo University Cairo University
iii
Name: Sohier Kotb Ahmed Eldahshan
Nationality: Egyptian.
Date and place of birth: 27/9/1966, Cairo.
Degree: Master.
Specialization: Statistics.
Supervisor: Prof.Dr. Heba El-Laithy
Prof. Sahar El-Tawela
Department of Statistics
Faculty of the Economics and Political Science
Cairo University
Title of the thesis: Estimation of The Interclass Coefficient Using MINQUE Approach
with Application.
Summary
One of the most important topics of interest to vast number of researchers long ago and
nowadays is estimating the degree of resemblance among family members, especially
estimating the interclass correlation.
The present study is mainly concerned with the estimation of the interclass correlation.
MINQUE estimator for the interclass correlation is derived. We apply this estimator of
the interclass correlation on the data of the “Egypt Demographic and Health Survey,
2000". The relationship between mother’s education and her children’s education is
studied. Relevant data from the EDHS-2000 file was utilized to construct the different
indices using SPSS. MATLAB program was used to estimate the interclass correlation
using the EDHS-2000 data.
iv
The aim of this study can be summarized in two main objectives as follows: The first is
to estimate the interclass correlation using MINQUE. The second is to estimate
interclass correlation between mother’s education and the level of education of her
children, using the derived MINQUE. Data of the "Egypt Demographic and Health
Survey, 2000" was used for this purpose.
The present study is divided into five chapters organized as follow:.
Chapter one: contains an introduction to the study which includes the
objectives of the study, data source, background information for both
familial correlation and the MINQUE technique, literature review for
the relationship between education of the mother and her children, and
structure of the study.
Chapter two: presents the principal concept of MINQUE approach.
Chapter three: gives an extensive review of related literature about the estimation of the
interclass correlation methods.
Chapter four: includes the data under-consideration, their source, identifies the study
variables and presents the derivation of the interclass correlation by the
MINQUE method.
Chapter five: presents the conclusions of this research.
Finally, this study includes References and two appendices.
v
Acknowledgements
My profound appreciation and gratitude goes to Prof. Heba El-Laithy for her kind
supervisor, creative suggestion, valuable comments and great help throughout the
accomplishment of this thesis.
I am also thankful to Dr. Zakaria Abdel-Wahed for his guidance through the early
years of confusion, and for the time he spent in revising the formulas appearing in the
thesis.
Special thanks go to Dr. Sahar El –Tawela for her constant support and kind help for
me.
Many thanks go to Prof. Fatma El-Zanaty for providing me with the data used in the
empirical application.
I would like to thank my colleagues at the National Center of Social and
Criminological Research,specially Prof. Magda Abdel-Ghani and also my friend
Dr.Abeer Saleh for their great help.
Special thanks go to my husband and my daughters for their hard efforts and sacrifice
to help me.
Of course, I am grateful to my parents for their patience, love and praying.
Without them this work would never have come into existence
Sohier Kotb Ahmed
Cairo, Egypt
--, 2012
vi
Notations & Abbreviations
N : number of the families.
K : the total number of children in all N families ( NnnnK 21 )
in : the total number of offspring in the i th family
iy : the measurement made on the parent in the i th family
ix : the vector of the measurement made on the offspring in
the ith family
iiniii xxxx ,,, 21
m : the mean of individual measurements of mothers
s : the mean of individual measurements of offspring
ms : the interclass correlation coefficient between a parent and offspring.
ss : the intraclass correlation coefficient among siblings.
2
m : the variance of individual measurements of parents.
2
s : the variance of individual measurements of offspring.
MINQUE: the Minimum Norm Quadratic Unbiased Estimator.
MLE : the Maximum Likelihood Estimator
vii
Table of Contents
Chapter (1): Introduction
1-1 Familial data 2
1-2 Objectives of the study 3
1-3 Familial Correlation Literature Review 4
1-4 The MINQUE technique 6
1-5 Source of Data 7
1-6 The relationship between education of the mother and her
Children 7
1-7 Organization of the thesis 11
Chapter (2): The Minimum Norm Quadratic Unbiased Estimation
2-1 Introduction 12
2-2 The model 12
2-3 The Principles of MINQUE in the Linear Model 14
2-4 MINQUE with a priori weights 21
Chapter (3): Methods of estimating the interclass correlation
3.1 Introduction 24
3.2 The Model for Familial Data 24
3. 3 Estimators of Interclass Correlation 25
3.3.1 The Pairwise Estimator 25
3.3.2 The Sib-Mean Estimator 26
3.3.3 The Random-Sib Estimator 27
3.3.4 The Ensemble Estimator 27
3.3.5 The Maximum likelihood Estimator 29
viii
3.3.6 The Weighted Sums of Squares Estimator 33
3.3.7 The MINQUE Estimator 36
Chapter (4): Estimation of interclass correlation between
mother’s education and her children’s
4.1 Introduction 44
4.2 Source of Data 44
4.2.1 Correlates to children and mother's education 45
4.3 The Study Variables 47
4.3.1 Children’s Education Index. 47
4.3.2 Mother’s Education Index 49 4.4 Derivation of interclass correlation by the MINQUE 51
4-5 The result 55
Chapter (5): Concluding remarks
Concluding remarks 57
References References 60
Appendix Appendix (1) 64
Appendix (2) 68
2
Chapter one
Introduction
1-1 Familial data Familial data is observed in many different fields of research including
epidemiology, genetics, heredity, and psychology. A common assumption of
familial data is dependency between family members, as relatives tend to
have similar attributes. Welson (2010) presented an extended history of
research on estimating this dependency using familial correlations.
Formally, familial correlations measure the degree of resemblance
between family members with respect to some specified quantitative
characteristic as height, weight, cholesterol, lung capacity, or blood pressure.
There are two types of familial correlation in familial data: the intraclass
correlation coefficient and interclass correlation coefficient.
The intraclass correlation measures the degree of resemblance between
members of the same group, for example: it might refer to the measure of
resemblance between the children of a family, the sons of a family, or the
daughters of a family.
The interclass correlation measures the degree of resemblance between
members of different groups, for example: it can refer to the measure of
resemblance between the parents and children of a family, the parents and
sons of a family, the parents and daughters of a family, or the sons and
daughters of a family.
3
These types of familial correlations have applications in several areas of
study. Estimation of interclass correlations is the main interest in the present
work.
1-2 Objectives of the study This study is mainly concerned with the estimation of the interclass
correlation using MINQUE approach. Accordingly, MINQUE estimator for
the interclass correlation is derived.
We apply this estimator of the interclass correlation on the data of the
‘Egypt Demographic and Health Survey, 2000’, and the relationship
between mother’s education and her children’s education was investigated.
Relevant data from the EDHS-2000 file is utilized to construct the different
indices using SPSS. MATLAB program was used to estimate the interclass
correlation for the aforementioned data. The aim of this study can be
summarized in two main objectives as follows:
The first is to derive estimate of the interclass correlation using
MINQUE.
The second is to estimate of interclass correlation between mother’s
education and the level of education of her children, using the
derived MINQUE.
Data of the "Egypt Demographic and Health Survey,
2000" was used for this purpose.
4
1-3 Familial Correlation Literature Review
Several estimators have been proposed for the interclass correlation
coefficient. Some of these estimators have been discussed in detail by
Rosner et al. (1977). The first is Pairwise estimator where each child in a
family is paired with the mother of that family. The second is Sib-Mean
Estimator where the mean offspring score from a family is paired with the
mother of that family. The third is Random-Sib Estimator where a random
offspring is chosen from each family and is paired with the mother of that
family . And, the last is Ensemble Estimator , whereby an ‘expected value’
for all random-sib estimator is computed over all possible choices of random
sibs from each family .For estimators Pairwise, Sib-Mean and Random-Sib,
an ordinary Pearson correlation is computed from the set of pairs formed
over all families in the sample.
Under the assumption of normality, Srivastava(1984) derived the
iterative maximum likelihood estimators of parent-children correlation and
using a canonical reduction of the data. He also proposed two sets of
alternative estimators based on the canonical reduction that do not require an
iterative procedure and have better distributional properties. All three sets of
estimators allow families to have different numbers of children.
Srivastava and Keen (1988) derived a noniterative method, the weighted
sums of squares technique for estimating the interclass correlation.
It was shown by Rosner et al. (1977) that estimators pair-wise and
ensemble are superior to Sib-Mean and Random-Sib in terms of mean
squared error with the pairwise estimator being superior in the case of low
5
intraclass correlation, and Ensemble estimator being superior when ss is
high.
Rosner (1979), in a further simulation study, showed that pairwise
estimator is rough equivalent in mean squared error to the maximum
likelihood estimator for small values of ss . For equal numbers sibling per
family, the pair-wise estimator is the maximum likelihood estimator.
Accordingly, the pairwise procedure has generally been accepted as
reasonable approach for estimating interclass correlation in most practical
situations, especially since the maximum likelihood estimation, in general,
presents computational difficulties.
Srivastava and Katapa (1986) compared the asymptotic distributions of the
maximum likelihood estimators and alternative estimators proposed in
Srivastava(1984).
Eliasziw M., et al (1990), demonstrated that the estimator proposed by
Srivastava (1984) is shown to be identical to the modified sib-mean
estimator when the sib-sib correlation is estimated by the method of
unweighted group means and only slightly more efficient than ensemble.
The additional finite-sample Monte Carlo simulation results reaffirm that
ensemble and Srivastava's estimators are essentially indistinguishable in
terms of mean squared error and bias.
6
1-4 The MINQUE technique Hartley, J. N. K. Rao, and Kiefer (1969) proposed a new method of
estimation for general linear models with heteroscedastic error variances.
C. R. Rao(1970) has named it MINQUE (minimum norm quadratic
unbiased estimation or estimator(s), depending on the context).
C. R. Rao (1971a,1971b,1972 ) generalized it for variance and covariance
components models. J.N.K. Rao (1971, 1973) has compared MINQUE and
modified MINQUE, which is just the average of the squared residuals,
estimators of heteroscedastic variances with the usual sample variances in
the case of replicated data.
P. S. R, S. Rao and Chaubey (1978) considered some modifications of
MINQUE and gave generalizations. These authors made it possible to
estimate the distinct elements of the covariance matrix using similar
methods as MINQUE in univariate as well as multivariate situations.
The novelty behind this method of MINQUE is that it lays down a new
optimality criterion of estimators and yields explicit estimators in
complicated situations . Chaubey (1980b) used this method to estimate the
variances and covariances arising from an unbalanced regression with
residuals having a covariance matrix of intraclass form. Kleffe (1993)
derived the explicit form for the estimate of interclass correlation using the
MINQUE.
7
1-5 Source of Data The source of data used in this study is the 2000 Egypt Demographic and
Health Survey (2000 EDHS). This survey interviewed a nationally
representative sample of 15,573 ever-married women aged 15- 49. It is the
sixth in the series of Demographic and Heath Surveys conducted in Egypt.
In addition to the main purpose of 2000 EDHS, obtaining data from
community on the current health situation, it included special module
collecting data on children’s education. From this module, we will construct
two indices: one for mother’s education and another for her children
education.
1-6 The relationship between education of the mother and her
children The majority of researches on relationship between parental education and
child educational focus on the duration of child schooling as primary
outcome measure. Fewer studies, however, have analyzed the relationship
between parental education and children’s learning within school.
Brown, P.H. (2003) presents the landmark study of education in the United
States known as the “Coleman Report” (United States National Center for
Educational Statistics, 1966)which reported that family characteristics are
more important determinants of educational achievement than school quality
or teacher experience, particularly in the early stages of schooling.
The first objective of his paper is to understand how parental education
affects investments in children’s human capital. Using a new survey of
children, households, schools, and communities in Gansu, China, he
8
estimated the demand for six education-related investments. He found that
more educated parents provid higher levels of both education-related goods
(e.g., the provision of children’s books) and education-related time (e.g.,
time spent reading to children). The study suggests that the perceived returns
to education are higher for the children of more educated parents.
The second objective of his paper is to analyze the extent to which these
investments explain the robust relationship between parental education and
children’s learning described in the literature. To facilitate this, he estimate
the effect of parental education on children’s Chinese and mathematics test
scores with and without controlling for individual investments; reductions in
the estimated effect of parental education when controlling for investments
are interpreted as the degree to which the particular investment explains the
relationship between parental education and test scores. Parental education
has a strong positive effect on children’s test scores. Even though the
direction of causality is uncertain in some estimates, the paper shows a
correlation between parent education and various investments in children’s
human capital development is evident.
Magnuson, K. (2002) discuss the following question"Does an increase in a
mother’s education improve her young child’s academic performance?".
Positive correlations between mothers’ educational attainment and children’s
well being, in particular children’s cognitive development and academic
outcomes, are among the most replicated results from developmental studies.
Yet, surprisingly little is known about the causal nature of this relationship.
Because conventional regression (e.g., OLS) and analysis of variance (e.g.,
ANOVA) approaches to estimate the effect of maternal schooling on child
outcomes may be biased by omitted variables, they use experimentally
9
induced differences in mothers’ education to estimate Instrumental Variable
(IV) models. Their data come from the National Evaluation of Welfare to
Work Strategies Child Outcomes Study (NEWWS-COS).
He found that increases in maternal education are significantly and
positively associated with children’s academic school readiness, and
negatively associated with children’s academic problems. The IV models
produce larger, although less precise, estimates compared to the OLS
models.
Jerrim.J and Micklewright.J (2009) focus on the socio-economic
characteristics of each parent and the different influences they exert on boys
and girls. Their data come from the Programme for International Student
Assessment (2003 PISA).
In 2003 PISA tested children’s ability in one major (maths) and two
‘minor’ (Reading and Science) domains. They focus in particular on the
results for maths, they tested the association between each parent’s
education and their child’s cognitive skills at age 15 using regression models
in which each parent’s years of education enters separately. It is not easy to
make broad generalizations from their results about the relative importance
of father’s and mother’s education for their children’s cognitive ability in
secondary school and how this varies for sons and daughters. They
attempted to present a summary of the general picture, focusing on the
results for ability in maths.
First, it is more common for father’s education to have a greater effect than
mother’s education (they use ‘effect’ without implying causality). Second,
this seems to be particularly true of sons, but there are plenty of countries
that are counter examples for both sons and daughters. Third, they found
10
more variation across countries in the differences in the effects of fathers
and mothers than in the differences either parent has on sons and daughters.
Fourth, there is some suggestion of a common pattern across countries that
mothers have more effect on their daughters than their sons, although the
differences are small. Fifth, they frequently found complementarities
between mothers’ and fathers ‘education that warrant further attention
1-7 Organization of the thesis This thesis consists of five chapters. The first one is an introductory
chapter, which presents the importance and the objectives of the study. In
chapter two, the principal of MINQUE approach is presented. In chapter
three, a review of related literature about the estimation of the interclass
correlation methods is presented. Chapter four contains the source of data,
study variables and obtaining the MINQUE estimator for the interclass
correlation. Finally, the conclusion will be presented in the last chapter.