a nonparametric measure of inequality
TRANSCRIPT
-
8/7/2019 A Nonparametric Measure of Inequality
1/44
by
Tanmoy Talukdar, Avik Sarkar, Ritam Bhaumikand
Riddhipratim Basu
-
8/7/2019 A Nonparametric Measure of Inequality
2/44
Problem of measuring inequality hasbeen a very important problem ineconomic statistics.
When the data is univariate, we knowof various parametric inequalitymeasures, which we generalize formultivariate applications.
However, it seems that distribution-
-
8/7/2019 A Nonparametric Measure of Inequality
3/44
On why we prefer a nonparametric inequality measure
Non-parametric measures, beingdistribution-free, will not depend on (notnecessarily valid) distributional assumptions,as is the case with parametric measures .
They will, moreover, be robust to smallerrors in measurement which might bepresent in the data.
-
8/7/2019 A Nonparametric Measure of Inequality
4/44
The problem that we had as our motivation-
Suppose we have data available on all
states of India on more than onevariables/attributes that are indicators ofsocio-economic growth of the state. Andwe are interested in finding whether
there is a significant amount of inequality present among the states in this respect.
As we want to construct a nonparametric
measure , it is most natural to look at the
-
8/7/2019 A Nonparametric Measure of Inequality
5/44
Let us suppose we have n > 1 blocks,denoted by B1,B2,,Bn.
Let us assume we also have m > 1categories.
We assume that all the categories are equally
important.
For each of the m categories we have aranking of the n blocks.
The nature of the data may tempt us to use
-
8/7/2019 A Nonparametric Measure of Inequality
6/44
A necessary assumption at thisstage will be that
there are no ties.
Under the above assumptioneach rank vector will be apermutation of {1,2,...,n}.
-
8/7/2019 A Nonparametric Measure of Inequality
7/44
Denote the rank vectors as
1 ,
2 ,,
m .Amxn = [ 1, 2 , m ] is the Rank matrix.
Example:
The above is a typical example of a rank matrixwith four categories and five blocks.
1 5 2 3 4
3 4 1 5 23 4 1 5 2
5 2 1 4 3
-
8/7/2019 A Nonparametric Measure of Inequality
8/44
Definition:A Rank matrix Amxn = [ 1, 2 , m] is said tobe a Complete Inequality configuration if all therows of A are identical, i.e.
,
1=
2= =
m.
(NOTE: Complete inequality configuration is notunique.)Here is a typical example:3 4 1 5 2
3 4 1 5 23 4 1 5 2
3 4 1 5 2
-
8/7/2019 A Nonparametric Measure of Inequality
9/44
Null hypothesis:
H0 : P( i = ) = 1 /n! for all
permutations of {1,2,,n}.
-
8/7/2019 A Nonparametric Measure of Inequality
10/44
Given a rank matrix A , our objectiveis to find a measure of inequality
corresponding to that matrix.With respect to the earlier definitionof Complete Inequality, we try to
propose a measure of inequalitybased on some notion ofdistance of the given rank matrix
A from a Complete Inequality
-
8/7/2019 A Nonparametric Measure of Inequality
11/44
Let S n denote the set of all permutations of {1, 2, ..., n}.Definitions:
is a distance function if
d is called a metric if, in addition to conditions 1 and 2above it satisfies the triangle inequality, i.e.,
: {0}nd S
, :
1. ( , ) 0 ,
2. ( , ) ( , ).
n
d iff and
d d
S
, , , ( , ) ( , ) ( , ).n d d d S
-
8/7/2019 A Nonparametric Measure of Inequality
12/44
1. Spearmans Distance :
2. Spearmans Footrule* :
2
1
1
( , ) .n
k
d k k
21
( , ) .n
k
d k k
-
8/7/2019 A Nonparametric Measure of Inequality
13/44
3. Kendalls Distance* :
4. Cayleys Distance* :Cayley's Distance d C between two permutations and isgiven by the minimum number of transpositions needed to
reach from .
[The * denotes the distances that are metrics as well.]
{ 0}, .1K k l
k l k ld
-
8/7/2019 A Nonparametric Measure of Inequality
14/44
Given a Rank matrix A mxn = [ 1, 2 ,, m] and a distance d on S n, we propose the following D-Measures:
[NB:Both the D-Measures above attain the value 0 if and only if A is
a Complete Inequality configuration.]
11
21
( ) , .
( ) , .
mind ii m
d i j
i j m
i A d
ii A d
D
D
-
8/7/2019 A Nonparametric Measure of Inequality
15/44
Theorem 1:For any distance d on S n and any rank matrix A mxn:
2 1( ) .2d d mi A AD D
(ii) If d is a metric,then
2 11 .d d
A m AD D Thus, if d is a metric, then for a fixed matrix A, boththe D-Measures are of the same order. But D d 2 is far
easier to compute for well-behaved d than Dd
1.
-
8/7/2019 A Nonparametric Measure of Inequality
16/44
We want to choose a suitabledistance function on S n which will be
sensi sitive to inequality in rank matrixA. Spearman's Distance and Spearman's
Footrule are both restrictions ofdistance functions on R n to S n , and hence, are not reflective of the special structure of S n and A.
Cayleys Distance gives equal
-
8/7/2019 A Nonparametric Measure of Inequality
17/44
Call (i)the (i)-value adjacent transposition if for all in S n , (i) acting upon swaps the values i and (i+1).
Call value adjacent transposition if = (i)
for some i.
3 1 4 2
3 1 4 2
Example:
Place Adjacent Transposition
Value Adjacent Transposition
-
8/7/2019 A Nonparametric Measure of Inequality
18/44
We choose a suitable metric on S n ,d* as follows:d* ( , ) is defined as the minimum number of value adjacent transpositions needed to reach
starting from .
1 2 3 4
2 1 43
2 1 4 3
So,d* ( , ) = 3.
Consider the problem of calculating d* ( , )where = {1,2,3,4} and = {3,1,4,2}.
Step 1:
Step 2:
Step 3:
-
8/7/2019 A Nonparametric Measure of Inequality
19/44
We propose the following D-Measure:D*= D 2 d*
i.e. for a rank matrix A mx n ,
* **
2
1
, .d
i j
i j m
A A d D D
-
8/7/2019 A Nonparametric Measure of Inequality
20/44
Proposition 1 :
Proposition 2 :
Proposition 3 :
*, ' .
, ' , ' .n
K
Let Then
d d
S
* *, .
, , .
n
n
Let Then
d d
S
S
*
{ 0}.1 i i j ji j k l
k l k lAD
-
8/7/2019 A Nonparametric Measure of Inequality
21/44
Theorem 2 : Let Amx n be a rank matrix. Then,1. D *(A)=0 iff A is a Complete Inequality configuration, and
2. D * is invariant under row and column permutations.
Theorem 3 : Let Amx n be a rank matrix. Let B i dominate B j w.r.t.A. Let r be fixed,1 rm . Let A * be the matrix obtained from Aby swapping r (i) and r (j). Then,
D*
(A)
-
8/7/2019 A Nonparametric Measure of Inequality
22/44
A natural upper bound for D * is
Theorem 4 : Let Amx n be a rank matrix. Then,
.2 2
m n
2
*2
( 1),
24
,24{
nm
D Anm
if m is odd;
if m is even.
-
8/7/2019 A Nonparametric Measure of Inequality
23/44
Attainment of the improved upper bound-a construction:
Contd.
2
1 2 3 4 51 2 3 4 5
1 2 3 4 5
5 4 3 2 1
5 4 3 2 1
A
1
1 2 3 41 2 3 4
4 3 2 1
4 3 2 1
A
For the both the rank matrices above theaforesaid upper bound is attained.
-
8/7/2019 A Nonparametric Measure of Inequality
24/44
Definition:For a rank matrix A mx n we define its Inequality Coefficient,I asfollows:
{ 0}1
.
2 21 i i j ji j k l k l k l
I m n
Proposition 4 :Let Amx n be a rank matrix, and D * and I be as defined before.Then, *
1 .
2 2
DI
m n
-
8/7/2019 A Nonparametric Measure of Inequality
25/44
Theorem 5:For a rank matrix A mx n with Inequality Coefficient I, we have
I 1, with equality iff A is a Complete Inequality configuration.
(ii)1 1
,2 2
1 1 ,2 2( 1)
{ mI m
if m is odd;
if m is even.
-
8/7/2019 A Nonparametric Measure of Inequality
26/44
Theorem 5: (contd.)For a rank matrix A mx n with Inequality Coefficient I, we have
(iii) I is invariant under row and column permutations.
(iv) Let B i dominate B j in A. Let A 1 be the rank matrix obtained from A by swapping ranks of B i and B j in any one category C k .
Let the inequality coefficient for A 1 be I 1. Then, I 1 I.
-
8/7/2019 A Nonparametric Measure of Inequality
27/44
Theorem 6:
For a rank matrix A mx n with Inequality Coefficient I, if 1, 2,, m are i.i.d . random permutations of {1,2,,n}, then,
1( ) .2E I
-
8/7/2019 A Nonparametric Measure of Inequality
28/44
Theorem 7:
For a rank matrix A mx n with Inequality Coefficient I, if 1, 2,, m are i.i.d . random permutations of {1,2,,n}, then,
2 5var( ) .
362 2
nI
m n Corollary:
Let n be fixed. Then as m goes to infinity, we have1
.2
PmI
-
8/7/2019 A Nonparametric Measure of Inequality
29/44
The distribution of I under H 0 is a slightly right tailed one.As m or n increases, it rapidly becomes concentrated around 0.5.
The graph of the simulated distribution of I for m=5 and n=29 is provided below.
-
8/7/2019 A Nonparametric Measure of Inequality
30/44
-
8/7/2019 A Nonparametric Measure of Inequality
31/44
Collection of Data
We use the results of 59th and 61st rounds of NSSO household survey.
The 59 th round survey was carried out in 2003 and the 61 st in
2004-05.
We have excluded the Union Territories from our study.
-
8/7/2019 A Nonparametric Measure of Inequality
32/44
The values of the following variables or attributes areused to act as categories to rank the states:
1.MPCE- Monthly Per-capita ConsumptionExpenditure,
2. Level of Education,3. Employment,4.Primary Source of Lighting, and 5.Area of Land Possessed .
Inequality among States in India
Variable Selection
-
8/7/2019 A Nonparametric Measure of Inequality
33/44
Inequality among States in India
Results: 59 th Round Data were not available for all the states.
We used the data for 17 states for which data on allcategories were available.
For this round we have a 5 X 17 matrix.
-
8/7/2019 A Nonparametric Measure of Inequality
34/44
Inequality among States in India
Results: 59 th Round Contd.
I=0.591P- value=0.003
-
8/7/2019 A Nonparametric Measure of Inequality
35/44
Inequality among States in India
Results: 61 st Round Data were available for all the states.
We used the data for 28 states and Delhi.
For this round we have a 5 X 29 matrix.
-
8/7/2019 A Nonparametric Measure of Inequality
36/44
Inequality among States in India
Results: 61 st Round Contd.
I=0.605P- value=0.00001
-
8/7/2019 A Nonparametric Measure of Inequality
37/44
Inequality among States in India
Results: 61 st RoundTRUNCATED
Data were available for all the states in the 61st
round,but only for 17 states in the 59 th.
To compare the two, we analyze by truncating the data
so as to include only those states that were included inthe 59 th Round study.
For this round also, we have a 5 X 17 matrix.
-
8/7/2019 A Nonparametric Measure of Inequality
38/44
Inequality among States in India
Results: 61 st RoundTRUNCATED Contd.
I=0.575
P- value=0.010
-
8/7/2019 A Nonparametric Measure of Inequality
39/44
Other statistics used for comparing are:
(1)Friedman Statistic:
(2)Statistic used by Sarkar et al.:
Comparison with other Statistics
2( 1)
'' ( )2ii j
nD j
1' .C d D D
-
8/7/2019 A Nonparametric Measure of Inequality
40/44
Comparison with D 1
1 2 3 4
4 3 2 1A
2
1 2 3 4
2 1 4 3A
1 2'( ) 1, '( ) 2D A D A * *
1 2( ) 5, ( ) 2D A D A
has more inequality w.r.t . D.
But w.r.t. D *,more inequality is present in
-
8/7/2019 A Nonparametric Measure of Inequality
41/44
Comparison with D
has more inequality w.r.t. D*.
So D cannot distinguish between
A1 and A 2 .
1
1 2 3 4
1 2 3 4
4 3 2 1
4 3 2 1
A
2
1 4 3 2
2 1 4 3
3 2 1 4
4 3 2 1
A
1 2''( ) ''( ) 0D A D A
* *1 2( ) 24, ( ) 20D A D A
-
8/7/2019 A Nonparametric Measure of Inequality
42/44
In real life scenarios we often end up with situations where tiesexist between the ranks of blocks, or where the data isincomplete.
In such cases, we give a natural extension to our measure byusing the formula
In case of ties we replace the indicator function by 0.5.
In case of incomplete data, we ignore those cases and scale bythe number of meaningful observation pairs.
{ 0}1
.
2 2
1 i i j ji j k l
k l k lI m n
-
8/7/2019 A Nonparametric Measure of Inequality
43/44
Large sample distribution of I Further investigation of the combinatorial properties of
D*. Effect of an outlier block. Effect of different clusters in categories. Exploring the case when all the categories are not of
equal importance.
-
8/7/2019 A Nonparametric Measure of Inequality
44/44