a nonparametric measure of inequality

Upload: tanmoy-talukdar

Post on 08-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/7/2019 A Nonparametric Measure of Inequality

    1/44

    by

    Tanmoy Talukdar, Avik Sarkar, Ritam Bhaumikand

    Riddhipratim Basu

  • 8/7/2019 A Nonparametric Measure of Inequality

    2/44

    Problem of measuring inequality hasbeen a very important problem ineconomic statistics.

    When the data is univariate, we knowof various parametric inequalitymeasures, which we generalize formultivariate applications.

    However, it seems that distribution-

  • 8/7/2019 A Nonparametric Measure of Inequality

    3/44

    On why we prefer a nonparametric inequality measure

    Non-parametric measures, beingdistribution-free, will not depend on (notnecessarily valid) distributional assumptions,as is the case with parametric measures .

    They will, moreover, be robust to smallerrors in measurement which might bepresent in the data.

  • 8/7/2019 A Nonparametric Measure of Inequality

    4/44

    The problem that we had as our motivation-

    Suppose we have data available on all

    states of India on more than onevariables/attributes that are indicators ofsocio-economic growth of the state. Andwe are interested in finding whether

    there is a significant amount of inequality present among the states in this respect.

    As we want to construct a nonparametric

    measure , it is most natural to look at the

  • 8/7/2019 A Nonparametric Measure of Inequality

    5/44

    Let us suppose we have n > 1 blocks,denoted by B1,B2,,Bn.

    Let us assume we also have m > 1categories.

    We assume that all the categories are equally

    important.

    For each of the m categories we have aranking of the n blocks.

    The nature of the data may tempt us to use

  • 8/7/2019 A Nonparametric Measure of Inequality

    6/44

    A necessary assumption at thisstage will be that

    there are no ties.

    Under the above assumptioneach rank vector will be apermutation of {1,2,...,n}.

  • 8/7/2019 A Nonparametric Measure of Inequality

    7/44

    Denote the rank vectors as

    1 ,

    2 ,,

    m .Amxn = [ 1, 2 , m ] is the Rank matrix.

    Example:

    The above is a typical example of a rank matrixwith four categories and five blocks.

    1 5 2 3 4

    3 4 1 5 23 4 1 5 2

    5 2 1 4 3

  • 8/7/2019 A Nonparametric Measure of Inequality

    8/44

    Definition:A Rank matrix Amxn = [ 1, 2 , m] is said tobe a Complete Inequality configuration if all therows of A are identical, i.e.

    ,

    1=

    2= =

    m.

    (NOTE: Complete inequality configuration is notunique.)Here is a typical example:3 4 1 5 2

    3 4 1 5 23 4 1 5 2

    3 4 1 5 2

  • 8/7/2019 A Nonparametric Measure of Inequality

    9/44

    Null hypothesis:

    H0 : P( i = ) = 1 /n! for all

    permutations of {1,2,,n}.

  • 8/7/2019 A Nonparametric Measure of Inequality

    10/44

    Given a rank matrix A , our objectiveis to find a measure of inequality

    corresponding to that matrix.With respect to the earlier definitionof Complete Inequality, we try to

    propose a measure of inequalitybased on some notion ofdistance of the given rank matrix

    A from a Complete Inequality

  • 8/7/2019 A Nonparametric Measure of Inequality

    11/44

    Let S n denote the set of all permutations of {1, 2, ..., n}.Definitions:

    is a distance function if

    d is called a metric if, in addition to conditions 1 and 2above it satisfies the triangle inequality, i.e.,

    : {0}nd S

    , :

    1. ( , ) 0 ,

    2. ( , ) ( , ).

    n

    d iff and

    d d

    S

    , , , ( , ) ( , ) ( , ).n d d d S

  • 8/7/2019 A Nonparametric Measure of Inequality

    12/44

    1. Spearmans Distance :

    2. Spearmans Footrule* :

    2

    1

    1

    ( , ) .n

    k

    d k k

    21

    ( , ) .n

    k

    d k k

  • 8/7/2019 A Nonparametric Measure of Inequality

    13/44

    3. Kendalls Distance* :

    4. Cayleys Distance* :Cayley's Distance d C between two permutations and isgiven by the minimum number of transpositions needed to

    reach from .

    [The * denotes the distances that are metrics as well.]

    { 0}, .1K k l

    k l k ld

  • 8/7/2019 A Nonparametric Measure of Inequality

    14/44

    Given a Rank matrix A mxn = [ 1, 2 ,, m] and a distance d on S n, we propose the following D-Measures:

    [NB:Both the D-Measures above attain the value 0 if and only if A is

    a Complete Inequality configuration.]

    11

    21

    ( ) , .

    ( ) , .

    mind ii m

    d i j

    i j m

    i A d

    ii A d

    D

    D

  • 8/7/2019 A Nonparametric Measure of Inequality

    15/44

    Theorem 1:For any distance d on S n and any rank matrix A mxn:

    2 1( ) .2d d mi A AD D

    (ii) If d is a metric,then

    2 11 .d d

    A m AD D Thus, if d is a metric, then for a fixed matrix A, boththe D-Measures are of the same order. But D d 2 is far

    easier to compute for well-behaved d than Dd

    1.

  • 8/7/2019 A Nonparametric Measure of Inequality

    16/44

    We want to choose a suitabledistance function on S n which will be

    sensi sitive to inequality in rank matrixA. Spearman's Distance and Spearman's

    Footrule are both restrictions ofdistance functions on R n to S n , and hence, are not reflective of the special structure of S n and A.

    Cayleys Distance gives equal

  • 8/7/2019 A Nonparametric Measure of Inequality

    17/44

    Call (i)the (i)-value adjacent transposition if for all in S n , (i) acting upon swaps the values i and (i+1).

    Call value adjacent transposition if = (i)

    for some i.

    3 1 4 2

    3 1 4 2

    Example:

    Place Adjacent Transposition

    Value Adjacent Transposition

  • 8/7/2019 A Nonparametric Measure of Inequality

    18/44

    We choose a suitable metric on S n ,d* as follows:d* ( , ) is defined as the minimum number of value adjacent transpositions needed to reach

    starting from .

    1 2 3 4

    2 1 43

    2 1 4 3

    So,d* ( , ) = 3.

    Consider the problem of calculating d* ( , )where = {1,2,3,4} and = {3,1,4,2}.

    Step 1:

    Step 2:

    Step 3:

  • 8/7/2019 A Nonparametric Measure of Inequality

    19/44

    We propose the following D-Measure:D*= D 2 d*

    i.e. for a rank matrix A mx n ,

    * **

    2

    1

    , .d

    i j

    i j m

    A A d D D

  • 8/7/2019 A Nonparametric Measure of Inequality

    20/44

    Proposition 1 :

    Proposition 2 :

    Proposition 3 :

    *, ' .

    , ' , ' .n

    K

    Let Then

    d d

    S

    * *, .

    , , .

    n

    n

    Let Then

    d d

    S

    S

    *

    { 0}.1 i i j ji j k l

    k l k lAD

  • 8/7/2019 A Nonparametric Measure of Inequality

    21/44

    Theorem 2 : Let Amx n be a rank matrix. Then,1. D *(A)=0 iff A is a Complete Inequality configuration, and

    2. D * is invariant under row and column permutations.

    Theorem 3 : Let Amx n be a rank matrix. Let B i dominate B j w.r.t.A. Let r be fixed,1 rm . Let A * be the matrix obtained from Aby swapping r (i) and r (j). Then,

    D*

    (A)

  • 8/7/2019 A Nonparametric Measure of Inequality

    22/44

    A natural upper bound for D * is

    Theorem 4 : Let Amx n be a rank matrix. Then,

    .2 2

    m n

    2

    *2

    ( 1),

    24

    ,24{

    nm

    D Anm

    if m is odd;

    if m is even.

  • 8/7/2019 A Nonparametric Measure of Inequality

    23/44

    Attainment of the improved upper bound-a construction:

    Contd.

    2

    1 2 3 4 51 2 3 4 5

    1 2 3 4 5

    5 4 3 2 1

    5 4 3 2 1

    A

    1

    1 2 3 41 2 3 4

    4 3 2 1

    4 3 2 1

    A

    For the both the rank matrices above theaforesaid upper bound is attained.

  • 8/7/2019 A Nonparametric Measure of Inequality

    24/44

    Definition:For a rank matrix A mx n we define its Inequality Coefficient,I asfollows:

    { 0}1

    .

    2 21 i i j ji j k l k l k l

    I m n

    Proposition 4 :Let Amx n be a rank matrix, and D * and I be as defined before.Then, *

    1 .

    2 2

    DI

    m n

  • 8/7/2019 A Nonparametric Measure of Inequality

    25/44

    Theorem 5:For a rank matrix A mx n with Inequality Coefficient I, we have

    I 1, with equality iff A is a Complete Inequality configuration.

    (ii)1 1

    ,2 2

    1 1 ,2 2( 1)

    { mI m

    if m is odd;

    if m is even.

  • 8/7/2019 A Nonparametric Measure of Inequality

    26/44

    Theorem 5: (contd.)For a rank matrix A mx n with Inequality Coefficient I, we have

    (iii) I is invariant under row and column permutations.

    (iv) Let B i dominate B j in A. Let A 1 be the rank matrix obtained from A by swapping ranks of B i and B j in any one category C k .

    Let the inequality coefficient for A 1 be I 1. Then, I 1 I.

  • 8/7/2019 A Nonparametric Measure of Inequality

    27/44

    Theorem 6:

    For a rank matrix A mx n with Inequality Coefficient I, if 1, 2,, m are i.i.d . random permutations of {1,2,,n}, then,

    1( ) .2E I

  • 8/7/2019 A Nonparametric Measure of Inequality

    28/44

    Theorem 7:

    For a rank matrix A mx n with Inequality Coefficient I, if 1, 2,, m are i.i.d . random permutations of {1,2,,n}, then,

    2 5var( ) .

    362 2

    nI

    m n Corollary:

    Let n be fixed. Then as m goes to infinity, we have1

    .2

    PmI

  • 8/7/2019 A Nonparametric Measure of Inequality

    29/44

    The distribution of I under H 0 is a slightly right tailed one.As m or n increases, it rapidly becomes concentrated around 0.5.

    The graph of the simulated distribution of I for m=5 and n=29 is provided below.

  • 8/7/2019 A Nonparametric Measure of Inequality

    30/44

  • 8/7/2019 A Nonparametric Measure of Inequality

    31/44

    Collection of Data

    We use the results of 59th and 61st rounds of NSSO household survey.

    The 59 th round survey was carried out in 2003 and the 61 st in

    2004-05.

    We have excluded the Union Territories from our study.

  • 8/7/2019 A Nonparametric Measure of Inequality

    32/44

    The values of the following variables or attributes areused to act as categories to rank the states:

    1.MPCE- Monthly Per-capita ConsumptionExpenditure,

    2. Level of Education,3. Employment,4.Primary Source of Lighting, and 5.Area of Land Possessed .

    Inequality among States in India

    Variable Selection

  • 8/7/2019 A Nonparametric Measure of Inequality

    33/44

    Inequality among States in India

    Results: 59 th Round Data were not available for all the states.

    We used the data for 17 states for which data on allcategories were available.

    For this round we have a 5 X 17 matrix.

  • 8/7/2019 A Nonparametric Measure of Inequality

    34/44

    Inequality among States in India

    Results: 59 th Round Contd.

    I=0.591P- value=0.003

  • 8/7/2019 A Nonparametric Measure of Inequality

    35/44

    Inequality among States in India

    Results: 61 st Round Data were available for all the states.

    We used the data for 28 states and Delhi.

    For this round we have a 5 X 29 matrix.

  • 8/7/2019 A Nonparametric Measure of Inequality

    36/44

    Inequality among States in India

    Results: 61 st Round Contd.

    I=0.605P- value=0.00001

  • 8/7/2019 A Nonparametric Measure of Inequality

    37/44

    Inequality among States in India

    Results: 61 st RoundTRUNCATED

    Data were available for all the states in the 61st

    round,but only for 17 states in the 59 th.

    To compare the two, we analyze by truncating the data

    so as to include only those states that were included inthe 59 th Round study.

    For this round also, we have a 5 X 17 matrix.

  • 8/7/2019 A Nonparametric Measure of Inequality

    38/44

    Inequality among States in India

    Results: 61 st RoundTRUNCATED Contd.

    I=0.575

    P- value=0.010

  • 8/7/2019 A Nonparametric Measure of Inequality

    39/44

    Other statistics used for comparing are:

    (1)Friedman Statistic:

    (2)Statistic used by Sarkar et al.:

    Comparison with other Statistics

    2( 1)

    '' ( )2ii j

    nD j

    1' .C d D D

  • 8/7/2019 A Nonparametric Measure of Inequality

    40/44

    Comparison with D 1

    1 2 3 4

    4 3 2 1A

    2

    1 2 3 4

    2 1 4 3A

    1 2'( ) 1, '( ) 2D A D A * *

    1 2( ) 5, ( ) 2D A D A

    has more inequality w.r.t . D.

    But w.r.t. D *,more inequality is present in

  • 8/7/2019 A Nonparametric Measure of Inequality

    41/44

    Comparison with D

    has more inequality w.r.t. D*.

    So D cannot distinguish between

    A1 and A 2 .

    1

    1 2 3 4

    1 2 3 4

    4 3 2 1

    4 3 2 1

    A

    2

    1 4 3 2

    2 1 4 3

    3 2 1 4

    4 3 2 1

    A

    1 2''( ) ''( ) 0D A D A

    * *1 2( ) 24, ( ) 20D A D A

  • 8/7/2019 A Nonparametric Measure of Inequality

    42/44

    In real life scenarios we often end up with situations where tiesexist between the ranks of blocks, or where the data isincomplete.

    In such cases, we give a natural extension to our measure byusing the formula

    In case of ties we replace the indicator function by 0.5.

    In case of incomplete data, we ignore those cases and scale bythe number of meaningful observation pairs.

    { 0}1

    .

    2 2

    1 i i j ji j k l

    k l k lI m n

  • 8/7/2019 A Nonparametric Measure of Inequality

    43/44

    Large sample distribution of I Further investigation of the combinatorial properties of

    D*. Effect of an outlier block. Effect of different clusters in categories. Exploring the case when all the categories are not of

    equal importance.

  • 8/7/2019 A Nonparametric Measure of Inequality

    44/44