clog log

Upload: askazy007

Post on 01-Mar-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/25/2019 Clog Log

    1/18

    COMPLEMENTARY LOG LOG MODEL

    Under the assumption of binary response, there are two alternatives to logit model: probit

    model and complementary-log-log model.

    They all follow the same form( ) ( )xx += (1)

    for a continuous cdf .

    Complementary log-log model says1

    log{-log[1- ( )]} Tp n p

    x X

    = . The expression on

    the left-hand side is called Complementary Log-Log transformation. Like the logit and

    the probit transformation, the complementary log-log transformation takes a response

    restricted to the (0,1) interval and converts it into something in ( , ) + interval. Here,

    we need mentioned that the log of 1- (x) is always a negative number. This is changed

    to a positive number before taking the log a second time. We can also write the model

    down like form (1), as 1)]( ) 1 exp[ exp( Tp n px X = .

    Both logit and probit links have the same property, which is link[(x)]=-link[1-(x)] .

    This means that the response curve for (x) has a symmetric appearance about the point

    (x) =0.5 and so (x) has the same rate for approaching 0 as well as for approaching 1.

    When the data given is not symmetric in the [0,1] interval and increase slowly at small to

    moderate value but increases sharply near 1. The logit and probit models areinappropriate. However, in this situation, the complementary log-log model might give a

    satisfied answer.

    PROBIT CLOGLOG AND LOGIT MODELS COMPARISON

    0

    0. 2

    0. 4

    0. 6

    0. 8

    1

    1. 2

    - 12 - 11 - 10 - 9 - 8 - 7 - 6 - 5 - 4 - 3 - 2 - 1 0 1 2 3 4 5 6 7 8 9 10 11 12

    Explanatory var i abl e

    PI(X)

    CLOGLOG

    PROBI T

    LOGI T

    Unlike logit and probit the complementary log-log model is asymmetrical, it is frequentlyused when the probability of an event is very small or very large. Under the assumption

  • 7/25/2019 Clog Log

    2/18

    that the general features are not lost, let us only consider a simple complementary log-log

    model with one predictor (x)=1-exp[-exp( + x)] in the report left. The response has

    an S-shaped curve, it approach 0 fairly slowly but approaching 1 quite sharply, when

    >0.

    compl ement ary l og l og model

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    - 9 - 8 - 7 - 6 - 5 - 4 - 3 - 2 - 1 0 1 2 3 4 5 6 7 8

    Explanatory var i abl e

    PI(x)

    bet a>0

    bet a

  • 7/25/2019 Clog Log

    3/18

    For 2 1 1x x = , the complement probability at 2x equals the complement probability at

    1x raised to the power exp( ) .

    Here, we will give out another related model (x)=exp[-exp( + x)] , it is called log-

    log model. For it, (x) approaches 0 sharply but approaches 1 slowly. As x increases, the

    curve is monotone decreasing when >0, and monotone increasing when 0

    bet a

  • 7/25/2019 Clog Log

    4/18

    F(t)=P(T )t . For fixed dosage x, the probability a randomly selected subject dies is

    (x)=P(Y=1| X=x)=P(T x)=F(x)

    That is the appropriate binary model is the one having the shape of the cdf F of the

    tolerance distribution. Let denote the standard cdf for the family to which F belongs.

    A common standardization uses the mean and standard deviation of T, so that

    (x)=F(x)= [( ) / ]x Then, the model has form (x)= ( + x) .Let us use the beetle data as an example.

    Beetles Killed after Exposure to Carbon Disulfide

    Fitted Values

    Log DoseNumber of

    BeetlesNumber

    KilledComp. Log-

    LogProbit Logit

    1.691 59 6 5.7 3.4 35

    1.724 60 13 11.3 10.7 9.8

    1.755 62 18 20.9 23.4 22.4

    1.784 56 28 30.3 33.8 33.9

    1.811 63 5 47.7 49.6 50.0

    1.837 59 53 54.2 53.4 53.3

    1.861 62 61 61.1 59.7 59.2

    1.884 60 60 59.9 59.2 58.8

    In the table, we find that the underlying cdf of number killed increases moderately beforex=1.811,then there is a big jump on the number of beetles killed.

    SAS result:

    For logistic proc:the logit model for beetles data 31

    21:41 Sunday, March 14, 2004

    The LOGISTIC Procedure

    Model Fit Statistics

    InterceptIntercept and

    Criterion Only Covariates

    AIC 647.441 376.354SC 651.617 384.706-2 Log L 645.441 372.354

    Testing Global Null Hypothesis: BETA=0

    Test Chi-Square DF Pr > ChiSq

    Likeli hood Ratio 273.0869 1

  • 7/25/2019 Clog Log

    5/18

    Intercep t 1 -60.7339 5.1814 137.3964

  • 7/25/2019 Clog Log

    6/18

    logdos e 1 19.7408 1.4880 175.9925 ChiSq

    Intercept 1 -34.9561 2.6413 -40.1330 -29.7793 175.15

  • 7/25/2019 Clog Log

    7/18

    Model Information

    Data Set WORK.BEETLES1Distribution BinomialLink Function LogitResponse Variable (Events) nkilledResponse Variable (Trials) nbeetlesObservations Used 8Number Of Events 291Number Of Trials 481

    Criteria For Assessing Goodness Of Fit

    Criterion DF Value Value/DF

    Deviance 6 11.1156 1.8526Scaled Deviance 6 11.1156 1.8526Pearson Chi-Square 6 9.9067 1.6511Scaled Pearson X2 6 9.9067 1.6511Log Likelihood -186.1771

    Algorithm converged.Analysis Of Parameter Estimates

    Standard Wald 95% Confidence Chi-Parameter DF Estimate Error Limits Square Pr > ChiSq

    Intercept 1 -60.7401 5.1819 -70.8964 -50.5838 137.40

  • 7/25/2019 Clog Log

    8/18

    of survival is^

    1 (x)=exp[-exp(-39.5224+22.0148 1.7)] = 0.884445 where at dosage

    =1.8 it is 0.3296029, and at dosage =1.9,it is 4.39966e-05, the probability of survival at

    dosage+0.1 equals the probability at dosage raised to the power

    Exp(22.0148 0.1)=9.03838 . For instance, 9.038380.3296029 (0.884445)

    (9.03838

    0.3296027=(0.884445) )

    Underlying the LOGISTIC proc:

    Model intercept logdose Standard ErrorAIC with Intercept

    and Covariates

    logit -60.7339 34.2824 5.1814/2.9129 376.354

    probit -34.9557 19.7408 2.6490/1.4880 375.226

    Complementarylog-log

    -39.5224 22.0148 3.2356/1.7970 368.753

    Underlying the GENMOD proc:

    Model intercept logdoseStandard

    Error2Deviance-G DF

    logit -60.7401 34.2859 5.1819 /2.9132 11.1156 6

    probit -34.9561 19.7410 2.6413 /1.4853 9.9870 6

    Complementary

    log-log-39.5223 22.0148 3.2229/1.7899 3.5143 6

    From the table we can find that under logistic proc, complementary log-log has the

    smallest AIC =368.753, under Genmod proc, it is still complementary log-log that has the

    smallest 2G 3.5143= .The last few things I need mentioned are there are

    The reason is that GENMOD uses the Newton-Raphson algorithm to get the ML

    estimates, and LOGISTIC uses iteratively reweighed least squares(also called Fisherscoring ), these two algorithms are equivalent for logit models but diverge for any other

    model.(That is because logit is the unique canonical link function).

    (With the coefficients, we can see with the one unit change on x, the logit will

  • 7/25/2019 Clog Log

    9/18

    ?????, the compare between coefficients and standard error)

    Finally, we need mentioned that complementary log-log model is not only used for binaryrespond but also can be used for ordinal responses with cumulative link, the form is

    j+log{ log[1 ( | x)]}= xTP Y j

    and the ordinal model using this link is sometimes called a Proportional hazards model

    for survival data to handle grouped survival times.

    model using cumulative link

    Bookmark:

    Proof for symmetric property of logit distribution and probit model:

    logit[(x)]=log[ (x)/(1- (x))]

    =-log[(1- (x))/ (x)]

    =-logit[1-(x)]

    1

    1

    1 1

    (x) + x

    (x) ( + x)

    (x) (x)

    [ ]

    [1 ]

    [ ] [1 ]

    =

    =

    =

    log{ log[ (x)]}=log{-log(1-exp[-exp( + x)])}

    The model says that

    for large p, log-log close to logit

    /*************storage plant***********************/

    F(x)=exp{-exp[-(x-a)/b]}

  • 7/25/2019 Clog Log

    10/18

    2 1 2 1

    2

    2 1

    1

    2 1

    2 1

    exp[ ( )]

    ( )

    ( )

    log[ log(1 ( ))] log[ log(1 ( ))]

    log[1 ( )]exp[ ]

    log[1 ( )]

    1 ( ) [1 ( )]

    link[(x)]=-link[1-(x)](x)=exp[-exp( + x)]

    log[-log(1- (x))]= + x

    F(x)=exp{-exp[-(x-a)

    x x

    x x

    xx

    x

    x x

    xx

    x

    =

    =

    =

    2 1

    /b]}

    1

    (x)=exp[-exp( + x)]

    F(t)=P(T )

    (x)=P(Y=1| X=x)=P(T x)=F(x)

    (x)=F(x)= [( ) / ]

    (x)=1-exp[-exp( + x)]

    (x)=1-exp[-exp( + x)]

    link[(x)]=-link[1-(x)]

    logit[(x)]=log[ (x)/(1- (x))]

    x x

    t

    x

    =

    ^

    1

    1

    1)]

    ( , )

    =-log[(1- (x))/ (x)]

    =-logit[1-(x)]

    log[-log( (x))]=

    ( ) 1 exp[ exp(

    1 (x)=exp[-exp(-39.5224+22.0148 1.7)]

    exp( )

    (x) + x

    (x) ( + x)

    x

    p [0.1 0.9]

    [ ]

    [1 ]

    T

    Tp n px X

    +

    =

    =

    =

    1 1

    9.03838

    9.03838

    2

    (x) (x)log{ log[ (x)]}=log{-log(1-exp[-exp( + x)])}

    log[-log(1- (x))]=-39.5224+22.0148 logdose

    Exp(22.0148 0.1)=9.03838

    0.3296029 (0.884445)

    0.3296027=(0.884445)

    Deviance-G 3.51

    [ ] [1 ]

    =

    =

    j

    +

    43

    log{ log[1 ( | x)]}= xTP Y j

  • 7/25/2019 Clog Log

    11/18

    2 1

    2 1

    2

    1

    2 1

    2 1

    2 1

    exp[ ( )]

    ( )

    ( )

    log[ log(1 ( ))] log[ log(1 ( ))]

    log[1 ( )]exp[ ]

    log[1 ( )]

    1 ( ) [1 ( )]

    link[(x)]=-link[1-(x)](x)=exp[-exp( + x)]

    log[-log( (x))]= + x

    F(x)=exp{-exp[-(x-a)/b

    x x

    x x

    xx

    x

    x x

    xx

    x

    =

    =

    =

    2 1

    ]}

    1

    (x)=exp[-exp( + x)]

    F(t)=P(T )

    (x)=P(Y=1| X=x)=P(T x)=F(x)

    (x)=F(x)= [( ) / ]

    (x)=1-exp[-exp( + x)]

    link[(x)]=-link[1-(x)]

    logit[(x)]=log[ (x)/(1- (x))]

    =-log[(1- (x))/ (x)

    x x

    t

    x

    =

    1)]

    ( , )

    ]

    =-logit[1-(x)]

    log[-log( (x))]=

    ( ) 1 exp[ exp(

    x

    p [0.1 0.9]

    T

    Tp n px X

    +

    =

    /*************************************/

  • 7/25/2019 Clog Log

    12/18

    APPENDIX I

    For logistic proc/ *t he begi n of sas codet he code i s used f or t wo goal , so ther e i s some r esul tt hat got f r om i t and use i t back t o t he code agai n f r o pl ot t i ng*/ data beet l es1;i nput l ogdose nbeet l es nki l l ed;nsur vi ve=nbeet l es- nki l l ed;/ *f or sampl e pr opor t i on*/

    r pr ob1=nki l l ed/ nbeet l es;/ *f or l ogi t model */ yl ogi t =exp( - 60.7339+34.2824*l ogdose) / (1+exp( - 60.7339+34.2824*l ogdose) ) ;/ *f or compl ement ar y l og l og*/ ycl l = 1- exp( - exp( - 39.5224+22.0148*l ogdose) ) ;dat al i nes;

    1. 691 59 61. 724 60 131. 755 62 181. 784 56 281. 811 63 521. 837 59 531. 861 62 61

    1. 884 60 60;/ ***usi ng l ogi st i c pr oc f or t est compar i ng t he t hr ee ki nds of model */ proc logistic data= beet l es1; / * for l ogi t * / model nki l l ed/ nbeet l es = l ogdose ;t i t l e t he l ogi t model f or beet l es dat a ;

    proc logistic dat a= beet l es1; / *f or cl ogl og*/ model nki l l ed/ nbeet l es = l ogdose / l i nk=cl ogl og OUTROC=cl l pl ot d;t i t l e t he compl ement ar y l og l og model f or beet l es dat a ;

    proc logistic dat a= beet l es1; / *f or probi t */ model nki l l ed/ nbeet l es = l ogdose / l i nk=pr obi t ;t i t l e t he pr obi t model f or beet l es dat a ;

    / *pl ot a cur ve*/

    symbol 1 col or =r ed val ue=st ar i nt erpol =NONE hei ght =1 wi dt h=1;symbol 2 col or =gr een val ue=pl us i nt er pol =spl i ne hei ght =1 wi dt h=1;symbol 3 col or =bl ue val ue=DI AMOND i nterpol =spl i ne hei ght =1 wi dt h=1;proc gplot data=beet l es1;pl ot r pr ob1*l ogdose yl ogi t *l ogdose ycl l *l ogdose/haxi s=1.65 t o 1.90 by.05 over l ay l egend=l egend2;t i t l e ' sampl e pr oport i on, cl l and l ogi t model compar i son' ;

  • 7/25/2019 Clog Log

    13/18

    run;/ ***t he end of usi ng l ogi st i c f or t est compar i ng t he three ki nd ofmodel */ / *t he end of sas code*/

    /*result begin*//*for logistic proc*/

    the logit model for beetles data 3121:41 Sunday, March 14, 2004

    The LOGISTIC Procedure

    Model Information

    Data Set WORK.BEETLES1Response Variable (Events) nkilledResponse Variable (Trials) nbeetlesNumber of Observations 8Model binary logitOptimization Technique Fisher's scoring

    Response Profile

    Ordered Binary TotalValue Outcome Frequency

    1 Event 2912 Nonevent 190

    Model Convergence Status

    Convergence criterion (GCONV=1E-8) satisfied.

    Model Fit Statistics

    InterceptIntercept and

    Criterion Only Covariates

    AIC 647.441 376.354

    SC 651.617 384.706-2 Log L 645.441 372.354

    Testing Global Null Hypothesis: BETA=0

    Test Chi-Square DF Pr > ChiSq

    Likelihood Ratio 273.0869 1

  • 7/25/2019 Clog Log

    14/18

    Association of Predicted Probabilities and Observed Responses

    Percent Concordant 87.0 Somers' D 0.802Percent Discordant 6.8 Gamma 0.856Percent Tied 6.3 Tau-a 0.384Pairs 55290 c 0.901

    -------------------------------------------------------------------------------------------------------------------the complementary log log model for beetles data 33

    21:41 Sunday, March 14, 2004

    The LOGISTIC Procedure

    Model Information

    Data Set WORK.BEETLES1Response Variable (Events) nkilledResponse Variable (Trials) nbeetlesNumber of Observations 8Model binary cloglogOptimization Technique Fisher's scoring

    Response Profile

    Ordered Binary TotalValue Outcome Frequency

    1 Event 2912 Nonevent 190

    Model Convergence Status

    Convergence criterion (GCONV=1E-8) satisfied.

    Model Fit Statistics

    InterceptIntercept and

    Criterion Only Covariates

    AIC 647.441 368.753SC 651.617 377.105-2 Log L 645.441 364.753

    Testing Global Null Hypothesis: BETA=0

    Test Chi-Square DF Pr > ChiSq

    Likelihood Ratio 280.6881 1

  • 7/25/2019 Clog Log

    15/18

    The LOGISTIC Procedure

    Model Information

    Data Set WORK.BEETLES1Response Variable (Events) nkilledResponse Variable (Trials) nbeetlesNumber of Observations 8Model binary probitOptimization Technique Fisher's scoring

    Response Profile

    Ordered Binary TotalValue Outcome Frequency

    1 Event 2912 Nonevent 190

    Model Convergence Status

    Convergence criterion (GCONV=1E-8) satisfied.

    Model Fit Statistics

    InterceptIntercept and

    Criterion Only Covariates

    AIC 647.441 375.226SC 651.617 383.577-2 Log L 645.441 371.226

    Testing Global Null Hypothesis: BETA=0

    Test Chi-Square DF Pr > ChiSq

    Likelihood Ratio 274.2155 1

  • 7/25/2019 Clog Log

    16/18

    1. 691 59 61. 724 60 131. 755 62 181. 784 56 281. 811 63 521. 837 59 531. 861 62 611. 884 60 60;

    / *t he end of sas code*/ proc genmod dat a= beet l es1;model nki l l ed/ nbeet l es= l ogdose / di st =bi n l i nk=pr obi t ;

    proc genmod data= beet l es1;model nki l l ed/ nbeet l es= l ogdose / di st =bi n l i nk=cl ogl og;

    proc genmod data= beet l es1;model nki l l ed/ nbeet l es= l ogdose / di st =bi n l i nk=l ogi t ;qui t ;

    run;/ ** *t he end of usi ng genmode f or t est compar i ng t he thr ee ki nd of

    model */

    ***************************************************************************************************************************The SAS System 23:18 Monday, March 15, 2004 4

    The GENMOD Procedure

    Model Information

    Data Set WORK.BEETLES1Distribution BinomialLink Function ProbitResponse Variable (Events) nkilledResponse Variable (Trials) nbeetlesObservations Used 8Number Of Events 291

    Number Of Trials 481

    Criteria For Assessing Goodness Of Fit

    Criterion DF Value Value/DF

    Deviance 6 9.9870 1.6645Scaled Deviance 6 9.9870 1.6645Pearson Chi-Square 6 9.3690 1.5615Scaled Pearson X2 6 9.3690 1.5615Log Likelihood -185.6128

    Algorithm converged.Analysis Of Parameter Estimates

    Standard Wald 95% Confidence Chi-Parameter DF Estimate Error Limits Square Pr > ChiSq

    Intercept 1 -34.9561 2.6413 -40.1330 -29.7793 175.15

  • 7/25/2019 Clog Log

    17/18

    Response Variable (Trials) nbeetlesObservations Used 8Number Of Events 291Number Of Trials 481

    Criteria For Assessing Goodness Of Fit

    Criterion DF Value Value/DF

    Deviance 6 3.5143 0.5857Scaled Deviance 6 3.5143 0.5857Pearson Chi-Square 6 3.3592 0.5599Scaled Pearson X2 6 3.3592 0.5599Log Likelihood -182.3765

    Algorithm converged.Analysis Of Parameter Estimates

    Standard Wald 95% Confidence Chi-Parameter DF Estimate Error Limits Square Pr > ChiSq

    Intercept 1 -39.5223 3.2229 -45.8391 -33.2055 150.38

  • 7/25/2019 Clog Log

    18/18