an introduction and application of iml to weighted chisquare statistics lisa price, bruce johnston...

62
An Introduction and An Introduction and Application of IML Application of IML to to Weighted ChiSquare Weighted ChiSquare Statistics Statistics Lisa Price, Bruce Lisa Price, Bruce Johnston Johnston Junming Yang, Dan Junming Yang, Dan DiPrimeo DiPrimeo BASAS April 14, 2008

Upload: allan-bradford

Post on 04-Jan-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

An Introduction and An Introduction and Application of IML toApplication of IML toWeighted ChiSquare Weighted ChiSquare

StatisticsStatistics

Lisa Price, Bruce JohnstonLisa Price, Bruce Johnston

Junming Yang, Dan DiPrimeoJunming Yang, Dan DiPrimeo

BASAS April 14, 2008BASAS April 14, 2008

Page 2: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

What is IML?What is IML?

► InteractiveInteractive►MatrixMatrix►LanguageLanguage

Page 3: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

What is IML?What is IML?

►A powerful, flexible programming language A powerful, flexible programming language in a dynamic, interactive environment.in a dynamic, interactive environment.

►The fundamental object is a data matrix.The fundamental object is a data matrix.►Use IML interactively (I=interactive!) to see Use IML interactively (I=interactive!) to see

results immediately, or store in a module.results immediately, or store in a module.►Powerful: built-in operators to perform Powerful: built-in operators to perform

matrix operations.matrix operations.►Data management commandsData management commands

Page 4: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

What is IML?What is IML?

Page 5: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Why IML?Why IML?

► Can do graphs and analyses that other SAS Can do graphs and analyses that other SAS Modules don’t readily do – in programming Modules don’t readily do – in programming statements that are easily translated from statements that are easily translated from mathematics and statistics statements.mathematics and statistics statements.

► A powerful graphics package for scientific A powerful graphics package for scientific explorationexploration

► In SAS 9.2, a mechanism for submitting R In SAS 9.2, a mechanism for submitting R statements in IML Workshopstatements in IML Workshop

►Write a program in a language nobody else Write a program in a language nobody else knows. Great job security.knows. Great job security.

Page 6: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

GoalGoal

►To get you acquainted with IML viaTo get you acquainted with IML via a VERY brief introa VERY brief intro A real life exampleA real life example

Page 7: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Define Matrix ADefine Matrix A

proc iml;proc iml; reset print; /* send to the .lst file */reset print; /* send to the .lst file */ A = {A = {11 33 55, , 44 44 11, , 22 22 66 }; }; /* Define A to be a 3 x 3 matrix /* Define A to be a 3 x 3 matrix

*/*/

A 3 rows 3 cols (numeric)

1 3 5 4 4 1 2 2 6

Page 8: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Define Matrix BDefine Matrix B

B = {B = {11 33 44, , 33 55 22, , 44 22 11 }; };/* Define B to be a 3 x3 positive definite /* Define B to be a 3 x3 positive definite

symmetric matrix */symmetric matrix */

B 3 rows 3 cols (numeric)

1 3 4 3 5 2 4 2 1

Page 9: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Define Matrix CDefine Matrix C

CC == {{22 11 11, , 33 44 66};};

/* C is a 2 x 3 matrix *//* C is a 2 x 3 matrix */

C 2 rows 3 cols (numeric)

2 1 1 3 4 6

Page 10: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Define Matrix DDefine Matrix D

DD= {= {22 22, , 44 55, , 55 11};};/* D is a 3 x 2 matrix *//* D is a 3 x 2 matrix */

D 3 rows 2 cols (numeric)

2 2 4 5 5 1

Page 11: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Define Matrix XDefine Matrix X

XX == {{11 11 00, , 11 11 00, , 11 11 00, , 11 00 11, , 11 00 11, , 11 00 11};};/* X is a design matrix for ANOVA, 6 obs, 1 /* X is a design matrix for ANOVA, 6 obs, 1

independent X, 2 levels of X */independent X, 2 levels of X */

X 6 rows 3 cols (numeric)

1 1 0 1 1 0 1 1 0 1 0 1 1 0 1 1 0 1

Page 12: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Compute the Inverse Of ACompute the Inverse Of A

inverseAinverseA == inv(A);inv(A);

/* compute the inverse of A*//* compute the inverse of A*/ INVERSEA 3 rows 3 cols (numeric)

-0.5 0.1818182 0.3863636

0.5 0.0909091 -0.431818

0 -0.090909 0.1818182

Page 13: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Compute the Transpose Of CCompute the Transpose Of C

transposeCtransposeC == t(C); /* or, C` */t(C); /* or, C` *//* compute the transpose of C *//* compute the transpose of C */

TRANSPOSEC 3 rows 2 cols (numeric)

2 3 1 4 1 6

Page 14: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Binary Operation: A+BBinary Operation: A+B

AplusBAplusB == A + B;A + B;/* Add 2 matrices, of same size *//* Add 2 matrices, of same size */

APLUSB 3 rows 3 cols (numeric)

2 6 9 7 9 3 6 4 7

Page 15: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Binary Operation: A*BBinary Operation: A*B(Matrix Multiplicaton)(Matrix Multiplicaton)

AtimesBAtimesB == A * B;A * B;

/* Matrix multiplication *//* Matrix multiplication */

ATIMESB 3 rows 3 cols (numeric)

30 28 15 20 34 25 32 28 18

Page 16: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Question:Question:

► I would like to multiply two matrices, I would like to multiply two matrices, term by term, rather than matrix term by term, rather than matrix multiplication. That is,multiplication. That is,

►What will be the operator?What will be the operator?► Answer: #Answer: #► Compare with *, which we already sawCompare with *, which we already saw

816

26

44

22?

24

13

1616

1010

44

22*

24

13

Page 17: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Reduction OperatorsReduction Operators

AsumrowsAsumrows == A[,+];A[,+];/* Reduction operator, sum the rows *//* Reduction operator, sum the rows */

ASUMROWS 3 rows 1 col (numeric)

9 9 10

Page 18: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Reduction OperatorsReduction Operators

AsumcolsAsumcols == A[+,];A[+,];

/* Reduction operator, sum the columns /* Reduction operator, sum the columns */*/

ASUMCOLS 1 row 3 cols (numeric)

7 9 12

Page 19: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Question: How to get column or Question: How to get column or row products?row products?

► prodC = C[op,]prodC = C[op,]

►What is op?What is op?► Answer: #Answer: #

646

643

112

prodc

C

Page 20: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

QuestionQuestion

►Suppose we wanted to sum all the Suppose we wanted to sum all the terms of the matrix A. How to do this?terms of the matrix A. How to do this?

►Answer: sumA = sum(A);Answer: sumA = sum(A);►Answer: sumA = A[+,][,+];Answer: sumA = A[+,][,+];►Answer: sumA = A[,+][+,];Answer: sumA = A[,+][+,];►Answer: sumA = A[+,+];Answer: sumA = A[+,+];

Page 21: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Catenating MatricesCatenating Matrices

AnextB = A||B;AnextB = A||B;/* Put 2 matrices side by side *//* Put 2 matrices side by side */

ANEXTB 3 rows 6 cols (numeric)

1 3 5 1 3 4 4 4 1 3 5 2 2 2 6 4 2 1

Page 22: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Catenating MatricesCatenating Matrices

AtopB = A//BAtopB = A//B/* Put 2 matrices on top of each other *//* Put 2 matrices on top of each other */

ATOPB 6 rows 3 cols (numeric)

1 3 5 4 4 1 2 2 6 1 3 4 3 5 2 4 2 1

Page 23: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Diagonal OperatorsDiagonal Operators

diagAdiagA = diag(A);= diag(A);/* Change non-diagonal elements of A to 0, /* Change non-diagonal elements of A to 0,

keep diagonals as they are */keep diagonals as they are */

DIAGA 3 rows 3 cols (numeric)

1 0 0 0 4 0 0 0 6

Page 24: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Diagonal OperatorsDiagonal Operators

vdiagAvdiagA == vecdiag(A);vecdiag(A);/* Take the diagonal elements of A, put into a /* Take the diagonal elements of A, put into a

column matrix */column matrix */

VDIAGA 3 rows 1 col (numeric)

1 4 6

Page 25: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

FunctionsFunctions

logAlogA = log(A);= log(A);/* Log of each term */ /* Log of each term */

LOGA 3 rows 3 cols (numeric)

0 1.0986123 1.6094379 1.3862944 1.3862944 0 0.6931472 0.6931472 1.7917595

Page 26: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

One more functionOne more function

ssgvdiagA =ssgvdiagA = ssq(vdiagA);ssq(vdiagA);

/* Square each element of vdiagA, sum /* Square each element of vdiagA, sum them */them */

SSGVDIAGA 1 row 1 col (numeric)

53

Page 27: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Programming StatementsProgramming Statements

► In addition to the functions and In addition to the functions and operators that we talked about earlier, operators that we talked about earlier, we have programming statements:we have programming statements:

► If/ThenIf/Then►DoDo► Jumping (GOTO, LINK Statements)Jumping (GOTO, LINK Statements)►ModulesModules

Page 28: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

ModulesModules

► Modules are similar to Modules are similar to subroutines, or functions, that can be subroutines, or functions, that can be called anywhere in a program, and called anywhere in a program, and reused later.reused later.

Page 29: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Programming Statements, Programming Statements, ModulesModules

start TransMatrix(D);start TransMatrix(D);

DcolDcol = ncol(D); = ncol(D); /* Number of columns in D */ /* Number of columns in D */Drow = nrow(D); /* Number of rows in DDrow = nrow(D); /* Number of rows in D */*/

Dtranspose_temp = shape(Dtranspose_temp = shape(..,Dcol, Drow);,Dcol, Drow);do i = do i = 11 to Dcol; to Dcol;

do j = do j = 11 to Drow; to Drow;Dtranspose_temp[i,j] = D[j,i]; /* transpose matrix D */Dtranspose_temp[i,j] = D[j,i]; /* transpose matrix D */

end;end;end;end;return (Dtranspose_temp);return (Dtranspose_temp);finish TransMatrix;finish TransMatrix;

Dtranspose = TransMatrix(X);Dtranspose = TransMatrix(X);print Dtranspose;print Dtranspose;

Page 30: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Programming Statements, Programming Statements, ModulesModules

The previous slide illustrates The previous slide illustrates 1.1.Programming Statements (In the form Programming Statements (In the form

of Do Loop)of Do Loop)2.2.Modules (creating groups of Modules (creating groups of

statements that can be invoked statements that can be invoked anywhere in the program, i.e., a anywhere in the program, i.e., a subroutine, and creating a separate subroutine, and creating a separate environment local to the module).environment local to the module).

Page 31: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Bring a dataset into IML Bring a dataset into IML Convert to a MatrixConvert to a Matrix

procproc imliml;;

use Dset1; /* dataset to read data use Dset1; /* dataset to read data from*/from*/

Page 32: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Bring a dataset into IML Bring a dataset into IML Convert to a MatrixConvert to a Matrix

read all var{x} into xobs;read all var{x} into xobs;/* X variable into X matrix /* X variable into X matrix */*/

XOBS 10 rows 1 col (numeric)

550 200 280 340 410 160 380 510 510 475

Page 33: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

A real-life exampleA real-life example► Consider the following table, with x the number of responders, n the number of Consider the following table, with x the number of responders, n the number of

patients, and p the proportion (x/n):patients, and p the proportion (x/n):

Group 0Group 0 Group 1Group 1

Stratum 1Stratum 1 x10/n10=p10x10/n10=p10 x11/n11=p11x11/n11=p11

Stratum 2Stratum 2 x20/n20=p20x20/n20=p20 x21/n21=p21x21/n21=p21

Stratum 3Stratum 3 x30/n30=p30x30/n30=p30 x31/n31=p31x31/n31=p31

TotalTotal p0p0 p1p1

Page 34: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

A real life exampleA real life example

► The problem is to test the equality of the The problem is to test the equality of the proportions responding between the two proportions responding between the two treatment groups, controlling for stratum, and to treatment groups, controlling for stratum, and to generate point estimates and confidence generate point estimates and confidence intervals.intervals.

► PROC FREQ, CMH provides pvalues for PROC FREQ, CMH provides pvalues for hypothesis testing, but not point estimates, CIs.hypothesis testing, but not point estimates, CIs.

► Test desired can be found in Mehrotra and Test desired can be found in Mehrotra and Railkar, Statistics in Medicine, 2000.Railkar, Statistics in Medicine, 2000.

►How to proceed?How to proceed?

Page 35: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Defining the X, N matricesDefining the X, N matrices

►We need to define two matrices of the We need to define two matrices of the same size, X, N, wheresame size, X, N, where X is the number of patients who respond to X is the number of patients who respond to

treatment in group J, j = 1 (1treatment in group J, j = 1 (1stst column) or j column) or j = 2 (2= 2 (2ndnd column), in stratum I. column), in stratum I.

N is the number of patients in treatment in N is the number of patients in treatment in group J, j = 1 (1group J, j = 1 (1stst column) or j = 2 (2 column) or j = 2 (2ndnd column), in stratum I.column), in stratum I.

►We can read in the data or enter it; we We can read in the data or enter it; we show the matrices next page, but leave show the matrices next page, but leave off the coding.off the coding.

Page 36: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Defining the X, N matricesDefining the X, N matrices

21

3231

2221

1211

2*

.

.

..

kk

k

nn

nn

nn

nn

N

21

3231

2221

1211

2*

.

.

..

kk

k

xx

xx

xx

xx

X

Page 37: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Compute row sums, point Compute row sums, point estimatesestimates

►We need to calculate 3 terms on the next We need to calculate 3 terms on the next page:page:

The row sums of the matrix N (will give a K x 1 The row sums of the matrix N (will give a K x 1 matrix) (sum within each stratum)matrix) (sum within each stratum)

Another matrix, phat, which is term by term each Another matrix, phat, which is term by term each element of x divided by corresponding term of n.element of x divided by corresponding term of n.

The percent of patients responding across each The percent of patients responding across each stratumstratum

Page 38: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Compute row sums, point Compute row sums, point estimatesestimates

./ˆ ijijij nxp ► ndot= n[,+];ndot= n[,+];

► phat= x/n;phat= x/n;

► pbar= (n#phat)[,pbar= (n#phat)[,+]/ndot;+]/ndot;

kinnn iii ...,3,2,1,21

21

2211 ˆˆ

ii

iiiii nn

pnpnp

Page 39: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Compute WeightsCompute Weights

►On the next page, we will compute the On the next page, we will compute the weights associated with each stratum. weights associated with each stratum.

► These are the harmonic means, and each These are the harmonic means, and each row contributes a weight proportional to the row contributes a weight proportional to the product of the terms in each column.product of the terms in each column.

Page 40: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Compute WeightsCompute Weights

►wnum = n[,#]/ndot;wnum = n[,#]/ndot;►wden = wnum[+,];wden = wnum[+,];►w w = =

wnum/wden;wnum/wden;

k

m m

mm

i

ii

i

n

nnn

nn

w

1

21

21

Page 41: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Weighted Proportion of Weighted Proportion of Responders (Each Group) and Responders (Each Group) and

VarianceVarianceOn the next page, we need to compute the On the next page, we need to compute the

proportion of patients in each treatment proportion of patients in each treatment group (each column), responding to group (each column), responding to treatment; controlling for stratum (the treatment; controlling for stratum (the weight calculated previously). weight calculated previously).

Also, we compute the estimated variance of Also, we compute the estimated variance of each estimate of the proportioneach estimate of the proportion

Page 42: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Weighted Proportion of Weighted Proportion of Responders (Each Group) and Responders (Each Group) and

VarianceVariance.ˆˆ

1ij

k

iiwj pwp

pwt = (w#phat)pwt = (w#phat)[+,];[+,];

varpwt = varpwt = (w#w#phat#(1-(w#w#phat#(1-phat)/n)[+,];phat)/n)[+,];

.)ˆ1(ˆ

)ˆ(ˆ1

2

k

i ij

ijijiwj n

ppwpV

Page 43: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Compute the weighted difference of Compute the weighted difference of proportions between two groups, and the proportions between two groups, and the

estimated varianceestimated variance ►We will compute on the next page the We will compute on the next page the

difference in proportions between the two difference in proportions between the two treatment groups (column 2 minus column 1).treatment groups (column 2 minus column 1).

► To do this, for each row, we calculate the To do this, for each row, we calculate the unweighted difference between column 2 and unweighted difference between column 2 and column 1. Then, we multiply each difference column 1. Then, we multiply each difference by the weight associated with each row.by the weight associated with each row.

► Then we sum these terms, and get the point Then we sum these terms, and get the point estimateestimate

Page 44: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Compute the weighted difference of Compute the weighted difference of proportions between two groups, and the proportions between two groups, and the

estimated varianceestimated variance

k

iiiw w

1

ˆˆ

► deltahat = phat[,2] – deltahat = phat[,2] – phat[,1]; phat[,1];

► deltawt = (w # deltahat) deltawt = (w # deltahat) [+,]; [+,];

.ˆˆˆ12 iii pp

Page 45: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Compute the weighted difference of Compute the weighted difference of proportions between two groups, and the proportions between two groups, and the

estimated varianceestimated variance ► To compute confidence intervals, we will To compute confidence intervals, we will

need the estimated variance of the point need the estimated variance of the point estimates.estimates.

► The variance of this point estimate is The variance of this point estimate is provided on the next page; and confidence provided on the next page; and confidence intervals following that page.intervals following that page.

Page 46: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Compute the weighted difference of Compute the weighted difference of proportions between two groups, and the proportions between two groups, and the

estimated varianceestimated variance

)ˆ(ˆ)ˆ(ˆ1

2i

k

iiw VwV

► vardeltai = vardeltai =

(phat#(1-phat) / n ) [,(phat#(1-phat) / n ) [,+];+];

► vardeltaw = vardeltaw =

(w # w # vardeltai ) (w # w # vardeltai ) [+,]; [+,];

2

22

1

11 )ˆ1(ˆ)ˆ1(ˆ)ˆ(ˆ

i

ii

i

iii n

pp

n

ppV

Page 47: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Confidence intervals for proportion of each Confidence intervals for proportion of each treatment group, and for the difference in treatment group, and for the difference in

proportionsproportions

► pwtcil = pwtcil =

pwt – 1.96*sqrt(varpwt);pwt – 1.96*sqrt(varpwt);► pwtciu = pwtciu =

pwt +1.96*sqrt(varpwt);pwt +1.96*sqrt(varpwt);

► delwcil = delwcil =

deltawt–deltawt–1.96*sqrt(vardeltaw);1.96*sqrt(vardeltaw);

► delwciu = delwciu =

deltawt+ deltawt+ 1.96*sqrt(vardeltaw); 1.96*sqrt(vardeltaw);

)ˆ(ˆˆ 2/ wjwj pVzp

)ˆ(ˆˆ2/ ww Vz

Page 48: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

p-values for testing the hypothesisp-values for testing the hypothesis

► The chi-square statistic on the next page is The chi-square statistic on the next page is identical to that generated by the CMH identical to that generated by the CMH statistic in PROC FREQ, except for a factor of statistic in PROC FREQ, except for a factor of (n-1)/n. (n-1)/n.

► That is, the numerator has a term of (ndot – That is, the numerator has a term of (ndot – 1) rather than ndot.1) rather than ndot.

Page 49: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

p-values for testing the hypothesisp-values for testing the hypothesis

► chictmp = ((n[,#] # deltahat) / ndot ) [+,];chictmp = ((n[,#] # deltahat) / ndot ) [+,];► chicnum = chictmp ** 2;chicnum = chictmp ** 2;► chicden = (n[,#] # pbar # (1-pbar) / ndot ) chicden = (n[,#] # pbar # (1-pbar) / ndot )

[+,];[+,];► chic2 = chicnum/chicden; chic2 = chicnum/chicden;

2

1

21

11

221

2

)1(

)ˆˆ(

k

iii

i

ii

i

k

ii

i

ii

c

ppn

nn

ppn

nn

Page 50: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

X 3 rows 2 cols (numeric)X 3 rows 2 cols (numeric)

140 180140 180 50 9050 90 60 6060 60

N 3 rows 2 cols (numeric)N 3 rows 2 cols (numeric)

450 440450 440 170 200170 200 200 180200 180

proc iml;reset print;x = {140 180, 50 90, 60 60};n = {450 440, 170 200, 200 180};

Page 51: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

/* ndot: sum the rows of the n matrix */ndot = n[,+];

NDOT 3 rows 1 col (numeric) 890 370 380

PHAT 3 rows 2 cols (numeric)

0.3111111 0.40909090.2941176 0.45 0.3 0.3333333

/* phat: point estimates, proportion of responders each cell */phat = x/n;

Page 52: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

/* pbar: proportion of responders in each row, weighted by size */pbar = (n#phat)[,+]/ndot;

/* difference in proportions from column 2 (group 1) to column 1 (group 0) */

delta = phat[,2] - phat[,1];

PBAR 3 rows 1 col (numeric)

0.35955060.37837840.3157895

DELTA 3 rows 1 col (numeric)

0.09797980.15588240.0333333

Page 53: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

/* Compute weights. *//* Wnum is, for each row, the product of the columns by the sum of cols *//* Wden is the sum of the column vector *//* w is normalized, which will sum to 1, each row of wnum divided by wden */

wnum = n[,#]/ndot;wden = wnum[+,];w = wnum/wden;

WNUM 3 rows 1 col (numeric)

222.4719191.89189294.736842

WDEN 1 row 1 col

(numeric)

409.10064

W 3 rows 1 col

(numeric)

0.54380730.22461930.2315734

Page 54: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

/* PWT is weight (a scalar) * phat. then we sum over the rows, to get column proportions *//* VARPWT is the corresponding variance of pwt */

pwt = (w#phat)[+,];varpwt = (w#w#phat#(1-phat)/n)[+,];

PWT 1 row 2 cols (numeric)

0.304721 0.4007364

VARPWT 1 row 2 cols (numeric)

0.0002588 0.0002911

Page 55: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

/* delta is a 3 x 1, the differences in each row. multiply by w, a 3 x 1. deltawt is the sum *//* vardelta is the variance of the difference of each row *//* vardelw is the variance of the weighted difference */

deltawt = (w#delta)[+,];vardelta= (phat#(1-phat)/n)[,+];vardelw = (w#w#vardelta)[+,];

DELTAWT 1 row 1 col (numeric)

0.0960154

VARDELTA 3 rows 1 col (numeric)

0.00102570.00245870.0022846

VARDELW 1 row 1 col (numeric)

0.0005499

Page 56: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

/* Lower Confidence Limits for each of the column proportions */pwtctil = pwt - 1.96*sqrt(varpwt);

/* Upper Confidence Limits for each of the column proportions */pwtctiu = pwt + 1.96*sqrt(varpwt);

PWTCTIL 1 row 2 cols (numeric)

0.2731918 0.3672948

PWTCTIU 1 row 2 cols (numeric)

0.3362502 0.4341781

Page 57: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

/* Lower, Upper Limits for the DeltaWT, the difference between the two groups */

delwcil = deltawt - 1.96*sqrt(vardelw);delwciu = deltawt + 1.96*sqrt(vardelw);

DELWCIL 1 row 1 col (numeric)

0.0500542

DELWCIU 1 row 1 col (numeric)

0.1419766

Page 58: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

/* ChiSquare Computations */

chictmp = (n[,#]#delta/ndot)[+,];chicnum = chictmp**2;chicden = (n[,#]#pbar#(1-pbar)/ndot)[+,];

chic2 = chicnum/chicden;pchic2 = 1-probchi(chic2,1);

CHICTMP 1 row 1 col (numeric)39.279972

CHICNUM 1 row 1 col (numeric)1542.9162

CHICDEN 1 row 1 col (numeric)93.312668

CHIC2 1 row 1 col (numeric)16.534906

PCHIC2 1 row 1 col (numeric)0.0000478

Page 59: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Other UsesOther Uses

►Graphics.Graphics.► IML Studio can submit FORTRAN, C++, IML Studio can submit FORTRAN, C++,

and R statementsand R statements Formerly Stat StudioFormerly Stat Studio

►Which was formerly IML WorkshopWhich was formerly IML Workshop

Page 60: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

For more referenceFor more reference

► See the SAS/IML Users Guide, or Online See the SAS/IML Users Guide, or Online DocumentationDocumentation

►Mehrotra, D., and Railkar, R. Minimum risk Mehrotra, D., and Railkar, R. Minimum risk weights for comparing treatments in weights for comparing treatments in stratified binomial trials. Statistics in stratified binomial trials. Statistics in Medicine 2000; 19: 811-825.Medicine 2000; 19: 811-825.

►Graybill, F. Graybill, F. Theory and Application of the Theory and Application of the General Linear ModelGeneral Linear Model. Wadsworth & Brooks: . Wadsworth & Brooks: Pacific Grove, 1976 Pacific Grove, 1976

Page 61: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

SummarySummary

► IML is a powerful tool for data analysis, IML is a powerful tool for data analysis, statistics, and graphics.statistics, and graphics.

►Consider instead of datastep Consider instead of datastep programming if the opportunity programming if the opportunity presents itself.presents itself. Not always the best alternative, but helps Not always the best alternative, but helps

inform the decision making.inform the decision making.

Page 62: An Introduction and Application of IML to Weighted ChiSquare Statistics Lisa Price, Bruce Johnston Junming Yang, Dan DiPrimeo BASAS April 14, 2008

Questions?Questions?