![Page 1: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/1.jpg)
Score Tests in Semiparametric Models
Raymond J. CarrollDepartment of StatisticsFaculties of Nutrition and
Toxicology
Texas A&M Universityhttp://stat.tamu.edu/~carroll
Papers available at my web site
![Page 2: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/2.jpg)
Texas is surrounded on all sides by foreign countries: Mexico to the
south and the United States to the east, west and north
![Page 3: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/3.jpg)
College Station, home of Texas A&M University
I-35
I-45
Big Bend National Park
Wichita Falls, Wichita Falls, that’s my hometown
West Texas
Palo DuroCanyon, the Grand Canyon of Texas
Guadalupe Mountains National Park
East Texas
![Page 4: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/4.jpg)
Palo Duro Canyon of the Red River
![Page 5: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/5.jpg)
Co-Authors
Arnab Maity
![Page 6: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/6.jpg)
Co-Authors
Nilanjan Chatterjee
![Page 7: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/7.jpg)
Co-Authors
Kyusang Yu Enno Mammen
![Page 8: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/8.jpg)
Outline
• Parametric Score Tests
• Straightforward extension to semiparametric models
• Profile Score Testing
• Gene-Environment Interactions
• Repeated Measures
![Page 9: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/9.jpg)
Parametric Models
• Parametric Score Tests
• Parameter of interest =
Nuisance parameter =
Interested in testing whether
Log-Likelihood function = L (Y ;X ;Z;¯ ;µ)
![Page 10: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/10.jpg)
Parametric Models
• Score Tests are convenient when it is easy to maximize the null loglikelihood
• But hard to maximize the entire loglikelihood
P ni=1L (Y i ;X i ;Zi ;0;µ)
P ni=1L (Y i ;X i ;Zi ;¯ ;µ)
![Page 11: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/11.jpg)
Parametric Models
• Let be the MLE for a given value of
• Let subscripts denote derivatives
• Then the normalized score test statistic is just
bµ(¯ )
S = n¡ 1=2P ni=1L ¯ fY i ;X i ;Zi ;0;bµ(0)g
![Page 12: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/12.jpg)
Parametric Models
• Let be the Fisher Information evaluated at = 0, and with sub-matrices such as
• Then using likelihood properties, the score statistic under the null hypothesis is asymptotically equivalent to
I
n¡ 1=2P ni=1
·L ¯ fY i ;X i ;Zi ;0;µg
¡ I ¯ µI ¡ 1µµL µfY i ;X i ;Zi ;0;µg
¸
I ¯ µ
![Page 13: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/13.jpg)
Parametric Models
• The asymptotic variance of the score statistic is
• Remember, all computed at the null = 0
• Under the null, if = 0 has dimension p, then
T = I ¯ ¯ ¡ I ¯ µI ¡ 1µµI µ¯
S > T ¡ 1S ) Â2p
![Page 14: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/14.jpg)
Parametric Models
• The key point about the score test is that all computations are done at the null hypothesis
• Thus, if maximizing the loglikelihood at the null is easy, the score test is easy to implement.
![Page 15: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/15.jpg)
Semiparametric Models
• Now the loglikelihood has the form
• Here, is an unknown function. The obvious score statistic is
• Where is an estimate under the null
L fY i ;X i ;¯ ;µ(Zi)g
µ(¢)
n¡ 1=2P ni=1L ¯ fY i ;X i ;0;bµ(Zi ;0)g
bµ(Zi ;0)
![Page 16: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/16.jpg)
Semiparametric Models
• Estimating in a loglikelihood like
• This is standard
• Kernel methods used local likelihood
• Splines use penalized loglikelihood
L fY i ;X i ;0;µ(Zi)g
µ(¢)
![Page 17: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/17.jpg)
Simple Local Likelihood
• Let K be a density function, and h a bandwidth
• Your target is the function at z• The kernel weights for local likelihood are
• If K is the uniform density, only observations within h of z get any weight
iKZ -zh
![Page 18: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/18.jpg)
Simple Local Likelihood
Only observations within h = 0.25 of x = -1.0 get any weight
![Page 19: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/19.jpg)
Simple Local Likelihood
• Near z, the function should be nearly linear
• The idea then is to do a likelihood estimate local to z via weighting, i.e., maximize
• Then announce 0θ(z)
P ni=1K
µZi ¡ z
h
¶L fY i ;X i ;0;®0 + ®1(Zi ¡ z)g
![Page 20: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/20.jpg)
Simple Local Likelihood
• It is well-known that the optimal bandwidth is
• The bandwidth can be estimated from data using such things as cross-validation
h / n¡ 1=5
![Page 21: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/21.jpg)
Score Test Problem
• The score statistic is
• Unfortunately, when this statistic is no longer asymptotically normally distributed with mean zero
• The asymptotic test level = 1!
h / n¡ 1=5
S = n¡ 1=2P ni=1L ¯ fY i ;X i ;0;bµ(Zi ;0)g
![Page 22: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/22.jpg)
Score Test Problem
• The problem can be fixed up in an ad hoc way by setting
• This defeats the point of the score test, which is to use standard methods, not ad hoc ones.
h / n¡ 1=3
![Page 23: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/23.jpg)
Profiling in Semiparametrics
• In profile methods, one does a series of steps
• For every , estimate the function by using local likelihood to maximize
• Call it
P ni=1K
µZi ¡ z
h
¶L fY i ;X i ;¯ ;®0 + ®1(Zi ¡ z)g
bµ(z;¯ )
![Page 24: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/24.jpg)
Profiling in Semiparametrics
• Then maximize the semiparametric profile loglikelihood
• Often difficult to do the maximization, hence the need to do score tests
n¡ 1=2P ni=1L fY i ;X i ;¯ ;bµ(Zi ;¯ )g
![Page 25: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/25.jpg)
Profiling in Semiparametrics
• The semiparametric profile loglikelihood has many of the same features as profiling does in parametric problems.
• The key feature is that it is a projection, so that it is orthogonal to the score for , or to any function of Z alone.
µ(Z)
![Page 26: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/26.jpg)
Profiling in Semiparametrics
• The semiparametric profile score is
n¡ 1=2P ni=1
@@
L fY i ;X i ;¯ ;bµ(Zi ;¯ )g =0
¼n¡ 1=2P ni=1
·L ¯ fY i ;X i ;0;bµ(Zi ;0)g
+L µfY i ;X i ;0;bµ(Zi ;0)g@
@bµ(Zi ;¯ )¯ =0
¸
![Page 27: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/27.jpg)
Profiling in Semiparametrics
• The problem is to compute
• Without doing profile likelihood!
@@
bµ(Zi ;¯ )¯ =0
![Page 28: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/28.jpg)
Profiling in Semiparametrics
• The definition of local likelihood is that for every ,
• Differentiate with respect to .
0 = E£L µfY ;X ;¯ ;µ(Z;¯ )gjZ = z
¤
![Page 29: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/29.jpg)
Profiling in Semiparametrics
• Then
• Algorithm: Estimate numerator and denominator by nonparametric regression
• All done at the null model!
@@
bµ(Z;0) = ¡E
hL ¯ µfY ;X ;0;µ(Z;0)gjZ = z
i
E£L µµfY ;X ;0;µ(Z;0)gjZ = z
¤
![Page 30: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/30.jpg)
Results
• There are two things to estimate at the null model
• Any method can be used without affecting the asymptotic properties
• Not true without profiling
bµ(Z;0)@
@bµ(Z;0) = bµ¯ (Z;0)
![Page 31: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/31.jpg)
Results
• We have implemented the test in some cases using the following methods:• Kernels• Splines from gam in Splus• Splines from R• Penalized regression splines
• All results are similar: this is as it should be: because we have projected and profiled, the method of fitting does not matter
![Page 32: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/32.jpg)
Results
• The null distribution of the score test is asymptotically the same as if the following were known
µ(Z) @@
µ(Z;0) = µ¯ (Z;0)
![Page 33: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/33.jpg)
Results
• This means its variance is the same as the variance of
• This is trivial to estimate• If you use different methods, the
asymptotic variance may differ
n¡ 1=2P ni=1
·L ¯ fY i ;X i ;0;µ(Zi)g
+L µfY i ;X i ;0;µ(Zi)gµ¯ (Zi ;0)¸
![Page 34: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/34.jpg)
Results
• With this substitution, the semiparametric score test requires no undersmoothing
• Any method works
• How does one do undersmoothing for a spline or an orthogonal series?
![Page 35: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/35.jpg)
Results
• Finally, the method is a locally semiparametric efficient test for the null hypothesis
• The power is: the method of nonparametric regression that you use does not matter
![Page 36: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/36.jpg)
Example
• Colorectal adenoma: a precursor of colorectal cancer
• N-acetyltransferase 2 (NAT2): plays important role in detoxification of certain aromatic carcinogen present in cigarette smoke
• Case-control study of colorectal adenoma• Association between colorectal adenoma
and the candidate gene NAT2 in relation to smoking history.
![Page 37: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/37.jpg)
Example
• Y = colorectal adenoma
• X = genetic information (below)
• Z = years since stopping smoking
![Page 38: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/38.jpg)
More on the Genetics
• Subjects genotyped for six known functional SNP’s related to NAT2 acetylation activity
• Genotype data were used to construct diplotype information, i.e., The pair of haplotypes the subjects carried along their pair of homologous chromosomes
![Page 39: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/39.jpg)
More on the Genetics
• We identifies the 14 most common diplotypes
• We ran analyses on the k most common ones, for k = 1,…,14
![Page 40: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/40.jpg)
The Model
• The model is a version of what is done in genetics, namely for arbitrary ,
• The interest is in the genetic effects, so we want to know whether
• However, we want more power if there are interactions
pr(Y = 1jX ;Z) = H©X > ¯ + µ(Zi) + °X > ¯ µ(Zi)
ª
°
![Page 41: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/41.jpg)
The Model
• For the moment, pretend is fixed
• This is an excellent example of why score testing: the model is very difficult to fit numerically• With extensions to such things as longitudinal
data and additive models, it is nearly impossible to fit
pr(Y = 1jX ;Z) = H©X > ¯ + µ(Zi) + °X > ¯ µ(Zi)
ª
°
![Page 42: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/42.jpg)
The Model
• Note however that under the null, the model is simple nonparametric logistic regression
• Our methods only require fits under this simple null model
pr(Y = 1jX ;Z) = H fµ(Zi)g
![Page 43: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/43.jpg)
The Method
• The parameter is not identified at the null
• However, the derivative of the loglikelihood evaluated at the null depends on
• The, the score statistic depends on
pr(Y = 1jX ;Z) = H©X > ¯ + µ(Zi) + °X > ¯ µ(Zi)
ª
°
°
S n(° )
°
![Page 44: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/44.jpg)
The Method
• Our theory gives a linear expansion and an easily calculated covariance matrix for each
• The statistic as a process in converges weakly to a Gaussian process
°
S n(° ) = n¡ 1=2P ni=1ª i(° ) + op(n¡ 1=2)
covfS n(° )g ! T (°)
S n(° ) °
![Page 45: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/45.jpg)
The Method
• Following Chatterjee, et al. (AJHG, 2006), the overall test statistic is taken as
• (a,c) are arbitrary, but we take it as (-3,3)
n = maxa· ° · c
hS >
n (° )T ¡ 1(° )S n(° )i
![Page 46: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/46.jpg)
Critical Values
• Critical values are easy to obtain via simulation
• Let b=1,…,B, and let Recall
• By the weak convergence, this has the same limit distribution as (with estimates under the null)
in the simulated world
N ib = Normal(0;1)
S n(° ) = n¡ 1=2P ni=1ª i(° ) + op(n¡ 1=2)
S bn (° ) = n¡ 1=2P ni=1
bª i(° )N ib
![Page 47: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/47.jpg)
Critical Values
• This means that the following have the same limit distributions under the null
• This means you just simulate a lot of times to get the null critical value
N ib = Normal(0;1)
bn = maxa· ° · c
hS >
bn (° )T ¡ 1(° )S bn(° )i
n = maxa· ° · c
hS >
n (° )T ¡ 1(° )S n(° )i
bn
![Page 48: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/48.jpg)
Simulation
• We did a simulation under a more complex model (theory easily extended)
• Here X = independent BVN, variances = 1, and with means given as
• c = 0 is the null
pr(Y = 1jX ;Z) = H©S> ´ + X > ¯ + µ(Zi) + °X > ¯ µ(Zi)
ª
¯ = c(1;1)> ; ;c = 0;0:01;:::;0:15
![Page 49: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/49.jpg)
Simulation
• In addition,
• We varied the true values as
Z = Uniform[¡ 2;2]
µ(z) = sin(2z)
S = Normal(0;1);´ = 1
¡ 3 · ° · 3
pr(Y = 1jX ;Z) = H©S> ´ + X > ¯ + µ(Zi) + °X > ¯ µ(Zi)
ª
° true = 0;1;2
![Page 50: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/50.jpg)
Power Simulation
![Page 51: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/51.jpg)
Simulation Summary
• The test maintains its Type I error
• Little loss of power compared to no interaction when there is no interaction
• Great gain in power when there is interaction
• Results here were for kernels: almost numerically identical for penalized regression splines
![Page 52: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/52.jpg)
NAT2 Example
• Case-control study with 700 cases and 700 controls
• As stated before, there were 14 common diplotypes
• Our X was the design matrix for the k most common, k = 1,2,…,14
![Page 53: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/53.jpg)
NAT2 Example
• Z was years since stopping smoking
• Co-factors S were age and gender
• The model is slightly more complex because of the non-smokers (Z=0), but those details hidden here
![Page 54: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/54.jpg)
NAT2 Example Results
![Page 55: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/55.jpg)
NAT2 Example Results
• Stronger evidence of genetic association seen with the new model
• For example, with 12 diplotypes, our p-value was 0.036, the usual method was 0.214
![Page 56: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/56.jpg)
Extensions: Repeated Measures
• We have extended the results to repeated measures models
• If there are J repeated measures, the loglikelihood is
• Note: one function, but evaluated multiple times
L fY i1; :::;Y i J ;X i1; :::;X iJ ;¯ ;µ(Zi1); :::µ(Zi J )g
![Page 57: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/57.jpg)
Extensions: Repeated Measures
• If there are J repeated measures, the loglikelihood is
• There is no straightforward kernel method for this• Wang (2003, Biometrika) gave a solution in
the Gaussian case with no parameters• Lin and Carroll (2006, JRSSB) gave the
efficient profile solution in the general case including parameters
L fY i1; :::;Y i J ;X i1; :::;X iJ ;¯ ;µ(Zi1); :::µ(Zi J )g
![Page 58: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/58.jpg)
Extensions: Repeated Measures
• It is straightforward to write out a profiled score at the null for this loglikelihood
• The form is the same as in the non-repeated measures case: a projection of the score for onto the score for µ(¢)
¯
L fY i1; :::;Y i J ;X i1; :::;X iJ ;¯ ;µ(Zi1); :::µ(Zi J )g
![Page 59: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/59.jpg)
Extensions: Repeated Measures
• Here the estimation of is not trivial because it is the solution of a complex integral equation
@@
µ(Zi ;¯ )¯ =0
![Page 60: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/60.jpg)
Extensions : Repeated Measures
• Using Wang (2003, Biometrika) method of nonparametric regression using kernels, we have figured out a way to estimate
• This solution is the heart of a new paper (Maity, Carroll, Mammen and Chatterjee, JRSSB, 2009)
@@
µ(Zi ;¯ )¯ =0
![Page 61: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/61.jpg)
Extensions : Repeated Measures
• The result is a score based method: it is based entirely on the null model and does not need to fit the profile model
• It is a projection, so any estimation method can be used, not just kernels
• There is an equally impressive extension to testing genetic main effects in the possible presence of interactions
![Page 62: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/62.jpg)
Extensions : Nuisance Parameters
• Nuisance parameters are easily handled with a small change of notation
![Page 63: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/63.jpg)
Extensions: Additive Models
• We have developed a version of this for the case of repeated measures with additive models in the nonparametric part
Y ij = X >ij ¯ +
P Dd=1µd(Zi jd) + ² ij
(² i1; :::; ² i J )> = [0;§ ]:
![Page 64: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/64.jpg)
Extensions: Additive Models
• The additive model method uses smooth backfitting (see multiple papers by Park, Yu and Mammen)
![Page 65: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/65.jpg)
Summary
• Score testing is a powerful device in parametric problems.
• It is generally computationally easy
• It is equivalent to projecting the score for onto the score for the nuisance parameters
¯
![Page 66: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/66.jpg)
Summary
• We have generalized score testing from parametric problems to a variety of semiparametric problems
• This involved a reformulation using the semiparametric profile method
• It is equivalent to projecting the score for onto the score for
• The key was to compute this projection while doing everything at the null model
¯µ(¢)
![Page 67: Score Tests in Semiparametric Models Raymond J. Carroll Department of Statistics Faculties of Nutrition and Toxicology Texas A&M University carroll](https://reader035.vdocuments.us/reader035/viewer/2022062314/56649d595503460f94a39825/html5/thumbnails/67.jpg)
Summary
• Our approach avoided artificialities such as ad hoc undersmoothing
• It is semiparametric efficient
• Any smoothing method can be used, not just kernels
• Multiple extensions were discussed