package ‘bstats’ - universidad autónoma del estado de ... · package ‘bstats’ february 15,...
TRANSCRIPT
Package ‘bstats’February 15, 2013
Version 1.0-12-3
Date 2011-10-31
Title Basic statistical functions for R
Author Bin Wang <[email protected]>.
Maintainer Bin Wang <[email protected]>
Description This package collects commonly used procedures oralgorithms for general data analysis. In addition, routinesfor linear regression analysis, statistical computing andgraphics, and many others have been implemented in R for somecourses taught at the University of South Alabama.
License Unlimited
Repository CRAN
Date/Publication 2011-12-04 09:26:34
NeedsCompilation yes
R topics documented:ac . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2birth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3bptest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4bstats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5dw.test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5edf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7edu75 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8influential.plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8ld50.logit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10ld50.logitfit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10lm.ci . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11mediation.test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1
2 ac
model.check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13model.test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14oddsratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15predictor.plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17residual.plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18river . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19scb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19supervisor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20vif . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20white.test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22wls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Index 24
ac Autocorrelation
Description
Removal of autocorrelation by transformation.
Usage
ac(lmobj,type=’cochrane’, ...)
## S3 method for class ’lm’ac(lmobj,type=’cochrane’, ...)
Arguments
lmobj an object that inherits from class lm, such as an lm or glm object.
type method selection: ’iterative’, ’cochrane’.
... not used.
Details
’iterative’: simultaneously estimate the regression coefficients and rho by minimizing the sumsquared errors. A grid searching method is used.
’cochrane’: 1. Fit a linear regression model and compute OLS estimates 2. Calculate the residualsto estimate rho from the data. 3. Fit (1) to obtain estimates of the regression coefficients. 4. Checkto see whether autocorrelation still exist. If yes, repeat by using the estimated coefficients from step3 in step 1.
Value
coefficients, rhohat, dwtest, re-fitted model.
birth 3
Author(s)
Wang, B.
References
Cochrane and Orcutt (1949)
St 335 text
Examples
data(edu75)lm0 = lm(Y~X1+X2+X3, data=edu75)ac.lm(lm0,type=’iterative’)ac.lm(lm0, type=’cochrane’)
birth Birth data
Description
Birth data for singleton live births with gestational age at least 38 weeks.
Usage
data(birth)
Format
A data frame with 400 observations on 9 variables.
Sex character ’male’ or ’female’Gestation numeric Gestational age (in weeks).Weight numeric birth weight.Length numeric height.Head numeric head size.Chest numeric chest size.Mother.s.age numeric chest size.type factor ’r’ = rural or ’u’ = urban.region factor region of the birth.
References
Wang, CSDA and JSS papers.
4 bptest
bptest Breusch-Pagan Test
Description
Performs the Breusch-Pagan test against heteroskedasticity.
Usage
bptest(formula, varformula = NULL, studentize = TRUE, data = list())
Arguments
formula a symbolic description for the model to be tested (or a fitted "lm" object).varformula a formula describing only the potential explanatory variables for the variance
(no dependent variable needed). By default the same explanatory variables aretaken as in the main regression model.
studentize logical. If set to TRUE Koenker’s studentized version of the test statistic will beused.
data an optional data frame containing the variables in the model. By default thevariables are taken from the environment which bptest is called from.
Details
The Breusch-Pagan test fits a linear regression model to the residuals of a linear regression model(by default the same explanatory variables are taken as in the main regression model) and rejects iftoo much of the variance is explained by the additional explanatory variables.
UnderH0 the test statistic of the Breusch-Pagan test follows a chi-squared distribution with parameter(the number of regressors without the constant in the model) degrees of freedom.
Value
A list with class "htest" containing the following components:
statistic the value of the test statistic.p.value the p-value of the test.parameter degrees of freedom.method a character string indicating what type of test was performed.data.name a character string giving the name(s) of the data.
References
T.S. Breusch & A.R. Pagan (1979), A Simple Test for Heteroscedasticity and Random CoefficientVariation. Econometrica 47, 1287–1294
R. Koenker (1981), A Note on Studentizing a Test for Heteroscedasticity. Journal of Econometrics17, 107–112.
W. Kramer & H. Sonnberger (1986), The Linear Regression Model under Test. Heidelberg: Physica
bstats 5
Examples
## generate a regressorx <- rep(c(-1,1), 50)## generate heteroskedastic and homoskedastic disturbanceserr1 <- rnorm(100, sd=rep(c(1,2), 50))err2 <- rnorm(100)## generate a linear relationshipy1 <- 1 + x + err1y2 <- 1 + x + err2## perform Breusch-Pagan testbptest(y1 ~ x)bptest(y2 ~ x)
bstats R package: bstats
Description
In this paackage, some R functions are written for the convenience of class uses. Especially for myst 315, st 210, st 335, st 475/575
Author(s)
B. Wang <[email protected]>
dw.test Durbin-Watson Test
Description
Performs the Durbin-Watson test for autocorrelation of disturbances.
Usage
dw.test(formula, order.by = NULL, alternative = c("greater", "two.sided", "less"),iterations = 15, exact = NULL, tol = 1e-10, data = list())
Arguments
formula a symbolic description for the model to be tested (or a fitted "lm" object).
order.by Either a vector z or a formula with a single explanatory variable like ~ z. Theobservations in the model are ordered by the size of z. If set to NULL (the default)the observations are assumed to be ordered (e.g., a time series).
alternative a character string specifying the alternative hypothesis.
6 dw.test
iterations an integer specifying the number of iterations when calculating the p-value withthe "pan" algorithm.
exact logical. If set to FALSE a normal approximation will be used to compute the pvalue, if TRUE the "pan" algorithm is used. The default is to use "pan" if thesample size is < 100.
tol tolerance. Eigenvalues computed have to be greater than tol to be treated asnon-zero.
data an optional data frame containing the variables in the model. By default thevariables are taken from the environment which dwtest is called from.
Details
The Durbin-Watson test has the null hypothesis that the autocorrelation of the disturbances is 0. It ispossible to test against the alternative that it is greater than, not equal to, or less than 0, respectively.This can be specified by the alternative argument.
Under the assumption of normally distributed disturbances, the null distribution of the Durbin-Watson statistic is the distribution of a linear combination of chi-squared variables. The p-value iscomputed using the Fortran version of Applied Statistics Algorithm AS 153 by Farebrother (1980,1984). This algorithm is called "pan" or "gradsol". For large sample sizes the algorithm might fail tocompute the p value; in that case a warning is printed and an approximate p value will be given; thisp value is computed using a normal approximation with mean and variance of the Durbin-Watsontest statistic.
For an overview on R and econometrics see Racine & Hyndman (2002).
Value
An object of class "htest" containing:
statistic the test statistic.
p.value the corresponding p-value.
method a character string with the method used.
data.name a character string with the data name.
References
J. Durbin & G.S. Watson (1950), Testing for Serial Correlation in Least Squares Regression I.Biometrika 37, 409–428.
J. Durbin & G.S. Watson (1951), Testing for Serial Correlation in Least Squares Regression II.Biometrika 38, 159–178.
J. Durbin & G.S. Watson (1971), Testing for Serial Correlation in Least Squares Regression III.Biometrika 58, 1–19.
R.W. Farebrother (1980), Pan’s Procedure for the Tail Probabilities of the Durbin-Watson Statistic(Corr: 81V30 p189; AS R52: 84V33 p363- 366; AS R53: 84V33 p366- 369). Applied Statistics29, 224–227.
edf 7
R. W. Farebrother (1984), [AS R53] A Remark on Algorithms AS 106 (77V26 p92-98), AS 153(80V29 p224-227) and AS 155: The Distribution of a Linear Combination of χ2 Random Variables(80V29 p323-333) Applied Statistics 33, 366–369.
W. Krämer & H. Sonnberger (1986), The Linear Regression Model under Test. Heidelberg: Physica.
J. Racine & R. Hyndman (2002), Using R To Teach Econometrics. Journal of Applied Econometrics17, 175–189.
See Also
lm
Examples
## generate two AR(1) error terms with parameter## rho = 0 (white noise) and rho = 0.9 respectivelyerr1 <- rnorm(100)
## generate regressor and dependent variablex <- rep(c(-1,1), 50)y1 <- 1 + x + err1
## perform Durbin-Watson testdw.test(y1 ~ x)
err2 <- filter(err1, 0.9, method="recursive")y2 <- 1 + x + err2dw.test(y2 ~ x)
edf To compute the empirical distribution function.
Description
To compute the empirical distribution function.
Usage
edf(x,y=NULL)
Arguments
x A sample. ’NA’ values will be automatically removed.
y A grid of points where the edf will be evaluated.
Author(s)
B. Wang <[email protected]>
8 influential.plot
See Also
scb.
Examples
x = rnorm(100)(out = edf(x))plot(out)(out2= scb(out))lines(out2)
edu75 Education expenditure data (1975)
Description
Education expenditure data for all 50 states in U.S.A in 1975.
Usage
data(edu75)
Format
A data frame with 50 observations on 6 variables.
States character Initial of state namesY numeric Educational expenditure.X1 numeric X1.X2 numeric X2.X3 numeric X3.Region character region, 1=northwest, 2,3,4.
References
Stat 335 text
influential.plot Draw plots for the influence measures
Description
Draw plots for the influence measures.
influential.plot 9
Usage
influential.plot(lmobj,type=’hadi’,ID=FALSE,col=1)
Arguments
lmobj An R object by fitting an OLS model to a data set.
type Plot type. ’hadi’: the Hadi’s influence Measures; ’potential-residual’: potential-residual plot; ’dfits’: DFITS plot; ’hat’: leverage plot; ’cook’: Cook’s distance.
ID Whether to identify points in the plots. Default: FALSE
col Color of the plot.
Value
Output the influence measures, including leverage values (Leverage), Hadi’s measure (Hadi), Welschand Kuh Measure (DFIT) and Cook’s distance (CookD). In addition, the standard residuals are alsoexported.
Author(s)
B. Wang <[email protected]>
See Also
residual.plot.
Examples
data(river)lm0 = lm(Nitrogen~Agr+Forest+Rsdntial+ComIndl, data=river)influential.plot(lm0)influential.plot(lm0,type=’hadi’)influential.plot(lm0,type=’potential’)influential.plot(lm0,type=’leve’)influential.plot(lm0,type=’dfit’)influential.plot(lm0,type=’cook’)influential.plot(lm0,type=’potential’,ID=TRUE)
10 ld50.logitfit
ld50.logit Predict Doses for Binomial Assay model (using counts)
Description
Calibrate binomial assays, generalizing the calculation of LD50 based on a logistic regressionmodel.
Usage
ld50.logit(ndead, ntotal, dose, cf = 1:2, p = 0.5)
Arguments
ndead A vector of number of failures.
ntotal Total number of trials.
dose A vector of dosages.
cf The terms in the coefficient vector giving the intercept and coefficient of (log-)dose
p Probabilities at which to predict the dose needed.
References
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Springer.
Examples
ldose <- rep(0:5, 2)numdead <- c(1, 4, 9, 13, 18, 20, 0, 2, 6, 10, 12, 16)n=20
ld50.logit(numdead,n,ldose,p = 0.5)
ld50.logitfit Predict Doses for Binomial Assay model (using counts)
Description
Calibrate binomial assays, generalizing the calculation of LD50 based on a logistic regressionmodel.
Usage
ld50.logitfit(rate, dose, p = 0.5)
lm.ci 11
Arguments
rate A vector of percentages of successes among all trials.
dose A vector of dosages.
p Probabilities at which to predict the dose needed.
References
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Springer.
Examples
ldose <- rep(0:5, 2)rate <- c(1, 4, 9, 13, 18, 20, 0, 2, 6, 10, 12, 16)/20
ld50.logitfit(rate,ldose,p = 0.5)
lm.ci To compute the confidene interval of the regression parameters.
Description
To compute the confidene interval of the regression parameters.
Usage
lm.ci(lmobj,level=0.95)
Arguments
lmobj An R object by fitting a linear regression model to a data set.
level Confidence level. Default: 0.95.
Author(s)
B. Wang <[email protected]>
See Also
model.test.
12 mediation.test
Examples
data(birth)attach(birth)lm0 = lm(Head~Weight)lm.ci(lm0)lm1 = lm(Head~Weight+Gestation)lm.ci(lm1, level=0.99)
mediation.test The Sobel mediation test
Description
To compute statistics and p-values for the Sobel test. Results for three versions of "Sobel test" areprovided: Sobel test, Aroian test and Goodman test.
Usage
mediation.test(mv,iv,dv)
Arguments
mv The mediator variable.
iv The independent variable.
dv The dependent variable.
Details
To test whether a mediator carries the influence on an IV to a DV.
Value
Missing values are not allowed.
Author(s)
B. Wang <[email protected]>
model.check 13
References
MacKinnon, D. P., & Dwyer, J. H. (1993). Estimating mediated effects in prevention studies.Evaluation Review, 17, 144-158.
MacKinnon, D. P., Warsi, G., & Dwyer, J. H. (1995). A simulation study of mediated effect mea-sures. Multivariate Behavioral Research, 30, 41-62.
Preacher, K. J., & Hayes, A. F. (2004). SPSS and SAS procedures for estimating indirect effects insimple mediation models. Behavior Research Methods,Instruments, & Computers, 36, 717-731.
Preacher, K. J., & Hayes, A. F. (2008). asymptotic and resampling strategies for assessing andcomparing indirect effects in multiple mediator models. Behavior Research Methods, Instruments,& Computers, 40, 879-891.
Examples
mv = rnorm(100)iv = rnorm(100)dv = rnorm(100)mediation.test(mv,iv,dv)
model.check Linear Regression Model Check
Description
Performs tests to check the least squares assumptions for a linear regression model.
Usage
model.check(lmobj)
Arguments
lmobj A fitted model
Details
In this function, we check the normality, independece, and constant variance assmptions of the errorterms, and the presence of multicollinearity.
Value
A list with class "htest" containing the following components:
statistic the value of the test statistic.p.value the p-value of the test.parameter degrees of freedom.method a character string indicating what type of test was performed.data.name a character string giving the name(s) of the data.
14 model.test
References
To be updated.
Examples
data(river)lm0 = lm(Nitrogen~Agr+Forest+Rsdntial+ComIndl, data=river)model.check(lm0)
model.test To compare two models and determine which one is adequate.
Description
To compare a full model and reduced model to test whether the reduced model is adequate or not.
Usage
model.test(fmobj,rmobj,alpha=0.05)
Arguments
fmobj An R object by fitting a full linear regression model (FM) to a data set.
rmobj An R object by fitting a reduced linear regression model (RM) to a data set.
alpha Significance level. Default: alpha=0.05.
Details
To test a null hypothesis "H0: the RM is adequate" against "H1: the FM is adequate". The valuesof test statistic, p-value and critical value based on an F test will be given.
Value
Missing values are not allowed.
Author(s)
B. Wang <[email protected]>
See Also
lm.ci.
oddsratio 15
Examples
data(supervisor)attach(supervisor)lm0 = lm(Y~X1+X3)lm1 = lm(Y~X1+X2+X3+X4+X5+X6)model.test(lm1,lm0)
oddsratio Odds Ratio and Relative Risk
Description
To compute the odds ratio and relative risk based on a 2 X 2 table.
Usage
oddsratio(x,alpha=0.05,n,...)
Arguments
x A vector of length 2 of the number of events from the case and control studies.n A vector of length 2 of the sample sizes.alpha The significance level. Default: 0.05.... Controls
Details
x can be a matrix or a data.frame: the first columns showing the number of events and the secondcolumn showing the sample sizes.
Exact confidence limits for the odds ratio by using an algorithm based on Thomas (1971). See alsoGart (1971). If the sample sizes are too large, the exact confidence interval may not work due tooverflow problem.
Asymptotic confidence limits are computed according to SAS/STAT(R) 9.2 User’s Guide, SecondEdition.
Score method: code has been published for generating confidence intervals by inverting a score test.It is available from http://web.stat.ufl.edu/~aa/cda/R/two_sample/R2/
See also "riskratio" and "oddsratio" in R package epitools.
Value
OR an estimate of odds ratio;RR an estimate of realtive risk;ORCI A table showing various (1-alpha)% confidence limits for OR;RRCI A table showing various (1-alpha)% confidence limits for RR;
16 oddsratio
References
Agresti, A. (1990) _Categorical data analysis_. New York: Wiley. Pages 59-66.
Agresti, A. (1992), A Survey of Exact Inference for Contingency Tables Statistical Science, Vol. 7,No. 1. (Feb., 1992), pp. 131-153.
Agresti, A. (2002), Categorical Data Analysis, Second Edition, New York: John Wiley \& Sons.
Fisher, R. A. (1935) The logic of inductive inference. _Journal of the Royal Statistical SocietySeries A_ *98*, 39-54.
Fisher, R. A. (1962) Confidence limits for a cross-product ratio. _Australian Journal of Statistics_*4*, 41.
Fisher, R. A. (1970) _Statistical Methods for Research Workers._ Oliver & Boyd.
Mehta, C. R. and Patel, N. R. (1986) Algorithm 643. FEXACT: A Fortran subroutine for Fisher’sexact test on unordered r*c contingency tables. _ACM Transactions on Mathematical Software_,*12*, 154-161.
Clarkson, D. B., Fan, Y. and Joe, H. (1993) A Remark on Algorithm 643: FEXACT: An Algorithmfor Performing Fisher’s Exact Test in r x c Contingency Tables. _ACM Transactions on Mathemat-ical Software_, *19*, 484-488.
Patefield, W. M. (1981) Algorithm AS159. An efficient method of generating r x c tables with givenrow and column totals. _Applied Statistics_ *30*, 91-97.
Stokes, M. E., Davis, C. S., and Koch, G. G. (2000), Categorical Data Analysis Using the SASSystem, Second Edition, Cary, NC: SAS Institute Inc.
See Also
fisher.test, chisq.test
Examples
# library(bstats)x = c(1,0)n = c(72370,73058)oddsratio(x,n=n)
Convictions <-matrix(c(2, 10, 15, 3),
nrow = 2,dimnames =list(c("Dizygotic", "Monozygotic"),
c("Convicted", "Not convicted")))Convictionsfisher.test(Convictions, conf.level = 0.95)$conf.int
x = matrix(c(2,10,17,13), ncol=2)oddsratio(x)
Convictions <-matrix(c(8, 492, 0, 500), nrow = 2, byrow=TRUE)
predictor.plot 17
fisher.test(Convictions, conf.level = 0.95)$conf.int
x = c(8,0)n = c(500,500)oddsratio(x,n=n)
predictor.plot Draw plots for predictor impacts on the dependent variable
Description
Draw added-variable plot (av) or redidual plus component (rc) plot.
Usage
predictor.plot(lmobj,type=’av’,ID=FALSE, col=1)
Arguments
lmobj An R object by fitting an OLS model to a data set.
type Plot type. ’av’: added variable plot; ’rc’: residual plus component plot.
ID Whether to identify points in the plots. Default: FALSE
col Color of the plot.
Value
Missing value not allowed.
Author(s)
B. Wang <[email protected]>
See Also
residual.plot.
Examples
data(river)lm0 = lm(Nitrogen~Agr+Forest+Rsdntial+ComIndl, data=river)predictor.plot(lm0)predictor.plot(lm0,type=’rc’)
18 residual.plot
residual.plot Draw residual plots for an ordinary regression model.
Description
Draw residual plots for an ordinary regression model.
Usage
residual.plot(lmobj,type=’fitted’,col=1)
Arguments
lmobj An R object by fitting an OLS model to a data set.
type Type of residual plot(s): ’fitted’, residuals against fitted values; ’index’, residualsagainst index; ’predictor’, residuals against each of the predictors in the fittedmodel; ’qqplot’, qq-plot of the standardized residuals to check the normalityassumption.
col Color of the plot.
Value
Missing values are not allowed.
Author(s)
B. Wang <[email protected]>
See Also
influential.plot.
Examples
data(river)lm0 = lm(Nitrogen~Agr+Forest+Rsdntial+ComIndl, data=river)residual.plot(lm0)residual.plot(lm0,type=’index’)residual.plot(lm0,type=’predictor’)
scb 19
river New York river data
Description
This is a data set selected from book "Regression by examples" by Samprit Chatterjee and Ali S.Hadi.
Usage
data(river)
Format
In a 1976 study exploring the relationship between water quality and land use, Haith (1976) obtainedthe measurements on 20 river basins in New York State. A question of interest here is how theland use around a river basin contributes to the water pollution as measured by the mean nitrogenconcentration (mg/liter).
River character River namesAgr numeric percentage of land area currently in agricultural useForest numeric percentage of forest landRsdntial numeric percentage of land area in residential useComIndl numeric percentage of land area either in commercial or industrial useNitrogen numeric mean nitrogen concentration
References
"Regression analysis by example" by Samprit Chatterjee and Ali S. Hadi, Wiley. ISBN: 978-0-471-74696-6.
scb To compute the simultaneous confidence bands.
Description
To compute the simultaneous confidence bands.
Usage
scb(x,alpha=0.05)
Arguments
x An R object. Currently, only ’edf’ objects are supported.alpha Significance level. Default 0.05 for a 95 percent confidence level.
20 vif
Author(s)
B. Wang <[email protected]>
See Also
edf.
Examples
x = rnorm(100)(out = edf(x))plot(out)(out2= scb(out))lines(out2)
supervisor Supervisor performance data
Description
This is a data set selected from book "Regression by examples" by Samprit Chatterjee and Ali S.Hadi.
Usage
data(supervisor)
Format
A data frame with 28829 observations on 8 variables.
Y numeric overall rating of jon being done by supervisorX1--X6 numeric average score for six different aspects
References
"Regression analysis by example" by Samprit Chatterjee and Ali S. Hadi, Wiley. ISBN: 978-0-471-74696-6.
vif Variance Inflation Factors
vif 21
Description
Calculates variance-inflation and generalized variance-inflation factors for linear and generalizedlinear models.
Usage
vif(object, ...)
## S3 method for class ’lm’vif(object, ...)
Arguments
object an object that inherits from class lm, such as an lm or glm object.
... not used.
Details
If all terms in an unweighted linear model have 1 df, then the usual variance-inflation factors arecalculated.
If any terms in an unweighted linear model have more than 1 df, then generalized variance-inflationfactors (Fox and Monette, 1992) are calculated. These are interpretable as the inflation in size ofthe confidence ellipse or ellipsoid for the coefficients of the term in comparison with what wouldbe obtained for orthogonal data.
The generalized vifs are invariant with respect to the coding of the terms in the model (as long asthe subspace of the columns of the model matrix pertaining to each term is invariant). To adjust forthe dimension of the confidence ellipsoid, the function also prints GV IF 1/(2×df) where df is thedegrees of freedom associated with the term.
Through a further generalization, the implementation here is applicable as well to other sorts ofmodels, in particular weighted linear models and generalized linear models, that inherit from classlm.
Value
A vector of vifs, or a matrix containing one row for each term in the model, and columns for theGVIF, df, and GV IF 1/(2×df).
Author(s)
Henric Nilsson and John Fox <[email protected]>
References
Fox, J. and Monette, G. (1992) Generalized collinearity diagnostics. JASA, 87, 178–183.
Fox, J. (2008) Applied Regression Analysis and Generalized Linear Models, Second Edition. Sage.
Fox, J. and Weisberg, S. (2011) An R Companion to Applied Regression, Second Edition, Sage.
22 white.test
Examples
data(edu75)lm0 = lm(Y~X1+X2+X3, data=edu75)vif(lm0)
white.test White test of constant variance
Description
Perform a test to check the common variance assumption for a linear regression model.
Usage
white.test(lmobj)
Arguments
lmobj A fitted model
Details
In this function, we check constant variance assmptions of the error terms.
Value
A list with class "htest" containing the following components:
statistic the value of the test statistic.
p.value the p-value of the test.
parameter degrees of freedom.
method a character string indicating what type of test was performed.
data.name a character string giving the name(s) of the data.
References
White test, From Wikipedia, the free encyclopedia.
Examples
data(river)lm0 = lm(Nitrogen~Agr+Forest+Rsdntial+ComIndl, data=river)white.test(lm0)
wls 23
wls Weighted least squares estimate by groups
Description
Weighted least squares estimate by groups.
Usage
wls(lmobj,group)
Arguments
lmobj An R object by fitting an OLS model to a data set.
group used to cluster the data. Can be a factor or a numerical vector.
Value
output the updated regressionn model with WLS.
Author(s)
B. Wang <[email protected]>
See Also
residual.plot.
Examples
data(edu75)lm0 = lm(Y~X1+X2+X3, data=edu75)wls(lm0,group=edu75$Region)
Index
∗Topic datasetsbirth, 3edu75, 8river, 19supervisor, 20
∗Topic htestbptest, 4dw.test, 5model.check, 13oddsratio, 15white.test, 22
∗Topic modelsld50.logit, 10ld50.logitfit, 10
∗Topic regressionac, 2ld50.logit, 10ld50.logitfit, 10vif, 20
∗Topic statsbstats, 5edf, 7influential.plot, 8lm.ci, 11model.test, 14predictor.plot, 17residual.plot, 18scb, 19wls, 23
∗Topic testmediation.test, 12
ac, 2
birth, 3bptest, 4bstats, 5
chisq.test, 16
dw.test, 5
edf, 7, 20edu75, 8
fisher.test, 16
influential.plot, 8, 18
ld50.logit, 10ld50.logitfit, 10lines.glm.dose (ld50.logit), 10lines.scb (scb), 19lm, 7lm.ci, 11, 14
mediation.test, 12model.check, 13model.test, 11, 14
oddsratio, 15
plot.edf (edf), 7plot.glm.dose (ld50.logit), 10plot.scb (scb), 19predictor.plot, 17print.edf (edf), 7print.glm.dose (ld50.logit), 10print.odds (oddsratio), 15print.scb (scb), 19
residual.plot, 9, 17, 18, 23river, 19
scb, 8, 19supervisor, 20
vif, 20
white.test, 22wls, 23
24