manual de sas

Upload: isur-edrey-papa

Post on 08-Aug-2018

270 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/22/2019 Manual de SAS

    1/53

    Overview of SAS/STAT Software

    SAS/STAT software, a component of the SAS System, provides comprehensivestatistical tools for a wide range of statistical analyses, including analysis ofvariance, regression, categorical data analysis, multivariate analysis, survival

    analysis, psychometric analysis, cluster analysis, and nonparametric analysis. Afew examples include mixed models, generalized linear models, correspondenceanalysis, and structural equations. The software is constantly being updated toreflect new methodology.

    In addition to 54 procedures for statistical analysis, SAS/STAT software alsoincludes the Market Research Application (MRA), a point-and-click interface tocommonly used techniques in market research. Also, the Analyst Application inthe SAS System provides convenient access to some of the more commonlyused statistical analyses in SAS/STAT software including analysis of variance,regression, logistic regression, mixed models, survival analysis, and some

    multivariate techniques.

    About This Book

    Since SAS/STAT software is a part of the SAS System, this book assumes thatyou are familiar with base SAS software and with the books SAS LanguageReference: Dictionary, SAS Language Reference: Concepts, and the SASProcedures Guide. It also assumes that you are familiar with basic SAS Systemconcepts such as creating SAS data sets with the DATA step and manipulatingSAS data sets with the procedures in base SAS software (for example, the

    PRINT and SORT procedures).

    Chapter Organization

    Typographical Conventions

    Options Used in Examples

    Chapter Organization

    This book is organized as follows.

    Chapter 1, this chapter, provides an overview of SAS/STAT software andsummarizes related information, products, and services. The next ten chaptersprovide some introduction to the broad areas covered by SAS/STAT software.Subsequent chapters describe the SAS procedures that make up SAS/STATsoftware. These chapters appear in alphabetical order by procedure name.

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/intro_sect3.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/intro_sect4.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/intro_sect5.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/intro_index.htm#stat_intro_introhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/intro_sect3.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/intro_sect4.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/intro_sect5.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/intro_index.htm#stat_intro_intro
  • 8/22/2019 Manual de SAS

    2/53

    The chapters documenting the SAS/STAT procedures are organized as follows:

    The Overviewsection provides a brief description of the analysis providedby the procedure.

    The Getting Startedsection provides a quick introduction to the procedure

    through a simple example. The Syntaxsection describes the SAS statements and options that control

    the procedure. The Details section discusses methodology and miscellaneous details.

    The Examples section contains examples using the procedure.

    The References section contains references for the methodology and

    examples for the procedure.

    Following the chapters on the SAS/STAT procedures, Appendix A, "Special SASData Sets," documents the special SAS data sets associated with SAS/STATprocedures.

    Typographical Conventions

    This book uses several type styles for presenting information. The following listexplains the meaning of the typographical conventions used in this book:

    romanis the standard type style used for most text.

    UPPERCASE ROMANis used for SAS statements, options, and other SAS language elements

    when they appear in the text. However, you can enter these elements inyour own SAS programs in lowercase, uppercase, or a mixture of the two.

    UPPERCASE BOLDis used in the "Syntax" sections' initial lists of SAS statements and options.

    obliqueis used for user-supplied values for options in the syntax definitions. In thetext, these values are written in italic.

    helveticais used for the names of variables and data sets when they appear in thetext.

    boldis used to refer to matrices and vectors.

    italicis used for terms that are defined in the text, for emphasis, and forreferences to publications.

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/appssds_index.htm#stat_appssds_appssdshttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/appssds_index.htm#stat_appssds_appssdshttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/appssds_index.htm#stat_appssds_appssdshttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/appssds_index.htm#stat_appssds_appssds
  • 8/22/2019 Manual de SAS

    3/53

    monospaceis used for example code. In most cases, this book uses lowercase typefor SAS code.

    Options Used in Examples

    Output of Examples

    Most of the output shown in this book is produced with the following SAS Systemoptions:

    options linesize=80 pagesize=200 nonumber nodate;

    The template STATDOC.TPL is used to create the HTML output that appears inthe online (CD) version. A style template controls stylistic HTML elements suchas colors, fonts, and presentation attributes. The style template is specified in theODS HTML statement as follows:

    ODS HTML style=statdoc;

    If you run the examples, you may get slightly different output. This is a function ofthe SAS System options used and the precision used by your computer forfloating-point calculations.

    Graphics Options

    The examples that contain graphical output are created with a specific set ofoptions and symbol statements. The code you see in the examples creates thecolor graphics that appear in the online (CD) version of this book. A slightly

    different set of options and statements is used to create the black and whitegraphics that appear in the printed version of the book.

    If you run the examples, you may get slightly different results. This may occurbecause not all graphic options for color devices translate directly to black andwhite output formats. For complete information on SAS/GRAPH software andgraphics options, refer to SAS/GRAPH Software: Reference.

    The following GOPTIONS statement is used to create the online (color) versionof the graphic output.

    filename GSASFILE '';

    goptions gsfname=GSASFILE gsfmode =replace

    fileonly

    transparency dev = gif

    ftext = swiss lfactor = 1

    htext = 4.0pct htitle = 4.5pct

    hsize = 5.625in vsize = 3.5in

    noborder cback = white

  • 8/22/2019 Manual de SAS

    4/53

    horigin = 0in vorigin = 0in ;

    The following GOPTIONS statement is used to create the black and whiteversion of the graphic output, which appears in the printed version of the manual.

    filename GSASFILE '';

    goptions gsfname=GSASFILE gsfmode =replace

    gaccess = sasgaedt fileonly

    dev = pslepsf

    ftext = swiss lfactor = 1

    htext = 3.0pct htitle = 3.5pct

    hsize = 5.625in vsize = 3.5in

    border cback = white

    horigin = 0in vorigin = 0in ;

    In most of the online examples, the plot symbols are specified as follows:

    symbol1 value=dot color=white height=3.5pct;

    The SYMBOLn statements used in online examples order the symbol colors asfollows: white, yellow, cyan, green, orange, blue, and black.

    In the examples appearing in the printed manual, symbol statements specifyCOLOR=BLACK and order the plot symbols as follows: dot, square, triangle,circle, plus, x, diamond, and star.

    The %PLOTIT Macro

    Examples that use the %PLOTIT macro are generated by defining a specialmacro variable to specify graphics options. See Appendix B, "Using the%PLOTIT Macro," for details on the options specified in these examples.

    Where to Turn for More Information

    This section describes other sources of information about SAS/STAT software.

    Accessing the SAS/STAT Sample Library

    Online Help System

    SAS Institute Technical Support Services

    Introduction to Regression Procedures

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/appplm_index.htm#stat_appplm_appplmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/appplm_index.htm#stat_appplm_appplmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/intro_sect7.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/intro_sect8.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/intro_sect9.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/appplm_index.htm#stat_appplm_appplmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/appplm_index.htm#stat_appplm_appplmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/intro_sect7.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/intro_sect8.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/intro_sect9.htm
  • 8/22/2019 Manual de SAS

    5/53

    Overview

    Statistical Background

    References

    Overview

    This chapter reviews SAS/STAT software procedures that are used forregression analysis: CATMOD, GLM, LIFEREG, LOGISTIC, NLIN, ORTHOREG,PLS, PROBIT, REG, RSREG, and TRANSREG. The REG procedure providesthe most general analysis capabilities; the other procedures give morespecialized analyses. This chapter also briefly mentions several procedures inSAS/ETS software.

    Introduction

    Introductory Example

    General Regression: The REG Procedure

    Nonlinear Regression: The NLIN Procedure

    Response Surface Regression: The RSREG Procedure

    Partial Least Squares Regression: The PLS ProcedureRegression for Ill-conditioned Data: The ORTHOREG Procedure

    Logistic Regression: The LOGISTIC Procedure

    Regression With Transformations: The TRANSREG Procedure

    Regression Using the GLM, CATMOD, LOGISTIC, PROBIT, and

    LIFEREG Procedures

    Interactive Features in the CATMOD, GLM, and REG ProceduresIntroduction to Regression Procedures

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect1.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect13.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect20.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect2.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect3.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect4.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect5.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect6.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect7.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect8.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect9.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect10.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect12.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect1.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect13.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect20.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect2.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect3.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect4.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect5.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect6.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect7.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect8.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect9.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect10.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect12.htm
  • 8/22/2019 Manual de SAS

    6/53

    Introduction

    Many SAS/STAT procedures, each with special features, perform regressionanalysis. The following procedures perform at least one type of regressionanalysis:

    CATMODanalyzes data that can be represented by a contingency table. PROCCATMOD fits linear models to functions of response frequencies, and itcan be used for linear and logistic regression. The CATMOD procedure isdiscussed in detail in Chapter 4, "Introduction to Categorical Data AnalysisProcedures."

    GENMODfits generalized linear models. PROC GENMOD is especially suited forresponses with discrete outcomes, and it performs logistic regression and

    Poisson regression as well as fitting Generalized Estimating Equations forrepeated measures data. See Chapter 4, "Introduction to Categorical DataAnalysis Procedures," and Chapter 30, "The GENMOD Procedure," formore information.

    GLMuses the method of least squares to fit general linear models. In additionto many other analyses, PROC GLM can perform simple, multiple,polynomial, and weighted regression. PROC GLM has many of the sameinput/output capabilities as PROC REG, but it does not provide as manydiagnostic tools or allow interactive changes in the model or data. SeeChapter 3, "Introduction to Analysis-of-Variance Procedures," for a moredetailed overview of the GLM procedure.

    LIFEREGfits parametric models to failure-time data that may be right censored.These types of models are commonly used in survival analysis. SeeChapter 9, "Introduction to Survival Analysis Procedures," for a moredetailed overview of the LIFEREG procedure.

    LOGISTICfits logistic models for binomial and ordinal outcomes. PROC LOGISTICprovides a wide variety of model-building methods and computes

    numerous regression diagnostics. See Chapter 4, "Introduction toCategorical Data Analysis Procedures," for a brief comparison of PROCLOGISTIC with other procedures.

    NLINbuilds nonlinear regression models. Several different iterative methods areavailable.

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcathttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcathttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcathttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcathttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/genmod_index.htm#stat_genmod_genmodhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introanova_index.htm#stat_introanova_introanovahttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introsurv_index.htm#stat_introsurv_introsurvhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcathttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcathttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcathttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcathttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcathttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcathttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/genmod_index.htm#stat_genmod_genmodhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introanova_index.htm#stat_introanova_introanovahttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introsurv_index.htm#stat_introsurv_introsurvhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcathttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcat
  • 8/22/2019 Manual de SAS

    7/53

    ORTHOREGperforms regression using the Gentleman-Givens computational method.For ill-conditioned data, PROC ORTHOREG can produce more accurateparameter estimates than other procedures such as PROC GLM andPROC REG.

    PLSperforms partial least squares regression, principal componentsregression, and reduced rank regression, with cross validation for thenumber of components.

    PROBITperforms probit regression as well as logistic regression and ordinallogistic regression. The PROBIT procedure is useful when the dependentvariable is either dichotomous or polychotomous and the independentvariables are continuous.

    REGperforms linear regression with many diagnostic capabilities, selectsmodels using one of nine methods, produces scatter plots of raw data andstatistics, highlights scatter plots to identify particular observations, andallows interactive changes in both the regression model and the data usedto fit the model.

    RSREGbuilds quadratic response-surface regression models. PROC RSREGanalyzes the fitted response surface to determine the factor levels ofoptimum response and performs a ridge analysis to search for the region

    of optimum response.

    TRANSREGfits univariate and multivariate linear models, optionally with spline andother nonlinear transformations. Models include ordinary regression and

    ANOVA, multiple and multivariate regression, metric and nonmetricconjoint analysis, metric and nonmetric vector and ideal point preferencemapping, redundancy analysis, canonical correlation, and responsesurface regression.

    Several SAS/ETS procedures also perform regression. The following procedures

    are documented in the SAS/ETS User's Guide.

    AUTOREGimplements regression models using time-series data where the errors areautocorrelated.

    PDLREGperforms regression analysis with polynomial distributed lags.

  • 8/22/2019 Manual de SAS

    8/53

    SYSLINhandles linear simultaneous systems of equations, such as econometricmodels.

    MODEL

    handles nonlinear simultaneous systems of equations, such aseconometric models.

    Previous Next

    Introduction to Regression Procedures

    Introductory Example

    Regression analysis is the analysis of the relationship between one variable andanother set of variables. The relationship is expressed as an equation that

    predicts a response variable (also called a dependent variable orcriterion) from afunction ofregressor variables (also called independent variables, predictors,explanatory variables, factors, orcarriers) andparameters. The parameters areadjusted so that a measure of fit is optimized. For example, the equation for theith observation might be

    where yi is the response variable,xi is a regressor variable, and areunknown parameters to be estimated, and is an error term.

    You might use regression analysis to find out how well you can predict a child'sweight if you know that child's height. Suppose you collect your data bymeasuring heights and weights of 19 school children. You want to estimate the

    intercept and the slope of a line described by the equation

    where

    Weightis the response variable.

    ,are the unknown parameters.

    Heightis the regressor variable.

    is the unknown error.

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect2.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect4.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect2.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect2.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect4.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect4.htm
  • 8/22/2019 Manual de SAS

    9/53

    The data are included in the following program. The results are displayed inFigure 2.1 and Figure 2.2.

    data class;

    input Name $ Height Weight Age;datalines;

    Alfred 69.0 112.5 14

    Alice 56.5 84.0 13

    Barbara 65.3 98.0 13

    Carol 62.8 102.5 14

    Henry 63.5 102.5 14

    James 57.3 83.0 12

    Jane 59.8 84.5 12

    Janet 62.5 112.5 15

    Jeffrey 62.5 84.0 13

    John 59.0 99.5 12

    Joyce 51.3 50.5 11

    Judy 64.3 90.0 14Louise 56.3 77.0 12

    Mary 66.5 112.0 15

    Philip 72.0 150.0 16

    Robert 64.8 128.0 12

    Ronald 67.0 133.0 15

    Thomas 57.5 85.0 11

    William 66.5 112.0 15

    ;

    symbol1 v=dot c=blue height=3.5pct;

    proc reg;

    model Weight=Height;

    plot Weight*Height/cframe=ligr;

    run;

    The REG Procedure

    Model: MODEL1

    Dependent Variable: Weight

    Analysis of Variance

    Source DF

    Sum of

    Squares

    Mean

    Square F Value Pr > F

    Model 1 7193.24912 7193.24912 57.08

  • 8/22/2019 Manual de SAS

    10/53

    Root MSE 11.22625 R-Square 0.7705

    Dependent Mean 100.02632 Adj R-Sq 0.7570

    Coeff Var 11.22330

    Parameter Estimates

    Variable DFParameter

    EstimateStandard

    Error t Value Pr > |t|

    Intercept 1 -143.02692 32.27459 -4.43 0.0004

    Height 1 3.89903 0.51609 7.55

  • 8/22/2019 Manual de SAS

    11/53

    Weight = -143.0 + 3.9* Height

    Regression is often used in an exploratory fashion to look for empiricalrelationships, such as the relationship between Height and Weight. In thisexample, Height is not the cause of Weight. You would need a controlled

    experiment to confirm scientifically the relationship. See the "Comments onInterpreting Regression Statistics" section for more information.

    The method most commonly used to estimate the parameters is to minimize thesum of squares of the differences between the actual response value and thevalue predicted by the equation. The estimates are called least-squaresestimates, and the criterion value is called the error sum of squares

    where b0 and b1 are the estimates of and that minimize SSE.

    For a general discussion of the theory of least-squares estimation of linearmodels and its application to regression and analysis of variance, refer to one ofthe applied regression texts, including Draper and Smith (1981), Daniel andWood (1980), Johnston (1972), and Weisberg (1985).

    SAS/STAT regression procedures produce the following information for a typicalregression analysis.

    parameter estimates using the least-squares criterion estimates of the variance of the error term

    estimates of the variance or standard deviation of the sampling distributionof the parameter estimates

    tests of hypotheses about the parameters

    SAS/STAT regression procedures can produce many other specializeddiagnostic statistics, including

    collinearity diagnostics to measure how strongly regressors are related to

    other regressors and how this affects the stability and variance of the

    estimates (REG) influence diagnostics to measure how each individual observation

    contributes to determining the parameter estimates, the SSE, and thefitted values (LOGISTIC, REG, RSREG)

    lack-of-fit diagnostics that measure the lack of fit of the regression model

    by comparing the error variance estimate to another pure error variancethat is not dependent on the form of the model (CATMOD, PROBIT,RSREG)

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htm#stat_introreg_intregcirshttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htm#stat_introreg_intregcirshttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htm#stat_introreg_intregcirshttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htm#stat_introreg_intregcirs
  • 8/22/2019 Manual de SAS

    12/53

    diagnostic scatter plots that check the fit of the model and highlighted

    scatter plots that identify particular observations or groups of observations(REG)

    predicted and residual values, and confidence intervals for the mean and

    for an individual value (GLM, LOGISTIC, REG)

    time-series diagnostics for equally spaced time-series data that measurehow much errors may be related across neighboring observations. Thesediagnostics can also measure functional goodness of fit for data sorted byregressor or response variables (REG, SAS/ETS procedures).

    Previous Next Top

    Copyright 2002 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

    Previous Next

    Introduction to Regression Procedures

    General Regression: The REG Procedure

    The REG procedure is a general-purpose procedure for regression that

    handles multiple regression models

    provides nine model-selection methods allows interactive changes both in the model and in the data used to fit the

    model

    allows linear equality restrictions on parameters tests linear hypotheses and multivariate hypotheses

    produces collinearity diagnostics, influence diagnostics, and partial

    regression leverage plots saves estimates, predicted values, residuals, confidence limits, and other

    diagnostic statistics in output SAS data sets generates plots of data and of various statistics "paints" or highlights scatter plots to identify particular observations or

    groups of observations uses, optionally, correlations or crossproducts for input

    Model-selection Methods in PROC REGThe nine methods of model selection implemented in PROC REG areNONE

    no selection. This method is the default and uses the full model given inthe MODEL statement to fit the linear regression.

    FORWARD

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect2.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect4.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect3.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect3.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect5.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect2.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect2.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect4.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect4.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect3.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect3.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect3.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect3.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect5.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect5.htm
  • 8/22/2019 Manual de SAS

    13/53

    forward selection. This method starts with no variables in the model andadds variables one by one to the model. At each step, the variable addedis the one that maximizes the fit of the model. You can also specify groupsof variables to treat as a unit during the selection process. An optionenables you to specify the criterion for inclusion.

    BACKWARDbackward elimination. This method starts with a full model and eliminatesvariables one by one from the model. At each step, the variable with thesmallest contribution to the model is deleted. You can also specify groupsof variables to treat as a unit during the selection process. An optionenables you to specify the criterion for exclusion.

    STEPWISEstepwise regression, forward and backward. This method is a modificationof the forward-selection method in that variables already in the model donot necessarily stay there. You can also specify groups of variables totreat as a unit during the selection process. Again, options enable you tospecify criteria for entry into the model and for remaining in the model.

    MAXRmaximum R2 improvement. This method tries to find the best one-variablemodel, the best two-variable model, and so on. The MAXR method differsfrom the STEPWISE method in that many more models are evaluated withMAXR, which considers all switches before making any switch. TheSTEPWISE method may remove the "worst" variable without consideringwhat the "best" remaining variable might accomplish, whereas MAXRwould consider what the "best" remaining variable might accomplish.

    Consequently, MAXR typically takes much longer to run than STEPWISE.

    MINRminimum R2 improvement. This method closely resembles MAXR, but theswitch chosen is the one that produces the smallest increase in R2.

    RSQUAREfinds a specified number of models having the highest R2 in each of arange of model sizes.

    CP

    finds a specified number of models with the lowest Cp within a range ofmodel sizes.

    ADJRSQfinds a specified number of models having the highest adjusted R2 within arange of model sizes.

    Previous Next Top

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect3.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect5.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect4.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect3.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect3.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect5.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect5.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect4.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect4.htm#topofpage
  • 8/22/2019 Manual de SAS

    14/53

    Copyright 2002 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

    Previous Next

    Introduction to Regression Procedures

    Nonlinear Regression: The NLIN Procedure

    The NLIN procedure implements iterative methods that attempt to find least-squares estimates for nonlinear models. The default method is Gauss-Newton,although several other methods, such as Newton and Marquardt, are available.You must specify parameter names, starting values, and expressions for themodel. All necessary analytical derivatives are calculated automatically for you.Grid search is also available to select starting values for the parameters. Sincenonlinear models are often difficult to estimate, PROC NLIN may not always findthe globally optimal least-squares estimates.

    Previous Next Top

    Copyright 2002 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

    Previous Next

    Introduction to Regression Procedures

    Response Surface Regression: The RSREG Procedure

    The RSREG procedure fits a quadratic response-surface model, which is usefulin searching for factor values that optimize a response. The following features inPROC RSREG make it preferable to other regression procedures for analyzingresponse surfaces:

    automatic generation of quadratic effects

    a lack-of-fit test

    solutions for critical values of the surface

    eigenvalues of the associated quadratic form a ridge analysis to search for the direction of optimum response

    Previous Next Top

    Copyright 2002 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

    Previous Next

    Introduction to Regression Procedures

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect4.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect6.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect4.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect6.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect5.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect5.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect7.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect5.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect7.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect6.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect6.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect8.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect4.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect4.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect6.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect6.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect4.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect4.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect6.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect6.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect5.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect5.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect5.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect5.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect7.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect7.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect5.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect5.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect7.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect7.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect6.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect6.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect6.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect6.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect8.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect8.htm
  • 8/22/2019 Manual de SAS

    15/53

    Partial Least Squares Regression: The PLS Procedure

    The PLS procedure fits models using any one of a number of linear predictive

    methods, includingpartial least squares (PLS). Ordinary least-squaresregression, as implemented in SAS/STAT procedures such as PROC GLM andPROC REG, has the single goal of minimizing sample response prediction error,seeking linear functions of the predictors that explain as much variation in eachresponse as possible. The techniques implemented in the PLS procedure havethe additional goal of accounting for variation in the predictors, under theassumption that directions in the predictor space that are well sampled shouldprovide better prediction fornewobservations when the predictors are highlycorrelated. All of the techniques implemented in the PLS procedure work byextracting successive linear combinations of the predictors, called factors (alsocalled components orlatent vectors), which optimally address one or both of

    these two goals -explaining response variation and explaining predictor variation.In particular, the method of partial least squares balances the two objectives,seeking for factors that explain both response and predictor variation.

    Previous Next Top

    Copyright 2002 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

    Previous Next

    Introduction to Regression Procedures

    Regression for Ill-conditioned Data: The ORTHOREG Procedure

    The ORTHOREG procedure performs linear least-squares regression using theGentleman-Givens computational method, and it can produce more accurateparameter estimates for ill-conditioned data. PROC GLM and PROC REGproduce very accurate estimates for most problems. However, if you have veryill-conditioned data, consider using the ORTHOREG procedure. The collinearitydiagnostics in PROC REG can help you to determine whether PROCORTHOREG would be useful.

    Previous Next Top

    Copyright 2002 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

    Previous Next

    Introduction to Regression Procedures

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect6.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect8.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect7.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect7.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect9.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect7.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect9.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect8.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect8.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect10.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect6.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect6.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect8.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect8.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect7.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect7.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect7.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect7.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect9.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect9.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect7.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect7.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect9.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect9.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect8.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect8.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect8.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect8.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect10.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect10.htm
  • 8/22/2019 Manual de SAS

    16/53

    Logistic Regression: The LOGISTIC Procedure

    The LOGISTIC procedure fits logistic models, in which the response can be

    either dichotomous or polychotomous. Stepwise model selection is available.You can request regression diagnostics, and predicted and residual values.

    Previous Next Top

    Copyright 2002 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

    Previous Next

    Introduction to Regression Procedures

    Regression With Transformations: The TRANSREG Procedure

    The TRANSREG procedure can fit many standard linear models. In addition,PROC TRANSREG can find nonlinear transformations of data and fit a linearmodel to the transformed variables. This is in contrast to PROC REG and PROCGLM, which fit linear models to data, or PROC NLIN, which fits nonlinear modelsto data. The TRANSREG procedure fits many types of linear models, including

    ordinary regression and ANOVA

    metric and nonmetric conjoint analysis

    metric and nonmetric vector and ideal point preference mapping simple, multiple, and multivariate regression with variable transformations redundancy analysis with variable transformations

    canonical correlation analysis with variable transformations

    response surface regression with variable transformations

    Previous Next Top

    Copyright 2002 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

    Previous Next

    Introduction to Regression Procedures

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect8.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect10.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect9.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect9.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect9.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect10.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect10.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect12.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect8.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect8.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect10.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect10.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect9.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect9.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect9.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect9.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect9.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect9.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect10.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect10.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect10.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect10.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect12.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect12.htm
  • 8/22/2019 Manual de SAS

    17/53

    Regression Using the GLM, CATMOD, LOGISTIC, PROBIT, andLIFEREG Procedures

    The GLM procedure fits general linear models to data, and it can performregression, analysis of variance, analysis of covariance, and many other

    analyses. The following features for regression distinguish PROC GLM fromother regression procedures:

    direct specification of polynomial effects

    ease of specifying categorical effects (PROC GLM automatically

    generates dummy variables for class variables)

    Most of the statistics based on predicted and residual values that are available inPROC REG are also available in PROC GLM. However, PROC GLM does notproduce collinearity diagnostics, influence diagnostics, or scatter plots. Inaddition, PROC GLM allows only one model and fits the full model.

    See Chapter 3, "Introduction to Analysis-of-Variance Procedures," and Chapter31, "The GLM Procedure," for more details.

    The CATMOD procedure can perform linear regression and logistic regression ofresponse functions for data that can be represented in a contingency table. SeeChapter 4, "Introduction to Categorical Data Analysis Procedures," and Chapter21, "The CATMOD Procedure," for more details.

    The LOGISTIC and PROBIT procedures can perform logistic and ordinal logisticregression. See Chapter 4, "Introduction to Categorical Data Analysis

    Procedures," Chapter 40, "The LOGISTIC Procedure," and Chapter 57, "ThePROBIT Procedure," for additional details.

    The LIFEREG procedure is useful in fitting equations to data that may be right-censored. See Chapter 9, "Introduction to Survival Analysis Procedures," andChapter 37, "The LIFEREG Procedure," for more details.

    Previous Next Top

    Copyright 2002 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

    Previous Next

    Introduction to Regression Procedures

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introanova_index.htm#stat_introanova_introanovahttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/glm_index.htm#stat_glm_glmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcathttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcathttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/catmod_index.htm#stat_catmod_catmodhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcathttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcathttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/logistic_index.htm#stat_logistic_logistichttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/probit_index.htm#stat_probit_probithttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/probit_index.htm#stat_probit_probithttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introsurv_index.htm#stat_introsurv_introsurvhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introsurv_index.htm#stat_introsurv_introsurvhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/lifereg_index.htm#stat_lifereg_lifereghttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect10.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect12.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect13.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introanova_index.htm#stat_introanova_introanovahttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/glm_index.htm#stat_glm_glmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcathttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/catmod_index.htm#stat_catmod_catmodhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcathttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introcat_index.htm#stat_introcat_introcathttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/logistic_index.htm#stat_logistic_logistichttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/probit_index.htm#stat_probit_probithttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/probit_index.htm#stat_probit_probithttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introsurv_index.htm#stat_introsurv_introsurvhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/lifereg_index.htm#stat_lifereg_lifereghttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect10.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect10.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect12.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect12.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect13.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect13.htm
  • 8/22/2019 Manual de SAS

    18/53

    Interactive Features in the CATMOD, GLM, and REG Procedures

    The CATMOD, GLM, and REG procedures do not stop after processing a RUNstatement. More statements can be submitted as a continuation of the previousstatements. Many new features in these procedures are useful to request after

    you have reviewed the results from previous statements. The procedures stop ifa DATA step or another procedure is requested or if a QUIT statement issubmitted.

    Previous Next Top

    Copyright 2002 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

    Previous Next

    Introduction to Regression Procedures

    Statistical Background

    The rest of this chapter outlines the way many SAS/STAT regression procedurescalculate various regression quantities. Exceptions and further details aredocumented with individual procedures.

    Linear Models

    Parameter Estimates and Associated Statistics

    Comments on Interpreting Regression Statistics

    Predicted and Residual Values

    Testing Linear Hypotheses

    Multivariate Tests

    Previous Next Top

    Copyright 2002 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

    Previous Next

    Introduction to Regression Procedures

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect13.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect12.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect12.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect14.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect14.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect15.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect17.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect19.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect12.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect14.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect13.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect13.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect15.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect11.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect13.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect13.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect12.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect12.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect12.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect12.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect14.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect14.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect14.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect15.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect17.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect19.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect12.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect12.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect14.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect14.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect13.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect13.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect13.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect13.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect15.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect15.htm
  • 8/22/2019 Manual de SAS

    19/53

    Linear Models

    In matrix algebra notation, a linear model is written as

    where X is the n kdesign matrix (rows are observations and columns are the

    regressors), is the k1 vector of unknown parameters, and is the n 1 vectorof unknown errors. The first column ofX is usually a vector of 1s used inestimating the intercept term.

    The statistical theory of linear models is based on strict classical assumptions.Ideally, the response is measured with all the factors controlled in anexperimentally determined environment. If you cannot control the factorsexperimentally, some tests must be interpreted as being conditional on theobserved values of the regressors.

    Other assumptions are that

    the form of the model is correct (all important explanatory variables have

    been included) regressor variables are measured without error the expected value of the errors is zero

    the variance of the error (and thus the dependent variable) for the ith

    observation is , where wi is a known weight factor. Usually, wi=1 for

    all iand thus is the common, constant variance. the errors are uncorrelated across observations

    When hypotheses are tested, the additional assumption is made that the errorsare normally distributed.

    Statistical Model

    If the model satisfies all the necessary assumptions, the least-squares estimatesare the best linear unbiased estimates (BLUE). In other words, the estimateshave minimum variance among the class of estimators that are unbiased and arelinear functions of the responses. If the additional assumption that the error termis normally distributed is also satisfied, then

    the statistics that are computed have the proper sampling distributions forhypothesis testing parameter estimates are normally distributed various sums of squares are distributed proportional to chi-square, at least

    under proper hypotheses ratios of estimates to standard errors are distributed as Student's tunder

    certain hypotheses

  • 8/22/2019 Manual de SAS

    20/53

    appropriate ratios of sums of squares are distributed as Funder certain

    hypotheses

    When regression analysis is used to model data that do not meet theassumptions, the results should be interpreted in a cautious, exploratory fashion.

    The significance probabilities under these circumstances are unreliable.

    Box (1966) and Mosteller and Tukey (1977, chaps. 12 and 13) discuss theproblems that are encountered with regression data, especially when the dataare not under experimental control.

    Previous Next Top

    Copyright 2002 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

    Previous Next

    Introduction to Regression Procedures

    Parameter Estimates and Associated Statistics

    Parameter estimates are formed using least-squares criteria by solving thenormal equations

    (X' WX)b = X' Wy

    for the parameter estimates b, where Wis a diagonal matrix with the observedweights on the diagonal, yielding

    b = (X'WX)-1X'Wy

    Assume for the present that X'WX has full column rank k(this assumption is

    relaxed later). The variance of the error is estimated by the mean square error

    where xi is the ith row of regressors. The parameter estimates are unbiased:

    The covariance matrix of the estimates is

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect13.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect15.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect14.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect14.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect13.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect13.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect15.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect15.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect14.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect14.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect14.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect14.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htm
  • 8/22/2019 Manual de SAS

    21/53

    The estimate of the covariance matrix is obtained by replacing with itsestimate, s2, in the formula preceding:

    COVB = (X' WX)-1s2

    The correlations of the estimates are derived by scaling to 1s on the diagonal.

    Let

    Standard errors of the estimates are computed using the equation

    where (X' WX)-1ii is the ith diagonal element of(X' WX)-1. The ratio

    t= [(bi)/( STDERR(bi))]

    is distributed as Student's tunder the hypothesis that is zero. Regressionprocedures display the tratio and the significance probability, which is the

    probability under the hypothesis of a larger absolute tvalue than was

    actually obtained. When the probability is less than some small level, the event isconsidered so unlikely that the hypothesis is rejected.

    Type I SS and Type II SS measure the contribution of a variable to the reductionin SSE. Type I SS measure the reduction in SSE as that variable is entered intothe model in sequence. Type II SS are the increment in SSE that results fromremoving the variable from the full model. Type II SS are equivalent to the TypeIII and Type IV SS reported in the GLM procedure. If Type II SS are used in thenumerator of an Ftest, the test is equivalent to the ttest for the hypothesis thatthe parameter is zero. In polynomial models, Type I SS measure the contributionof each polynomial term after it is orthogonalized to the previous terms in the

    model. The four types of SS are described in Chapter 11, "The Four Types ofEstimable Functions."

    Standardized estimates are defined as the estimates that result when allvariables are standardized to a mean of 0 and a variance of 1. Standardizedestimates are computed by multiplying the original estimates by the samplestandard deviation of the regressor variable and dividing by the sample standarddeviation of the dependent variable.

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introglmest_index.htm#stat_introglmest_introglmesthttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introglmest_index.htm#stat_introglmest_introglmesthttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introglmest_index.htm#stat_introglmest_introglmesthttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introglmest_index.htm#stat_introglmest_introglmest
  • 8/22/2019 Manual de SAS

    22/53

    R2 is an indicator of how much of the variation in the data is explained by themodel. It is defined as

    R2 = 1 - [ SSE/ TSS]

    where SSE is the sum of squares for error and TSS is the corrected total sum ofsquares. The Adjusted R2 statistic is an alternative to R2 that is adjusted for thenumber of parameters in the model. This is calculated as

    ADJRSQ = 1 - [(n - i)/(n -p)] (1 - R2 )

    where n is the number of observations used to fit the model, p is the number ofparameters in the model (including the intercept), and iis 1 if the model includesan intercept term, and 0 otherwise.

    Tolerances and variance inflation factors measure the strength of

    interrelationships among the regressor variables in the model. If all variables areorthogonal to each other, both tolerance and variance inflation are 1. If a variableis very closely related to other variables, the tolerance goes to 0 and the varianceinflation gets very large. Tolerance (TOL) is 1 minus the R2 that results from theregression of the other variables in the model on that regressor. Varianceinflation (VIF) is the diagonal of(X' WX)-1 if(X' WX) is scaled to correlation form.The statistics are related as

    VIF = [1/ TOL]

    Models Not of Full Rank

    If the model is not full rank, then a generalized inverse can be used to solve thenormal equations to minimize the SSE:

    b = (X' WX)- X' Wy

    However, these estimates are not unique since there are an infinite number ofsolutions using different generalized inverses. PROC REG and other regressionprocedures choose a nonzero solution for all variables that are linearlyindependent of previous variables and a zero solution for other variables. Thiscorresponds to using a generalized inverse in the normal equations, and theexpected values of the estimates are the Hermite normal form ofX' WXmultiplied by the true parameters:

    Degrees of freedom for the zeroed estimates are reported as zero. Thehypotheses that are not testable have ttests displayed as missing. The messagethat the model is not full rank includes a display of the relations that exist in thematrix.

  • 8/22/2019 Manual de SAS

    23/53

    Previous Next Top

    Copyright 2002 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

    Previous Next

    Introduction to Regression Procedures

    Comments on Interpreting Regression Statistics

    In most applications, regression models are merely useful approximations.Reality is often so complicated that you cannot know what the true model is. Youmay have to choose a model more on the basis of what variables can bemeasured and what kinds of models can be estimated than on a rigorous theorythat explains how the universe really works. However, even in cases where

    theory is lacking, a regression model may be an excellent predictor of theresponse if the model is carefully formulated from a large sample. Theinterpretation of statistics such as parameter estimates may nevertheless behighly problematical.

    Statisticians usually use the word "prediction" in a technical sense. Prediction inthis sense does not refer to "predicting the future" (statisticians call thatforecasting) but rather to guessing the response from the values of theregressors in an observation taken under the same circumstances as the samplefrom which the regression equation was estimated. If you developed a regressionmodel for predicting consumer preferences in 1958, it may not give very good

    predictions in 1988 no matter how well it did in 1958. If it is the future you want topredict, your model must include whatever relevant factors may change overtime. If the process you are studying does in fact change over time, you musttake observations at several, perhaps many, different times. Analysis of suchdata is the province of SAS/ETS procedures such as AUTOREG andSTATESPACE. Refer to the SAS/ETS User's Guide for more information onthese procedures.

    The comments in the rest of this section are directed toward linear least-squaresregression. Nonlinear regression and non-least-squares regression oftenintroduce further complications. For more detailed discussions of theinterpretation of regression statistics, see Darlington (1968), Mosteller and Tukey(1977), Weisberg (1985), and Younger (1979).

    Interpreting Parameter Estimates from a Controlled Experiment

    Parameter estimates are easiest to interpret in a controlled experiment in whichthe regressors are manipulated independently of each other. In a well-designedexperiment, such as a randomized factorial design with replications in each cell,you can use lack-of-fit tests and estimates of the standard error of prediction to

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect14.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect15.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect15.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect17.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect14.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect14.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect15.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect15.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect15.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect15.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect17.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect17.htm
  • 8/22/2019 Manual de SAS

    24/53

    determine whether the model describes the experimental process with adequateprecision. If so, a regression coefficient estimates the amount by which the meanresponse changes when the regressor is changed by one unit while all the otherregressors are unchanged. However, if the model involves interactions orpolynomial terms, it may not be possible to interpret individual regression

    coefficients. For example, if the equation includes both linear and quadratic termsfor a given variable, you cannot physically change the value of the linear termwithout also changing the value of the quadratic term. Sometimes it may bepossible to recode the regressors, for example by using orthogonal polynomials,to make the interpretation easier.

    If the nonstatistical aspects of the experiment are also treated with sufficient care(including such things as use of placebos and double blinds), then you can stateconclusions in causal terms; that is, this change in a regressor causes thatchange in the response. Causality can never be inferred from statistical resultsalone or from an observational study.

    If the model that you fit is not the true model, then the parameter estimates maydepend strongly on the particular values of the regressors used in theexperiment. For example, if the response is actually a quadratic function of aregressor but you fit a linear function, the estimated slope may be a largenegative value if you use only small values of the regressor, a large positivevalue if you use only large values of the regressor, or near zero if you use bothlarge and small regressor values. When you report the results of an experiment,it is important to include the values of the regressors. It is also important to avoidextrapolating the regression equation outside the range of regressors in thesample.

    Interpreting Parameter Estimates from an Observational Study

    In an observational study, parameter estimates can be interpreted as theexpected difference in response of two observations that differ by one unit on theregressor in question and that have the same values for all other regressors. Youcannot make inferences about "changes" in an observational study since youhave not actually changed anything. It may not be possible even in principle tochange one regressor independently of all the others. Neither can you drawconclusions about causality without experimental manipulation.

    If you conduct an observational study and if you do not know the true form of the

    model, interpretation of parameter estimates becomes even more convoluted. Acoefficient must then be interpreted as an average over the sampled populationof expected differences in response of observations that differ by one unit on onlyone regressor. The considerations that are discussed under controlledexperiments for which the true model is not known also apply.

  • 8/22/2019 Manual de SAS

    25/53

    Comparing Parameter Estimates

    Two coefficients in the same model can be directly compared only if theregressors are measured in the same units. You can make any coefficient largeor small just by changing the units. If you convert a regressor from feet to miles,the parameter estimate is multiplied by 5280.

    Sometimes standardized regression coefficients are used to compare the effectsof regressors measured in different units. Standardizing the variables effectivelymakes the standard deviation the unit of measurement. This makes sense only ifthe standard deviation is a meaningful quantity, which usually is the case only ifthe observations are sampled from a well-defined population. In a controlledexperiment, the standard deviation of a regressor depends on the values of theregressor selected by the experimenter. Thus, you can make a standardizedregression coefficient large by using a large range of values for the regressor.

    In some applications you may be able to compare regression coefficients in

    terms of the practical range of variation of a regressor. Suppose that eachindependent variable in an industrial process can be set to values only within acertain range. You can rescale the variables so that the smallest possible valueis zero and the largest possible value is one. Then the unit of measurement foreach regressor is the maximum possible range of the regressor, and theparameter estimates are comparable in that sense. Another possibility is to scalethe regressors in terms of the cost of setting a regressor to a particular value, socomparisons can be made in monetary terms.

    Correlated Regressors

    In an experiment, you can often select values for the regressors such that the

    regressors are orthogonal (not correlated with each other). Orthogonal designshave enormous advantages in interpretation. With orthogonal regressors, theparameter estimate for a given regressor does not depend on which otherregressors are included in the model, although other statistics such as standarderrors andp-values may change.

    If the regressors are correlated, it becomes difficult to disentangle the effects ofone regressor from another, and the parameter estimates may be highlydependent on which regressors are used in the model. Two correlatedregressors may be nonsignificant when tested separately but highly significantwhen considered together. If two regressors have a correlation of 1.0, it is

    impossible to separate their effects.

    It may be possible to recode correlated regressors to make interpretation easier.For example, ifXand Yare highly correlated, they could be replaced in a linearregression byX+YandX-Ywithout changing the fit of the model or statistics forother regressors.

  • 8/22/2019 Manual de SAS

    26/53

    Errors in the Regressors

    If there is error in the measurements of the regressors, the parameter estimatesmust be interpreted with respect to the measured values of the regressors, notthe true values. A regressor may be statistically nonsignificant when measuredwith error even though it would have been highly significant if measured

    accurately.

    Probability Values (p-values)

    Probability values (p-values) do not necessarily measure the importance of aregressor. An important regressor can have a large (nonsignificant)p-value if thesample is small, if the regressor is measured over a narrow range, if there arelarge measurement errors, or if another closely related regressor is included inthe equation. An unimportant regressor can have a very small p-value in a largesample. Computing a confidence interval for a parameter estimate gives youmore useful information than just looking at thep-value, but confidence intervalsdo not solve problems of measurement errors in the regressors or highly

    correlated regressors.

    Thep-values are always approximations. The assumptions required to computeexactp-values are never satisfied in practice.

    Interpreting R2

    R2 is usually defined as the proportion of variance of the response that ispredictable from (that can be explained by) the regressor variables. It may be

    easier to interpret , which is approximately the factor by which thestandard error of prediction is reduced by the introduction of the regressor

    variables.

    R2 is easiest to interpret when the observations, including the values of both theregressors and response, are randomly sampled from a well-defined population.Nonrandom sampling can greatly distort R2. For example, excessively largevalues ofR2 can be obtained by omitting from the sample observations withregressor values near the mean.

    In a controlled experiment, R2 depends on the values chosen for the regressors.A wide range of regressor values generally yields a largerR2 than a narrowrange. In comparing the results of two experiments on the same variables but

    with different ranges for the regressors, you should look at the standard error ofprediction (root mean square error) rather than R2.

    Whether a given R2 value is considered to be large or small depends on thecontext of the particular study. A social scientist might consider an R2 of 0.30 tobe large, while a physicist might consider 0.98 to be small.

  • 8/22/2019 Manual de SAS

    27/53

    You can always get an R2 arbitrarily close to 1.0 by including a large number ofcompletely unrelated regressors in the equation. If the number of regressors isclose to the sample size, R2 is very biased. In such cases, the adjusted R2 andrelated statistics discussed by Darlington (1968) are less misleading.

    If you fit many different models and choose the model with the largest R2

    , all thestatistics are biased and thep-values for the parameter estimates are not valid.Caution must be taken with the interpretation ofR2 for models with no interceptterm. As a general rule, no-intercept models should be fit only when theoretical

    justification exists and the data appear to fit a no-intercept framework. The R2 inthose cases is measuring something different (refer to Kvalseth 1985).

    Incorrect Data Values

    All regression statistics can be seriously distorted by a single incorrect datavalue. A decimal point in the wrong place can completely change the parameterestimates, R2, and other statistics. It is important to check your data for outliers

    and influential observations. The diagnostics in PROC REG are particularlyuseful in this regard.

    Previous Next Top

    Copyright 2002 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

    Previous Next

    Introduction to Regression Procedures

    Previous Next

    Introduction to Regression Procedures

    Predicted and Residual Values

    After the model has been fit, predicted and residual values are usually calculatedand output. The predicted values are calculated from the estimated regressionequation; the residuals are calculated as actual minus predicted. Someprocedures can calculate standard errors of residuals, predicted mean values,and individual predicted values.

    Consider the ith observation where xi is the row of regressors, b is the vector ofparameter estimates, and s2 is the mean squared error.

    Let

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect15.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect17.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect15.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect15.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect17.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect17.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htm
  • 8/22/2019 Manual de SAS

    28/53

    where X is the design matrix for the observed data, xi is an arbitrary regressorvector (possibly but not necessarily a row ofX), Wis a diagonal matrix with theobserved weights on the diagonal, and wi is the weight corresponding to xi.

    Then

    The standard error of the individual (future) predicted value yi is

    If the predictor vectorxi corresponds to an observation in the analysis data, thenthe residual for that observation is defined as

    The ratio of the residual to its standard error, called the studentized residual, issometimes shown as

    STUDENTi = [( RESIDi)/( STDERR( RESIDi))]

    There are two kinds of confidence intervals for predicted values. One type ofconfidence interval is an interval for the mean value of the response. The other

    type, sometimes called aprediction orforecasting interval, is an interval for theactual value of a response, which is the mean value plus error.

    For example, you can construct for the ith observation a confidence interval thatcontains the true mean value of the response with probability . The upperand lower limits of the confidence interval for the mean value are

    where is the tabulated tstatistic with degrees of freedom equal to thedegrees of freedom for the mean squared error.

    The limits for the confidence interval for an actual individual response are

  • 8/22/2019 Manual de SAS

    29/53

    Influential observations are those that, according to various criteria, appear to

    have a large influence on the parameter estimates. One measure of influence,Cook's D, measures the change to the estimates that results from deleting eachobservation:

    where kis the number of parameters in the model (including the intercept). Formore information, refer to Cook (1977, 1979).

    Thepredicted residualfor observation iis defined as the residual for the ithobservation that results from dropping the ith observation from the parameterestimates. The sum of squares of predicted residual errors is called the PRESSstatistic:

    Previous Next Top

    Copyright 2002 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

    Previous Next

    Introduction to Regression Procedures

    Testing Linear Hypotheses

    The general form of a linear hypothesis for the parameters is

    where L is qk, is k1, and c is q1. To test this hypothesis, the linearfunction is taken with respect to the parameter estimates:

    Lb - c

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect17.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect17.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect19.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect16.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect17.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect17.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect17.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect17.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect19.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect19.htm
  • 8/22/2019 Manual de SAS

    30/53

    This has variance

    where b is the estimate of .

    A quadratic form called the sum of squares due to the hypothesis is calculated:

    SS(Lb - c) = (Lb - c)' (L(X' WX)- L')-1 (Lb - c)

    If you assume that this is testable, the SS can be used as a numerator of the Ftest:

    F= [( SS(Lb - c) / q)/(s2)]

    This is compared with an Fdistribution with qand dfe degrees of freedom, where

    dfe is the degrees of freedom for residual error.

    Previous Next Top

    Copyright 2002 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

    Previous Next

    Introduction to Regression Procedures

    Previous Next

    Introduction to Regression Procedures

    Multivariate Tests

    Multivariate hypotheses involve several dependent variables in the form

    where L is a linear function on the regressor side, is a matrix of parameters, M

    is a linear function on the dependent side, and d is a matrix of constants. Thespecial case (handled by PROC REG) in which the constants are the same foreach dependent variable is written

    where c is a column vector of constants andj is a row vector of 1s. The specialcase in which the constants are 0 is

    http://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect17.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect19.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect20.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect20.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect17.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect17.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect19.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect19.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htm#topofpagehttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../common.hlp/images/copyrite.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect20.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect20.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect18.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect20.htmhttp://v9doc.sas.com/cgi-bin/sasdoc/cgigdoc?file=../statug.hlp/introreg_sect20.htm
  • 8/22/2019 Manual de SAS

    31/53

    These multivariate tests are covered in detail in Morrison (1976); Timm (1975);Mardia, Kent, and Bibby (1979); Bock (1975); and other works cited in Chapter 5,"Introduction to Multivariate Procedures."

    To test this hypothesis, construct two matrices, H and E, that correspond to thenumerator and denominator of a univariate Ftest:

    Four test statistics, based on the eigenvalues ofE-1 H or(E+H)-1 H, are formed.

    Let be the ordered eigenvalues ofE-1 H (if the inverse exists), and let be the

    ordered eigenvalues of(E + H)-1 H. It happens that and

    , and it turns out that is the ith canonical correlation.

    Letp be the rank of(H+E), which is less than or equal to the number of columnsofM. Let qbe the rank ofL(X' WX)- L'. Let vbe the