powerpoint presentation · 2017-09-22 · 4-3 hypothesis testing 4-3.1 statistical hypotheses test...

99

Upload: others

Post on 29-Mar-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

  • 4-1 Statistical Inference

    • The field of statistical inference consists of those

    methods used to make decisions or draw

    conclusions about a population.

    •These methods utilize the information contained in

    a random sample from the population in drawing

    conclusions.

  • 4-1 Statistical Inference

  • 4-2 Point Estimation

  • 4-2 Point Estimation

  • 4-2 Point Estimation

  • 4-2 Point Estimation

  • 4-3 Hypothesis Testing

    We like to think of statistical hypothesis testing as the

    data analysis stage of a comparative experiment, in which the engineer is interested, for example, in

    comparing the mean of a population to a specified

    value (e.g. mean pull strength).

    4-3.1 Statistical Hypotheses

  • 4-3 Hypothesis Testing

    4-3.1 Statistical Hypotheses

    Two-sided Alternative Hypothesis

    One-sided Alternative Hypotheses

  • 4-3 Hypothesis Testing

    4-3.1 Statistical Hypotheses

    Test of a Hypothesis

    • A procedure leading to a decision about a particular

    hypothesis

    • Hypothesis-testing procedures rely on using the information

    in a random sample from the population of interest.

    • If this information is consistent with the hypothesis, then we

    will conclude that the hypothesis is true; if this information is

    inconsistent with the hypothesis, we will conclude that the

    hypothesis is false.

  • 4-3 Hypothesis Testing

    4-3.2 Testing Statistical Hypotheses

  • 4-3 Hypothesis Testing

    4-3.2 Testing Statistical Hypotheses

  • 4-3 Hypothesis Testing

    4-3.2 Testing Statistical Hypotheses

    Sometimes the type I error probability is called the

    significance level, or the -error, or the size of the test.

  • 4-3 Hypothesis Testing

    4-3.2 Testing Statistical Hypotheses

    Reduce α

    1. push critical regions further

    2. Increasing sample size

    increasing Z value

  • 4-3 Hypothesis Testing

    4-3.2 Testing Statistical Hypotheses

  • 4-3 Hypothesis Testing

    4-3.2 Testing Statistical Hypotheses

    error when μ =52 and n=16.

    Increasing the sample size results

    in a decrease in β.

  • 4-3 Hypothesis Testing

    4-3.2 Testing Statistical Hypotheses

    error when μ =50.5 and n=10.

    β increases as the true value of μ

    approaches the hypothesized value.

  • 4-3 Hypothesis Testing

    4-3.2 Testing Statistical Hypotheses

  • 4-3 Hypothesis Testing

    4-3.2 Testing Statistical Hypotheses

    • The power is computed as 1 - b, and power can be interpreted as

    the probability of correctly rejecting a false null hypothesis. We often compare statistical tests by comparing their power

    properties.

    • For example, consider the propellant burning rate problem when

    we are testing H 0 : m = 50 centimeters per second against H 1 : m not equal 50 centimeters per second . Suppose that the true value of the

    mean is m = 52. When n = 10, we found that b = 0.2643, so the power of this test is 1 - b = 1 - 0.2643 = 0.7357 when m = 52.

  • 4-3 Hypothesis Testing

    4-3.3 P-Values in Hypothesis Testing

    P-value carries info about the weight of evidence against H0.

    The smaller the p-value, the greater the evidence against H0.

  • 4-3 Hypothesis Testing

    4-3.4 One-Sided and Two-Sided Hypotheses

    Two-Sided Test:

    One-Sided Tests:

  • 4-3 Hypothesis Testing

    4-3.5 General Procedure for Hypothesis Testing

  • OPTIONS NODATE NONUMBER;

    DATA EX415;

    mu_h0=12; mu_h1=11.25; sigma=0.5; n=4; alpha=0.05;

    xbar=11.7; sem=sigma/sqrt(n); /* sem = mean standard error */

    Z=(xbar-mu_h0)/sem; z_beta= probit(alpha);

    xbar_beta=mu_h0+z_beta*sem;

    z_beta=(xbar_beta-mu_h1)/sem;

    PVAL=PROBNORM(Z);

    beta=1-probnorm(z_beta);

    power=1-beta;

    PROC PRINT;

    VAR PVAL beta power;

    TITLE '4-15 (page 168)';

    RUN; QUIT;

    4-15 (page 168)

    OBS PVAL beta power

    1 0.11507 0.087685 0.91231

    4-3 Hypothesis Testing

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    Assumptions

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.1 Hypothesis Testing on the Mean

    We wish to test:

    The test statistic is:

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.1 Hypothesis Testing on the Mean

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.1 Hypothesis Testing on the Mean

    Reject H0 if the observed value of the test statistic z0 is

    either:

    or

    Fail to reject H0 if

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.1 Hypothesis Testing on the Mean

  • 4-4 Inference on the Mean of a Population,

    Variance KnownEx. 4-3

    X – burning rate

    μ = 50, s = 2 a = 0.05, n = 25, ҧ𝑥 = 51.3

    i) Hypothesis

    H0: μ = 50

    H1: μ ≠ 50

    ii) Test Statistic

    z0 =ഥ𝑋 − 𝜇0

    ൗ𝜎

    𝑛

    =51.3 −50

    ൗ2 25= 1.3/0.4 = 3.25

    iii) p-value = 2[1 – Φ(3.25)] = 0.0012

    Critical Region (CR) 또는 Region of Rejection (ROR)z0 > z0.025= 1.96 or z0< - z0.025= -1.96

    iv) Conclusion

    p-value=0.0012 값이유의수준 a = 0.05보다작기때문에(또는, 통계량 z0=3.25 값이기각역 (z0 = z0.025> 1.96) 에포함됨으로) H0를기각함. 즉,평균 burning rate이 50m/sec 이아니다라고결론내림.

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    OPTIONS NODATE NONUMBER;

    DATA EX43;

    MU=50; SD=2; N=25; ALPHA=0.05; XBAR=51.3;

    SEM=SD/SQRT(N); /*MEAN STANDARD ERROR*/

    Z=(XBAR-MU)/SEM;

    PVAL=2*(1-PROBNORM(Z));

    PROC PRINT;

    VAR Z PVAL ALPHA;

    TITLE 'TEST OF MU=50 VS NOT=50';

    RUN; QUIT;

    TEST OF MU=50 VS NOT=50

    OBS Z PVAL ALPHA

    1 3.25 .001154050 0.05

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.2 Type II Error and Choice of Sample Size

    Finding The Probability of Type II Error b

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.2 Type II Error and Choice of Sample Size

    Finding The Probability of Type II Error b

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.2 Type II Error and Choice of Sample Size

    Sample Size Formulas

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.2 Type II Error and Choice of Sample Size

    Sample Size Formulas

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.2 Type II Error and Choice of Sample Size

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.2 Type II Error and Choice of Sample Size

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.3 Large Sample Test

    In general, if n 30, the sample variance s2

    will be close to σ2 for most samples, and so s

    can be substituted for σ in the test procedures

    with little harmful effect.

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.4 Some Practical Comments on Hypothesis

    Testing

    The Seven-Step Procedure

    Only three steps are really required:

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.4 Some Practical Comments on Hypothesis

    Testing

    Statistical versus Practical Significance

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.4 Some Practical Comments on Hypothesis

    Testing

    Statistical versus Practical Significance

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.5 Confidence Interval on the Mean

    Two-sided confidence interval:

    One-sided confidence intervals:

    Confidence coefficient:

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.5 Confidence Interval on the Mean

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.6 Confidence Interval on the Mean

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.5 Confidence Interval on the Mean

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.5 Confidence Interval on the Mean

    Relationship between Tests of Hypotheses and

    Confidence Intervals

    If [l,u] is a 100(1 - ) percent confidence interval for the

    parameter, then the test of significance level of the

    hypothesis

    will lead to rejection of H0 if and only if the hypothesized

    value is not in the 100(1 - ) percent confidence interval

    [l, u].

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.5 Confidence Interval on the Mean

    Confidence Level and Precision of Estimation

    The length of the two-sided 95% confidence interval is

    whereas the length of the two-sided 99% confidence

    interval is

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.5 Confidence Interval on the Mean

    Choice of Sample Size

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.5 Confidence Interval on the Mean

    Choice of Sample Size

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.5 Confidence Interval on the Mean

    Choice of Sample Size

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.5 Confidence Interval on the Mean

    One-Sided Confidence Bounds

  • 4-4 Inference on the Mean of a Population,

    Variance Known

    4-4.6 General Method for Deriving a Confidence

    Interval

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

    4-5.1 Hypothesis Testing on the Mean

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

    4-5.1 Hypothesis Testing on the Mean

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

    4-5.1 Hypothesis Testing on the Mean

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

    4-5.1 Hypothesis Testing on the Mean

    Calculating the P-value

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

    4-5.1 Hypothesis Testing on the Mean

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

    4-5.1 Hypothesis Testing on the Mean

    Fixed Significance Level Approach

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

    4-5.1 Hypothesis Testing on the Mean

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

    Ex. 4-7ത𝑋 – mean coefficient of restitution n = 15

    i) Hypothesis

    H0: μ = 0.82

    H1: μ > 0.82

    ii) Normality Check!

    iii) Test Statistic

    t0 =ഥ𝑋 − 𝜇0

    ൗ𝑠

    𝑛

    =0.83725 −0.82

    ൗ0.02456 15= 2.72

    iv) Fixed significance level approach

    Critical Region (CR) 또는 Region of Rejection (ROR): t0 > t0.05= 1.761P-value approach

    t0= 2.72는 t0.01;14= 2.624 값과 t0.005;14= 2.977 사이에있으므로0.005

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

    OPTIONS NODATE NONUMBER;

    DATA ex47;

    xbar=0.83725; sd=0.02456;

    mu=0.82; n=15;alpha=0.05; df=n-1;

    t0=(xbar-mu)/(sd/sqrt(n));

    pval = 1- PROBT(t0, df); /* PROBT function returns the probability from a t distribution

    less than or equal to x */

    cv=tinv(1-alpha, df); /* critical value */

    /* TINV function returns a quantile from the t distribution */

    PROC PRINT;

    var t0 pval cv;

    title 'Test of mu=0.82 vs mu>0.82';

    title2 'Ex. 4-7 in pp193';

    RUN; QUIT;

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

    OPTIONS NOOVP NODATE NONUMBER;

    DATA ex47;

    input coeffs @@;

    cards;

    0.8411 0.8191 0.8182 0.8125 0.8750

    0.8580 0.8532 0.8483 0.8276 0.7983

    0.8042 0.8730 0.8282 0.8359 0.8660

    proc univariate data=ex47 mu0=0.82 plot normal;

    var coeffs;

    title 'using proc uinivariate for t-test for one population mean';

    title2 “Example 4-7 in page 191”;

    RUN; QUIT;

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

    OPTIONS NOOVP NODATE NONUMBER LS=80;

    DATA ex47;

    input coeffs @@;

    cards;

    0.8411 0.8191 0.8182 0.8125 0.8750

    0.8580 0.8532 0.8483 0.8276 0.7983

    0.8042 0.8730 0.8282 0.8359 0.8660

    ods graphics on;

    proc ttest data=ex47 h0=0.82 sides=u;

    /* h0는귀무가설, sides는양측일경우 2, lower 단측검정일경우 L, upper 단측검정일경우 u*/var coeffs;

    title 'using t-test for one population mean';

    title “Example 4-7 in page 191”;

    RUN; ods graphics off;

    QUIT;

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

    4-5.2 Type II Error and Choice of Sample Size

    Fortunately, this unpleasant task has already been done,

    and the results are summarized in a series of graphs in

    Appendix A Charts Va, Vb, Vc, and Vd that plot for the

    t-test against a parameter d for various sample sizes n.

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

    4-5.2 Type II Error and Choice of Sample Size

    These graphics are called operating characteristic

    (or OC) curves. Curves are provided for two-sided

    alternatives on Charts Va and Vb. The abscissa scale

    factor d on these charts is defined as

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

    4-5.2 Type II Error and Choice of Sample Size

    Chart V (c) in pp. 496

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

    4-5.3 Confidence Interval on the Mean

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

    4-5.3 Confidence Interval on the Mean

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

    4-5.4 Confidence Interval on the Mean

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

    OPTIONS NOOVP NODATE NONUMBER LS=80;

    DATA ex60;

    input diam @@;

    cards;

    8.23 8.31 8.42 8.29 8.19 8.24

    8.19 8.29 8.30 8.14 8.32 8.40

    ods graphics on;

    proc univariate plot normal mu0=8.2;

    var diam;

    title 'univariate for t-test with H0: mu=8.2';

    proc ttest data=ex60 h0=8.2;

    var diam;

    title 't-test for one population mean';

    title “Example 4-60 in page 198”;

    RUN; ods graphics off;

    QUIT;

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

  • 4-5 Inference on the Mean of a Population,

    Variance Unknown

  • 4-6 Inference on the Variance of a

    Normal Population

    4-6.1 Hypothesis Testing on the Variance of a

    Normal Population

  • 4-6 Inference on the Variance of a

    Normal Population

    4-6.1 Hypothesis Testing on the Variance of a

    Normal Population

  • 4-6 Inference on the Variance of a

    Normal Population

    4-6.1 Hypothesis Testing on the Variance of a

    Normal Population

  • 4-6 Inference on the Variance of a

    Normal Population

    4-6.1 Hypothesis Testing on the Variance of a

    Normal Population

  • 4-6 Inference on the Variance of a

    Normal Population

    4-6.1 Hypothesis Testing on the Variance of a

    Normal Population

  • 4-6 Inference on the Variance of a

    Normal Population

    4-6.1 Hypothesis Testing on the Variance of a

    Normal Population

  • 4-6 Inference on the Variance of a

    Normal Population

    4-6.2 Confidence Interval on the Variance of a

    Normal Population

  • 4-6 Inference on the Variance of a

    Normal Population

    OPTIONS NODATE NONUMBER;

    DATA ex410;

    p_var=0.01; s_var=0.0153;

    n=20;alpha=0.05; df=n-1;

    chisq=(n-1)*s_var/p_var;

    p_value=1-probchi(chisq, df);

    cv1=cinv(1-alpha, df); /* critical value */

    cv2=cinv(alpha, df);

    lcl=sqrt((n-1)*s_var/cv1); /* ucl= upper confidence limit */

    PROC PRINT;

    var chisq p_value cv1 cv2 lcl;

    title 'Test of H0: var=0.01 vs H0: var>0.01';

    title2 'Ex. 4-10 in pp202';

    RUN; QUIT;

  • 4-7 Inference on Population Proportion

    4-7.1 Hypothesis Testing on a Binomial Proportion

    We will consider testing:

  • 4-7 Inference on Population Proportion

    4-7.1 Hypothesis Testing on a Binomial Proportion

  • 4-7 Inference on Population Proportion

    4-7.1 Hypothesis Testing on a Binomial Proportion

  • 4-7 Inference on Population Proportion

    4-7.1 Hypothesis Testing on a Binomial Proportion

  • 4-7 Inference on Population Proportion

    4-7.2 Type II Error and Choice of Sample Size

  • 4-7 Inference on Population Proportion

    4-7.2 Type II Error and Choice of Sample Size

  • 4-7 Inference on Population Proportion

    4-7.3 Confidence Interval on a Binomial Proportion

  • 4-7 Inference on Population Proportion

    4-7.3 Confidence Interval on a Binomial Proportion

  • 4-7 Inference on Population Proportion

    4-7.3 Confidence Interval on a Binomial Proportion

    Choice of Sample Size

  • 4-8 Other Interval Estimates for a

    Single Sample

    4-8.1 Prediction Interval

  • 4-8 Other Interval Estimates for a

    Single Sample

    4-8.2 Tolerance Intervals for a Normal Distribution

  • 4-10 Testing for Goodness of Fit

    • So far, we have assumed the population or probability

    distribution for a particular problem is known.

    • There are many instances where the underlying

    distribution is not known, and we wish to test a particular

    distribution.

    • Use a goodness-of-fit test procedure based on the chi-

    square distribution.