1 prof. indrajit mukherjee, school of management, iit bombay supplies the data to confirm a...

31
1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay •Supplies the data to confirm a hypothesis that two variables are related • Provides both a visual and statistical means to test the strength of a relationship • Provides a good follow-up to cause and effect diagrams Scatter Diagram * * * * * *

Upload: kathryn-merritt

Post on 06-Jan-2018

218 views

Category:

Documents


4 download

DESCRIPTION

3 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Scatter Plot Y X Dependent Variable (Output) Independent Variable (Inputs)

TRANSCRIPT

Page 1: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

1Prof. Indrajit Mukherjee, School of Management, IIT Bombay

•Supplies the data to confirm a hypothesis that two variables are related• Provides both a visual and statistical means to test the strength of a relationship• Provides a good follow-up to cause and effect diagrams

Scatter Diagram

*

*

**

**

Page 2: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

2Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Mathematical Model Driven Quality Decisions

Y = F(x)

• Independent variables• Inputs, and In-Process Variables• Cause• Problem• Control• Input Conditions

• Dependent variable (s)• Output (s)

• Effect (s)• Symptom• Monitor• Response

Y X= X1 . . . X N

Page 3: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

3Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Scatter PlotY

X

Dependent Variable (Output)

Independent Variable (Inputs)

Page 4: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

4Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Scatter Diagram Example

volumeper day

costper day

23 125

26 140

29 146

33 160

38 167

42 170

50 188

55 195

60 200

Page 5: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

5Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Scatter plot Examples

Y

XX

X X

Y Y

YLinear Relationships Curvilinear Relationships

Page 6: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

6Prof. Indrajit Mukherjee, School of Management, IIT Bombay

X

Y

X

Y

X

Y

X

Y

Scatter plot ExamplesStrong Relationships Weak Relationships

(Continued)

Page 7: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

7Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Scatter plot Examples

X

Y

X

Y

No Relationship(Continued)

Page 8: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

8Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Types of Correlation

Positive Correlation Negative Correlation No Correlation

Page 9: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

9Prof. Indrajit Mukherjee, School of Management, IIT Bombay

-3 -2 -1 0 1 2 3 0

6

5

43

2

1

X

Y

A Nonlinear Relationship for Which r = 0

Page 10: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

10Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Calculation Example

TreeHeight

TrunkDiameter

yi xi xiyi yi2 xi2

35 8 280 1225 64

49 9 441 2401 81

27 7 189 729 49

33 6 198 1089 36

60 13 780 3600 169

21 7 147 441 49

45 11 495 2025 121

51 12 612 2601 144

321 73 3142 14111 713

Page 11: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

11Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Excel OutputExcel Correlation OutputTools / data analysis / correlation….

Correlation betweenTree Height and Trunk Diameter

Page 12: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

12Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Explaining Attitude Towardthe City of Residence

Respondent number

Attitude toward the city

Duration of the residence

1 6 102 9 123 8 124 3 45 10 126 4 67 5 88 2 29 11 1810 9 911 10 1712 2 2

Page 13: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

13Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Simple Linear Regression

Simple Linear Regression Describes the linear relationship between a Predictor variable, plotted on the x-axis, and a response variable, plotted on the y-axis

Resp

onse

PredictorDe

pend

ent V

aria

ble

(Y)

Independent Variable (X)

Page 14: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

14Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Types of Regression ModelsRegression

Models

Simple Multiple

Linear Non-LinearLinearNon-

Linear

X=1Variable

X≥2Variables

Page 15: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

15Prof. Indrajit Mukherjee, School of Management, IIT Bombay

• Only one independent variable, x

• Relationship between x and y is described by a linear function

• Changes in y are assumed to be caused by changes in x

Simple Linear Regression Model

Page 16: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

16Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Parameter Estimation Table [Volume Sales /month(Y) vs. Advertising/month (X)]

1 1 1 1 1

2 1 4 1 2

3 2 9 4 6

4 2 16 4 8

5 4 25 16 20

15 10 55 26 37

iX iY2iX

2iY i iX Y

Page 17: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

17Prof. Indrajit Mukherjee, School of Management, IIT Bombay

observation number Hydrocorban number purity1 0.99 90.012 1.02 89.053 1.05 91.434 1.29 93.745 1.46 96.736 1.36 94.457 0.87 87.598 1.23 91.779 1.55 99.42

10 1.4 93.6511 1.19 93.5412 1.15 92.5213 0.98 90.5614 1.01 89.5415 1.11 89.8516 1.2 90.3917 1.26 93.2518 1.32 93.4119 1.43 94.9820 0.95 87.33

Empirical ModelsTable Oxygen and hydrocarbon levels

Page 18: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

18Prof. Indrajit Mukherjee, School of Management, IIT Bombay

observation number

Hydrocorban number purity predicted value residual

1 0.99 90.01 89.069009 0.9409912 1.02 89.05 89.51836 -0.4681363 1.05 91.43 91.464353 -0.0343534 1.29 93.74 93.560279 0.1797215 1.46 96.73 96.105332 0.6246686 1.36 94.45 94.608242 -0.1582427 0.87 87.59 87.272501 0.3174998 1.23 91.77 92.662025 -0.8920259 1.55 99.42 97.452713 1.967287

10 1.4 93.65 95.207078 -1.55707811 1.19 93.54 92.063189 1.47681112 1.15 92.52 91.614062 0.90593813 0.98 90.56 88.9193 1.640714 1.01 89.54 89.368427 0.17157315 1.11 89.85 90.865571 -1.01551716 1.2 90.39 92.212898 -1.82289817 1.26 93.25 93.111152 0.13884818 1.32 93.41 94.009406 -0.59940619 1.43 94.98 95.656205 -0.67620520 0.95 87.33 88.470173 -1.140173

Adequacy of the Regression Model

Page 19: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

19Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Sample Data for House Price Model

House Price in Rs.1000’s ( Y) Square Feet(x)245 1400312 1600279 1700308 1875199 1100219 1550405 2350324 2450319 1425255 1700

Page 20: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

20Prof. Indrajit Mukherjee, School of Management, IIT Bombay

The DataData on sales of breadstick baskets andmargaritas for 25 weeks are shown below.

Breadstick

week orders margaritas1 860 13302 850 13503 800 12904 850 13505 880 13606 780 12507 815 12758 780 12509 750 1160

10 710 114011 740 114012 675 108013 720 114014 730 115015 645 102016 650 100017 730 120018 870 138019 890 139020 910 138021 940 140022 830 125023 840 125024 815 124525 800 1250

Page 21: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

21Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Year Income(X) Retail sales(Y)1 9098 54922 9138 55403 9094 53054 9282 55075 9229 54186 9347 53207 9525 55388 9756 56929 10282 5871

10 10662 615711 11019 634212 11307 590713 11432 612414 11449 618615 11697 622416 11871 649617 12018 671818 12523 692119 12053 647120 12088 639421 12215 655522 12494 6755

Check This Regression Analysis in Excel

Page 22: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

22Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Observation number

Pull strength

wire length

Die height

Observation number

Pull strength

wire length

Die height

1 9.95 540 585 600 400 205 360

2 24.45 8 110 15 21.65 4 205

3 31.75 11 120 16 17.89 4 400

4 35 10 550 17 69 20 600

5 25.02 8 295 18 10.3 1 585

6 16.86 4 200 19 34.93 10 540

7 14.38 2 375 20 46.59 15 250

8 9.6 2 52 21 44.88 15 290

9 24.35 9 100 22 54.12 16 510

10 27.5 8 300 23 56.63 17 590

11 17.08 4 412 24 22.13 6 100

12 37 11 400 25 21.15 5 400

13 41.95 12 500

Multiple Linear Regression Models

Example

Page 23: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

23Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Multiple Linear Regression Models

Least Squares Estimation of the ParametersThe method of least squares may be used to estimate the regression Coefficient, in the multiple regression model, equation suppose that n>k Observations are available, and let xij denote the ith observation or level of variable xj, the observations are(xi1,xi2,…,xik,yi), i=1,2,...,n and n>kIt is customary to present the data for multiple regression in a table such as table. Table data for multiple regression

y x1 x2 … xk

Y1 x11 x12 … x1k

Y2 x21 x22 … x2k

yn xn1 xn2 … xnk

Page 24: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

24Prof. Indrajit Mukherjee, School of Management, IIT Bombay

y x1 x2 … xk

Y1 x11 x12 … x1k

Y2 x21 x22 … x2k

yn xn1 xn2 … xnk

0 1 1 2 21 1 1 1

ˆ ˆ ˆ ˆ...n n n n

i i k ik ii i i i

n x x x y

2

0 1 1 1 2 1 2 1 11 1 1 1 1

ˆ ˆ ˆ ˆ...n n n n n

i i i i k i ik i ii i i i i

x x x x x x x y

2

0 1 1 2 21 1 1 1 1

ˆ ˆ ˆ ˆ...n n n n n

ik ik i ik i k i k ik ii i i i i

x x x x x x x y

Table data for multiple regression

Page 25: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

25Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Hypothesis Tests in Multiple Linear Regression

Test for Significance of Regression

Source of variation

Sum of squares Degrees of freedom

Mean square F0

regression SSR k MSR MSR/MSE

Error or residual SSE n-p MSE

total SST n-1

2 2 2

1 1 1

ˆ ˆ( ) ( ) ( )n n n

i i ii i i

y y y y y y

T R ESS SS SS

0

// 1R

E

SS kF

SS n k

Page 26: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

26Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Source of variation

Sum of squares

Degrees of freedom

Mean square F0

regression SSR k MSR MSR/MSE

Error or residual

SSE n-p MSE

total SST n-1

Hypothesis Tests in Multiple Linear RegressionTest for Significance of Regression

Source of variation

Sum of squares

Degrees of freedom

Mean square F0

regression 5990.7712 2 2995.3856 572.17

Error or residual 115.1735

22 5.2352

total6105.9447

24

Page 27: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

27Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Observation number Pull strength wire length Die height Observation number Pull strength wire length Die height

1 9.95 8.38 1.57 14 11.66 12.26 -0.60

2 24.45 25.60 -1.15 15 21.65 15.81 5.84

3 31.75 33.95 -2.20 16 17.89 18.25 -0.36

4 35 96.60 -1.60 17 69 64.67 4.33

5 25.02 27.91 -2.89 18 10.3 12.34 -2.04

6 16.86 15.75 1.11 19 34.93 36.47 -1.54

7 14.38 12.45 1.93 20 46.59 46.56 -0.03

8 9.6 8.40 1.20 21 44.88 47.06 -2.18

9 24.35 28.21 -3.86 22 54.12 52.56 1.56

10 27.5 27.98 -0.48 23 56.63 56.31 0.32

11 17.08 18.40 -1.32 24 22.13 19.98 2.15

12 37 37.46 -0.46 25 21.15 21.00 0.15

13 41.95 41.46 0.49

Multiple Linear Regression ModelsExample

Page 28: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

28Prof. Indrajit Mukherjee, School of Management, IIT Bombay

observation Temp(X) Feed rate(X2) Viscosity(Y)1 80 8 22562 93 9 23403 100 10 24264 82 12 22935 90 11 23306 99 8 23687 81 8 22508 96 10 24099 94 12 2364

10 9 11 237911 397 13 244012 95 11 236413 100 8 240414 85 12 231715 86 9 230916 87 12 2328

Assignment (Contd)Table:-

Page 29: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

29Prof. Indrajit Mukherjee, School of Management, IIT Bombay

year revenue number of offices Profit margin()1 3.92 7298 0.752 3.61 6855 0.713 3.32 6636 0.664 3.07 6506 0.615 3.06 6450 0.76 3.11 6402 0.727 3.21 6368 0.778 3.26 6340 0.749 3.42 6349 0.910 3.42 6352 0.8211 3.45 6361 0.7512 3.58 6369 0.7713 3.66 6546 0.7814 3.78 6672 0.8415 3.82 6890 0.7916 3.97 7115 0.717 4.07 7327 0.6818 4.25 7546 0.7219 4.41 7931 0.5520 4.49 8097 0.6321 4.7 8468 0.5622 4.58 9717 0.4123 4.69 8991 0.5124 4.71 9179 0.4725 4.78 9318 0.32

Page 30: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

30Prof. Indrajit Mukherjee, School of Management, IIT Bombay

Neural Networks

• Neural Network: A collection of neurons which areinterconnected. The output of one connects to several others with different strength connections.

– Initially, neural networks have no knowledge. (Allinformation is learned from experience using thenetwork.)

Neuron 1

Neuron 2

Output fromNeuron2

Output fromNeuron 1

Input 2

Input 3

Input 1

Page 31: 1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Supplies the data to confirm a hypothesis that two variables are related Provides both a visual

31Prof. Indrajit Mukherjee, School of Management, IIT Bombay

observation numer

Surface finish RPM

Type of cutting tool

observation numer

Surface finish RPM

Type of cutting tool

1 45.44 225 302 11 33.5 224 416

2 42.03 200 302 12 31.23 212 416

3 50.01 250 302 13 37.52 248 416

4 48.75 245 302 14 37.13 260 416

5 47.92 235 302 15 34.7 243 416

6 47.79 237 302 16 33.92 238 416

7 52.26 265 302 17 32.13 224 416

8 50.52 259 302 18 35.47 251 416

9 45.58 221 302 19 33.49 232 416

10 44.78 218 302 20 32.29 216 416

Multiple Regression ModelingWhat to Do in Such Cases?-Check Book