bivariate data when two variables are measured on a single experimental unit, the resulting data are...

21
Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable individually, and you can also explore the relationship between the two variables. Chapter 3: Describing Bivariate Data

Upload: diane-underwood

Post on 30-Dec-2015

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

Bivariate Data

When two variables are measured on a single experimental unit, the resulting data are called bivariate data.

You can describe each variable individually, and you can also explore the relationship between the two variables.

Chapter 3: Describing Bivariate Data

Page 2: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

Graphs for Qualitative Variables When at least one of the variables is

qualitative, you can use comparative pie charts or bar charts..

Variable #1 =

Variable #2 = Do you think that men and women are

treated equally in the workplace?

Do you think that men and women are treated equally in the workplace?

Opinion

Gender

Page 3: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

Comparative Bar Charts

Stacked Bar Chart Side-by-Side Bar Chart

Describe the relationship between opinion and gender:

More women than men feel that they are not treated equally in the workplace.

More women than men feel that they are not treated equally in the workplace.

Page 4: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

Two Quantitative VariablesWhen both of the variables are quantitative, call one variable x and the other y. A single measurement is a pair of numbers (x, y) that can be plotted using a two-dimensional graph called a scatterplot..

y

x

(2, 5)

x = 2

y = 5

Page 5: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

Describing the Scatterplot

Positive linear - strong Negative linear -weak

Curvilinear No relationship

Page 6: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

The Correlation Coefficient Assume that the two variables x and y exhibit a

linear pattern or form. The strength and direction of the relationship

between x and y are measured using the correlation coefficient, r..

where

sx = standard deviation of the x’s

sy = standard deviation of the y’s

sx = standard deviation of the x’s

sy = standard deviation of the y’s

yx

xy

ss

sr

yx

xy

ss

sr

1

))((

nn

yxyx

s

iiii

xy 1

))((

nn

yxyx

s

iiii

xy

Page 7: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

# of transistors in a CPU and its integer performance.

Example

CPU Model 1 2 3 4 5

x (million transistors) 14 15 17 19 16

y (SPECint) 178 230 240 275 200

•The scatterplot indicates a positive linear relationship.

•The scatterplot indicates a positive linear relationship.

Page 8: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

Examplex y xy

14 178 2492

15 230 3450

17 240 4080

19 275 5225

16 200 3200

81 1123 18447

360.37 6.224

924.1 2.16

Calculate

y

x

sy

sx

1

))((

nnyx

yxs

iiii

xy

6.634

5)1123)(81(

18447

yx

xy

ss

sr

885.)36.37(924.1

6.63

Page 9: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

Interpreting r

All points fall exactly on a straight line.

Strong relationship; either positive or negative

Weak relationship; random scatter of points

AppletApplet

-1 r 1

r 0

r 1 or –1

r = 1 or –1

Sign of r indicates direction of the linear relationship.

Page 10: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

The Regression Line Sometimes x and y are related in a particular way

—the value of y depends on the value of x.• y = dependent variable• x = independent variable

The form of the linear relationship between x and y can be described by fitting a line as best we can through the points. This is the regression line, y = a + bx..• a = y-intercept of the line• b = slope of the line

AppletApplet

Page 11: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

The Regression Line To find the slope and y-intercept

of the best fitting line, use:

xbya

s

srb

x

y

xbya

s

srb

x

y

• The least squares

regression line is y = a + bx

Page 12: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

x

y

s

srb

xby a

Examplex y xy

14 178 2492

15 230 3450

17 240 4080

19 275 5225

16 200 3200

81 1123 18447

189.179235.1

3604.37)885(.

86.53)2.16(189.176.224

xy 189.1786.53 :Line Regression

885.

3604.376.224

9235.12.16

r

sy

sx

y

x

RecallFrom Previous Example:

Page 13: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

Predict:

Example

xy 189.1786.53

Predict the CPU integer performance of a CPU containing 16 million transistors.

)16(189.1786.53 221.16

Page 14: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

Nonlinear Regression Not all relationships between two variables are

linear need to fit some other type of function Nonlinear regression deals with relationships

that are NOT linear. For example, polynomial logarithmic and exponential reciprocal

We can use the method of least squares if we can transform the data to make the relationship appear linear (linearization)

Page 15: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

When To Use Nonlinear Regression?

Often requires a lot of mathematical intuition Always draw a scatterplot

if the plot looks non-linear, try nonlinear regression

If a nonlinear relationship is suspected based on theoretical information

Relationship must be convertible to a linear form

Page 16: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

Types ofCurvilinear Regression

There are many possible types of nonlinear relationships that can be linearized:

Many other forms can be transformed!

y bx ay a b

x

axy b elog( ) y a bx

Page 17: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

Transforming to Linear Forms

Example: if the relation between y and x is exponential (i.e., y = x ), we take the logarithms of both sides of the equation to get: log y = log + x ( log

Note that and are constants.

We can perform similar transformations for reciprocal and power functions

Page 18: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

Examples:

ln

ln

y x

X x

Y y

Y X

/

1/

y x

X x

Y y

Y X

exp( ) e

ln

ln

xy x

Y y

X x

Y X

ln

ln

ln

y x

Y y

X x

Y X

Page 19: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

Review of Logarithmic Functions

The inverse of the exponential function is the natural logarithm function

Ln(exp(x)) = x

Product Rule for Logarithms Ln(a b) = Ln(a) + Ln(b)

Logb x = Ln(x) / Ln(b) (Change of Base) Loge(x) = Ln(x) / Ln(e) = Ln(x)

Log10(x) = Ln(x) / Ln(10)

Page 20: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

Key ConceptsI. Bivariate Data

1. Both qualitative and quantitative variables

2. Describing each variable separately3. Describing the relationship between the variables

II. Describing Two Qualitative Variables1. Side-by-Side pie charts2. Comparative line charts3. Comparative bar charts Side-by-Side Stacked

4. Relative frequencies to describe the relationship between the two variables.

Page 21: Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable

Key ConceptsIII. Describing Two Quantitative Variables

1. Scatterplots Linear or nonlinear pattern

Strength of relationship

Unusual observations; clusters and outliers

2. Covariance and correlation coefficient

3. The best fitting line Calculating the slope and y-intercept

Graphing the line

Using the line for prediction