Chapter 3: Descriptive Study
of Bivariate Data
• Univariate Data: data involving a single variable.• Multivariate Data: data involving more than one variable. • Bivariate Data: data involving two variables.
Bivariate Data
• There are two types of Bivariate Data: Bivariate Categorical Data and Bivariate Measurement Data.
Univariate vs. Bivariate
• Univariate Categorical :
• Bivariate Categorical:
Univariate vs. Bivariate
• Univariate Measurement: Bivariate Measurement:
SUMMARIZATION OF BIVARIATE CATEGORICAL DATA
Calculation of Relative Frequencies and make a contingency table
Data:
• The total frequency for any row is given in the right-hand margin and those for any column given at the bottom margin.• Both are called marginal totals.
• Depending on the specific context of a cross-tabulation, one may also wish to examine the cell frequencies relative to a marginal total.
• Data in this summary form are commonly called cross-classified or cross-tabulated data. • In statistical terminology, they are also called contingency tables.
SIMPSON’S PARADOX
Consider the data:
The proportion of males admitted: 233/ 557=.418.
Proportion of females admitted, 88/ 282 = .312.
• Does there appear to be a gender bias?
• In mechanical engineering, the proportion of males admitted, 151 / 186 = .812, is smaller the proportion of females admitted, 16/18 = .889.
• In history department, the proportion of males admitted, 82/371 = .221, is smaller than the proportion of females admitted, 72/264 =.273.
• When the data are studied department by department, the reverse but correct conclusion holds; females have a higher admission rate in both cases!• “Department” is an unrecorded or lurking variable.
• Group Work 10: Find two examples of Simpson’s Paradox. • Due: Wednesday, Sept 10th.