stat 101 - employees.oneonta.eduemployees.oneonta.edu/.../daily_assignment_sheets_sp…  · web...

71
KEY STAT 101 Topics: Syllabus, Nature of Statistics Handouts: Green Sheet #1; Syllabus; Assignment Schedule; Anatomy of Statistics (2): 1) Statistical Alphabet, 2) Statistical Relationships; Quiz & Homework Recording sheet Online: All handouts (except Q&H sheet) plus PowerPoint for the Nature of Statistics Course Assignments: 1) Obtain text (see syllabus). Scan through text for general format, etc. Review: Ch. 1, section 1.1 2) Go online to the Anatomy of Statistics link and look at a couple of these documents. 3) Go online, scan through the SPSS Manual Procedures section to get an idea of the current manual’s format. 4) H1: Items below represent the items from which the first homework grade will be selected. 5) Q1: Syllabus and today’s topics represent the items from which the first quiz will be selected. Next Class: Nature of Statistics – Terminology (cont.) Stat Essentials (What I should know from today): Be able to: 1) define “Data”; 2) define a “Variable”; 2) distinguish between objective (response) and explanatory variables; 3) identify variables to be studied when provided a study scenario; 4) other terms if we get there; and 5) syllabus. Problem 1.1: Nutrition According to a study published in the Journal of the American Dietetic Association: CHICAGO – The intake of added sugars in the United States is excessive, estimated by the US Department of Agriculture in 1999-2002 as 17% of calories a day. Consuming foods with added sugars displaces nutrient-dense foods in the diet. Reducing or limiting intake of added sugars is an important objective in providing overall dietary guidance. In a study of nearly 30,000 Americans published in the August 2009 issue of the Journal of the American Dietetic Association, researchers report that race/ethnicity, family income and educational status are independently associated with intake of added sugars. Groups with low income and education are particularly vulnerable to eating diets with high added sugars. [For Release August 4, 2009; excerpt from ADA webpage 8.26.2009] 1. Identify an objective variable for this study. What values could this variable assume? Sugar intake 2. Identify three explanatory variables. What values could these variables assume? Race/ethnicity; education level; family income Problem 1.2: Diet Calcium & Blood Pressure A heart researcher is interested in studying the relationship between diets which are high in calcium and blood pressure in adult females. The researcher randomly selects 20 female subjects who have high blood pressure. Ten subjects are randomly assigned to try a diet which is high in calcium. The other ten subjects are assigned to a diet with a standard amount of calcium. After one year the average blood pressures for subjects in both groups will be measured and compared to decide if diets high in calcium decrease the average blood pressure. 1 1.18.18 1

Upload: phungquynh

Post on 02-Aug-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

KEYSTAT 101Topics: Syllabus, Nature of StatisticsHandouts: Green Sheet #1; Syllabus; Assignment Schedule; Anatomy of Statistics (2): 1) Statistical Alphabet, 2) Statistical Relationships; Quiz & Homework Recording sheetOnline: All handouts (except Q&H sheet) plus PowerPoint for the Nature of StatisticsCourse Assignments:

1) Obtain text (see syllabus). Scan through text for general format, etc. Review: Ch. 1, section 1.12) Go online to the Anatomy of Statistics link and look at a couple of these documents.3) Go online, scan through the SPSS Manual Procedures section to get an idea of the current manual’s format. 4) H1: Items below represent the items from which the first homework grade will be selected.5) Q1: Syllabus and today’s topics represent the items from which the first quiz will be selected.

Next Class: Nature of Statistics – Terminology (cont.)Stat Essentials (What I should know from today): Be able to: 1) define “Data”; 2) define a “Variable”; 2) distinguish between objective (response) and explanatory variables; 3) identify variables to be studied when provided a study scenario; 4) other terms if we get there; and 5) syllabus.

Problem 1.1: Nutrition According to a study published in the Journal of the American Dietetic Association:CHICAGO – The intake of added sugars in the United States is excessive, estimated by the US Department of Agriculture in 1999-2002 as 17% of calories a day. Consuming foods with added sugars displaces nutrient-dense foods in the diet. Reducing or limiting intake of added sugars is an important objective in providing overall dietary guidance. In a study of nearly 30,000 Americans published in the August 2009 issue of the Journal of the American Dietetic Association, researchers report that race/ethnicity, family income and educational status are independently associated with intake of added sugars. Groups with low income and education are particularly vulnerable to eating diets with high added sugars. [For Release August 4, 2009; excerpt from ADA webpage 8.26.2009]

1. Identify an objective variable for this study. What values could this variable assume? Sugar intake

2. Identify three explanatory variables. What values could these variables assume? Race/ethnicity; education level; family incomeProblem 1.2: Diet Calcium & Blood PressureA heart researcher is interested in studying the relationship between diets which are high in calcium and blood pressure in adult females. The researcher randomly selects 20 female subjects who have high blood pressure. Ten subjects are randomly assigned to try a diet which is high in calcium. The other ten subjects are assigned to a diet with a standard amount of calcium. After one year the average blood pressures for subjects in both groups will be measured and compared to decide if diets high in calcium decrease the average blood pressure.

1. Identify an objective variable for this study and its associated values. CA level; blood pressure

2. Identify some potential explanatory variables and their associated values (they may not be specifically identified in the paragraph). Age, socio-economic level; region of country

Problem 1.3: Get Married - Gain WeightResearcher Penny Larson and her associates wanted to determine whether young couples who marry or cohabitate are more likely to gain weight than those who stay single. The researchers followed 8,000 men and women from 1995 through 2002 as they matured from the teens to young adults. When the study began, none of the participants were married or living with a romantic partner. By 2002, 14% of the participants were married and 16% were living with a romantic partner. At the end of the study, married or cohabitating women gained, on average, nine (9) pounds more than single women, and married or cohabitating men gained, on average, six (6) pounds more than single men. [p21sullivan]

1. Identify an objective variable for this study and its associated values. Weight gain; weights vary2. Identify an explanatory variable and its associated values. Marital status/cohabitate; values: married, not married cohabitating

Problem 1.4: Obesity and Artery CalcificationScientists were interested in learning if abdominal obesity is related to coronary artery calcification (CAC). The scientists studied 2,951 participants in the Coronary Artery Risk Development in Young Adults Study to investigate a possible link. Waist and hip girths were measured in 1985-86, 1995-96 (year 10) and in 2000-01 (waist girth only). CAC measurements were taken in 2001-02. The results of the study indicated that abdominal obesity measured by waist girth is associated with early atherosclerosis as measured by the presence of CAC in participants. [p21sullivan]

1. Identify an objective variable for this study and its associated values1. Identify an objective variable for this study and its associated values. Coronary artery calcification; values vary

2. Identify an explanatory variables and its associated values. Waist/hip girth; values vary

11.18.18

1

KEYSTAT 101

TOPICS: Nature of Statistics Terms and RelationshipsDOCUMENTS:

HANDOUTS: Green #2; Yellow worksheet #1& 2: 1) Top Films and 2) Twenty-Five Q’s classification

AVAILABLE ONLINE: All handouts listed above; Extra Credit #1ASSIGNMENT:

Readings: Ch. 1: sections 1.1-1.2; 1.3 (p. 20-24; sampling); Problems: p. 6 # 1-14, 21,24, 25-29; p. 13 # 1-12, 15, 16, 19–22, 29 Finish Worksheets #1 - Top Films and #2 - Twenty-Five Questions Extra Credit #1 – available online Items below

NEXT CLASS: Nature (cont.?); m&m’s Data Collection; ExCr #1 due (optional); Q&H2Stat Essentials (taken from today): Be able to: 1) define terms and relationships as presented on the sheet Anatomy of the Basics: Statistical Terms and Relationships; 2) identify variables and their characteristics

Problem 2.1:

FREE SHIPPING on orders $39+*Code: SHIP39

An 8x12, 20 page Shutterfly photo book costs $29.00. How much would it cost given the above discount?

$29.00*.5 = $14.50 30% of $14.50 = 14.50*.3 = $4.35 $14.50-$4.35 = $10.15 = final cost

Problem 2.2: A heart researcher is interested in studying the relationship between diets which are high in calcium and blood pressure in adult females. The researcher randomly selects 20 female subjects who have high blood pressure. Ten subjects are randomly assigned to try a diet which is high in calcium. The other ten subjects are assigned to a diet with a standard amount of calcium. After one year the average blood pressures for subjects in both groups will be measured and compared to decide if diets high in calcium decrease the average blood pressure.

1) Identify the population. Adult females

2) What characteristic of the population is being measured? CA & BP relationship

3) Identify the sample. 20 females

4) Is the purpose of this data collection to perform descriptive or inferential statistics? [P15#1H&M]

21.23.18

2

KEY5) Could blood pressure be used as an explanatory variable in this situation? No, all BP values may be different

3

KEYProblem 2.3:Heroin Use: The National Center for Drug Abuse is conducting a study to determine if heroin usage among teenagers has changed. Historically, about 1.3 percent of teenagers between the ages of 15 and 19 have used heroin one or more times. In a recent survey of 1,824 teenagers, 37 indicated they had used heroin one or more times.

1) Identify the population. Teenagers 15-19 yrs. old

2) Identify a variable of interest. Heroin use

3) Identify a sample. 1824 teenagers (15-19 yrs old); 37 are the YES respondents (2.085%)

4) Is the purpose of this data collection descriptive or inferential?

Problem 2.4:Cell Phone Frau d : Lambert and Pinheiro (2006) described a study in which researchers try to identify characteristics of cell phone calls that suggest the phone is being used fraudulently. For each cell phone call, the researchers recorded information on its direction (incoming or outgoing), location (local or roaming), duration, time of day, day of week, and whether the call took place on a weekday or weekend. [WSed3p6]

1) Identify the observational units in this study. Cell phones2) Identify the qualitative variables and their characteristics. Direction: qual, nominalLocation: qual, nominalday of week: qual, nominalweekend/weekday qual, nominal3) Identify the quantitative variables and their characteristics. Duration: quant, cont., ratiotime of day: quant, cont, interval??4) Would call duration be a good explanatory variable? Why/why not? No – too many values

Problem 2.5:Student Characteristics: A Case represents all of the information collected from one source, such as a student.Student #1 is a male who does not smoke, who lives in an urban area, and who would prefer to win an Olympic gold medal over an Academy Award or Nobel Prize. He indicates that he exercises 10 hours a week, watches television one hour per week, and has a GPA of 3.33. A resting pulse rate of 58 beats per minute, the oldest of three children and a desire to become a fireman represent other characteristics of this student.

Identify the variables for which data were obtained and classify them as qualitative (categorical)/quantitative, discrete/continuous, and provide a measurement level for each variable.Qual, Nominal: Gender; Smoke; Residence/location; Award type; Planned occupationQual, Ordinal: position among siblingsQuant, Interval: noneQuant, Discrete, ratio: number of siblingsQuant, continuous, ratio: exercise; TV watching; GPA; heart rate

If similar information were obtained from 49 other students, which variables might most likely be used as explanatory variables?Nominal and possibly ordinal variables as they have few categories

Problem 2.6:Iceland: According to World Bank data, 90% of Icelanders have access to the Internet. In order to determine this value, what were the units from which this figure was obtained? What was the variable of interest (objective variable) and what were the values of this variable? Identify the variable’s characteristics (Qual/Quant etc.) (L5p.7)

Units Icelanders Varible: Internet access; qual., nominal (Yes/No)

If one were to look at the number of people worldwide with access to the Internet, we could record the proportion within each country. In doing so, what would be the population units? What was the variable of interest (objective variable) and what were the values of this variable? Identify the variable’s characteristics (Qual/Quant etc.)

Units: Countries Variable: Proportion of country inhabitants with internet access; quant, cont. ratio

4

KEY

STAT 101

TOPICS: Sampling; Worksheet reviews; m&m’s data collectionDOCUMENTS:

HANDOUTS: Green #3; Yellow #3; m&m’s form AVAILABLE ONLINE: Green #3; Yellow #3; PowerPoint placed online (2): 1) Sampling; 2)

Combinations & PermutationsASSIGNMENTS:

Text Readings: Ch. 1: pp. 20-22; Text Problems: p. 6 #15-20, 22, 25-27, 35-38; p.13 #17, 18, 23-25, 30 Items below

NEXT CLASS: Topics: Sampling, Combinations (cont. ?); Qualitative Data (?)Stat Essentials (taken from today): Be able to: 1) identify sampling approaches; 2) understand relationships among basic statistical terms Problem 3.1: The grade for this course is based upon 400 points. These points are converted to a 100-point base to result in a final course grade.

1. How many of the 400 points represent one point of the final grade? 4; 400/100 = x/1 > 100x = 400 > x = 4

2. Extra credit points are added to those you have accrued throughout the course via exams, etc. If you complete ten extra credit exercises, and receive full credit for them all (i.e. 10 points), by how many points will your final grade increase? 10/4 = 2.5

3. Over the course of the semester Elijah elects to not submit three, 12-point class assignments. By how many points would his final grade decrease as a result of not having submitted these three assignments? 36/4 = 12

Problem 3.2: Burglaries: ADT Security Systems advertised that “when you go on vacation, burglars go to work.” Their ad stated that “according to FBI statistics, over 26% of home burglaries take place between Memorial Day and Labor Day.” What is misleading about this statement? (Triola7ed,p15#6)

Approximately 100 days +/- between Memorial Day and Labor Day. One-hundred days is 27.45 of the year, so about ¼ of burglaries in about ¼ of the year.

Problem 3.3: Election: Review the cartoon to the right. Assume that there are 100 boys and 100 girls. Demonstrate using these 200 students how this student’s conclusion is either correct or incorrect. Present your answer using both numerical computations and sufficient discussion to support your findings.

21% of 100 boys = 2130% of 100 girls = 30Together 51 students/ 200 = 25.64% of all support him

31.25.18

5

KEY

Problem 3.4: Variables: Identify the explanatory and objective variables in the following pairs of variables.

A)B) Lung capacity (objective) and number of years smoking cigarettes (explanatory)

C) Blood alcohol content (objective) and the number of alcoholic drinks consumed (explanatory)

D) Year (explanatory) and world record time in a marathon (objective)

Problem 357: O-Tiger price hike: For many years Oneonta had a single-A farm team of the NY Yankees, which was followed by a farm team of the Detroit Tigers. When the Tigers arrived, the prices for seating changed. The following comes from an editorial in the local newspaper about the rise in ticket prices for the local single-A professional baseball team. “General admission season passes for adults will be $155, up from $70, in 2009, while six-seat boxes will go up 500 percent, from $300 to $1,500.” (source: The Daily Star, In Our Opinion column for Feb. 7 & 8, 2009, p D3; this team has since left town)

A) The $85 increase in the single seat price represents how much in terms of a percentage increase?$85/$70 = 1.21 or 121% increase

B) Demonstrate using a numerical analysis whether or not the cost of a six-seat box increased by 500%. $1,500 - $300 = $1,200, which is the cost increase; $1,200/$300 = 4.00 or 400% increase

Problem 3.6: Seat Belts: Suppose that there are 300 students taking statistics and that they are asked if they always use seat belts.

A) If 27% of the students indicate that they do not always use seat belts, how many students is this?

300 * .27 = 81 students OR

27100

= X300

>> X=27∗300100

>> X=81 students

B) Suppose that in different course 20% of the students do not use seat belts and that the 20% represents 43 students. How many students are in this class?

20100

=43X

>> 20 X=43∗100>>20 X=4300 >> X=215 students

6

KEYSTAT 101

TOPICS: Combinations & Permutations; Sampling; Qualitative DataDOCUMENTS

HANDOUTS: Green #4; Yellow #4 AVAILABLE ONLINE: Green #4; Yellow #4; Combinations ppt.; Qualitative ppt.; Related Anatomy Sheets (5): 1)

Anatomy of a Systematic Random Sample; 2) Qualitative Frequency Table; 3) Pie Chart; 4) Pareto Chart; 5) Bar Chart; ExCr#2

HWK: Text Readings: Ch. 2 – pp. 56-57 (pie & pareto charts); Anatomy of Statistics – Bar, Pie, Pareto Charts Text Problems: p. 23 #1-3, 7-10, ODD problems 19-27 Additional problems available on Yellow #3 (Sampling) & #4 (Combinations) Connect your computer to the SUNY Oneonta system: mydesktop.oneonta.edu > follow instructions > should

allow you to use your PC or Mac to access SPSS (no guarantee you will be able to do SPSS from your PC) REVIEW Writing Descriptive Statements on the reverse of this page. ExCr #2 & Items below

Next Class: Qualitative Data; ExCr #2 dueStat Essentials (taken from today): Be able to: Combinations etc.:1) determine the number of samples via combinations; 2) calculate permutations, tree diagrams and the multiplication rule for independent events; Sampling: 3) understand sampling terminology; 4) identify sampling approaches; and if we get there, Qualitative Data: 5) basics of qualitative data analysis (maybe).

TI Calculations: Combinations & Permutations: Math > Prob > select P or C, input the n and r values > enter; Factorials: enter the number, then go to Math > Prob > !

Writing Descriptive Statements:

Descriptive statements merely report data presented in a table, a graph/chart, or a statistic, such as the mean. To write a paragraph about a table: 1) introduce the table; 2) provide descriptive sentences describing some aspect of the data; and 3) write a summary sentence.

Example using the table to the right.[1) Introduction=>] The following table presents residents’ ratings of life in the village. [2) Descriptives=>] Approximately 78% of surveyed residents rated life in the village positively (good to excellent). In contrast, 77 of 347 residents (22.2%) rated life in the village as poor to fair. One hundred Eighty-three residents (52.7%) rated the quality of life as “good.”[3) Conclusion=>] In general, it would appear that most residents (77.8%) are satisfied with the quality of life in the village.

NOTES: The Rating of Quality table is a SPSS generated table. 1) Use the VALID PERCENT column for percentages. DO NOT use the “Percent” column as it includes missing data. 2) If you use words such as most, more than, fewer, approximately, etc., you MUST include supporting statistical evidence. Example: “Most residents rated life in the village as good.” How much is “most,” 30%, 80%? In contrast, “Most residents (52.7%) rated life in the village as good,” provides the reader with context for the descriptive statement.3) If you include numbers representing counts, also include the associated percentage value (e.g. “Eight respondents (15%) liked the movie.”). It is much easier for a reader to understand 15% than to have to figure out what 8 of 53 represents (15%).4) If you start a sentence with a number, as done above in the third descriptive statement, write it out (e.g. NOT “7 respondents liked …,” but rather “Seven respondents liked …” Additionally, the numbers one (1) through ten (10) are generally written out within sentences; others may be displayed numerically.

Problem 4.1: Write a paragraph containing: a sentence introducing the table; two sentences that contain descriptive statements resulting from the table’s content; and a conclusion you can draw from these data. Answers vary

41.30.18

Rating of qual i ty of l i fe in vi l lage

4 1.1 1.2 1.283 23.1 23.9 25.1

183 50.8 52.7 77.860 16.7 17.3 95.117 4.7 4.9 100.0

347 96.4 100.013 3.6

360 100.0

ExcellentVery GoodGoodFairPoorTotal

Valid

SystemMissingTotal

Frequency Percent Valid PercentCumulat ivePercent

7

KEYProblem 4.2: Identify the variables and their characteristics:1: The number of doctors who wash their hands between patient visits.2: The majors of randomly selected students at a university.3: The average weight of mature German Shepherds.4: The category which best describes how frequently a person eats chocolate: Frequently, Occasionally, Seldom, Never.5: The temperature this morning at 7:00 a.m.6: The diameter of major league baseballs.7: The average horsepower of ten randomly selected 1.6L MINI Cooper engines.

Data Source* Variable Qual/Quant Discrete/Cont Nom/Ord/Int/Ratio

1. doctors wash hands qual. N/A N2. students major qual N/A N3. dogs weight quant cont R4. people eat chocolate qual N/A O5. thermometer temperature quant cont I6. baseballs ball diameter quant cont R7. Mini Coopers engine horsepower quant cont R*NOTE: Data Source is the population or sample unit from which you obtain the data, not the variable information (data) collected.

Problem 4.3: The Tax Man Cometh: The Internal Revenue Service wants to sample 1000 tax returns that were submitted last year to determine the percentage of returns that had a refund. Identify a sampling method that would be appropriate in this situation.

Simple Random Sample (SRS) would work

Problem 4.4: Prescription Drug Program: The director of a hospital pharmacy chooses at random 100 people age 60 or older from each of three surrounding counties to ask their opinion of a new prescription drug program. Identify the type of sampling used.

Stratified (non-proportional allocation)

Problem 4.5: Combinations & PermutationsIce Cream: Thirty-one ice cream flavors > three scoops (different flavors) & a banana = one banana split. A) If the order of the flavors did not matter, how many different combinations of ice cream flavors could be made into a banana split? B) How many different ways could a banana split be made if order matters?

Permutation Calculation Combination Calculation

31 P3=31!(31−3 )!

=31∗30∗29∗28!28 !

=26970banana splits

31 C3=31 !(31−3)!3 !

=31∗30∗29∗28 !28 !*3∗2∗1

=31∗5∗291

=4495banana splits

8

KEYSTAT 101

TOPICS: Qualitative Data; SPSSDOCUMENTS

HANDOUTS: Green Sheet #5; Yellow #5 AVAILABLE ONLINE: Green #5; Yellow #5; CA#2

ASSIGNMENTS: Text Readings - Ch. 2 – pp. 56-57 (pie & pareto charts) Text Problems – if we get here: p. 62 #23, 24 (make bar chart instead of pie), 25, 26. CA#2 & Items below.

FORMULAS: Multiplication Rule for Independent Events: k1*k2*k3∗. .. *kn-1 *kn [read as: event 1* event 2 * etc.]

Permutations: n Pr=

n!(n−r )! Combinations:

n C r=n !

(n−r )! r ! Next Class: Topics: Qualitative Data, Contingency Tables; Quantitative Data Analysis (maybe); CA#2 dueStat Essentials (taken from today): Be able to: Sampling - 1) distinguish between probability–based sampling approaches; Qualitative Data - 1) build qualitative tables; 2) build qualitative charts: pie, bar, pareto.

Problem 5.1: Combinations & PermutationsCoca Cola Directors: There are 11 members on the board of directors for the Coca Cola Company. A) If they must elect a chairperson, first vice president, second vice president; and secretary, how many different slates of four candidates are possible? B) If they must form a four-member ethics committee, how many different committees are possible?

Permutation Calculation Combination Calculation

B) Coca Cola

11 P4=11!(11−4 ) !

=11∗10∗9∗8∗7 !7 !

=7920 possible election orders

11 C4=11!(11−4 )! 4 !

=11∗10∗9∗8∗7 !7 !*4∗3∗2∗1

=11∗10∗31

=330possible committees

Problem 5.2:In 2005 a television advertisement for Allstate Auto Insurance noted that last year (2004) 1.3 million people switched to Allstate. What is missing here?

How many left.

Problem 5.3:Permutations & Combinations: You have ten paintings to hang, but only space to hang three. A) How many different ways could these paintings be hung if order matters? B) If order didn’t matter, how many different groups of three paintings could occur?

52.1.18

9

KEY

10 P3=10 !(10−3 ) !

=10∗9∗8∗7 !7 !

=720 10 C3=10 !(10−3 )!3 !

=10∗9∗8∗7 !7!*3∗2∗1

=10∗3∗41

=120

Problem 5.4:Village Life: Identify the variable characteristics below.

Variable: Quality of life in village

Qual/Quant: qualitative Measurement Level: ordinal

Write a brief paragraph regarding the information in this table.

Answers vary

Problem 5.5:Cell Phones:

Variable: Cell Phone Satisfaction Characteristics are: Categorical/Quant Discrete/Continuous/Neither N/O/I/R

Values: 1 = Fair; 2 = Good; 3 = Very Good; 4 = ExcellentData (n=31): 1,2, 3, 3, 3, 2, 2, 3, 3, 4, 3, 1, 3, 1, 3, 3, 3, 2, 2, 4, 3, 3, 3, 2, 2, 4, 4, 3, 3, 3, 3

Task 1: Build a qualitative frequency table of the variable Cell Phone. Include a table title, the variable values, frequencies, relative frequencies, cumulative frequencies, and cumulative relative frequencies.

Task 2: Build a Bar Chart of the variable Anxiety Level. Task 3: Build a Pie Chart of the variable Anxiety Level. Task 4: Build a Pareto Chart of the variable Anxiety Level. Task5: Write a paragraph that introduces the tables & charts, two sentences that describe information contained

within the frequency table, and a summary statement.

Degrees34.9281.36197.2846.44360

Rating of qual i ty of l i fe in vi l lage

4 1.1 1.2 1.283 23.1 23.9 25.1

183 50.8 52.7 77.860 16.7 17.3 95.117 4.7 4.9 100.0

347 96.4 100.013 3.6

360 100.0

ExcellentVery GoodGoodFairPoorTotal

Valid

SystemMissingTotal

Frequency Percent Valid PercentCumulat ivePercent

Rate Cell Phone Company

3 9.7 9.7 9.7

7 22.6 22.6 32.3

17 54.8 54.8 87.1

4 12.9 12.9 100.0

31 100.0 100.0

Fair

Good

Very Good

Excellent

Total

ValidFrequency Percent Valid Percent

CumulativePercent

10

KEY

11

KEYSTAT 101

TOPICS: Qualitative Data; Contingency tablesDOCUMENTS HANDOUTS: Greene Sheet #6; Yellow #6; Contingency Tables Reference (in SPSS:

Crosstabs) VAILABLE ONLINE: Green #6; Yellow #6; Qualitative PowerPoint; Anatomy Sheets (5): Contingency tables;

Qualitative Frequency Table; Pie Chart; Bar Chart; Pareto Chart; PowerPoint: Quantitative DataHWK:

Text Readings - Ch. 2 sections 2.1 – 2.2 Text Problems: p. 61-64 - # 26, 33, 35, 37, 38. ALSO, problems from Green #5. Yellow #6 problems not used in class Items below

Next Class: Contingency Tables (?); Quantitative Data: Tables & Charts

Stat Essentials (taken from today): Be able to: 1) calculate relative frequency for response values; 2) build a qualitative frequency tables & charts (pie, bar, pareto); 3) build & interpret contingency tables; 4) utilize SPSS to make tables and charts.

Problem 6.1:Contingency Table: During a past semester there were 39 purple cars registered on campus. Who owned these cars and what types of purple cars are various registrants driving? Contingency tables can be used to break the data into sub-groups, thereby providing more information about who, in this case, owns purple cars. Consider:

Which is the column variable and why? What is the size of this table? What happens to cases where one or both variables are

not available? Once built, do you see a trend within the data?

Create a 4x2 contingency table of the variables Status and Gender. Include column and row percentages. What does this crossing of these two variables tell you about the population of PCP’s (purple car people)?

1) Gender is the column variable (explanatory/independent variable). Status (registration type) is the objective/dependent variable (rows).

2) Table size is dependent on the number of objective variable values, here 4, followed b the number of values of the explanatory variable, here 2.

3) Deleted from the table as you would not know where to place the count.

4) Look at the shift in percentages, not counts.

COLOR TYPE STATUS GENDER MAKE ORIGINPurple Four-Door Commuter F DODGE AmericanPurple Four-Door Commuter F BUICK AmericanPurple Four-Door Commuter F CHRYSLER AmericanPurple Four-Door Commuter M SATURN AmericanPurple Four-Door Commuter M HONDA AsianPurple Four-Door Commuter F CHEVROLET AmericanPurple Four-Door Commuter F CHEVROLET AmericanPurple Four-Door Commuter F DODGE AmericanPurple Four-Door Commuter F DODGE AmericanPurple Four-Door Commuter F CHRYSLER AmericanPurple Four-Door Commuter F MERCURY AmericanPurple Four-Door Commuter F CHEVROLET AmericanPurple Suburban Commuter M SATURN AmericanPurple Two-Door Commuter F DODGE AmericanPurple Two-Door Commuter F CHEVROLET AmericanPurple Four-Door Commuter F HONDA AsianPurple Four-Door Commuter F PLYMOUTH AmericanPurple Four-Door Commuter F FORD AmericanPurple Four-Door Commuter F DODGE AmericanPurple Two-Door Commuter M ACURA AsianPurple Suburban Faculty M CHEVROLET AmericanPurple Suburban Faculty F DODGE AmericanPurple Two-Door Faculty M VOLKSWAGEN EuropeanPurple Two-Door Resident F CHEVROLET AmericanPurple Four-Door Resident M FORD AmericanPurple Two-Door Resident F CHEVROLET AmericanPurple Two-Door Resident M VOLKSWAGEN AmericanPurple Four-Door Resident M OLDSMOBILE AmericanPurple Four-Door Resident M SATURN AmericanPurple Four-Door Resident F PONTIAC AmericanPurple Four-Door Resident M PLYMOUTH AmericanPurple Four-Door Resident F TOYOTA AsianPurple Suburban Resident M MERCURY AmericanPurple Two-Door Resident M ACURA AsianPurple Four-Door Staff M HONDA AsianPurple Four-Door Staff F TOYOTA AsianPurple Pickup Truck Staff M DODGE AmericanPurple Suburban Staff F PLYMOUTH AmericanPurple Van Staff F PLYMOUTH American

62.6.18

12

KEY

Problem 6.2:Contingency Table: Create a contingency table that crosses vehicle Type with Status. Which variable should be the column variable and which the row variable?

Consider: Which is the column variable and why? What is the size of this table? What happens to cases where one or both variables are

not available? Once built, do you see a trend within the data?

1) Status is the column variable (explanatory/independent variable). Type (car type) is the objective/dependent variable (rows).

2) Table size is dependent on the number of objective variable values, here 5, followed b the number of values of the explanatory variable, here 4.

3) Deleted from the table as you would not know where to place the count.

4) Look at the shift in percentages, not counts.

COLOR TYPE STATUS GENDER MAKE ORIGINPurple Four-Door Commuter F DODGE AmericanPurple Four-Door Commuter F BUICK AmericanPurple Four-Door Commuter F CHRYSLER AmericanPurple Four-Door Commuter M SATURN AmericanPurple Four-Door Commuter M HONDA AsianPurple Four-Door Commuter F CHEVROLET AmericanPurple Four-Door Commuter F CHEVROLET AmericanPurple Four-Door Commuter F DODGE AmericanPurple Four-Door Commuter F DODGE AmericanPurple Four-Door Commuter F CHRYSLER AmericanPurple Four-Door Commuter F MERCURY AmericanPurple Four-Door Commuter F CHEVROLET AmericanPurple Suburban Commuter M SATURN AmericanPurple Two-Door Commuter F DODGE AmericanPurple Two-Door Commuter F CHEVROLET AmericanPurple Four-Door Commuter F HONDA AsianPurple Four-Door Commuter F PLYMOUTH AmericanPurple Four-Door Commuter F FORD AmericanPurple Four-Door Commuter F DODGE AmericanPurple Two-Door Commuter M ACURA AsianPurple Suburban Faculty M CHEVROLET AmericanPurple Suburban Faculty F DODGE AmericanPurple Two-Door Faculty M VOLKSWAGEN EuropeanPurple Two-Door Resident F CHEVROLET AmericanPurple Four-Door Resident M FORD AmericanPurple Two-Door Resident F CHEVROLET AmericanPurple Two-Door Resident M VOLKSWAGEN AmericanPurple Four-Door Resident M OLDSMOBILE AmericanPurple Four-Door Resident M SATURN AmericanPurple Four-Door Resident F PONTIAC AmericanPurple Four-Door Resident M PLYMOUTH AmericanPurple Four-Door Resident F TOYOTA AsianPurple Suburban Resident M MERCURY AmericanPurple Two-Door Resident M ACURA AsianPurple Four-Door Staff M HONDA AsianPurple Four-Door Staff F TOYOTA AsianPurple Pickup Truck Staff M DODGE AmericanPurple Suburban Staff F PLYMOUTH AmericanPurple Van Staff F PLYMOUTH American

13

KEYSTAT 101

TOPICS: Contingency Tables (cont.); Quantitative Data Tables, Charts & GraphsDOCUMENTS

HANDOUTS: Green #7; Yellow #7 (Quantitative Data) AVAILABLE ONLINE:

o Green #7; Yellow #7; o PowerPoint (2): Quantitative Data; Frequency Polygons & Ogives; o Anatomy Sheets (4): Quantitative Frequency Table; Histogram, Dot Plot, Stem-and-Leaf; o ExCR#3 online

HWK: Text Read: Ch. 2 sections 2.1, 2.2. Text Problems: p. 47 #2 – 5, 7, 8, 11, 19, 21, 25, 31 (#31: freq. table AND freq. histogram) Items below.

Next Class: Quantitative Data – Charts (cont.); ExCr#3 due; Exam #1 distributedStat Essentials (taken from today): Be able to: 1) build a frequency table appropriate for the presentation of quantitative data; 2) identify frequency table components – classes, boundaries, etc.; 3) build charts/graphs appropriate to quantitative data – histogram, dot plot, stem-and-leaf, frequency polygon, ogive (won’t get to all today).

Problem 7.1: Contingency Tables: Using the table below, find the requested percentages or counts. The data present three cities in which houses were sold and during which month the houses sold.

What is the size of this contingency table? __4__ by _3___

Among houses sold in Arlington, what percent were sold in June and July? 48.6%

During August 32.3% of the houses were sold in Fort Worth.

Among all houses 11.2% were sold in Dallas during September.

Looking at the right marginal totals column what does the 131 value represent? Houses sold in July

T or F: Thirty percent of the houses sold in June were sold in Dallas.

T or F: Thirty-two percent of the houses were sold in Fort Worth.

T or F: Of the houses sold in July, approximately 25% were sold in Dallas

72.8.18

14

KEYProblem 7.2: Obtaining statistical output and providing analysisIdeal Weight: Twenty five students reported their ideal weights (in most cases, not their current weight). Weights (lb): 110 115 123 130 105 119 130 125 120 115 120 120

120 110 120 150 110 130 120 118 120 135 130 135110

Create a frequency table containing five classesWrite a paragraph containing an introduction and a minimum of two descriptive statements.

Identify:Midpoint of the third class: 129.5 lb. Boundaries of the first class: 104.5, 114.5 lb.

Class limits of the second class: 115, 124 lb. Width of the classes: 10 lb.

Based upon your table make the following charts/graphs (NOTE: do only those demonstrated today)Histogram Dot plot Stem and Leaf Frequency Polygon Ogive

double] Stem-and-leaf of weight lb. N = 25Leaf Unit = 1.0

10 5 11 0000 11 5589 12 00000003 12 5 13 0000 13 55 14 14 15 0

Histogram not made.

Frequency polygon and ogive not available. See class notes.

IDEAL WEIGHTS

Weight (lb.) f rf cf crf105-114 5 0.20 5 0.20115-124 12 0.48 17 0.58125-134 5 0.20 22 0.88135-144 2 0.08 24 0.96145-154 1 0.04 25 1.00Totals: 25 1.00 - -

15

KEYSTAT 101

TOPICS: Quantitative Tables & Charts (cont.)DOCUMENTS:

HANDOUTS: Green #8; EXAM #1 AVAILABLE ONLINE: Green #8; PowerPoints: Distribution Shapes; Sigma; Quantitative

Data – Freq. Polygon, Ogive, Line Chart ppt. ANATOMY REFERENCE SHEETS: Contingency Tables; various charts for quantitative data

HWK: Text Reading: Ch. 2, review sections 2.1-2.2 Text Problems: p. 47 #13, 15, 19, 29, 37, using #32, only make a dot plot, no table; using #35, only make a stem-and-

leaf plot (no table) Items below

Next Class: DUE EXAM #1; Sigma; Line Charts; Distribution Shapes; and …

Stat Essentials (as with prior class): Be able to: 1) build a frequency table appropriate for the presentation of quantitative data; 2) identify frequency table components – classes, boundaries, etc.; 3) build charts/graphs appropriate to quantitative data – histogram, dot plot, stem-and-leaf, frequency polygon, ogive ; 4) build a time series chart (line chart).

Problem 8.1: Nutrition Bars: (NOTE: Lower Class Limits shown)

How many values are in the data set? 75

How many classes are there? 11

What is the width of a class? 50

How many milligrams of sodium are in the nutrition bar with the highest value? >500

Explain how a relative frequency histogram would differ from the displayed chart. (Hint: Think about how frequency and relative frequency bar charts differ?) CHANGE OF Y-AXIS LABEL AND SCALE

Problem 8.2: OOPS – a repeat of 7.1 Contingency Tables: Using the table below, find the requested percentages or counts. The data present three cities in which houses were sold and during which month the houses sold.

What is the size of this contingency table? __4__ by _3___

Among houses sold in Arlington, what percent were sold in June and July? 48.6%

During August 32.3% of the houses were sold in Fort Worth.

Among all houses 11.2% were sold in Dallas during September.

Looking at the right marginal totals column what does the 131 value represent? Houses sold in July

T or F: Thirty percent of the houses sold in June were sold in Dallas.

T or F: Thirty-two percent of the houses were sold in Fort Worth.

T or F: Of the houses sold in July, approximately 25% were sold in Dallas

82.13.18

16

KEYProblem 8.3: Retirement Ages: (Larson & Farber p. 51 #39)Using the following data, build both a single stem and a double stem, stem-and-leaf.

Ages:70 54 55 71 57 58 63 65 60 66 57 6263 60 63 60 66 60 67 69 69 52 61 73

SINGLE STEM-AND-LEAF DOUBLE STEM-AND-LEAF

Retirement Age Retirement AgeKey: Leaf = 1.0 Key:52 = 5|2 Key: Leaf = 1.0 Key:52 = 5|2

Stem Leaf Stem Leaf5 2 4 5 7 7 8 5 2 46 0 0 0 0 1 2 3 3 3 5 6 6 7 9 9 5 5 7 7 87 0 1 3 6 0 0 0 0 1 2 3 3 3

6 5 6 6 7 9 97 0 1 3

Problem 8.4:Retirement Ages: Using the data from problem 8.3, build a frequency table containing five classes. [NOTE: Without a stated starting point, make sure that the minimum value fits into the first class and the maximum value is within the fifth class. If the latter does not occur, either shift the starting point of the first class, while maintaining class width, or increase the size of the class width.]

Ages (yrs.) Calc. Class Size Retirement Ages52 60 66 Max: 73 Age (yrs.) f rf cf crf54 60 66 Min: 52 50-54 2 0.083 2 0.08355 61 67 Range: 21 55-59 4 0.167 6 0.2557 62 69 60-64 9 0.375 15 0.62557 63 69 5 Classes divided 65-69 6 0.250 21 0.87558 63 70 into range = 4.2=> 70-74 3 0.125 24 1.00060 63 71 round up to 5 Total: 24 1.000 - -60 65 73

17

KEYSTAT 101:

TOPICS: Distribution Shapes; Time Series Charts; Sigma; Measures of Center DOCUMENTS:

HANDOUTS: Green #9; Yellow #9; CA#3 AVAILABLE ONLINE: Green #9; Yellow #9; PowerPoints for: 1) Sigma, 2) Measures of

Center; 3) Distributions, 4) Time Series Charts; Anatomy sheets: SigmaASSIGNMENTS:

Text Readings: Ch. 2, p. 59 (Time Series), sections 2.3, 2.4 Text Problems: p. 48 #21, 23; p. 60 #17 (single stem), 18 (double stem), 29, 32, 34 (part c & d only; scatter plots

when we get to correlation) =>>>NOTE: one of these problems may be H8 next class. CA#3 Items below

NEXT CLASS: Measures of Center & Measures of Variation; =>>>>DUE: CA#3Stat Essentials (taken from today): Be able to: DISTRIBUTIONS: 1) identify distribution shapes and characteristics; TABLES & CHARTS: 1) build a time series line chart; SIGMA: 1) understand what the symbol (Sigma) means; 2) successfully demonstrate the application of MEASURES OF CENTER 1) calculate an arithmetic mean, median, mode; 2) identify when you would use each of these measures.

Problem 9.1:An introductory statistics class had three exams, for which the grade distribution of each exam is presented to the right. For each exam describe the distribution’s shape and comment on the exam’s difficulty. [Consider the x-axis point C1 to be the exam’s mean grade.]

Distribution Shape Difficulty of Exam (easy, average, hard)

Exam #1: Positive (right-skewed) Hard

Exam #2: Normal Average

Exam #3: Negative (left-tailed) Easy

Problem 9.2:SIGMA: Given the following data for X and Y, determine the values of Sxx and Syy.

X Y S xx=Σx 2

−(Σx )2

n =

7131.5 S yy=Σy 2

−(Σy )2

n = 559.5

92.15.18

X Y x2 Y2 XY20 4 400 16 8038 10 1444 100 38010 2 100 4 2068 8 4624 64 544

104 28 10816 784 291287 83 7569 6889 7221

327 135 24953 7857 11157X Y X2 Y2 XY

18

KEY

Problem 9.3:SIGMA: Given the following data for X and Y, determine the values of r. [Note: same data as in problem 9.2.]

X Y r= nΣ xy−(Σx)(Σy )

√ [nΣx2−(Σx )2 ] [nΣy2−(Σy )2 ]=

.92620 4 See table prior page

38 10

10 2

68 8

104 28

87 23

Problem 9.4Time Series: Create a time series chart of the Murder Rates in New York City

19

KEYMurder Rates in NYC*

Year Rate1990 14.51991 14.21992 13.21993 13.31994 11.11995 8.51996 7.41997 6.01998 5.11999 5.02000 5.02001 5.02002 4.82003 4.92004 4.62005 4.52006 4.82007 4.22008 4.32009 4.02010 4.52011 3.92012 3.52013 3.02014 2.7

*per hundred thousand STAT 101

TOPICS: Measures of Center & VariationDOCUMENTS:

HANDOUTS: Green sheet #10; Yellow #10 AVAILABLE ONLINE: Green #10; Yellow #10; PowerPoint: Measures of Center, Measures of Variation, Measures

of Position.ASSIGNMENTS:

Text Review: Ch. 2 section 2.3-2.4; read 2.5 Text Problems: p. 77 # 55, 56 [NOTE: INSTRUCTIONS FOR THESE TWO PROBLEMS ARE ON P. 76] Yellow Sheet #13, as indicated in class Items below

FORMULAS (For Samples): Mean Variance Standard Deviation

Definition formula): x=∑i=1

n

x i

ns2=

Σ( x−x¿)2

n−1 s=√∑( x− x )2

n−1

Computational formula): aboves2=

nΣx2−(Σx )2

n(n−1) s=√ n∑ x2−(Σx)2

n (n−1 )Median: 1) middle score if odd number of values; 2) mid-point between two middle scores if even number of valuesMode: most frequent value (multiple modes may exist); represents the center of qualitative data

Midrange: minimum+maximum

2NEXT CLASS: Measures of Variation & Position

Stat Essentials (taken from today): MEASURES OF CENTER: Be able to: 1) calculate an arithmetic mean, median, mode; 2) identify when you would use each of these measures; MEASURES OF VARIATON: Be able to:1) explain what a

102.20.18

20

KEYstandard deviation represents; 2) obtain a standard deviation using either formula (definition or computational); 3) interpret what the standard deviation represents in a given situation; 4) Empirical Rule;.5) Chebychev’s Theorem.

Problem 10.1:Blood Pressure: Given the following sample of systolic blood pressures, determine their mean, median, mode, and midrange.

Systolic Pressures: 120 145 86 133 115 124 153 98 144 132

Problem 10.2 CRICKETS:

THE DATA: Temperature vs. Cricket Chirps: Crickets make a chirping noise by sliding their wings over each other. Perhaps you have noticed that the number of chirps seems to increase with the temperature. The following data list the temperature (Fahrenheit) and the number of chips per second for the striped ground cricket.

X: Temperature (Fo): 69.4 69.7 71.6 75.2 76.3 79.6 80.6 80.6 82.0 82.6 83.3 83.5 84.3 88.6 93.3Y: Chirps/second: 15.4 14.7 16.0 15.5 14.4 15.0 17.1 16.0 17.1 17.2 16.2 17.0 18.4 20.0 19.8 Given: xxyyxy

Determine the mean, median, mode for both of these variables.

Temp ChirpIQR 8.3 1.81.5(IQR) 12.45 2.7LL 62.75 12.7UL 95.95 19.9

21

KEY

22

KEYSTAT 101:

TOPICS: Measures of Variation; Measures of PositionDOCUMENTS:

HANDOUTS: Green #11; Yellow #11 AVAILABLE ONLINE: Green #11; Yellow #11, PowerPoint: Measures of

Position; CA#4ASSIGNMENT:

Text Review: Ch. 2 section 2.4-2.5 Text Problems: p. 90 #1 – 11, 15 – 18, 25, 29 Yellow #11, as needed CA#4, ExCr#4 (optional)

FORMULAS (For Samples): Mean, Median, Mode, Midrange, Variance, Standard Deviation on prior pageFive-Number-Summary: Minimum – Q1 – Q2 – Q3 - MaximumChebychev’s Theorem: 1- 1/k2 where k ≥2; Used with any shape distributionEmpirical Rule: Used with normally distribution data

z-score: z= x−μ

σ Pearson’s I=3( x−median )

sCVAR= s

x¿ ∗100 %

Q1=n+1

4 M=Q 2=n+1

2 Q3=3(n+1)

4NEXT CLASS: Measures of Position (cont.); DUE: CA#4, ExCr#4Stat Essentials (taken from today): Measures of Variation: Be able to:1) explain what a standard deviation represents; 2) obtain a standard deviation using either formula (definition or computational); 3) interpret what the standard deviation represents in a given situation; 4) Empirical Rule; Measures of Position: 1) obtain the five-number-summary; 2) interquartile range; 3) build a box plot; 4) determine variability measures: skew (Pearson’s I), Coefficient of Variability, z-score.

Problem 11.1:Blood Pressure: Given the following sample of systolic blood pressures, determine their variance, standard deviation.

Systolic Pressures: 120 145 86 133 115 124 153 98 144 132NOTE: Calculations for the mean, median, mode are on prior green sheet (Problem 10.1).

112.22.18

23

KEYCRICKETS:THE DATA: Temperature vs. Cricket Chirps: Crickets make a chirping noise by sliding their wings over each other. Perhaps you have noticed that the number of chirps seems to increase with the temperature. The following data list the temperature (Fahrenheit) and the number of chips per second for the striped ground cricket.

X: Temperature (Fo): 69.4 69.7 71.6 75.2 76.3 79.6 80.6 80.6 82.0 82.6 83.3 83.5 84.3 88.6 93.3Y: Chirps/second: 15.4 14.7 16.0 15.5 14.4 15.0 17.1 16.0 17.1 17.2 16.2 17.0 18.4 20.0 19.8 Given:

xxyyxy

NOTE: Calculations for the mean, median, mode for both of these variables are on prior green sheet (Problem 10.2).

Problem 11.2: Using the computational formula, determine the standard deviations for these two variables.

Temp ChirpIQR 8.3 1.81.5(IQR) 12.45 2.7LL 62.75 12.7UL 95.95 19.9

Problem 11.3 (z-score): A temperature of 93.3o is how many standard deviations away from the mean? Sixteen chirps per second is how many standard deviations away from the mean? (z-score questions)

Temp: z = (93.3 – 80.04)/6.707 = 1.977 s.d above meanChirps: z = (16 – 16.65)/1.70 = -.38 s.d below mean

Problem 11.4 (CVar): Which variable, Temperature or Chirps/sec, shows greater variability?

Variation: Temp: CVAR = (6.71/80.04)*100 = 8.4%Chirps: CVAR = (1.7/16.65)*100 = 10.2%

Chirps has greater variability

Problem 11.5 (Skew): Determine the degree of skew for each of these variables.24

KEY

Value in table above, but it will vary some from a value calculated by the Pearson’s I formula.

EXAMINATION PROBLEMS TO BE COMPLETED WITHOUT ASSISTANCE STAT 101:TOPICS: Exam ReviewDOCUMENTS:

HANDOUTS: Green #12; Yellow #12 AVAILABLE ONLINE: Green #12; Yellow #12; Exam #2 review sheet – review

sample exams for Exam #1 (if a topic is unfamiliar, we may not have done that yet)HWK:

Review Ch. 1, Ch. 2 Yellow Sheet #12, as needed - finish off review problems (answers online) Exam #2 take-home portion (this sheet)

FORMULAS: Available on prior orange sheetsNEXT CLASS: Exam #2; Take-home section dueStat Essentials (taken from today): A review of sample items representing potential exam topics.

Exam 1 Topic Areas for Review:See Handouts distributed through today. This is a comprehensive, cumulative exam.Potential exam topics:1. terms and their relationships2. assessing table & chart content for accuracy3. interpretation & discussion of table & chart content4. recognition of appropriate/inappropriate data presentation5. sampling techniques – identification and calculation6. combinations, permutations, multiplication rule for independent events, tree diagrams7. building qualitative tables (2 types) and charts (pie, pareto, bar)8. building quantitative tables (2 types) and charts (histogram, dot plot, stem-and-leaf, ogive; freq. polygon)9. measures of center: mean, median, mode, midrange10. measures of variation: range, standard deviation, inter-quartile range (Q3 –Q1)11. measures of position: minimum, Q3, Q2, Q1, maximum12. Empirical Rule & Chebychev’s Theorem13. SIGMA

EXAM #1 TAKE-HOME PROBLEMS [10]:

GENERAL INSTRUCTIONS FOR TAKE-HOME PROBLEMS (on reverse): On this sheet , complete the problem set to which you have been assigned. Submit this sheet at the beginning of class #13 (exam day). The problems completed on other/additional sheets

or the substitution of other sheets (e.g. notebook page) will not be accepted. This sheet will not be accepted after the exam has begun . Grading values listed with problems. Where not identified, the general grading process occurs (-.5 first error,

etc.).

Problem Set Assignment: See the listing below for the problem set to which you have been assigned.

122.27.2018

25

KEYProblem Set#1 Problem Set #2 Problem Set #3 Problem Set#1 Problem Set #2 Problem Set #3 Problem Set#1 Problem Set #2 Problem Set #38:30 AM 8:30 AM 8:30 AM 10:00 AM 10:00 AM 10:00 AM 2:30 PM 2:30 PM 2:30 PMAmbrosio Kelly Andrade Anson Buritica Miller Alston Fitzpatrick DevineBull Koch Biittig Assini Geyer Nicholas Argondizza Gocool OramDigiovanni Leone Ruggieri Boyd Heisner Pepe Blydenburgh Karafian RosensteinDougan Martinez, A Russo Brock Held Smith, A Bretts Klein SachDuran Martinez, K Schweikhard Buscemi Jacobs Smih, C Brun Lewis SchmidtFitzpatrick Mathew Seuling Dobert Kilichowski Stieglitz Cates Lucic SchraderGallo Mcbride Struble Drabczyk Krupunich Tilke Chambers MacMillan SherwinGaul McGee Szemansco Evaniak Ludovico Trumble Cody Miro SrourKarpinski Montgomery Taranovich Garcia Margolin Vanglad Clark Moulder TyburskiPetix Scaglione Thornton Kisacky McGlynn Vermette Emerson Wescott MurilloSausville Sakiel

26

KEYNAME: _____________________ Problem Set #: _____ EXAM #1: Take-home problems [10 pts; No name or problem number -1]

The systolic blood pressures (in mmHg) of 12 women between the ages of 20 and 35 were measured before and after administration of a newly developed oral contraceptive are presented in Table 1.

Problem Set 1:1. [6] Using the data provided in Table 1, complete the following sigma problem.2. [4] Text problem: page 73 #20.

Problem Set 2:1. [6] Using the data provided in Table 1, complete the following sigma problem.2. [4] Text problem: page 73 #22.

Problem Set 3:1. [6] Using the data provided in Table 1, complete the following sigma problem.2. [4] Text problem: page 74 #30.

SIGMA PROBLEM [6; show work in an orderly fashion; grading 4 – work, 2 - answer]

Answers are on the exam #2 KEY

TEXT PROBLEM [4] (See instructions prior to problem #17 in text; only answers, no computations)

MEAN: _____ MEDIAN: _____ MODE: _____ (got labels?)

Above measure(s) that should not be used to represent the data center: __________________

To discuss the center of the data which measure would seem most appropriate? Mean Median Mode

nxx

nyxxy

b 22

1 )(

))((

22

2

0 )()())(())((

xxnxyxxyb

221 )()())((

xxnyxxynb

Table 1: Systolic Blood Pressure of 12 Women1

Systolic BP workspace for calculations if neededn = 12 X Y X2 Y2 XYSubject Before After

1 132 1372 126 1283 132 1404 120 1195 142 1456 130 1307 142 1488 137 1359 128 12910 120 13211 128 12812 129 133

SUMS:

n = 12 x y x2 y2 xy1between ages of 20 and 35 in mmHg

27

KEYSTAT 101

TOPICS: EXAM #1DOCUMENTS:

HANDOUTS: Green #13 AVAILABLE ONLINE: Green #13

ASSIGNMENTS: TAKE A BREAK AFTER YOU READ THIS (could be the next quiz)

NEXT CLASS: Measures of Position; Correlation & Regression beginsStat Essentials (taken from today): To see what we know

Portion sizes increase in 'Last Supper' paintings (An application of statistics)By Nanci Hellmich, USA TODAY (3/23/2010)

If your food portions seem to have grown larger over the years, you have some blessed company.

Two researchers analyzed the food and plate sizes in 52 of the most famous paintings of The Last Supper and found that the portion sizes in the paintings have increased dramatically over the past millennium, from years 1000 to 2000.

Using a computer program, they compared the size of loaves of bread, main dishes and plates to the size of the heads of the disciples and Jesus in the artwork, including Leonardo da Vinci's famous depiction of the event.

Findings published in April's International Journal of Obesity: Over that 1,000-year period, the main course size increased by 69%, plate size 66% and loaves of bread 23%. The biggest increases in size came after 1500.

The researchers used paintings of this event "because it is the most famous supper in history," which artists have been painting for centuries, so the paintings provide information about plate and entree sizes over time, says Brian Wansink, director of the Cornell (University) Food and Brand Lab in Ithaca, N.Y. One possible reason for the increase: Food may have become more available and less expensive, he says.

He did the research with his brother, Craig, a professor of religious studies at Virginia Wesleyan College in Norfolk, and a Presbyterian minister.

The three Gospels (Matthew, Mark and Luke), which include descriptions of The Last Supper, mention only bread and wine, but many of the paintings have other foods, such as fish, lamb, pork and even eel, says Craig Wansink.

The use of fish in the meals is symbolic because it's an image that is used to represent Christianity, he says. Among the reasons for the symbolism: A number of the disciples were fishermen, and Jesus told them "to be fishers of men," he says. Plus, he says, Jesus performed several miracles with fishes and loaves.

As Easter approaches, he says, people may want to study the paintings because they illustrate one of the "most important moments in Christianity

ABOVE: "The Last Supper" painting by Duccio, 1308-11. Note the size of the food and drink on the table compared to the size of the heads of Jesus and his disciples.BELOW: "The Last Supper" painting by Tiziano Vecellio Titian.

133.1.18

28

KEYSTAT 101

TOPICS: Box plots; z-score; Pearson’s I, Coefficient of Variability (CVAR)DOCUMENTS:

HANDOUTS: Green #14 AVAILABLE ONLINE: Green #14; PowerPoints: Measures of Position,

Correlation, RegressionHWK:

Text Readings: Ch. 9,sections 9.1, 9.2; Ch. 2 scatter plot p. 58 Text Problems: p. 107 #7 – 14, 21, 27, 28, 31, 61. Items below

FORMULAS: Mean, median, quartiles, variance, standard deviation on prior sheets

z-score: z= x−μ

σ Pearson’s I= 3( x−median )

sCVAR= s

x¿ ∗100 %

NEXT CLASS: Correlation & Regression; ScatterplotsStat Essentials (taken from today): Be able to: 1) obtain the five-number-summary; 2) build a box plot; 3) determine variability measures: skew (Pearson’s I), Coefficient of Variability, z-score. If we get there: 1) describe general steps leading to regression; 2) make a scatter plot; 3) calculate the Pearson Product Moment Correlation Coefficient, r.

CRICKETS:

THE DATA: Temperature vs. Cricket Chirps: Crickets make a chirping noise by sliding their wings over each other. Perhaps you have noticed that the number of chirps seems to increase with the temperature. The following data list the temperature (Fahrenheit) and the number of chips per second for the striped ground cricket.

X: Temperature (Fo): 69.4 69.7 71.6 75.2 76.3 79.6 80.6 80.6 82.0 82.6 83.3 83.5 84.3 88.6 93.3Y: Chirps/second: 15.4 14.7 16.0 15.5 14.4 15.0 17.1 16.0 17.1 17.2 16.2 17.0 18.4 20.0 19.8 Given: xxyyxy

REFER TO ORANGE #11 FOR YOUR ANSWERS TO THE FOLLOWING: Determine the mean, median, mode for both of these variables. Using the computational formula, determine the standard deviations for these two variables.

Problem 14.1: Skew (repeat of 11.5)Determine, using Pearson’s I, whether or not the variables are skewed.

Temperature I: _____ Chirps I: _____

Yes No (circle one): Given the value of “I,” we would consider this variable approximately normally distributed.

Temperature: Yes No Chirps: Yes No

Value in table above, but it will vary some from a value calculated by the Pearson’s I formula.

143.13.18

29

KEY

Problem 14.2: z-scoreFor the variable’s maximum value, determine how many standard deviations it is away from the mean.

Temperature z = _____ Chirps z: _____

TEMP: (93.3-80.04)/6.707 = 1.97

CHIRPS: z= x−x

s=20−16 . 65

1 . 70=1. 97

Problem 14.3: Variability (repeat of 11.4) Determine the variability of the variable.

Temperature CVAR = _____ Chirps CVAR: _____

Variation: Temp: CVAR = (6.71/80.04)*100 = 8.4%Chirps: CVAR = (1.7/16.65)*100 = 10.2%

Chirps has greater variability

As a result of comparing variability via CVAR, it appears that ______________ has greater variability than ___________.

Problem 14.4: Box plotsCreate a modified box plot for each variable. Although you may not need these values, calculate the IQR, upper limit, and lower limit.

See table on #11 for quartiles, etc.

Temperature (all Celsius; see prior table for many values)

Min: ____ Q1: ____ Q2: ____ Q3: ____

MAX: ____ IQR: _8.3___ L. Limit: _62.75___

U. Limit: _95.95___ Adj. Pt. (if any): _none___

Chirps (all seconds [not/minute as in box plot]cm; see prior table for many values)

Min: ____ Q1: ____ Q2: ____ Q3: ____

MAX: ____ IQR: _1.8___ L. Limit: _12.7___

U. Limit: _19.9___ Adj. Pt. (if any): _20.0___

30

KEY

STAT 101

TOPICS: Correlation & Regression DOCUMENTS:

HANDOUTS: Green #15; Yellow #15 AVAILABLE ONLINE: Green #15; Yellow #15; Correlation ppt. & Regression ppt. TABLE: Pearson Correlation Coefficient table

HWK: Text Readings (review): Ch. 9, sections 9.1, 9.2; Ch. 2 p. 58 (scatter plot) Text Problems: p. 495 #1-4, 9-12, 15-21, 23 CA#5 ExCr #5 (optional) Items below

FORMULAS: Hypotheses: Null Hypothesis (H0) vs. Alternative Hypothesis (H1) for correlations:

Statistically: In Words:

H0: ρ=0 (rho=0 ) H0: There is no linear relationship between variable X and variable Y.

H1: ρ≠0 (rho≠0 ) H1: There is a linear relationship between variable X and variable Y.[NOTE: Substitute actual variables in for “variable X” and “variable Y.”]

EXAMPLE:

H0: ρ=0 (rho=0 ) H0: There is no linear relationship between HEIGHT and WEIGHT.

H1: ρ≠0 (rho≠0 ) H1: There is a linear relationship between HEIGHT and WEIGHT.

Correlation:r= nΣ xy−(Σx )(Σy )

√nΣx2−(Σx )2√nΣy2−(Σy )2

Regression: [where b1 = slope and b0 = y-intercept]

y¿

=b0+b1 x b1=

nΣ xy−(Σx )(Σy)n (Σx2)−(Σx )2 b0= y−b1 x or

b0=(Σy )(Σx2 )−(Σx )(Σ xy )

n(Σx2 )−(Σx)2

NEXT CLASS: Correlation & Regression; ExCr#5 due; CA#5 distributedStat Essentials (taken from today): 1) Be able to: 1) identify the steps leading to correlation and regression; 2) conduct a correlation by hand and via SPSS; 3) Scatter Plot – building and interpreting; 4) Basics of Regression.

Problem 15.1:Woodpeckers: Forest managers are increasingly concerned about the damage done to animal populations when forests are clear-cut. Woodpeckers are a valuable forest asset, both because they provide nest and roost holes for other animals and birds and because they prey on many forest insect pests. The article “Artificial Trees as a Cavity Substrate for Woodpeckers,” (Journal of Wildlife Management [1983]) reported on a study of how woodpeckers behaved when provided with polystyrene cylinders as an alternative roost and nest cavity substrate. Noted below are selected values of X = ambient temperature (Co) and Y = cavity depth (in centimeters). [Devore & Peck p. 558]

Observation: 1 2 3 4 5 6 7 8 9 10 11 12 X: Temperature (Co): -6 -3 -2 1 6 10 11 19 21 23 25 26Y: Hole Depth (cm): 21.1 26.0 18.0 19.2 16.9 18.1 16.8 11.8 11.0 12.1 14.8 10.5

Given: n = 12 x = 131 y = 196.3 x2 = 2939 y2 = 3445.25 xy = 1622.3

1) What is being studied? Relationship between ____________________________________

153.15.18

NOTE: Insert the study’s variables in place of “variable X” and “variable Y.”

31

KEY

2) Does it make sense to study a relationship between these two variables? _____ Why? ________________________________________________________________________3) State in words the null and alternative correlation hypotheses to be tested.

H 0 : ρ=0 H 0 : There is no relationship between temperature and hole depth

H 1 : ρ≠0 H 1 : There is a relationship between temperature and hole depth

4) Make a scatter plot of these variables.

5) Given that the scatter plot indicates a relationship between these two variables. Calculate the correlation for the two variables.

r = -0.876 Sig. Level: = .01

Given: n = 12 x = 131 y = 196.3 x2 = 2939 y2 = 3445.25 xy = 1622.3

r= nΣ xy−(Σx)(Σy )

√ [nΣx2−(Σx )2 ] [nΣy2−(Σy )2 ]=

12(1622 .3)−(131)(196 .3)[ Sqrt (12(2939 )−1312 ]∗[ sqrt (12(3445 .25)−196 .32 ]

=

r=19467.6−25715 .3[ Sqrt (181079 ) ]∗[ sqrt (2809 .31 ]

= −6247 .7134 .56∗53. 00

=−6247 . 77131. 799

=−0 .876

6) Determine if the correlation is statistically significant: not sig. = .05 = .01(Use: Critical Values for the Pearson’s Correlation Coefficient Table; text page A26)

32

KEYIf yes, go on to regression. If no, no regression and best estimate becomes the mean of the Y variable.

33

KEYSTAT 101

TOPICS: Correlation & RegressionDOCUMENTS:

HANDOUTS: Green #16 AVAILABLE ONLINE: Green #16; Normal Distribution ppt.

ASSIGNMENT: Text Readings: Ch. 9 section 9.3; Ch. 5, sections 5.1, 5.2 Text Problems: p. 505 #7-12, 17, 19 Items below.

FORMULAS: Correlation & Regression on prior sheetCoefficient of Determination, r2, = Correlation Coefficient, r, squared (r2) OR

r2=SSExplainedSSTotal =

r2=Σ( y¿̂− y−)2

Σ( y− y−

)2¿

NEXT CLASS: Regression (cont?); Normal Distribution (?)Stat Essentials (taken from today): Review from prior classes 1) Steps leading to correlation and regression; 2) building and interpreting: scatter plot; hypotheses; correlation, statistical sig. of r; Today: 3) Regression – developing the regression equation; 4) terminology related to correlation & regression.

Problem 17.1:Given the following data set obtain: 1) in words identify what the null and alternative hypotheses state for this set of data; 2) a correlation coefficient; 3) the significance level of the correlation; 4) the coefficient of determination; 5) the regression equation for the line of best fit; 6) Add the regression line to the scatter plot to right (approximate location); and 7) calculate values for Yogi and Booboo. You may do this problem by manually calculating the values on the back of this sheet or using SPSS.

Lengths & Weights of Male BearsLength (in.): 53.0 67.5 72.0 72.0 73.5 68.5 73.0 37.0Weight (lbs.): 80 344 416 348 262 360 332 34

In words state the null and alternative hypotheses for these two variables.

H 0 : ρ=0 ______There is no linear relationship between length and weight.

H 1 : ρ≠0 ______There is a linear relationship between length and weight.

r = .897 Sig. level of r = .01 r2 = 80.46%

r= nΣ xy−(Σx )(Σy )

√nΣx2−(Σx )2√nΣy2−(Σy )2=

(8∗151879)−(516 .5 )(2176 )

√(8∗34525 .8 )−(516 .52)√(8∗728520)−(21762 )=1215032−1123904√9434 .15∗√1093184

=9112897 . 13∗1045 . 55

=91128101554 .68

=. 897

Regression Equation: y-hat = -351.59 + 9.659(x)

Given the regression equation you identified estimate the weight of the following two bears:

Yogi who is 76.0 inches long. Booboo (aka Bobo) who is 43 inches long.

Estimated weight: outside data range Estimated weight: 63.747 lb.

163.20.2018

34

KEYSTAT 101

TOPICS: Corr. & Reg.; da Vinci measures; Standard Normal Table DOCUMENTS:

HANDOUTS: Green #17, Yellow #17 (da Vinci measures); Standard Normal Table; Practice Sheets; CA#6

AVAILABLE ONLINE: Green #17, Yellow #17; CA#6; EX#6; Normal Distribution and Distribution of Sample Means PowerPoints; Anatomy (2): Normal Distribution, Standard Normal

ASSIGNMENT: Text Readings: –Ch. 5, sections 5.1 – 5.3; Text Problems: – (sect., page, problems); p. 506 #18, 30 (note: just determine the regression equation (calories = X),

no plot or correlation); p. 112 #62 CA#5; ExCr #6 Items below: PROBLEMS 19.1 & 19.2 ONLY.

FORMULAS: Probability Rules: 1) ΣP (X )=1 2) 0≤P(X )≤1 z= x−μσ

z= x−xs

NEXT CLASS: Normal Distribution, Dist. Of Sample Means, Probability; DUE SS#4; Ex#6Statistics Essentials: Know: 1) the characteristics of a standard normal distribution; 2) how to identify a probability distribution; 3) find an area associated with a z-score (use the standard normal table); 4) find a z-score associated with an area (use the standard normal table

Problem 17.1: Identify the area associated Problem 17.2: Identify the z-score associatedwith the following z-scores. with the following areas.

1) z = -1.54: area = .0618 2) z = .50: area = .6915 1) area = .0207: z = -2.04 2) area = .5000: z = 0

3) z = 2.33: area = .9901 4) z = -1.645: area = .0500 3) area = .9500: z = 1.645 4) area = .9900: z = 2.33

Draw the area under the standard normal curve and determine the probability noted.Problem 17.3: P(z ¿ 1.62) Problem 17.4: P(-.42 ¿ z or z ¿ .42)

Problem 17.5: P(1.00 ¿ z ¿ 2.50) Problem 17.6: P(-.26 ¿ z ¿ 0.00)

173.22.2018

1.0000-.9474.0526

.9474.6628

.3372

.3372

ANSWER:.3372*2 =.6744

.9938

.9938-.8413.1525

?.3974

.5000.5000-.3974.1026

.8413

35

KEY

36

KEYSTAT 101

TOPICS: Non-Standard Normal Dist.; Dist. of Sample Means; Central Limit TheoremDOCUMENTS:

HANDOUTS:; Green #18; Yellow #18 AVAILABLE ONLINE: Green #18; Yellow #18; Dist. Of Sample Means ppt.; Normal

Dist. Ppt.HWK:

Text Readings: Ch. 5, section 5.4 Text Problems: p. 262 #1-4, 33-35; Prior Topics: p. 505 #19; p. 110 #39, 44 Problems from Yellow #18 – selected ones we do not get to in class Items below

FORMULAS: Dist. Sample Means: μx=μ σ x=

σ√n

z= x−μσ√n ; Pop:

μ=∑i=1

n

xi

N σ=√∑ x2

N−μ2

NEXT CLASS: Dist. of Sample Means; Central Limit Theorem; Confidence Intervals (?); SS#5 dueStatistics Essentials: Know: 1) application of standard normal to non-standard normal situations; and, if we get there: 2) the characteristics of the distribution of sample means; 3) the Central Limit Theorem.

Problem 18.1: Driven to distraction: It seems almost silly to say: Keep your eyes on the road. But with cars now more than ever resembling mobile offices, massive entertainment centers, telephone booths and lunch counters – well, the road is sometimes the last thing we’re looking at. It’s much more interesting to jabber away on the cell phone or toy with your iPod – but those distractions can cut your reaction time in half. And with most accidents occurring within a few seconds, you need all the time you can get. So hang up, find a radio station you like and keep looking forward. (Source: MetLife yourlife, Summer 2007; italics added)

So what do you know? Assume it takes you 2 seconds to react and apply the brakes when driving and paying attention to the road. How long will it take you to react while being distracted by food, phone, etc?

Problem 18.2: Marriage Patterns: Are more people living together before getting married? The results of a recent survey indicated that 48% of respondents indicated that they were in an unwed relationship and that forty percent of these couples marry within three years. If there were 1200 respondents to the survey, how many of the respondents would marry within three years? 1200*.48 = 576; 576*.40 =230.4 couples

Problem 18.3: Education and self-employment: According to a recent Current Population Reports, the population distribution of number of years of education for self-employed individuals in the United States has a mean if 13.6 and a standard deviation of 3.0.

If one self-employed person was randomly selected, what would be the probability that he/she would have an education level less than 11 years?

z= x−μσ=11−13 .6

3=−2. 6

3=−.87

prob. associated with -.87 = .1922

Problem 18.4: Pregnancy Duration: The lengths of pregnancies are normally distributed with a mean of 268 days and a standard deviation of 15 days.

Wanda has been pregnant for a really, really long time. So long in fact that only one-half of a percent of women have been pregnant longer. For how many days has Wanda been pregnant?

.5% = .005 area; every area has a z-score; 1.0000 - .0050 = .9950; look up the z-score associated with an area of .9950 (area to left)x=μ+zσ=268+(2.575)(15 )=306 . 625 days

183.27.2018

37

KEYSTAT 101

TOPICS: Non-Standard Normal Dist.; Dist. of Sample Means; Central Limit TheoremDOCUMENTS:

HANDOUTS: Green #20; Yellow #19 AVAILABLE ONLINE: Green #20; Yelow #19; Dist. Of Sample Means ppt.;

Confidence Intervpls Ppt.HWK:

Text Readings: Ch. 6, section 6.1-6.3 Text Problems: p. 274 #1-8, 13, 17; Prior Topics: p. 505 #21; p. 112 #57 [do box plot for “You” data only] Problems from Yellow #18 & #19 – try some problems we do not get to in class; KEYS online Items below

FORMULAS: Dist. Sample Means: μx=μ σ x=

σ√n

z= x−μσ√n ; Pop:

μ=∑i=1

n

xi

N σ=√∑ x2

N−μ2

NEXT CLASS: Confidence IntervalsStatistics Essentials: Know: 1) application of standard normal to non-standard normal situations; 2) the characteristics of the distribution of sample means; 3) the Central Limit Theorem; 4) application to non-standard normal where n > 1.

Problem 19.1 (NOTE: continuation of 18.3)Education and self-employment: According to a recent Current Population Reports, the population distribution of number of years of education for self-employed individuals in the United States has a mean if 13.6 and a standard deviation of 3.0.

Find the mean and standard error of the sampling distribution of the mean for a random sample of size 100.

μx=μ=13 . 6 years σ x=σ√n=

3√100=

310 =. 3

If a sample of 100 self-employed individuals is selected, find the probability that their mean education level is less than 11.0 years.

z=x−μσ√n

=11−13 .63√100

=−2 . 6. 3

=−8 .66

probability essentially 0

Problem 19.2: Normal Distribution: PINE TREES: A buyer for the You-Build-It Lumber Company must decide whether or not to buy logging rights on a piece of land containing 15,000 mature pine trees. The heights of mature pine trees are normally distributed and the owner reports that the trees have a mean height of 36 feet and a standard deviation of 4 feet. The buyer will purchase the logging rights if less than 1,000 of the trees are estimated to be shorter than 30 feet tall. Based upon this information, what is the buyer’s decision?

A) Draw a picture of this problem. Not shownB) How many trees are estimated to be shorter than 30 feet tall? 1002 treesC) Explain your answer (i.e. show how you came to your recommendation). z = -1.5, therefore area below -1.5 = .0668 times 15,000 = number trees shorter than 30 ft. = 1002D) According to the stipulations cited above, is this a BUY ______ or a DO NOT BUY _______ situation?

193.29.18

38

KEYProblem 19.3Box plot: The table to the right contains the amount of fat per serving in grams of 12 Kelloggs “Children’s” cereals. Construct a box plot of these data.

Descriptive Statistics: Fat ALL MEASURES IN GRAMSVariable N N* Mean SE Mean StDev Minimum Q1 Median Q3 MaximumFat 12 0 0.542 0.144 0.498 0.000 0.00 0.500 1.000 1.500IQR: 1.000 L. Limit: -1.5 U. Limit: 2.5

Problem 19.4Correlation & Regression:

A)

Corr. Sig. at alpha = .o1 (p-vale = .001)

B) C)

D) -10.94+(1.9686)(30 years)=48.118 percent damageThese data were obtained from a survey of the number of years people smoked

and the percentage of lung damage they sustained.Years, x Damage, y x2 y2 xy

22 20 484 400 44014 14 196 196 19631 54 961 2916 167436 63 1296 3969 22689 17 81 289 153

41 71 1681 5041 291119 23 361 529 437

Sum 172 262 5060 13340 8079

a) Calculate the correlation coefficient and determine if the variables have a sig. corr.b) Find the equation of the regression line.c) Construct a scatter plot of the data.d) Predict the percentage of lung damage for a person who has smoked for 30 years.

4540353025201510

70

60

50

40

30

20

10

Years

Dam

age

Scatterplot of years vs. damage

Fat

1.6

1.4

1.2

1.0

0.8

0.6

0.4

0.2

0.0

Boxplot of Fat

39

r= nΣ xy−(Σx)(Σy )

√ [nΣx2−(Σx )2 ] [nΣy 2−(Σy )2 ]=

(7×8079 )−(172×262)

√ [(7×5060 )−1722 ][ (7×13340)−2622 ]=11489√5836×24736

=. 95622

b1=nΣ xy−(Σx )(Σy)

n (Σx2)−(Σx )2=(7×8079 )−(172×262)(7×5060 )−1722 =11489

5836=1 .9686

y¿

=b0+b1 x=−10 .94+1 . 9686 x

b0= y−b1 x=37 .43−(1 .9686 )(24 .57 )=−10 . 94

KEYSTAT 101

TOPICS: Confidence Intervals DOCUMENTS:

HANDOUTS: Green #20; Yellow #20; EXAM #3 AVAILABLE ONLINE: Green #20; Yellow #20; EXAM #3; Anatomy Sheets for

Confidence Intervals; Confidence Intervals ppt RELEVANT ANATOMY SHEETS (4): Z, Confidence Intervals for Small Samples, Large Samples, &

ProportionsASSIGNMENTS:

Text Readings: – Ch. 6, sections 6.1 – 6.3 EXAM #3 Item(s) below.

FORMULAS: Confidence Intervals & sample sizes [NOTE: Probabilities are on Orange #19]

x±zα2( σ√n ) x±zα

2( s√n ) x±t α

2, df ( s√n )

p¿±z α

2¿¿

n=( zα2σ

E )2

n= p̂ q̂( zα

2

E )2

NEXT CLASS: Confidence Intervals; CA#5 DUE (NOTE: not accepted after 11/21/17, no exceptions)Statistics Essentials for Confidence Intervals: Know: 1) what a confidence interval represents and how to calculate one for specific conditions; 2) what a margin of error is and how to obtain it; 3) the table above; 4) obtaining Z critical values.

Problem 20.1 (18.4 continued)Pregnancy Duration: The lengths of pregnancies are normally distributed with a mean of 268 days and a standard deviation of 15 days.

Find the mean and standard error (standard deviation) of the sampling distribution of the mean for a random sample 10 pregnancies.

μx=μ=268 days σ x=σ√n=

15√10=4 .74( prob .essentially 0 )

A sample of 10 pregnant woman is selected. Find the probability that their mean pregnancy is greater than 283 days.

z= x−μσ√n

=283−26815√10

=154 .74

=3 .16

Problem 20.2 Critical Value of za/2: Calculate the za/2 critical value associated with the following Confidence Levels.

92%: za/2 = 1.75 96%: za/2 = 2.05 90%: za/2 = 1.645 98%: za/2 = 2.33

204.3.18

C onfidence Intervals for a Mean C onfidence Intervals for a P roportion

X % C .I. = X % C .I. =

S ample S ize

sknown:

s known:(sunknown)

30

n

x z s

2

30

nx z s

2

nsx t df,2

nsx z 2

)(__

ExEx

npqp z 2

)( EpEp

40

KEYProblem 20.3Balance-1: Data were collected on the length of time subjects could stand on one foot with eyes closed? Using the data provided in the accompanying table for a random sample of 30 time measures (from N = 94), build 95% confidence intervals about the observed means for left foot up and right foot up.

Left Foot up: 95% C.I. = (17.05, 27.52) sec.Right Foot up: 95% C.I. = (19.51, 28.97) sec.Stats to right

Problem 20.4Balance-2: Using the data from the prior problem, obtain a 90% confidence interval for the left foot up data.

Left Foot up: 90% C.I. = 22.2883 +/- 4.21 sec.Right Foot up: 90% C.I. = 24.2397 +/- 2.31 sec.

Problem 20.5Balance-3: Examine the Left Foot 90% and 95% confidence intervals from the prior problems. What happened to confidence and precision when going from a 90% C.I. to a 95% C.I? Explain why this happens?

Confidence goes up; precision goes down. Why? To be more confident that the interval contains the population value, the interval is widened thereby reducing precision (which is reflected by a smaller interval).

Problem 20.6Balance-4: The N = 94 statistics table to the right presents the mean, , and standard deviation, s, for the studied population. Review the means and determine if they fall within the 95% confidence intervals generated in the previous problems for the samples of size n = 30. So, can samples be used to estimate population parameters? What would be some advantages of using samples to estimate population parameters?

Population value is within 95% C.I. of sample.Population value is not within 90% C.I. of sample (may be a rounding issue).

STAT 101 214.5.2018

41

KEYTOPICS: Confidence Intervals for Small Samples & ProportionsDOCUMENTS:

HANDOUTS: Green #21; Yellow #21; CA#7 AVAILABLE ONLINE: Green #21; Yellow #21; CA#7; Hypothesis Testing ppt.

ASSIGNMENT: Text LARSON: Ch. 7, section 7.1 – 7.3 Yellow #22: problems #5, 6, 14, 18 CA#7 Items below and green 20.3-20.6

FORMULAS: On prior sheets (Confidence intervals on sheet #20).NEXT CLASS: Confidence Intervals; Hypothesis Testing (maybe); DUE: CA#7Statistics Essentials for Confidence Intervals (proportions): Know: 1) how to identify a confidence interval for small samples & proportions; 2) Calculating the intervals; 3) characteristics of the t distribution.

Problem 21.1 (Correlation)

d) -29.958+(.560)(74 feet)=11.482 inches -29.958+(.560)(81 feet)=15.402 inches It is not meaningful to predict the value of y for x=95 because x=95 is outside the range of the original data -29.958+(.560)(79 feet)=14.282 inches

Problem 21.2 (Confidence Interval)Health Clubs: A random sample of 60 female members of health clubs in Los Angles showed that they spend on average 4 hours per week doing physical exercise with a standard deviation of .75 hours. Find a 95% confidence interval for the population mean

95%C . I .=x±zα2 ( σ√n )=4 . 0±1 . 96( .75

√60)=4 . 0±.19=(3 . 81 , 4 . 19)

Problem 21.3: (Normal Distribution)Tall Clubs International: Tall Clubs International is a social organization for tall people. It has a requirement that men must be at least 74 inches tall, and women must be at least 70 inches tall. Man’s heights are normally distributed with a mean of 69 inches and a standard deviation of 2.8 inches. Women’s heights are normally distributed with a mean of 63.6 inches and a standard deviation of 2.5 inches.

A) What percent of men meet the requirement?B) What percent of women meet the requirement?C) Are the height requirements for men and women fair? Why or why not?

A) z= x−μ

σ=74−69

2 .8=1. 785

Area to right = .0367 or 3.67% (using z=1.79)

B) z= x−μ

σ=70−63.6

2.5=2 .56

Area to right = .0052 or .52%C) Are the height requirements for men and women fair? Why or why not?C) Not fair as a larger proportion of males are eligible for membership.

Trees: Heights and Diameters The height (in feet) and trunk diameters (in inches) of 11 trees are shown below. Height, x Diameter, y x2 y2 xy

70 8.3 4900 68.89 58172 10.5 5184 110.25 75675 11 5625 121 82576 11.4 5776 129.96 866.471 9.2 5041 84.64 653.273 10.9 5329 118.81 795.785 14.9 7225 222.01 1266.578 14 6084 196 109277 16.3 5929 265.69 1255.180 18 6400 324 144082 15.8 6724 249.64 1295.6

Sum 839 140.3 64217 1890.89 10826.5

a) Calculate the correlation coefficient and determine if the variables have a significant correlation. b) Find the equation of the regression line.c) Construct a scatter plot of the data.d) Using the regression equation, what is the predicted diameter for x=74 feet? 81 feet? 95 feet? 79 feet?

42

x=76 . 273y=12 . 755b1=

nΣ xy−(Σx )(Σy)n (Σx2)−(Σx )2

=(11×108265)−(839×140 .3 )

(11×64217 )−8392 =1379. 82466

=.560

b0= y−b1 x=12. 755−( .560×76 .273 )=−29 . 958

y¿

=b0+b1 x=−29 .958+. 560 x

r= nΣ xy−(Σx)(Σy )

√ [nΣx2−(Σx )2 ] [nΣy2−(Σy )2 ]=

(11×10826 .5 )−(839×140 . 3)

√ [(11×64217 )−8392 ][ (11×1890 . 9 )−140 .32 ]=1379. 8√2466×1115. 81

. 8318

KEYSTAT 101

TOPICS: Confidence Intervals for Proportions; Data collectionDOCUMENTS:

HANDOUTS: Green #22 AVAILABLE ONLINE: Green #22; Hypothesis Testing ppt.

ASSIGNMENT: Text LARSON: Ch. 7, section 7.1 – 7.3 Yellow #21: problems #4, 8, 12, 13 Items below

FORMULAS: p¿±zα

2¿¿

n= p̂ q̂( zα

2

E )2

NEXT CLASS: Hypothesis TestingStatistics Essentials for Confidence Intervals (proportions): Know: 1) how to identify a confidence interval for small samples & proportions; 2) Calculating the intervals; 3) characteristics of the t distribution.

Problem 22.1: (C.I. Proportions)Facebook: In a survey of 1550 U.S. adults, 1054 said that they use the social media website Facebook.

(L&F 7ed p.320, 322)a) Find a point estimate for the population of U.S. adults who use Facebook.

x/n = 1054/1550 = .68 = p¿

b) Construct a 90 % confidence interval about the point estimate of the proportion of U.S. adults who use Facebook.

p¿±zα

2¿¿

= . 68±1 .645(√ ( .68 )( .32 )

1550 )= .68 ± .019 = 90% C.I. => (.661, .699) proportion

c) Construct a 95 % confidence interval about the point estimate of the proportion of U.S. adults who use Facebook.

p¿±zα

2¿¿

= . 68±1 . 96(√ ( . 68)( . 32)

1550 )= .68 ± .023 = 95% C.I. => (.657, .703) proportion

d) Does it seem possible that the population proportion could equal 0.59? Explain why or why not.

NO as .59 does not fall within the intervals.

Problem 22.2: (C.I. Proportions)New Year’s Resolution: From a recent survey of 2241 U.S. adults, 29 % of respondents indicated that they had made a resolution to eat healthier. (L&F 7ed p.325, #12)

a) Determine the number of individuals indicating an intent to eat healthier.n*(%/100) = 2241*.29 = 650 adults

b) Construct a 90% confidence interval for the population proportion.

p¿±zα

2¿¿

= . 29±1 .645 (√ ( .29 )( .71 )

2241 )= .29 ± .016 = 90% C.I. => (.274, .306) proportion

p¿±zα

2¿¿

= . 29±1 . 96(√ ( . 29)( . 71)

2241 )= .29 ± .019 = 95% C.I. => (.271, .309) proportion

22

43

KEYProblem 22.3: (C.I. Sample Size)

Fast Food: You wish to estimate, with 90% confidence, the population proportion of U.S. adults who eat fast food four to six times per week. Your estimate must be accurate within 3% of the population proportion. (L&F 7ed p.326, #19)

a) How large a sample do you need if no preliminary information is available? (where to start - see p. 331)

Where there is no information regarding p-hat, use .5 for both p-hat and q-hat.

n= p¿̂ q

¿̂( zα2

E )2

¿

¿=

n=( . 5)( . 5) ( 1.645.03 )

2

= 752

b) Find the minimum sample size using the results from a prior study that found that 11% of the populations ate fast food four to six times per week.

n= p¿̂ q

¿̂( zα2

E )2

¿

¿=

n=( . 11)( . 89 ) ( 1.645

.03 )2

= 295

c) Compare the results from parts a) and b). Having an estimate of the population proportion (11%) reduces sample size needed.

Problem 22.4: (C.I. Small Sample)Commute Time: From a random sample of eight people, the mean commute time to work was 35.5 minutes with a standard deviation of 5.8 minutes. Build a 95% confidence interval about the sample mean, for the population mean. (L&F 7ed p.315, #17)

x±t α

2, df ( s√n ) =

35 .5±2 .365( 5 . 8√8 ) = 35.5 ± 4.85 = 95% C.I. => (30.65, 40.35) minutes

Problem 22.5: (Normal Distribution)Test Scores: Assume the test scores for a large class are normally distributed with a mean of 74 and a standard deviation of 10. (Le p.144, #3.14)

a) Suppose that you received a score of 88. What portion of the class received scores higher than you?

z= x−μσ =

88−7410 = 1.4; associated area to left = .9192; area to right = .0808

b) Suppose that the instructor wants to limit the number of A grades to 20%. What would be the lowest score for an A?

x=μ+zσ = 74 + (.84)(10) = 82.4

44

KEY

45

KEYSTAT 101

TOPICS: Hypothesis Testing DOCUMENTS:

HANDOUTS: Green #23; Yellow #23; CA#8; ExCr #7 AVAILABLE ONLINE: Green #23; Yellow #23; CA#8; ECxCr#7; Hypothesis Testing ppt;

STAT EXAM #4 REVIEW SHEETS #1-5ASSIGNMENT:

Text LARSON: Ch. 7, section 7.1 – 7.3 Text Problems: p. 367 #1-11, 13, 17, 18, 21-23, 25, 26 CA#8 ExCr#7 Items below

FORMULAS: Hypothesis Testing

z=x−μ0

σ√n

t=x−μ0

s√n

z= p¿̂−p

√ pqn

¿

Steps in testing a hypothesis:1. List the given information2. State in statistical and word format the null hypothesis and the alternative hypothesis

a. Identify the “claim”3. Determine if the test is one-tailed (left or right) or two-tailed4. Identify the critical value for the test (or the stated p-value level at which the test is being conducted)5. Draw a picture containing the critical value(s) and, after testing, the test statistic6. Identify the statistical test7. Conduct the test and obtain the test statistic (place it on your drawing)8. Analyze your results (relationship between critical value and test statistic (or p-values comparison)9. State a conclusion. Include a statement of the null hypothesis, the level at which it was tested, and whether it was retained

or rejected.NEXT CLASS: Hypothesis Testing; DUE: CA#8; ExCr#7 Statistics Essentials Hypothesis Testing: Know: 1) how to write hypotheses in statistical format and in written format; 2) what statistical significance means; 3) what terms associated with hypothesis testing mean – e.g. critical value(s), test statistic, p-value, etc.; 4) how to conduct a hypothesis test and interpret the results.

Problem 23.1: Doctor Salaries: Is there a doctor in the house? The Bureau of Labor Statistics reported that in May 2009, the mean annual earnings of all family practitioners in the United States was $168,550. A random sample of 55 family practitioners in Missouri that month had mean earnings of $154,590 with a standard deviation of $42,750. Do the data provide sufficient evidence at to conclude that the mean salary for family practitioners in Missouri is less than the national average?

Problem 23.2: Good credit: The Fair Isaac Corporation (FICO) credit score is used by banks and other lenders to determine whether someone is a good credit risk. Scores range from 300 to 850, with a score of 720 or more indicating that a person is a very good credit risk. An economist wants to determine whether the mean FICO credit score is lower than the cutoff of 720. She finds that a random sample of 100 people had a mean FICO score of 703 with a standard deviation of 92. Can the economist conclude that the mean FICO score is less than 720? Use α=.05 level of significance.

Problem 23.3: Measuring lung function: One of the measurements used to determine the health of a person’s lungs is the amount of air a person can exhale under force in one second. This is called the forced expiratory volume in one second, and is abbreviated FEV1. Assume the mean FEV1 for 10-year-old boys is 2.1 liters and that the standard deviation is 0.3. A random sample of 100 10-year-old boys who live in a community with high levels of ozone pollution are found to have a sample mean FEV1 of 1.95 liters. Can you conclude that the mean FEV1 in the high-pollution community is less than 2.1 liters? Use the α = 0.05 level of significance.

23

Testing a Hypothesis

Decision: TRUE FALSERETAIN

Correct Type II ErrorDecision b

REJECT Type I Error Correct Decision

Null Hypothesis

46

KEYTesting a hypothesis: (format to be used for upcoming exam)

A) Provide a complete list of the GIVEN information:

B) State the null and alternative hypotheses in both statistical form and written form.

Statistically: In Words:

Ho: _________________ Ho: _________________________________________________________

Ha: _________________ Ha: _________________________________________________________

C) The hypothesis test is (circle one): left-tailed right-tailed two-tailed

D) Identify the Critical Value(s) for the test. C.V. = __________

E) Draw a detailed picture displaying the region(s) of retention and rejection of the null hypothesis. Include the location of the Critical Value(s) and Test Statistic (once calculated).

F) Present the formula to be used, determine the Test Statistic and locate the T.S. on the picture.

Show Formula, Calculations and test statistic

State Formula:

Calculations:

Test Statistic: _________ (place here and on diagram)

G) Using the format presented in class, state a conclusion for this hypothesis test.

________________________________________________________________________________________________

________________________________________________________________________________________________

________________________________________________________________________________________________

item E EE

47

KEY

Problem 23.1: Doctor Salaries H0: µ≥168,550 The mean doctor salary is ≥$168,500 NationallyHa: µ<168,550 The mean doctor salary is < $168,500 in Missouri

Given: x¿

= $154,590 = $168,500 s = $42,750 n = 55 = .05Left-tailed test Critical Value = -1.645

z= x−μs√n

=154 ,590−168 ,55042 , 750√55

=−2 . 42

Reject the null hypothesis. At the α=.05 significance level, we can conclude that the mean salary for family practitioners in Missouri is less than the national average.

Problem 23.2 Good Credit H0:µ≥720 The mean FICO score is ≥ 720.Ha:µ<720 The mean FICO score is < 720.

Given: x¿

= 703 = 720 s = 92 n = 100 = .05Left-tailed test Critical Value = -1.645.

z= x−μs√n

=703−72092√100

=−1 . 848

(test statistic)Reject the null hypothesis. At the α=.05 significance level, we can conclude that the mean FICO score is less than 720.

Problem 23.3 Lung FunctionH0: µ≥2.1; The mean FEV1 in a high-pollution community is ≥ 2.1 liters.Ha: µ<2.1; The mean FEV1 in a high-pollution community is < 2.1 liters.

Given: x¿

= 1.95 liters = 2.1 liters s = 0.3 l. n = 100 = .05Left-tailed test Critical Value = -1.645

z= x−μσ√n

=1.95−2. 10 .3√100

=−5

(test statistic)Reject the null hypothesis. At the α=0.05 significance level, we can conclude that the mean FEV1 in the high-pollution community is less than 2.1 liters.

Retain

Reject

C.V. = -1.645T.S.-2.42

Retain

Reject

C.V. -1.645T.S. -5

Retain

Reject

C.V. -1.645T.S. -1.848

48

KEYSTAT 101

TOPICS: Hypothesis Testing DOCUMENTS:

HANDOUTS: Green #24 AVAILABLE ONLINE: Green #24

HWK: Problems from Yellow #23 and this sheet

FORMULAS: Steps in testing a hypothesis: On Green 23

Hypothesis Testing:

z=x−μ0

σ√n

t=x−μ0

s√n

z= p¿̂−p

√ pqn

¿

NEXT CLASS: Review SessionStatistics Essentials: Know: 1) how to write hypotheses in statistical format and in written format; 2) what statistical significance means; 3) what terms associated with hypothesis testing mean – e.g. critical value(s), test statistic, p value, etc.; 4) how to conduct a hypothesis test and interpret the results.

PROBLEM 24.1: BMI for Miss America: The trend of thinner Miss America winners has generated charges that the contest encourages unhealthy diet habits among young women. The mean BMI for ten recent Miss America winners was 18.76 with a standard deviation of 1.19. At = .01, test the claim that recent winners are from a population with a mean BMI less than 20.16, which was the BMI for winners from the 1920s and 1930s.

BMI of Miss America

Ho : μ≥20 .16The mean BMI of recent Miss Americas is 20.16

Ha : μ<20. 16The mean BMI of recent Miss Americas is less than 20.16. CLAIM

Given: x¿

= 18.76 BMI = 20.16 BMI s = 1.19 n = 10 = .01 Left-tailed

t=x−μ0

s√n = -3.72 (Test Stat.) Critical Value = -2.821

Conclusion: Reject null. There is insufficient evidence at the = .01 level to conclude that the BMI of recent Miss America pagent winners differs from the BMI of women from the 1920’s and 30’s..

PROBLEM 24.2: Curing diabetes: vertical banded gastroplasty is a surgical procedure that reduces the volume of the stomach in order to produce weight loss. In a recent study, 82 patients with type 2 diabetes underwent this procedure, and 59 of them experienced a recovery from diabetes. Does this study provide convincing evidence that more than 60% of those with diabetes who undergo this surgery will recover from diabetes? Use the α=.05 level of significance.

H0:p=.60Ha:p>60

p=5982=. 7195

z= p¿̂− p

√ pqn

=.7195−. 60

√. 60×. 4082

=2 . 21 ¿

(test statistic)Critical value=1.645Reject the null hypothesis. This does provide convincing evidence that more than 60% of those with diabetes who undergo this surgery will recover from diabetes.

244.17.2018

Retain

T.S.-3.72C.V.-2.821

Reject

49

KEY

PROBLEM 24.3:High salaries for executives: A Washington Post-ABC News poll conducted in October 2009 surveyed a random sample of 1004 adults in the United States. Of these people, 713 said they would support federal legislation putting limits on the amounts that top executives are paid at companies that receive emergency government loans. One highly paid executive claims that less than 75% of U.S. adults support limits on the amounts that executives are paid.

a) State the appropriate null and alternate hypotheses.b) Compute the test statistic.c) Using α=.05, can you conclude that the executive’s claim is true?d) Using α=.01, can you conclude that the executive’s claim is true?

a) H0:p=.75Ha:p<.75

b)

p=7131004

=. 7102

z= p¿̂−p

√ pqn

=.7102−.75

√ .75×.251004

=−2 .91 ¿

(test statistic)c) Critical values= -1.645Reject null hypothesis. There is sufficient evidence to support the claim at the α=.05 level.d) Critical values= -2.33Reject the null hypothesis. There is sufficient evidence to support the claim at the α=.01 level

PROBLEM 24.4: Environment: In 2008, the General Social Survey asked 1493 U.S. adults to rate their level of interest in environmental issues. Of these, 751 said that they were “very interested.” Does the survey provide convincing evidence that more than half of U.S. adults are very interested in environmental issues? Use the α=.05 level of significance.

H0:p=.5Ha:p>.5

p=7511493

=. 503

z= p¿̂− p

√ pqn

=.503−. 50

√. 50×. 501493

=. 232 ¿

(test statistic)Critical value=1.645Fail to reject the null hypothesis. The survey does not provide convincing evidence that more than half of U.S. adults are very interested in environmental issues at the α=.05 significance level.

PROBLEM 24.5: Heavy children: Are children heavier now than they were in the past? The National Health and Nutrition Examination Survey (NHANES) taken between 1999 and 2002 reported that the mean weight of six-year-old girls in the United States was 49.3 pounds. Another NHANES, published in 2008, reported that a sample of 193 six-year-old girls weighed between 2003 and 2006 had an average weight of 51.5 pounds. The standard deviation of this sample is 15 pounds. Can you conclude that the mean weight of six-year-old girls is higher in 2006 than in 2002? Use the α=0.01 level of significance.

H0: µ=49.3H1: µ>49.3

z= x−μs√n

=51 .5−49 .315√193

=2 .038

(test statistic)Critical value=2.33

50

KEYFail to reject the null hypothesis. At the α=0.01 significance level, we can conclude that the mean weight of six-year-old girls in 2006 is approximately equal to the mean weight of six-year-old girls in 2002.

51

KEYSTAT 101

TOPICS: Application of SPSS to large data sets DOCUMENTS:

HANDOUTS: Green #25; CA#9 AVAILABLE ONLINE: Green #25; CA#9; ExCr#8

HWK: CA#9 ExCr#8 (optional)

THE FINAL FOUR DAYS: 4.24 – data collection 4.26 – Exam Review 5.1 – Exam #4 5.3 & 5.8 – Application Lab

o NOTE: Due to space limitations, May 8th class members cannot be accommodated at one of the May 3rd time slots.

NEXT CLASS: Data Collection for Exam Week Application Lab; DUE CA#9; EcCr#8Statistics Essentials: Using SPSS to analyze large data sets in preparation for the final’s week lab.

EXAM #4 Review Topics: Descriptive Statistics: Creating tables, charts; Obtaining Statistics – Measures of Center & Variation. Measures of Position – five-number summary, modified box plots with associated calculations; skewness; z-score; etc. Correlation & Regression – scatter plots, r, r2, least squares line, calculating regression estimates Normal distribution – standard normal table, finding areas or z values; non-standard normal data, finding probabilities and

variable values

Distribution of sample means - μx ,σ x , central limit theorem, problems where n>1

Confidence Intervals - s vs. s, < 30 vs.≥ 30, intervals for proportions, finding x and E given interval, Standard Normal table, t-table, determining sample size for means problems

Hypothesis Testing – one sample for means and proportions

REFERENCES/STUDY MATERIALSo Green Sheet - Keys onlineo Yellow Sheet - Keys online (most)o Course Review Materials – Additional problems & KEYS for areas since Exam #2 (online Supporting Materials link)o Text problemso Topic-based PowerPointso Essential Statistics: the first PowerPoint – listing items to knowo Statistics Essentials: Review this section of each Green sheet. Can you do these things?

254.19.2018

52

KEYSTAT 101 NAME: _________________________________

TOPICS: Data Collection DOCUMENTS:

HANDOUTS: Green #26 AVAILABLE ONLINE: Green #26

HWK: Begin Review: See Green #24 for topic areas and resources

NEXT CLASS: Review Session

1) Place your data on this sheet2) Enter your data into the computer file.3) Answer the questions below.4) Leave this form in the three-ringed binder.

BENTON'S INDIVIDUAL COOKIE WEIGHTS (grams) OREO INDIVIDUAL COOKIE FILLING WEIGHT (gms)

COOKIE Benton's Plain Wt.

FILLING Benton's Plain Wt.

COOKIE Benton's

Double Wt.

FILLING Benton's

Double Wt.COOKIE Oreo

Plain Wt.FILLING Oreo

Plain Wt.COOKIE Oreo

Double Wt.FILLING Oreo

Double Wt.

Cookie Stats:

Questions: For each of the following items think of the various descriptive and inferential statistics we have explored, generate a question that could be explored during the final class meeting, and what we might learn via that investigation.

Example: Descriptive Statistics: What: Generate a histogram of the weights of double Stuf Oreos. Why – an opportunity to look at normality.

1) Descriptive tables, graphs & statistics:

What: ________________________________________________________________________________________

Why: _________________________________________________________________________________________

2) Normal Distribution/Dist. Of Sample Means:

What: ________________________________________________________________________________________

Why: _________________________________________________________________________________________

3) Confidence Intervals:

What: ________________________________________________________________________________________

Why: _________________________________________________________________________________________

4) Hypothesis Testing:

What: ________________________________________________________________________________________

Why: _________________________________________________________________________________________

264.24.18

53

KEYEXAMINATION PROBLEM

NOTICE TO CADE AND INDIVIDUAL TUTORS: THIS SHEET IS PART OF AN EXAM. IT SHOULD BE COMPLETED BY CLASS STUDENTS INDEPENDENT OF TUTOR ASSISTANCE. FOR REVIEW PURPOSES SIMILAR PROBLEMS ARE AVAILABLE IN THE TEXT AND ONLINE.STAT 101 TOPICS: Exam #4 reviewDOCUMENTS: HANDOUTS: Green #27 (Exam Problems); Yellow #27 (Exam Review) AVAILABLE ONLINE: Green #27 (Exam Problems); Yellow #27 (Exam

Review)HWK: Exam #4 Take-home Measures of Position & VariationNEXT CLASS: Exam #4EXAM #4 Review Topics: Listed on Orange #25

EXAM #4: Take-Home Problems. [Value: 15 points] DIRECTIONS: Provide your answer to the assigned problem ON THE REVERSE OF THIS

SHEET. Only this Green sheet or a photo copy thereof will be accepted. There is no credit for doing the wrong problem. Grading: Problem values in [ ]; general deductions at .5 per occurrence. N.B: THIS PORTION OF THE EXAM IS DUE AT THE BEGINNING OF THE

CLASS PERIOD ON TUESDAY MAY 1, 2018. IT WILL NOT BE ACCEPTED AFTER THAT POINT IN TIME. Email it if necessary.

Labels: Missing = -1 point.

DATA SET #1: Menarchal ages of female swimmers: The following are menarchal ages (in years) of 20 female swimmers who began training after they had reached menarche. (source: Le p. 93)

Age in years: 14.0 16.1 13.4 14.6 13.7 13.2 13.7 14.3 12.9 14.115.1 14.8 12.8 14.2 14.1 13.6 14.2 15.8 12.7 15.6

DATA SET #2: Exercise and menstrual cycles: A study on the effects of exercise on the menstrual cycle provided the following ages (years) of menarche for 20 female swimmers who began training prior to menarche. (source: Le p. 93)

Age in years: 15.0 17.1 14.6 15.2 14.9 14.4 14.7 15.3 13.6 15.1 16.2 15.9 13.8 15.0 15.4 14.9 14.2 16.5 13.2 16.8

DATA SET #3: Daily fat intake: The following are the daily fat intake (grams) of a group of 20 adult males. (source: Le p. 94)

Fat Intake: 22 62 91 102 117 129 137 141 73 117 135 143 37 82 100 114 124 135 142 122

274.26.18

DATA SET #1: Menarchal Ages of Female Swimmers8:30 AM 10:00 AM 2:30 PMAmbrosio Anson AlstonAndrade Assini ArgondizzaBull Boyd BlydenburghDigiovanni Brock BrettsDuran Dobert BrunFitzpatrick, K Drabczyk ChambersGallo Evaniak CodyGaul Garcia ClarkKarpinski Margolin SakielRuggieri

DATA SET #2: Exercise & Menstrual Cycles8:30 AM 10:00 AM 2:30 PMBiittig Geyer Fitzpatrick,DKelly Heisner GocoolKoch Held KarafianLeone Jacobs KleinMartinez, K Kilichowski LewisMathew Ludovico LucicMcbride Krupunich MacMillanMcGee McGlynn MiroMontgomery Miller MoulderPetix Wescott

DATA SET #3: Daily Fat Intake8:30 AM 10:00 AM 2:30 PMDougan Buritica CatesMartinez, A Pepe DevineRusso Smith, A EmersonSausville Smih, C OramScaglione Stieglitz RosensteinSchweikhard Tilke SachSeuling Trumble SchmidtStruble Vanglad SchraderSzemansco Vermette SherwinTaranovich SrourThornton Tyburski

54

KEYEXAMINATION PROBLEM: SEE NOTICE ON REVERSE OF THIS PAGE.

NAME: _______________________ Data Set #: _____ CLASS (circle): 8:30 10:00 2:30EXAM #4: TAKE-HOME [15]

A) [4] Determine the standard deviation of the data set. STANDARD DEVIATION = __________ (Use the front of this page for calculations of x and x2. Calculate here to TWO decimal places)

B) [2] Determine the skew of the data set. SKEW = __________

C) [2] Determine the percent variability of the data set. VARIABILITY = __________

D) [7] Build a modified box plot of the data set.

Statistics [3]: Min: ______ Q1: ______ Q2: ______ Q3: ______ MAX: ______

IQR: ______ L. Limit: ______ U. Limit: ______ Adj. Pt. (if any): ______

Box Plot Here [4] (use line as the x-axis)

Work space:

I certify that I completed this examination without the assistance of other individuals. Signature: ___________________________

______________________________________________________________________

55

KEY

STAT 101

TOPICS: Exam #2 DOCUMENTS:

HANDOUTS: Orange #28 AVAILABLE ONLINE: Orange #28

HWK: None

NEXT CLASS: Lab (see times & locations below)

FINALS WEEK CLASS SCHEDULE8:30 AM CLASS:THURSDAY MAY 3, 2018LOCATION: HUEC 136 (computer lab)CLASS EXAM TIME: 8:00 – 10:30

10:00 AM CLASS: TUESDAY MAY 8, 2018LOCATION: PSCI 128 (computer lab)CLASS EXAM TIME: 8:00 – 10:30

2:30 PM CLASS: THURSDAY MAY 3, 2018LOCATION: HUEC 136 (computer lab)CLASS EXAM TIME: 2:00 – 4:30

Schedule of Events, etc: Course Evaluation followed by Data analysis completed individually or in groups of 2-3 max. Resources: Bring any books, handouts, notes, consultants, etc. that you feel may be of help in the data analysis. IF YOU ARRIVE LATE, YOU MUST COMPLETE THE LAB PROJECT ON

YOUR OWN.

Ah, Statistics…The government are very keen on amassing statistics. They collect them, add them, raise them to the nth power, take the cube root and prepare wonderful diagrams. But you must never forget that every one of these figures comes in the first instance from the village watchman, who just puts down what he damn pleases. [--Comment of an English judge on the subject of Indian statistics; Quoted by Sir Josiah Stamp in Some Economic Matters in Modern Life_]

There are three types of people in this world: Those who can count, and those who can't. [Seen on a bumper sticker]

The statistics on sanity are that one out of every four Americans is suffering from some form of mental illness. Think of your three best friends. If they're okay, then it's you. [Rita Mae Brown]

I always find that statistics are hard to swallow and impossible to digest. The only one I can remember is that if all the people who go to sleep in church were laid end to end they would be a lot more comfortable. [Mrs. Robert A. Taft]

Before the curse of statistics fell upon mankind we lived a happy, innocent life, full of merriment and go, and informed by fairly good judgment. [Hilarie Belloc The Silence of the Sea]

285.1.18

NOTE: College policy holds that the final exam (for us #4) must be retained by the faculty member. To assure that this policy is followed, you may not receive credit (up to 30 points) for participation in the final class lab session without exam four’s return.

56

KEY

57