cpb-us-w2.wpmucdn.com  · web viewthe data in this report is taken from a study of child health...

12
MTH332 - Homework 5– Multiple Linear Regression – Maternal Smoking and Infant Health (Continued) Marie Carver April 20, 2020 Abstract:

Upload: others

Post on 14-Jul-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: cpb-us-w2.wpmucdn.com  · Web viewThe data in this report is taken from a study of Child Health and Development Studies (CHDS) that focused on the pregnancies that occurred between

MTH332 - Homework 5 – Multiple Linear Regression – Maternal Smoking and Infant

Health (Continued)

Marie Carver April 20, 2020 

 

Abstract: A study of 1236 male births to women with different characteristics was conducted. The goal was to see how smoking, parity, and the mother’s age, weight, and height affected the birthweight of the male babies. The information that was collected, such as regression lines, correlations, R^2, means, and more was calculated using built-in functions in R. The information was recorded in histograms, tables, and regression models. The results proved all of the claims that mothers who smoke have increased rates of premature delivery, the newborns of smokers are smaller at every gestational age, and smoking seems to be a more significant determinant of birth weight than the mother’s pre-pregnancy height and weight, parity, and age.

Page 2: cpb-us-w2.wpmucdn.com  · Web viewThe data in this report is taken from a study of Child Health and Development Studies (CHDS) that focused on the pregnancies that occurred between

Introduction:

The data in this report is taken from a study of Child Health and Development Studies (CHDS) that focused on the pregnancies that occurred between 1960 and 1967 among women in the Kaiser Foundation Health Plan in Oakland, California. This report will focus on the data collected from only one year of this study. It includes the data of 1236 male single births where the baby lived at least 28 days. The point of this study is to see how the smoking status of mothers affects the gestation rate and birthweight of newborns. It will also compare different variables besides smoking status to see which is a more significant determinant of birthweight.  

Methods: 

The data was imported into RStudio from a provided data site. The data contained variables such as the baby’s birthweight and gestation period, along with the mother’s parity, age, weight, height, and smoking status. All the data was manipulated in R. A histogram was made in R by plotting one histogram on top of another to compare more easily. Tables were made in word to compare other sets of data. Most of this data was achieved with the “summary()” command in R. Regression models were also created with one or more parallel regression lines running through them by using the “lm” and “abline()” functions.  

Results:  Claim 1: Mothers who smoke have increased rates of premature delivery.  

Do Not Smoke Do SmokeSample Size (not including those who answered “999” for gestation rate)

733 480

Gestation Period Min(in days)

148 223

Gestation Period Max(in days)

353 330

Gestation Period Mean(in days)

280 278

Standard Deviation 16.63 15.07Number of Babies Born 259 Days or Earlier (Premature)

59 41

Percent of Babies Born 259 Days or Earlier (Premature)

8.049% 8.54%

Page 3: cpb-us-w2.wpmucdn.com  · Web viewThe data in this report is taken from a study of Child Health and Development Studies (CHDS) that focused on the pregnancies that occurred between

 Table 5.1: Important Information 

 Figure 5.1: Histogram of Gestation Period vs. Frequency

Claim 2: The newborns of smokers are smaller at every gestational age.  

PrematureGestation Age

0-259 days

NormalGestation Age260-280 days

LongGestation Age281-353 days

Sample Size 59 289 385Birthweight Min(in oz) 55 71 90

Birthweight Max(in oz) 145 163 176

Birthweight Mean(in oz) 106.67 119.21 128.64

Standard Deviation 21.91 15.35 15.5 Table 5.2: Birthweights at Different Gestation Ages of Those who Do Not Smoke 

PrematureGestation Age

0-259 days

NormalGestation Age260-280 days

LongGestation Age281-353 days

Sample Size 41 239 200Birthweight Min(in oz) 58 72 71

Page 4: cpb-us-w2.wpmucdn.com  · Web viewThe data in this report is taken from a study of Child Health and Development Studies (CHDS) that focused on the pregnancies that occurred between

Birthweight Max(in oz) 127 154 163

Birthweight Mean(in oz) 91.73 111.27 122.04

Standard Deviation 17.05 15.34 16.67 Table 5.3: Birthweights at Different Gestation Ages of Those who Do Smoke 

Data With Outliers: 

  Figure 5.2: How gestation affects birthweights of non-smokers. Including outliers

Page 5: cpb-us-w2.wpmucdn.com  · Web viewThe data in this report is taken from a study of Child Health and Development Studies (CHDS) that focused on the pregnancies that occurred between

 Figure 5.3: How gestation affects birthweights of smokers. Including outliers

Figure 5.4: How gestation affects birthweights of non-smokers and smokers. Including outliers

Data Without Outliers:  

Page 6: cpb-us-w2.wpmucdn.com  · Web viewThe data in this report is taken from a study of Child Health and Development Studies (CHDS) that focused on the pregnancies that occurred between

 Figure 5.6: How gestation affects birthweights of non-smokers. Excluding outliers

 Figure 5.7: How gestation affects birthweights of smokers. Excluding outliers

Page 7: cpb-us-w2.wpmucdn.com  · Web viewThe data in this report is taken from a study of Child Health and Development Studies (CHDS) that focused on the pregnancies that occurred between

Figure 5.8: How gestation affects birthweights of non-smokers and smokers. Excluding outliers 

Do Not Smoke Do SmokeSample Size 733 480Correlation Coefficient (R) 0.35 0.49Coefficient of Determination (R^2) 0.1225 0.2401

Mean Squared Error 264.33 248.88Table 5.4: Information for Smokers and Non-Smokers (Including Outliers)

Claim 3: Smoking seems to be a more significant determinant of birth weight than the mother’s prepregnancy height and weight, parity, payment status, or history of previous pregnancy outcomes, or the infant’s sex.  

Short(0-64 inch

es)

Tall(65-72 inc

hes)

Light(0-170 l

bs)

Heavy(171-

250 lbs)

First Born

Not First Born

Young Mothe

r(0 - 25years)

Older Mother (26 – 45 years)

Sample Size 680 534 1127 51 909 311 531 687Birthweight Min(in oz) 58 55 55 72 55 63 62 55

Page 8: cpb-us-w2.wpmucdn.com  · Web viewThe data in this report is taken from a study of Child Health and Development Studies (CHDS) that focused on the pregnancies that occurred between

Birthweight Max(in oz) 174 176 174 176 174 176 176 174

Birthweight Mean(in oz)

116.86 122.84 119.67 124.78 120.35

118.39 118.69 120.74

Standard Deviation 17.51 18.73 18.08 21.48 18.54 17.12 17.25 18.89

Table 5.5: How different determinants affect birthweights

 Different Variables’ Effect on Birthweight (NOT Adjusting for Each Other) 

Smoke Parity Height Weight AgeEstimate -8.938 -1.929 1.433 0.135 0.106MultipleR-Squared

0.058 0.002 0.039 0.024 0.001

P-Value < 2.2e-16 0.105 2.966e-12 8.207e-08 0.238Residual Standard Error

17.68 on 122418.22 on 123417.94 on 121218.15 on 119818.25 on 1232

Table 5.6: Different Variables’ Effect on Birthweight (NOT Adjusting for Each Other) 

Different Variables’ Effect on Birthweight (Adjusting for Each Other) Smoke Parity Height Weight Age

Estimate -9.18492 -2.05378 1.31760 0.05084 -0.03746MultipleR-Squared

0.1069 0.1069 0.1069 0.1069 0.1069

P-Value < 2.2e-16 < 2.2e-16 < 2.2e-16 < 2.2e-16 < 2.2e-16ResidualStandard Error

17.35 on 118017.35 on 118017.35 on 118017.35 on 118017.35 on 1180

Table 5.7: Different Variables’ Effect on Birthweight (Adjusting for Each Other)

Conclusion: 

The data proves the first claim that “mothers who smoke have increased rates of premature delivery.” As seen in Table 5.1, the percent of babies born prematurely to those who do not smoke is 8.049%, and the percent of babies born prematurely to those who do smoke is 8.54%. Although non-smokers have a smaller minimum gestation period than smokers (148 days vs 223 days), this is most likely an outlier, which can also be seen on the left side of the histogram (Figure 5.1). Again looking at the Figure 5.1, as the gestation period decreases and gets closer to prematurity, the frequency of the number of babies born is larger in smokers than non-smokers. Similarly, as the gestation period approaches the average and above, the frequency of

Page 9: cpb-us-w2.wpmucdn.com  · Web viewThe data in this report is taken from a study of Child Health and Development Studies (CHDS) that focused on the pregnancies that occurred between

the number of babies born is larger in non-smokers than smokers. Therefore, mothers who smoke have increased rates of premature delivery.  The data proves the second claim that “the newborns of smokers are smaller at every gestational age than the newborns of non-smokers.” This can be seen in Table 5.2 and Tables 5.3. During each gestational stage, the newborns of smokers have a smaller birthweight mean than the newborns of non-smokers. The maximum birthweights at each gestational age of non-smokers is also larger than the maximum birthweights at each gestational age of smokers. The minimum birthweights do not support this claim only because of the outliers as seen in the regression models. The regression models show a visual of how the gestation period affects birthweights depending on if the mother was a non-smoker (Figure 5.2 and Figure 5.5) or if the mother was a smoker (Figure 5.3 and Figure 5.6). Figure 5.7 compares the parallel regression lines. The blue line represents the smokers while the red line represents the non-smokers. As one can see, the non-smoker's regression line lies above the smoker’s regression line, proving that the newborns of smokers are smaller at every gestational age than the newborns of non-smokers. The correlation is also very small, with it being only 0.35 for non-smokers and 0.49 for smokers. This proves that smokers have a decent impact on birthweight. Next, we will see if other determinants affect birthweights more than smoking.  The data also proves the third claim that “smoking seems to be a more significant determinant of birth weight than the mother’s pre-pregnancy height and weight, parity, or the mother’s age.” As seen in Table 5.3, all the listed determinants affect the birthweight. Exactly how much they affect the birthweight is seen in Table 5.4 and Table 5.5. Table 5.4 compares the estimate, multiple R-Squared, P-Value and residual standard error of each determinant’s individual effect on the birthweight. The estimates show the effect of each determinant on the birthweight. For example, smoking results in a 8.938oz decrease in birthweight, and when the parity is 1 (meaning this isn’t a mother’s firs child) the birthweight decreases by 1.929oz. We associate a 1-inch increase in height to an increase in 1.433oz in birthweight. Similar conclusion can be drawn for weight and age. The multiple r-squared information shows that approximately 5.8% of variation in birthweight can be explained by smoking, while approximately 0.2%, 3.9%, 2.4%, and 0.1% of variation in birthweight can be explained by parity, height, weight, and age respectively. Since the p-value is so low for smoking, height, and weight, we can reject the null hypothesis and conclude that they all influence birthweight. This further supports the multiple R-squared data. Table 5.5 compares the same data as above, but adjusts itself for each other determinant instead of comparing them separately. The only information that differs between them is the estimate, which shows how much each determinant affects birthweight. It is obvious from both tables that smoking is a more significant determinant of birthweight.