Loading required package: car
Loading required package: carData
Attaching package: 'car'
The following object is masked from 'package:dplyr':
recode
The following object is masked from 'package:purrr':
some
Loading required package: effects
lattice theme set by effectsTheme()
See ?effectsTheme for details.
Code
library(smss)
Warning: package 'smss' was built under R version 4.2.2
Question 1
The prediction equation for selling price of homes in Jacksonville, FL (\(y\)) is given as:
\(\hat{y} = -10,536 + 53.8x_1 + 2.84x_2\)
where \(x_1\) is the size of the home and \(x_2\) is the size of the lot (both in square feet).
According to the model, predicted selling price was roughly 118000 USD. The residual of roughly 28000 means that the model underpredicted the selling price by roughly 28000 USD.
Part B
For fixed lot size, the house price is expected to increase by 53.8 USD as the square footage of the house itself increases in 1. This is because the coefficient for the home size square footage is 53.8.
Part C
Lot size would need to increase by 18.943662 to have the same impact as a one-square-foot increase in home size.
Question 2
Code
data(salary)salary
degree rank sex year ysdeg salary
1 Masters Prof Male 25 35 36350
2 Masters Prof Male 13 22 35350
3 Masters Prof Male 10 23 28200
4 Masters Prof Female 7 27 26775
5 PhD Prof Male 19 30 33696
6 Masters Prof Male 16 21 28516
7 PhD Prof Female 0 32 24900
8 Masters Prof Male 16 18 31909
9 PhD Prof Male 13 30 31850
10 PhD Prof Male 13 31 32850
11 Masters Prof Male 12 22 27025
12 Masters Assoc Male 15 19 24750
13 Masters Prof Male 9 17 28200
14 PhD Assoc Male 9 27 23712
15 Masters Prof Male 9 24 25748
16 Masters Prof Male 7 15 29342
17 Masters Prof Male 13 20 31114
18 PhD Assoc Male 11 14 24742
19 PhD Assoc Male 10 15 22906
20 PhD Prof Male 6 21 24450
21 PhD Asst Male 16 23 19175
22 PhD Assoc Male 8 31 20525
23 Masters Prof Male 7 13 27959
24 Masters Prof Female 8 24 38045
25 Masters Assoc Male 9 12 24832
26 Masters Prof Male 5 18 25400
27 Masters Assoc Male 11 14 24800
28 Masters Prof Female 5 16 25500
29 PhD Assoc Male 3 7 26182
30 PhD Assoc Male 3 17 23725
31 PhD Asst Female 10 15 21600
32 PhD Assoc Male 11 31 23300
33 PhD Asst Male 9 14 23713
34 PhD Assoc Female 4 33 20690
35 PhD Assoc Female 6 29 22450
36 Masters Assoc Male 1 9 20850
37 Masters Asst Female 8 14 18304
38 Masters Asst Male 4 4 17095
39 Masters Asst Male 4 5 16700
40 Masters Asst Male 4 4 17600
41 Masters Asst Male 3 4 18075
42 PhD Asst Male 3 11 18000
43 Masters Assoc Male 0 7 20999
44 Masters Asst Female 3 3 17250
45 Masters Asst Male 2 3 16500
46 Masters Asst Male 2 1 16094
47 Masters Asst Female 2 6 16150
48 Masters Asst Female 2 2 15350
49 Masters Asst Male 1 1 16244
50 Masters Asst Female 1 1 16686
51 Masters Asst Female 1 1 15000
52 Masters Asst Female 0 2 20300
Welch Two Sample t-test
data: salary_men$salary and salary_women$salary
t = 1.7744, df = 21.591, p-value = 0.09009
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-567.8539 7247.1471
sample estimates:
mean of x mean of y
24696.79 21357.14
The findings of a two-sample Welch’s t-test comparing salary by sex are inconclusive; the difference in means is not significant at a 95% confidence level, but is significant at a 90% confidence level, suggesting that further investigation (i.e. multiple regression) could yield significant results.
Call:
lm(formula = salary ~ ., data = salary)
Residuals:
Min 1Q Median 3Q Max
-4045.2 -1094.7 -361.5 813.2 9193.1
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 15746.05 800.18 19.678 < 2e-16 ***
degreePhD 1388.61 1018.75 1.363 0.180
rankAssoc 5292.36 1145.40 4.621 3.22e-05 ***
rankProf 11118.76 1351.77 8.225 1.62e-10 ***
sexFemale 1166.37 925.57 1.260 0.214
year 476.31 94.91 5.018 8.65e-06 ***
ysdeg -124.57 77.49 -1.608 0.115
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2398 on 45 degrees of freedom
Multiple R-squared: 0.855, Adjusted R-squared: 0.8357
F-statistic: 44.24 on 6 and 45 DF, p-value: < 2.2e-16
The 95% confidence interval for the coefficient of sexFemale is the range between -481.0138205 on the low end and 2813.7538205 on the high end. Because this range of values passes from negative to positive (i.e. crosses 0), we say that the result is insignificant at a 95% confidence level.
Part C
1.5746048^{4} – All other things being equal, a professor at this university could be expected to earn $15746.05. This coefficient is significant beyond a 99% confidence level.
1388.6133186 – A professor with a PhD would be expected to make $1,388.61 more than one with a Master’s degree. However, this coefficient is not significant at a 95% confidence level.
5292.3607713 – An Associate Professor would be expected to make $5292.36 more than an Assistant Professor. This coefficient is significant at a 95% confidence level.
1.1118764^{4} – A Full Professor would be expected to make $11118.76 more than an Assistant Professor. This coefficient is significant at a 95% confidence level.
1166.373101 – A female professor would be expected to make $1166.37 more than a male professor based on this model. However, the coefficient is not significant at a 95% confidence level. The direction of the sign, and the lack of significance, would both help to discredit the notion that female professors earn less than male professors at this university systemically.
476.3090151 – Each additional year of experience in one’s current rank would be expected to earn a professor an additional $476.31 per year. This coefficient is significant at a 95% confidence level.
-124.5743208 – A professor would be expected to earn $124.57 less per year based on each year since they earned their highest degree according to this model. However, this coefficient is not significant at a 95% confidence level, which is good, because this effect wouldn’t make much sense when considering the real-world meaning of the coefficient.
Call:
lm(formula = salary ~ ., data = salary)
Residuals:
Min 1Q Median 3Q Max
-4045.2 -1094.7 -361.5 813.2 9193.1
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 26864.81 1375.29 19.534 < 2e-16 ***
degreePhD 1388.61 1018.75 1.363 0.180
rankAsst -11118.76 1351.77 -8.225 1.62e-10 ***
rankAssoc -5826.40 1012.93 -5.752 7.28e-07 ***
sexFemale 1166.37 925.57 1.260 0.214
year 476.31 94.91 5.018 8.65e-06 ***
ysdeg -124.57 77.49 -1.608 0.115
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2398 on 45 degrees of freedom
Multiple R-squared: 0.855, Adjusted R-squared: 0.8357
F-statistic: 44.24 on 6 and 45 DF, p-value: < 2.2e-16
Reordering the rank variable to put the “Prof” level first yields the above regression table. As in the first regression, Assistant Professors are here expected to make $11118.76 less per year than Full Professors. Associate Professors are expected to make $5826.40 less per year than Full Professors. Both of these coefficients are significant at a 95% confidence level.
Part E
Code
summary(lm( salary ~ . - rank,data = salary))
Call:
lm(formula = salary ~ . - rank, data = salary)
Residuals:
Min 1Q Median 3Q Max
-8146.9 -2186.9 -491.5 2279.1 11186.6
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 17183.57 1147.94 14.969 < 2e-16 ***
degreePhD -3299.35 1302.52 -2.533 0.014704 *
sexFemale -1286.54 1313.09 -0.980 0.332209
year 351.97 142.48 2.470 0.017185 *
ysdeg 339.40 80.62 4.210 0.000114 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3744 on 47 degrees of freedom
Multiple R-squared: 0.6312, Adjusted R-squared: 0.5998
F-statistic: 20.11 on 4 and 47 DF, p-value: 1.048e-09
With rank excluded, ysdeg becomes positive and significant. The coefficient for sexFemale is now negative, but is still not significant.
Call:
lm(formula = salary ~ . - ysdeg, data = salary_dean)
Residuals:
Min 1Q Median 3Q Max
-3403.3 -1387.0 -167.0 528.2 9233.8
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 24425.32 1107.52 22.054 < 2e-16 ***
degreePhD 818.93 797.48 1.027 0.3100
rankAsst -11096.95 1191.00 -9.317 4.54e-12 ***
rankAssoc -6124.28 1028.58 -5.954 3.65e-07 ***
sexFemale 907.14 840.54 1.079 0.2862
year 434.85 78.89 5.512 1.65e-06 ***
dean 2163.46 1072.04 2.018 0.0496 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2362 on 45 degrees of freedom
Multiple R-squared: 0.8594, Adjusted R-squared: 0.8407
F-statistic: 45.86 on 6 and 45 DF, p-value: < 2.2e-16
Because dean is based on ysdeg, and because year and ysdeg measure overlapping lengths of time, I excluded ysdeg for this model. The results are similar to those above, most notably in the lack of significance for the coefficient of sexFemale.
Call:
lm(formula = Price ~ Size + New, data = house.selling.price)
Residuals:
Min 1Q Median 3Q Max
-205102 -34374 -5778 18929 163866
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -40230.867 14696.140 -2.738 0.00737 **
Size 116.132 8.795 13.204 < 2e-16 ***
New 57736.283 18653.041 3.095 0.00257 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 53880 on 97 degrees of freedom
Multiple R-squared: 0.7226, Adjusted R-squared: 0.7169
F-statistic: 126.3 on 2 and 97 DF, p-value: < 2.2e-16
All coefficients are significant. The intercept (i.e. a theoretical house of no size that is not new) is -$42390.87. Each square foot increases house price by $116.13. A new house would be expected to sell for $57736.28 more than an old house of equal size.
Part B
\(y\) is equal to predicted selling price in USD, and \(x\) is equal to house size in square feet.
Call:
lm(formula = Price ~ Size * New, data = house.selling.price)
Residuals:
Min 1Q Median 3Q Max
-175748 -28979 -6260 14693 192519
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -22227.808 15521.110 -1.432 0.15536
Size 104.438 9.424 11.082 < 2e-16 ***
New -78527.502 51007.642 -1.540 0.12697
Size:New 61.916 21.686 2.855 0.00527 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 52000 on 96 degrees of freedom
Multiple R-squared: 0.7443, Adjusted R-squared: 0.7363
F-statistic: 93.15 on 3 and 96 DF, p-value: < 2.2e-16
The intercept and New values are no longer significant. Selling price is expected to increase by $104 per square foot for all houses, and an additional $62 per square foot for new houses.