#A we can reject the null hypothesis that the mean salary for men and women is the same. There is a significant difference in salary between men and women, with male faculty earning more on average.
#C Based on the coefficients, we can interpret that degreePhD,sexFemale,ysdeg are Not statistically significant and rankAssoc,rankProf,year are Statistically significant.
#E The coefficient for sex show that females make $1286 less than males.
Code
library(alr4)
Loading required package: car
Loading required package: carData
Loading required package: effects
lattice theme set by effectsTheme()
See ?effectsTheme for details.
Code
data(salary)summary(lm(salary ~ sex, data = salary))
Call:
lm(formula = salary ~ sex, data = salary)
Residuals:
Min 1Q Median 3Q Max
-8602.8 -4296.6 -100.8 3513.1 16687.9
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 24697 938 26.330 <2e-16 ***
sexFemale -3340 1808 -1.847 0.0706 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 5782 on 50 degrees of freedom
Multiple R-squared: 0.0639, Adjusted R-squared: 0.04518
F-statistic: 3.413 on 1 and 50 DF, p-value: 0.0706
Code
#Bmodel <-lm(salary ~ ., data = salary)summary(model)
Call:
lm(formula = salary ~ ., data = salary)
Residuals:
Min 1Q Median 3Q Max
-4045.2 -1094.7 -361.5 813.2 9193.1
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 15746.05 800.18 19.678 < 2e-16 ***
degreePhD 1388.61 1018.75 1.363 0.180
rankAssoc 5292.36 1145.40 4.621 3.22e-05 ***
rankProf 11118.76 1351.77 8.225 1.62e-10 ***
sexFemale 1166.37 925.57 1.260 0.214
year 476.31 94.91 5.018 8.65e-06 ***
ysdeg -124.57 77.49 -1.608 0.115
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2398 on 45 degrees of freedom
Multiple R-squared: 0.855, Adjusted R-squared: 0.8357
F-statistic: 44.24 on 6 and 45 DF, p-value: < 2.2e-16
H. I prefer the model with interaction as it provides a better fit to the data. The R-squared value for the model with interaction is 0.72, which is higher than the R-squared value for the model without interaction (0.67). This indicates that the model with interaction is able to explain more of the variation in the data.
Source Code
---title: "Homework4"author: "Rahul Somu"description: "Homework4"date: "05/16/2023"format: html: toc: true code-fold: true code-copy: true code-tools: truecategories: - hw4editor_options: chunk_output_type: inline---# Question 1```{r, echo=T}calculate_prediction <-function(home_size, lot_size) { intercept <--10536 coefficient_home_size <-53.8 coefficient_lot_size <-2.84 predicted_price <- intercept + coefficient_home_size * home_size + coefficient_lot_size * lot_size actual_price <-145000 residual <- actual_price - predicted_price result <-list(predicted_price = predicted_price,residual = residual )return(result)}result <-calculate_prediction(home_size =1240, lot_size =18000)predicted_price <- result$predicted_priceresidual <- result$residualpredicted_priceresidualincrease_price <-53.8increase_lot_size <-53.8*1/2.84increase_priceincrease_lot_size```#Question 2#Awe can reject the null hypothesis that the mean salary for men and women is the same. There is a significant difference in salary between men and women, with male faculty earning more on average.#CBased on the coefficients, we can interpret that degreePhD,sexFemale,ysdeg are Not statistically significant and rankAssoc,rankProf,year are Statistically significant.#EThe coefficient for sex show that females make $1286 less than males.```{r, echo=T}library(alr4)data(salary)summary(lm(salary ~ sex, data = salary))#Bmodel <-lm(salary ~ ., data = salary)summary(model)summary(model)$coefficients#Dsalary$rank <-relevel(salary$rank, ref ="Prof")model <-lm(salary ~ ., data = salary)summary(model)$coefficientsmodel <-lm(salary ~ . - rank, data = salary)summary(model)#Fsalary$new_dean <-ifelse(salary$ysdeg <=15, 1, 0)cor.test(salary$new_dean, salary$ysdeg)summary(lm(salary ~ . -ysdeg, data = salary))```#Question 3```{r, echo=T}library(smss)data(house.selling.price)model1 <-lm(Price ~ Size + New, data = house.selling.price)summary(model1)summary(lm(Price ~ Size * New, data = house.selling.price))```#Bselling_price_new = -40230.867 + 116.132 * Size + 57736.283selling_price_not_new = -40230.867 + 116.132 * Size#Cselling_price_new = -40230.867 + 116.132 * 3000 + 57736.283 selling_price_not_new = -40230.867 + 116.132 * 3000 = 25693.6```{r, echo=T}#Dnew_interaction_term <-lm(Price ~ Size + New + Size * New, data = house.selling.price)summary(new_interaction_term)```#Enew_sp_pred = -22228 + 104*Size - 78528 + 62*Size = -100756 + 168*Sizenot_new_pred = -22228 + 104*Size```{r, echo=T}#Fnew_pred <--100756+168*(3000)not_new_pred <-22228+104*(3000)new_prednot_new_pred``````{r, echo=T}#Gnew_pred <--100756+168*(1500)not_new_pred <-22228+104*(1500)new_prednot_new_pred```H. I prefer the model with interaction as it provides a better fit to the data. The R-squared value for the model with interaction is 0.72, which is higher than the R-squared value for the model without interaction (0.67). This indicates that the model with interaction is able to explain more of the variation in the data.