Loading required package: car
Loading required package: carData
Attaching package: 'car'
The following object is masked from 'package:dplyr':
recode
The following object is masked from 'package:purrr':
some
Loading required package: effects
lattice theme set by effectsTheme()
See ?effectsTheme for details.
region group fertility ppgdp
Africa :53 oecd : 31 Min. :1.134 Min. : 114.8
Asia :50 other :115 1st Qu.:1.754 1st Qu.: 1283.0
Europe :39 africa: 53 Median :2.262 Median : 4684.5
Latin Amer:20 Mean :2.761 Mean : 13012.0
Caribbean :17 3rd Qu.:3.545 3rd Qu.: 15520.5
Oceania :17 Max. :6.925 Max. :105095.4
(Other) : 3
lifeExpF pctUrban
Min. :48.11 Min. : 11.00
1st Qu.:65.66 1st Qu.: 39.00
Median :75.89 Median : 59.00
Mean :72.29 Mean : 57.93
3rd Qu.:79.58 3rd Qu.: 75.00
Max. :87.12 Max. :100.00
the variable ppgdp is the predictor and the variable fertility is the response.
When ppgdp is lower than 25000, fertility surges. when ppgdp is greater than 25000, fertility maintains stable. I don’t believe a stright-line mean function would be plausible for a summary of this graph.
The sccatterplot matrix suggests that there are positive correlations between quality and helpfulness, helpfulness and clarity, quality and clarity. It also indicates that there is no strong correlation between the rest of them.
Question 5
Code
data("student.survey")summary(student.survey)
subj ge ag hi co
Min. : 1.00 f:31 Min. :22.00 Min. :2.000 Min. :2.600
1st Qu.:15.75 m:29 1st Qu.:24.00 1st Qu.:3.000 1st Qu.:3.175
Median :30.50 Median :26.50 Median :3.350 Median :3.500
Mean :30.50 Mean :29.17 Mean :3.308 Mean :3.453
3rd Qu.:45.25 3rd Qu.:31.00 3rd Qu.:3.625 3rd Qu.:3.725
Max. :60.00 Max. :71.00 Max. :4.000 Max. :4.000
dh dr tv sp
Min. : 0 Min. : 0.200 Min. : 0.000 Min. : 0.000
1st Qu.: 205 1st Qu.: 1.450 1st Qu.: 3.000 1st Qu.: 3.000
Median : 640 Median : 2.000 Median : 6.000 Median : 5.000
Mean :1232 Mean : 3.818 Mean : 7.267 Mean : 5.483
3rd Qu.:1350 3rd Qu.: 5.000 3rd Qu.:10.000 3rd Qu.: 7.000
Max. :8000 Max. :20.000 Max. :37.000 Max. :16.000
ne ah ve pa
Min. : 0.000 Min. : 0.000 Mode :logical d:21
1st Qu.: 2.000 1st Qu.: 0.000 FALSE:60 i:24
Median : 3.000 Median : 0.500 r:15
Mean : 4.083 Mean : 1.433
3rd Qu.: 5.250 3rd Qu.: 2.000
Max. :14.000 Max. :11.000
pi re ab aa
very liberal : 8 never :15 Mode :logical Mode :logical
liberal :24 occasionally:29 FALSE:60 FALSE:59
slightly liberal : 6 most weeks : 7 NA's :1
moderate :10 every week : 9
slightly conservative: 6
conservative : 4
very conservative : 2
ld
Mode :logical
FALSE:44
NA's :16
For the first regression analysis, the people who are very conservative come to church every week. By contrast, the people who are liberal or very liberal come to church rarely. It indicates the close relationship between political ideology and religion.
For the second regression analysis, the graphic demonstrates that students who spent more time watching TV achieve lower gpa. In other words, there is a negative association between gpa and hours of watching TV.
Source Code
---title: "Homework 3"author: "Guanhua Tan"description: "Homework 3"date: "04/01/2023"format: html: toc: true code-fold: true code-copy: true code-tools: truecategories: - hw3 - regression analysis---```{r}library(tidyverse)library(smss)library(alr4)data(UN11)```# Question 1```{r}glimpse(UN11)summary(UN11)```(1) the variable ppgdp is the predictor and the variable fertility is the response.(2)```{r}# scatterplotggplot(UN11, aes(x=ppgdp, y=fertility))+geom_point()+geom_smooth(method ='lm')```When ppgdp is lower than 25000, fertility surges. when ppgdp is greater than 25000, fertility maintains stable. I don't believe a stright-line mean function would be plausible for a summary of this graph.(3)```{r}# scatterplot log(data)ggplot(UN11, aes(x=log(ppgdp), y=fertility))+geom_point()+geom_smooth(method ='lm')```Yes, the simple line regression model seem plausible for a summary of this graphic.# Question 2```{r}ggplot(UN11, aes(x=log(ppgdp), y=fertility))+geom_point()+geom_smooth(method="lm")cor(UN11$ppgdp,UN11$fertility)``````{r}UN11$income.pound=UN11$ppgdp*1.33ggplot(UN11, aes(x=log(income.pound), y=fertility))+geom_point()+geom_smooth(method ='lm')cor.test(UN11$income.pound, UN11$fertility)```(a) the slopes of the prediction equation maintain the same.(b) the correlation doesn't change.# Question 3```{r}data("water")summary(water)pairs(~APMAM+APSAB+APSLAKE+OPBPC+OPRC+OPSLAKE+BSAAM, data=water)```The sctterplot martrix clearly demonstrates that there are positive correlations between any two sites.# Question 4```{r}data("Rateprof")summary(Rateprof)pairs(~quality+helpfulness+clarity+easiness+raterInterest, data=Rateprof)```The sccatterplot matrix suggests that there are positive correlations between quality and helpfulness, helpfulness and clarity, quality and clarity. It also indicates that there is no strong correlation between the rest of them.# Question 5```{r}data("student.survey")summary(student.survey)# (i)lm_ideology_religiosity <-lm(as.numeric(pi)~as.numeric(re), data=student.survey)summary(lm_ideology_religiosity)``````{r}# iilm_hi_tv <-lm(hi~tv, data=student.survey)summary(lm_hi_tv)``````{r}ggplot(student.survey, aes(x=re, fill=pi))+geom_bar(position="fill")ggplot(student.survey, aes(x=tv, y=log(hi)))+geom_smooth(method="lm")```For the first regression analysis, the people who are very conservative come to church every week. By contrast, the people who are liberal or very liberal come to church rarely. It indicates the close relationship between political ideology and religion.For the second regression analysis, the graphic demonstrates that students who spent more time watching TV achieve lower gpa. In other words, there is a negative association between gpa and hours of watching TV.