Code
library(tidyverse)
::opts_chunk$set(echo = TRUE) knitr
Quinn He
October 17, 2022
I want to create a data set with the information provided in the HW.
I did it manually below because using the t.test() function would give me the 95% CI. The CI for 90% is 15.34 and 21.65.
[1] 15.34312 21.65688
I found p by dividing the number of people that said college is essential for success with the total number of participants. Then I found the margin of error to calculate the low and high of the confidence interval for 95%. I believe this formula is accurate. https://www.geeksforgeeks.org/how-to-calculate-point-estimates-in-r/
The mean would fall between 503 and 630 with a confidence interval of 95%.
The sample size should be about 278.
I create a data frame with the parameters of the question in order to perform a t test to compare the means.
One Sample t-test
data: df
t = -3.0972, df = 8, p-value = 0.01473
alternative hypothesis: true mean is not equal to 500
95 percent confidence interval:
340.4972 476.6449
sample estimates:
mean of x
408.5711
The p value is 0.002.
For Jones. Plug t value into t value formula
For Smith.
For Jones, we can retain the null hypothesis since the level for Jones is greater than 0.05. For Smith we must reject the null hypothesis since his significance level is below 0.05.
If a result from a test is less than 0.05 we must reject the null hypothesis, but if it’s greater than 0.05, we must fail to reject. There is no accepting a null hypothesis, we must either reject or fail to reject.
One Sample t-test
data: gas_taxes
t = -1.8857, df = 17, p-value = 0.03827
alternative hypothesis: true mean is less than 45
95 percent confidence interval:
-Inf 44.67946
sample estimates:
mean of x
40.86278
Since the sample mean is lower than the 45 and the p value is lower than 0.05, it is statistically significant. This means the null hypothesis should be rejected.
---
title: "Homework 2"
author: "Quinn He"
desription: "DACSS 603 HW 2"
date: "10/17/2022"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- hw2
---
```{r}
#| label: setup
#| warning: false
library(tidyverse)
knitr::opts_chunk$set(echo = TRUE)
```
I want to create a data set with the information provided in the HW.
```{r}
surgical_proc <- c("Bypass", "Angiography")
samp_size <- c(539,847)
mean_wait <- c(19,18)
stand_dev <-c(10,9)
df <- tibble(surgical_proc,samp_size,mean_wait,stand_dev)
```
## 1
I did it manually below because using the t.test() function would give me the 95% CI. The CI for 90% is 15.34 and 21.65.
```{r}
mn <- mean(df$mean_wait)
standard_dev <- sd(df$mean_wait)
size <- length(df$mean_wait)
std_er <- standard_dev / sqrt(size)
confidence <- 0.9
tail_area <- (1-confidence)/2
t_score <- qt(p = 1 - tail_area, df = size-1)
CI <- c(mn - t_score * std_er,
mn + t_score * std_er)
print(CI)
```
## 2
I found p by dividing the number of people that said college is essential for success with the total number of participants. Then I found the margin of error to calculate the low and high of the confidence interval for 95%. I believe this formula is accurate.
https://www.geeksforgeeks.org/how-to-calculate-point-estimates-in-r/
The mean would fall between 503 and 630 with a confidence interval of 95%.
```{r}
p <- 567/1031
margin <- qt(0.975, df = 1031 - 1) * sqrt(1031)
low <- 567 - margin
high <- 567 + margin
CI <- c(low, high)
print(CI)
```
## 3
The sample size should be about 278.
```{r}
#1.96 * 42.5/sqrt(n) = 5
(1.96*42.5/5)^2
```
## 4a
```{r}
men_mean <- 500
sample_mean <- 410
sd <- 90
```
I create a data frame with the parameters of the question in order to perform a t test to compare the means.
```{r}
#pnorm(9, mean = 410, sd = 90)
df <- rnorm(9, mean = 410, sd = 90)
t.test(df, mu = 500, alternative = "two.sided")
```
## 4b
The p value is 0.002.
## 4c
## 5a
For Jones. Plug t value into t value formula
```{r}
2 * (1 - pt(1.95, df = 999))
```
For Smith.
```{r}
2 * (1 - pt(1.97, df = 999))
```
## 5b
For Jones, we can retain the null hypothesis since the level for Jones is greater than 0.05. For Smith we must reject the null hypothesis since his significance level is below 0.05.
## 5c
If a result from a test is less than 0.05 we must reject the null hypothesis, but if it's greater than 0.05, we must fail to reject. There is no accepting a null hypothesis, we must either reject or fail to reject.
## 6
```{r}
gas_taxes <- c(51.27, 47.43, 38.89, 41.95, 28.61, 41.29, 52.19, 49.48, 35.02, 48.13, 39.28, 54.41, 41.66, 30.28, 18.49, 38.72, 33.41, 45.02)
```
```{r}
t.test(gas_taxes, mu = 45, alternative = "less", conf.level = 0.95)
```
Since the sample mean is lower than the 45 and the p value is lower than 0.05, it is statistically significant. This means the null hypothesis should be rejected.