Below is the code used for setting up the degrees of freedom (df = sample size - 1). Thus, Bypass df would be 539 - 1 = 538, and, Angiography would be 847 - 1 = 846.
Code
Bypass_df =538Angio_df =846
Below is the code for setting up the mean and standard deviation.
The 90% confidence interval for Bypass is [18.45, 19.55], and for Angio it is [17.60, 18.40]. Thus, Angio has the narrower confidence interval, which is logical to conclude because it has a larger sample size (847 > 539), which reduces the margin of error. Additionally, the standard deviation is smaller (9<10), which signifies less variance.
Question 2
Code
# Number who believed college ed is essential for successN <-567# Total sample sizeS <-1031# Calculating point estimatePE <- N/S# Using the function prop.test() to find the confidence interval range & p-valueprop.test(N, S, PE)
1-sample proportions test without continuity correction
data: N out of S, null probability PE
X-squared = 6.9637e-30, df = 1, p-value = 1
alternative hypothesis: true p is not equal to 0.5499515
95 percent confidence interval:
0.5194543 0.5800778
sample estimates:
p
0.5499515
The 95% confidence interval is [0.519, 0.580], with a p-value of 0.550.
Question 3
Code
# Calculating confidence interval of 95%CI95 <-qnorm(0.025, lower.tail = F)# Calculating sample size needed using confidence interval equationStudent_sample <- ((170*0.25/5)*CI95)^2Student_sample
[1] 277.5454
Based on these calculations, the needed sample size would be 278 students for a significance level of 5%.
Question 4
Part A
Code
# Calculating the standard error (sd = 90 sample = 9)company_se <-90/sqrt(9)company_se
[1] 30
Code
# Calculating the t-scorecompany_tscore <- (410-500)/company_secompany_tscore
If the significance level is 0.05, then Smith has statistically significant study findings, while Jones does not.
Part C
Both of the studies could be statistically significant, depending on the significance level. For example, a 0.1 significance level would mean Jones also had statistically significant results. In this case, since the t-scores were so similar (0.2 off), it would not be unreasonable for both studies to reject the null hypothesis.
One Sample t-test
data: gas_taxes
t = -1.8857, df = 17, p-value = 0.03827
alternative hypothesis: true mean is less than 45
95 percent confidence interval:
-Inf 44.67946
sample estimates:
mean of x
40.86278
Because the p-value is 0.03827 on a 95% confidence interval, on the 0.05 significance level, it is possible the null hypothesis that gas prices are equal to or greater than $0.45.
Source Code
---title: "Homework 2 - Emily Duryea"author: "Emily Duryea"description: "The second homework assignment for DACSS 603"date: "10/17/2022"format: html: toc: true code-fold: true code-copy: true code-tools: truecategories: - hw2 - Emily Duryea---# Homework 2Uploading packages to be used for this assignment:```{r}library(readxl)library(tidyverse)library(dplyr)```## Question 1Below is the code used for setting up the degrees of freedom (df = sample size - 1). Thus, Bypass df would be 539 - 1 = 538, and, Angiography would be 847 - 1 = 846.```{r}Bypass_df =538Angio_df =846```Below is the code for setting up the mean and standard deviation.```{r}Bypass_mean =19Bypass_sd =10Angio_mean =18Angio_sd =9```The code for calculating t-score with 90% confidence interval is below.```{r}Bypass_tscore <-qt(p =0.9, df = Bypass_df)Angio_tscore <-qt(p =0.9, df = Angio_df)```The code for calculating the standard error (sd/sqrt(sample size)) is below.```{r}Bypass_se <- Bypass_sd/sqrt(539)Angio_se <- Angio_sd/sqrt(847)```The code for calculating the margin of error (T-score multiplied by the standard error) is below.```{r}Bypass_me <- Bypass_tscore*Bypass_seAngio_me <- Angio_tscore*Angio_se```Below is the code for calculating the upper and lower ranges (add mean to margin of error for upper, subtract for lower).```{r}Bypass_low <- Bypass_mean - Bypass_meBypass_up <- Bypass_mean + Bypass_meAngio_low <- Angio_mean - Angio_meAngio_up <- Angio_mean + Angio_meBypass <-c(Bypass_low, Bypass_up)Angio <-c(Angio_low, Angio_up)BypassAngio```The 90% confidence interval for Bypass is \[18.45, 19.55\], and for Angio it is \[17.60, 18.40\]. Thus, Angio has the narrower confidence interval, which is logical to conclude because it has a larger sample size (847 \> 539), which reduces the margin of error. Additionally, the standard deviation is smaller (9\<10), which signifies less variance.## Question 2```{r}# Number who believed college ed is essential for successN <-567# Total sample sizeS <-1031# Calculating point estimatePE <- N/S# Using the function prop.test() to find the confidence interval range & p-valueprop.test(N, S, PE)```The 95% confidence interval is \[0.519, 0.580\], with a p-value of 0.550.## Question 3```{r}# Calculating confidence interval of 95%CI95 <-qnorm(0.025, lower.tail = F)# Calculating sample size needed using confidence interval equationStudent_sample <- ((170*0.25/5)*CI95)^2Student_sample```Based on these calculations, the needed sample size would be 278 students for a significance level of 5%.## Question 4### Part A```{r}# Calculating the standard error (sd = 90 sample = 9)company_se <-90/sqrt(9)company_se# Calculating the t-scorecompany_tscore <- (410-500)/company_secompany_tscore# Calculating the p-value (df = 9-1 = 8)company_pvalue <- (pt(q=-3, df=8))*2company_pvalue```It is possible to reject the null hypothesis, as the p-value is statistically significant (0.017), less than 0.05.### Part B```{r}# Calculating the probability of a random sample with a mean of 410 or lessless_company <-pt(-3, 8)less_company```The p-value for the lower tail is 0.00854.### Part B```{r}# Calculating the probability of a random sample with a mean of 410 or moremore_company <-pt(-3, 8, lower.tail = F)more_company```The p-value for the upper tail is 0.991.```{r}total_company <- less_company + more_companytotal_company```The total of both tails is equal to 1.## Question 5### Part A```{r}# Calculating t-scoresJones_tscore <- (519.5-500)/10Jones_tscoreSmith_tscore <- (519.7-500)/10Smith_tscore# Calculating p-valuesJones_pvalue <- (pt(q=1.95, df=999, lower.tail=FALSE))*2Jones_pvalueSmith_pvalue <- (pt(q=1.97, df=999, lower.tail=FALSE))*2Smith_pvalue```### Part BIf the significance level is 0.05, then Smith has statistically significant study findings, while Jones does not.### Part CBoth of the studies could be statistically significant, depending on the significance level. For example, a 0.1 significance level would mean Jones also had statistically significant results. In this case, since the t-scores were so similar (0.2 off), it would not be unreasonable for both studies to reject the null hypothesis.## Question 6```{r}gas_taxes <-c(51.27, 47.43, 38.89, 41.95, 28.61, 41.29, 52.19, 49.48, 35.02, 48.13, 39.28, 54.41, 41.66, 30.28, 18.49, 38.72, 33.41, 45.02)# Getting the meanmean_gtaxes <-mean(gas_taxes)mean_gtaxes# Conducting t-testt.test(gas_taxes, mu=45.0, alternative="less")```Because the p-value is 0.03827 on a 95% confidence interval, on the 0.05 significance level, it is possible the null hypothesis that gas prices are equal to or greater than \$0.45.