Homework 2

hw2

Mani Kanta Gogula

HW_2_603

Author

Mani Kanta Gogula

Published

October 17, 2022

Code

library(tidyverse)
library(ggplot2)
library(stats)

knitr::opts_chunk$set(echo = TRUE)

Question 1

Code

procedure <- c("Bypass", "Angiography")
s_size <- c(539, 847)
mean_wait_time <- c(19, 18)
s_sd <- c(10, 9)

surgery <- data.frame(procedure, s_size, mean_wait_time, s_sd)
surgery

Code

standard_error <- s_sd / sqrt(s_size)
standard_error

[1] 0.4307305 0.3092437

Code

confidence_level <- 0.90
tail_area <- (1-confidence_level)/2
tail_area

[1] 0.05

Code

t_score <- qt(p = 1-tail_area, df = s_size-1)
t_score

[1] 1.647691 1.646657

Code

CI <- c(mean_wait_time - t_score * standard_error,
        mean_wait_time + t_score * standard_error)
CI

[1] 18.29029 17.49078 19.70971 18.50922

We can be 90% confident that the population mean wait time for the bypass procedure is between 18.29029 and 19.70971 days.

We can be 90% confident that the population mean wait time for the angiography procedure is between 17.49078 and 18.50922 days.

From the above results, we can be sure that confidence interval of angiography procedure is narrower.

Question 2

Code

prop.test(567, 1031, conf.level = .95)


    1-sample proportions test with continuity correction

data:  567 out of 1031, null probability 0.5
X-squared = 10.091, df = 1, p-value = 0.00149
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
 0.5189682 0.5805580
sample estimates:
        p 
0.5499515

The point estimate, p, of the proportion of all adult Americans who believe that a college education is essential for success is 0.5499515 and confidence interval at 95% confidence level for p is [0.5189682, 0.5805580].

Question 3

Code

ME <- 5
z <- 1.96
s_sd <- (200-30)/4

s_size <- ((z*s_sd)/ME)^2
s_size

[1] 277.5556

The necessary size for the sample is 278.

Question 4

A

We assume that the sample is random and that the population has a normal distribution.

Null hypothesis: H0: μ = 500

Alternative hypothesis: Ha: μ ≠ 500

We will reject the null hypothesis at a p-value less than or equal to 0.05

Code

s_mean <- 410
μ <- 500
s_sd <- 90
s_size <- 9

Calculating test-statistic

Code

t_score <- (s_mean-μ)/(s_sd/sqrt(s_size))
t_score

[1] -3

Calculating p-value

Code

p <- 2*pt(t_score, s_size-1)
p

[1] 0.01707168

As p-value is less than the 0.05, we reject the null hypothesis. Therefore, the mean income of female employees is not equal to $500.

B

We assume that the sample is random and that the population has a normal distribution.

Null hypothesis: H0: μ = 500

Alternative hypothesis: Ha: μ < 500

We will reject the null hypothesis at a p-value less than 0.05

Code

p <- pt(t_score, s_size-1, lower.tail = TRUE)
p

[1] 0.008535841

The p-value is 0.008535841. As p-value is less than the 0.05, we reject the null hypothesis. Therefore, the mean income of female employees is less than $500.

C

We assume that the sample is random and that the population has a normal distribution.

Null hypothesis: H0: μ = 500

Alternative hypothesis: Ha: μ > 500

We will reject the null hypothesis at a p-value less than 0.05

Code

p <- pt(t_score, s_size-1, lower.tail = FALSE)
p

[1] 0.9914642

The p-value is 0.9914642. As p-value is less than the 0.05, we reject the null hypothesis. Therefore, the mean income of female employees is greater than $500.

Question 5

A

We assume that the sample is random and that the population has a normal distribution.

Null hypothesis: H0: μ = 500

Alternative hypothesis: Ha: μ ≠ 500

We will reject the null hypothesis at a p-value less than 0.05

Calculating t-statistic and p-value for Jones

Code

s_mean <- 519.5
μ <- 500
se <- 10
s_size <- 1000

jt <- (s_mean-μ)/se
jt

[1] 1.95

Code

p <- 2*pt(jt, s_size-1, lower.tail = FALSE)
p

[1] 0.05145555

Calculating t-statistic and p-value for Smith

Code

s_mean <- 519.7
μ <- 500
se <- 10
s_size <- 1000

jt <- (s_mean-μ)/se
jt

[1] 1.97

Code

p <- 2*pt(jt, s_size-1, lower.tail = FALSE)
p

[1] 0.04911426

The test-statistic is 1.95, p-value is 0.05145555 for Jones and the test-statistic is 1.97, p-value is 0.05145555 for Smith.

B

The p-value is 0.05145555 for Jones. As p-value is greater than the 0.05, we fail to reject the null hypothesis. The p-value is 0.04911426 for Jones. As p-value is less than the 0.05, we reject the null hypothesis. Therefore, the result is statistically significant for Smith, but not Jones.

C

If we fail to report the P-value and simply state whether the P-value is less than/equal to or greater than the defined significance level of the test, one cannot determine the strength of the conclusion. In the Jones/Smith example, reporting the results only as P ≤ 0.05 versus P > 0.05 will lead to different conclusions about very similar results.

Question 6

Code

gas_taxes <- c(51.27, 47.43, 38.89, 41.95, 28.61, 41.29, 52.19, 49.48, 35.02, 48.13, 39.28, 54.41, 41.66, 30.28, 18.49, 38.72, 33.41, 45.02)

t.test(gas_taxes, mu = 18.4, conf.level = .95)


    One Sample t-test

data:  gas_taxes
t = 10.238, df = 17, p-value = 1.095e-08
alternative hypothesis: true mean is not equal to 18.4
95 percent confidence interval:
 36.23386 45.49169
sample estimates:
mean of x 
 40.86278

The 95% confidence interval for the mean tax per gallon is 36.23386 through 45.49169. We cannot conclude with 95% confidence that the mean tax is less than 45 cents, since the 95% confidence interval contains values above 45 cents.