Homework1
sai Pothula
Author

Sai Padma pothula

Published

May 2, 2023

This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Cmd+Shift+Enter.

Code
library(readxl)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
Code
library(tidyverse)
── Attaching packages
───────────────────────────────────────
tidyverse 1.3.2 ──
✔ ggplot2 3.4.0     ✔ purrr   0.3.5
✔ tibble  3.1.8     ✔ stringr 1.5.0
✔ tidyr   1.2.1     ✔ forcats 0.5.2
✔ readr   2.1.3     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
Code
library(ggplot2)

  lung_data <- read_excel("_data/LungCapData.xls")
view(lung_data)

1A:

Code
hist(lung_data$LungCap, freq = FALSE, main = "Histogram", xlab = "Values", ylab = "Density")

B:

Code
boxplot(LungCap ~ Gender, data = lung_data)
title("lung capacity between male and female")

C:

Code
boxplot(LungCap ~ Smoke, data = lung_data)
title("Lung Capacity between Smokers and Non-Smokers")

D:

Code
lung2 <-lung %>%
  mutate(Age_Cat = case_when(
        Age >= 0 & Age <= 13 ~ "13 or less",
        Age >= 14 & Age <= 15 ~ "14 to 15" ,
        Age >= 16 & Age <= 17 ~ "16 to 17" ,
        Age >= 18 ~ "18 or more" ,
        ))
Error in mutate(., Age_Cat = case_when(Age >= 0 & Age <= 13 ~ "13 or less", : object 'lung' not found
Code
box_plot_crop2<-ggplot(data=lung2, aes(x=Smoke, y=LungCap, fill=Smoke)) 
Error in ggplot(data = lung2, aes(x = Smoke, y = LungCap, fill = Smoke)): object 'lung2' not found
Code
box_plot_crop2+ geom_boxplot() +
  theme(legend.position = "right") +
  theme (axis.text.x=element_blank(),
        axis.ticks.x=element_blank())+
  coord_cartesian(ylim =  c(0, 15))+
  labs(title="Box Plot - Lung Capacity",
        x ="Smoke", y = "Density")+
  facet_wrap(.~Age_Cat, scales= "free")
Error in eval(expr, envir, enclos): object 'box_plot_crop2' not found

E: Smoking has been found to have a negative correlation with lung capacity in three out of four groups studied. However, it is possible that another explanation exists. Older individuals tend to smoke more and also have higher lung capacity, so analyzing the relationship between smoking and lung capacity without factoring in age can lead to a misinterpretation of the true association.

2A:

Code
X <- c(0, 1, 2, 3, 4)
frequency <- c(128, 434, 160, 64, 24)
Code
prob_two <- frequency[3] / sum(frequency)
print(paste("the probability:", prob_two))
[1] "the probability: 0.197530864197531"

2B

Code
prob_two <- (frequency[1]+frequency[2]) / 810
print(paste("the probability:", prob_two))
[1] "the probability: 0.693827160493827"

2C:

Code
prob_two <- (frequency[1]+frequency[2]+frequency[3]) / 810
print(paste("the probability:", prob_two))
[1] "the probability: 0.891358024691358"

2D:

Code
prob_two <- (frequency[4]+frequency[5]) / 810
print(paste("the probability:", prob_two))
[1] "the probability: 0.108641975308642"

2E:

Code
probaility_fre<-c(128/810, 434/810, 160/810, 64/801, 24/810)
weigh_avg<-weighted.mean(X, probaility_fre)
print(paste("the expected value:", weigh_avg))
[1] "the expected value: 1.28793968456357"

2F:

Code
weigh_avgvariance <- sum((X-weigh_avg)^2*probaility_fre)
print(paste("the variance:", weigh_avgvariance))
[1] "the variance: 0.858839854536097"
Code
standad_deviation  <- sqrt(variance)
Error in eval(expr, envir, enclos): object 'variance' not found
Code
print(paste("the standad deviation:", standad_deviation))
Error in paste("the standad deviation:", standad_deviation): object 'standad_deviation' not found