HW4

This is HW4 using cleaned sleep data through HW3

Eunsol Noh
2022-03-19
library(readr)
library(tidyverse)
library(tidyr)
library(dplyr)
library(lubridate)

setwd('/Users/eunsolnoh/Desktop/dacss601/R') #set path
sleep_data<-read.table(file="sleep_data.csv", header = TRUE) #load file that was cleaned though HW3
knitr::opts_chunk$set(echo = TRUE)

Goals 1

Despite the fact that “Time.in.bed” and “TST_mins” are usually used with different meanings in sleep studies, I could confirm that in this data set “Time.in.bed” means “TST_mins” through HW3. So, I will omit the part to check sleep efficiency (C.Goals 1 in HW3).

head(sleep_data$Time.in.bed)
[1] 512  16 510 404 432 438
head(sleep_data$TST_mins)
[1] 512  16 510 404 432 438

mean() #mean median() #median min() #lowest value max() #highest value sd() #standard deviation var() #variance (standard deviation squared) IQR() #interquartile range

#Presented means, medians, mins, maxes, and SDs for sleep quality, time in bed and heart rate according to states (e.g., coffee, tea, exercising and stress)

sleep_data1<-sleep_data[ -c(1,2,5,6,12) ] #sleep_data1 is created for further analyses (2.2-2.4).

sleep_data1 %>% #mean
  group_by(coffee_state,tea_state,working_out_state,stress_state) %>%
  summarize_all(mean, na.rm = TRUE) 
# A tibble: 14 × 7
# Groups:   coffee_state, tea_state, working_out_state [7]
   coffee_state tea_state working_out_state stress_state Sleep.quality
   <chr>        <chr>     <chr>             <chr>                <dbl>
 1 No           No        No                No                    71.9
 2 No           No        No                Yes                    3  
 3 No           Yes       No                No                    84  
 4 No           Yes       No                Yes                   93  
 5 No           Yes       Yes               No                    79.9
 6 No           Yes       Yes               Yes                   90.3
 7 Yes          No        No                No                    75.8
 8 Yes          No        No                Yes                   99  
 9 Yes          No        Yes               No                    79.3
10 Yes          No        Yes               Yes                   82  
11 Yes          Yes       No                No                    82.0
12 Yes          Yes       No                Yes                   80.6
13 Yes          Yes       Yes               No                    76  
14 Yes          Yes       Yes               Yes                   60.5
# … with 2 more variables: Time.in.bed <dbl>, Heart.rate <dbl>
sleep_data1 %>% #median
  group_by(coffee_state,tea_state,working_out_state,stress_state) %>%
  summarize_all(median, na.rm = TRUE) 
# A tibble: 14 × 7
# Groups:   coffee_state, tea_state, working_out_state [7]
   coffee_state tea_state working_out_state stress_state Sleep.quality
   <chr>        <chr>     <chr>             <chr>                <dbl>
 1 No           No        No                No                    91  
 2 No           No        No                Yes                    3  
 3 No           Yes       No                No                    82  
 4 No           Yes       No                Yes                   93  
 5 No           Yes       Yes               No                    78  
 6 No           Yes       Yes               Yes                   92  
 7 Yes          No        No                No                    76  
 8 Yes          No        No                Yes                   99  
 9 Yes          No        Yes               No                    82  
10 Yes          No        Yes               Yes                   82  
11 Yes          Yes       No                No                    82  
12 Yes          Yes       No                Yes                   77  
13 Yes          Yes       Yes               No                    77.5
14 Yes          Yes       Yes               Yes                   60  
# … with 2 more variables: Time.in.bed <dbl>, Heart.rate <dbl>
sleep_data1 %>% #min
  group_by(coffee_state,tea_state,working_out_state,stress_state)%>%
  summarize_all(min, na.rm = TRUE) 
# A tibble: 14 × 7
# Groups:   coffee_state, tea_state, working_out_state [7]
   coffee_state tea_state working_out_state stress_state Sleep.quality
   <chr>        <chr>     <chr>             <chr>                <int>
 1 No           No        No                No                      16
 2 No           No        No                Yes                      3
 3 No           Yes       No                No                      75
 4 No           Yes       No                Yes                     93
 5 No           Yes       Yes               No                      59
 6 No           Yes       Yes               Yes                     86
 7 Yes          No        No                No                      58
 8 Yes          No        No                Yes                     98
 9 Yes          No        Yes               No                      53
10 Yes          No        Yes               Yes                     82
11 Yes          Yes       No                No                      64
12 Yes          Yes       No                Yes                     76
13 Yes          Yes       Yes               No                      50
14 Yes          Yes       Yes               Yes                     54
# … with 2 more variables: Time.in.bed <int>, Heart.rate <int>
sleep_data1 %>% #max
  group_by(coffee_state,tea_state,working_out_state,stress_state)%>%
  summarize_all(max, na.rm = TRUE) 
# A tibble: 14 × 7
# Groups:   coffee_state, tea_state, working_out_state [7]
   coffee_state tea_state working_out_state stress_state Sleep.quality
   <chr>        <chr>     <chr>             <chr>                <int>
 1 No           No        No                No                     100
 2 No           No        No                Yes                      3
 3 No           Yes       No                No                     100
 4 No           Yes       No                Yes                     93
 5 No           Yes       Yes               No                     100
 6 No           Yes       Yes               Yes                     93
 7 Yes          No        No                No                      90
 8 Yes          No        No                Yes                    100
 9 Yes          No        Yes               No                      97
10 Yes          No        Yes               Yes                     82
11 Yes          Yes       No                No                     100
12 Yes          Yes       No                Yes                     93
13 Yes          Yes       Yes               No                      93
14 Yes          Yes       Yes               Yes                     68
# … with 2 more variables: Time.in.bed <int>, Heart.rate <int>
 sleep_data1 %>% #sd
  group_by(coffee_state,tea_state,working_out_state,stress_state)%>%
  summarize_all(sd, na.rm = TRUE) 
# A tibble: 14 × 7
# Groups:   coffee_state, tea_state, working_out_state [7]
   coffee_state tea_state working_out_state stress_state Sleep.quality
   <chr>        <chr>     <chr>             <chr>                <dbl>
 1 No           No        No                No                   36.3 
 2 No           No        No                Yes                  NA   
 3 No           Yes       No                No                    8.12
 4 No           Yes       No                Yes                  NA   
 5 No           Yes       Yes               No                   10.9 
 6 No           Yes       Yes               Yes                   3.79
 7 Yes          No        No                No                   10.1 
 8 Yes          No        No                Yes                   1.41
 9 Yes          No        Yes               No                   14.4 
10 Yes          No        Yes               Yes                  NA   
11 Yes          Yes       No                No                    9.19
12 Yes          Yes       No                Yes                   6.05
13 Yes          Yes       Yes               No                    9.53
14 Yes          Yes       Yes               Yes                   5.80
# … with 2 more variables: Time.in.bed <dbl>, Heart.rate <dbl>
ggplot(sleep_data1, aes(coffee_state)) + geom_bar() + theme_bw() + labs(title = "Conditions before sleep", y = "Number of Responses", x = "Drank coffee") 
ggplot(sleep_data1, aes(tea_state)) + geom_bar() + theme_bw() + labs(title = "Conditions before sleep", y = "Number of Responses", x = "Drank tea") 
ggplot(sleep_data1, aes(working_out_state)) + geom_bar()+ theme_bw() + labs(title = "Conditions before sleep", y = "Number of Responses", x = "Working-out state") 
ggplot(sleep_data1, aes(stress_state)) + geom_bar()+ theme_bw() + labs(title = "Conditions before sleep", y = "Number of Responses", x = "Stress state") 

Goals 2

Goals 2.2-2.4 (C.Goals 2.2-2.4 which I described in HW3).

2.1 There is a paper that "worse mood" was reported from the subjects who woke up early compared to those who fell asleep late assuming they had same amount of sleep. So, I will see the realtionship between **time to fall asleep or time to wake up vs. sleep quality**.

2.2 I will see the effects of coffee consumed during the day on sleep features -> **Drank coffee vs. time to fall asleep, time to wake up, sleep quality or TST**

2.3 I will see the effects of tea consumed during the day on sleep features -> **Drank tea vs. time to fall asleep, time to wake up, sleep quality or TST**

2.4 I will see the effects of stress during the day on sleep features -> **Stressful day vs. time to fall asleep, time to wake up, sleep quality or TST**

2.5 I will see the effects of excercise during the day on sleep features -> **Worked out vs. time to fall asleep, time to wake up, sleep quality or TST**

#2.1 There is a paper that higher portion of "worse mood" was reported from the subjects who woke up early compared to those who fell asleep late assuming they had same amount of sleep. So, I will see the relationship between time to fall asleep or time to wake up vs. sleep quality and mood at awake**. In addition, others of sleep features will be analyzed. 

#2.1-1 relationship between time to go to the bed and mood at awake
ggplot(sleep_data, aes(Start,Mood.at.awake)) + geom_point()
#2.1-2 relationship between time to wakeupand mood at awake
ggplot(sleep_data, aes(End, Mood.at.awake)) + geom_point()
#No relationship was observed between time to sleep or time to wake up and mood at awake. Even though there was a limitation that the sleep amount was not the same between early sleepers and late sleepers, I wanted to check the time to go to the bed still may affect the mood at awake. But, I couldn't see any relationship between them. Furthermore, the relationship of time to wake up to the mood at awake was observed, whcih still didn't show any correlation either. 

#2.1-3 relationship between time to go to the bed and sleep quality
ggplot(sleep_data, aes(Start, Sleep.quality)) + geom_point()
#2.1-4 relationship between each of time to wake up sleep quality 
ggplot(sleep_data, aes(End, Sleep.quality)) + geom_point()
#Regardless of time to sleep or wake up, sleep quality is generally high over the subjects. 
#2.2 I will see the effects of coffee consumed during the day on sleep features -> **Drank coffee vs. time to fall asleep, time to wake up, sleep quality or TST**

ggplot(sleep_data, aes(Start,coffee_state)) + geom_point()
ggplot(sleep_data, aes(End,coffee_state)) + geom_point()
#There seems to be no effect of drinking coffee before sleep on the time to sleep or to wake up.

ggplot(sleep_data1, aes(coffee_state, Sleep.quality)) + geom_violin()
ggplot(sleep_data1, aes(coffee_state, Time.in.bed)) + geom_violin()
#For those who drank coffee showed better sleep quality and higher amount of time in bed compared to the group who had no coffee.  
#2.3 I will see the effects of tea consumed during the day on sleep features -> **Drank tea vs. time to fall asleep, time to wake up, #sleep quality or TST**
ggplot(sleep_data, aes(Start,tea_state)) + geom_point()
ggplot(sleep_data, aes(End,tea_state)) + geom_point()
#Drinking tea did not show the effects of time to sleep or to wake up either.

ggplot(sleep_data1, aes(tea_state, Sleep.quality)) + geom_violin()
ggplot(sleep_data1, aes(tea_state, Time.in.bed)) + geom_violin()
#Drinking tea showed better sleep quality and higher amount of time to sleep compared to no drinking group since those who had no tea who had overall results of sleep amount and quality while all of the subjects who had tea showed better sleep quality and higher amount of sleep. 
#2.4 I will see the effects of stress during the day on sleep features -> **Stressful day vs. time to fall asleep, time to wake up, sleep quality or TST**
ggplot(sleep_data, aes(Start,stress_state)) + geom_point()
ggplot(sleep_data, aes(End,stress_state)) + geom_point()
#Stress did not show the effects of time to sleep or to wake up either.

ggplot(sleep_data1, aes(stress_state, Sleep.quality)) + geom_violin()
ggplot(sleep_data1, aes(stress_state, Time.in.bed)) + geom_violin()
#Compared to those who had stress, most of people who had no stress showed higher amount of time in bed. 
#2.4 I will see the effects of excercise during the day on sleep features -> **Worked out vs. time to fall asleep, time to wake up, sleep quality or TST**
ggplot(sleep_data, aes(Start,working_out_state)) + geom_point()
ggplot(sleep_data, aes(End,working_out_state)) + geom_point()
#Working-out state did not show the effects of time to sleep or to wake up either.

ggplot(sleep_data1, aes(working_out_state, Sleep.quality)) + geom_violin()
ggplot(sleep_data1, aes(working_out_state, Time.in.bed)) + geom_violin()
#Compared to those who didn't work out, most of people who worked out showed higher amount of time in bed and higher quality of sleep as well.  
#For fun, extra analyses were observed. 

#relationship between time in bed(TST) and sleep quality
ggplot(data = sleep_data1, mapping = aes(x = Time.in.bed, y = Sleep.quality)) +
  geom_point(mapping = aes(color = Sleep.quality)) +
  geom_smooth()
#This figure shows that sleep quality has a positive relationship with the amount of total sleep. 

#relationship between time in bed(TST) and sleep quality 
ggplot(data = sleep_data1, mapping = aes(x = Heart.rate, y = Sleep.quality)) +
  geom_point(mapping = aes(color = Sleep.quality)) +
  geom_smooth()
#This figure shows that sleep quality has a negative relationship with heart rate. 

HW 4.4 For the final project, I will think about ways to show the features of sleep each other in one graph so that all features look cleaner in one graph. If possible, I will apply colors to the graphs as well. For limitations of 2.1, I didn’t control the amount of sleep for early sleepers and late sleepers. So, I will think about a way to make the amount of sleep the same and compare them (early vs. late) in the same condition (the same amount of sleep) to see the effects of time to fall asleep and to wake up on total sleep time and sleep quality.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Noh (2022, March 23). Data Analytics and Computational Social Science: HW4. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomenoh879265/

BibTeX citation

@misc{noh2022hw4,
  author = {Noh, Eunsol},
  title = {Data Analytics and Computational Social Science: HW4},
  url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomenoh879265/},
  year = {2022}
}