A few important things (presenting libraries) were missed in the previous final, so it was updated in here. This is the final project to show the effects of daily features on sleep duration, sleep quality and heart rate during sleep. The daily features are the things known to be related to sleep. For example, coffee is known to delay sleep onset time, tea is known to help sleep faster, stress is known to hamper sleep quality and sleep onset time and working out is known to have better sleep quality and faster sleep onset time. Therefore, those daily features are used in this project to see the relationship of them to sleep in different ways.
<Sleep project-how daily features are related to our sleep?>
The data is from Kaggle (https://www.kaggle.com/danagerous/sleep-data). Sleep was recorded from a Swedish application (iOS) from 180 subjects. 18 people are excluded due to missing information for some categories (i.e., total participants = 162).
variables:
#Libraries
library(readr)
library(tidyverse)
library(tidyr)
library(dplyr)
library(lubridate)
library(ggplot2)
library(GGally)
library(hrbrthemes)
library(viridis)
library(ggridges)
library(forcats)
library(patchwork)
library(ggExtra)
library(dygraphs)
library(chron)
library(hexbin)
library(RColorBrewer)
library(hms)
knitr::opts_chunk$set(echo = TRUE)
#setup
setwd('/Users/eunsolnoh/Desktop/dacss601/R') #set path
#import data
sleep<-read.csv(file="sleepdatacsv.csv",sep=";")
sleep_data<-rename(sleep,Mood.at.awake=Wake.up)
#change one col name of Wake.up to Mood.at.awake for better understanding
knitr::opts_chunk$set(echo = TRUE)
sleep_data = select(sleep_data, 1:7) #excluded the variable of "activity steps",
#which was the last variable (8th) as many subjects didn't include this information
sleep_data<-sleep_data %>%
drop_na(Sleep.quality,Sleep.Notes,Mood.at.awake,Heart.rate) #excluded rows that have
#blank for some variables
#convert hours to minutes for the column of "time in bed" in order to compare it
#with total sleep time that was the column created by subtracting sleep start time
#from sleep end time.
times<-as.POSIXlt(sleep_data$Time.in.bed, format="%H:%M")
sleep_data$Time.in.bed<-times$hour*60+times$min
#changing the percentage character of the column of "sleep quality" to numeric
sleep_data$Sleep.quality<-as.numeric(sub("%","",sleep_data$Sleep.quality))
#mutate: Sleep.Notes are divided to columns corresponding to "Drank coffee",
#"Drank tea", "Worked out" and "Stressful day"in the variable named states for
#each for further analyses.
sleep_data<-sleep_data %>%
mutate(coffee_state = case_when(
grepl("Drank coffee",Sleep.Notes) == TRUE ~ 'Yes', #yes
grepl("Drank coffee",Sleep.Notes) == FALSE ~'No' #no
)) %>%
mutate(tea_state = case_when(
grepl("Drank tea",Sleep.Notes) == TRUE ~ 'Yes', #yes
grepl("Drank tea",Sleep.Notes) == FALSE ~ 'No' #no
)) %>%
mutate(working_out_state = case_when(
grepl("Worked out",Sleep.Notes) == TRUE ~ 'Yes', #yes
grepl("Worked out",Sleep.Notes) == FALSE ~ 'No' #no
)) %>%
mutate(stress_state = case_when(
grepl("Stressful day",Sleep.Notes) == TRUE ~ 'Yes', #yes
grepl("Stressful day",Sleep.Notes) == FALSE ~ 'No' #no
)) %>%
#show the time between sleep start and sleep end in order to check if this
#calculated time is the same as time in bed (the 4th column).
mutate(TST_mins = as.integer(difftime(End,Start)))
head(arrange(sleep_data, Sleep.quality))
Start End Sleep.quality Time.in.bed
1 2014-12-30 21:17:50 2014-12-30 21:33:54 3 16
2 2015-01-19 05:06:38 2015-01-19 06:20:29 16 73
3 2015-06-05 03:45:52 2015-06-05 05:41:01 23 115
4 2015-05-06 21:47:25 2015-05-07 05:21:38 50 454
5 2015-04-28 21:41:45 2015-04-29 05:00:17 53 438
6 2015-03-04 20:53:47 2015-03-05 06:13:31 54 559
Mood.at.awake Sleep.Notes
1 :| Stressful day
2 :)
3 :)
4 :) Ate late:Drank coffee:Drank tea:Worked out
5 :) Drank coffee:Worked out
6 :) Drank coffee:Drank tea:Stressful day:Worked out
Heart.rate coffee_state tea_state working_out_state stress_state
1 72 No No No Yes
2 58 No No No No
3 57 No No No No
4 59 Yes Yes Yes No
5 59 Yes No Yes No
6 68 Yes Yes Yes Yes
TST_mins
1 16
2 73
3 115
4 454
5 438
6 559
#I wanted to save the cleaned data for in case.
#save(sleep_data, file = "sleep_data.csv")
write.table(sleep_data, file = "sleep_data.csv",
sep = "\t", row.names = F)
My initial plan included the analysis to show the relationship between the actually total sleep time and the time in the bed. When I saw the columns of time in bed and TST_mins(total sleep time in mins), they were same. In sleep research, those are used in different meanings: time in bed is equal to time spent on the bed including procrastinating time before/after the actual sleep. Total sleep time usually indicates the actual sleep. As it was not clear that those two terms were used differently, I checked TST_mins by manually subtracting the time to fall asleep from the time to wake up. At the end, it was observed that TST_mins was the same as time in bed. Therefore, TST_mins was ignored for the further analyses due to insufficient information of the actual sleep time.
Further analyses were conducted with the cleaned dataset.
sleep_data<-rename(sleep_data,Total.sleep.time=Time.in.bed) #change the name of
#the column of time.in.bed to total.sleep.time since it was checked that they have the same
#meaning in this dataset in the previous step.
#convert date-time format to time format as time information is only used for the further analyses.
sleep_data$Start<-as.POSIXct(sleep_data$Start, "%Y-%m-%d %H:%M:%S", tz = "EST5EDT" )
sleep_data$Start <- format(sleep_data$Start, format = "%H:%M:%S")
sleep_data$End<-as.POSIXct(sleep_data$End, "%Y-%m-%d %H:%M:%S", tz = "EST5EDT")
sleep_data$End <- format(sleep_data$End, format = "%H:%M:%S")
knitr::opts_chunk$set(echo = TRUE)
1. What daily features (e.g., drinking coffee) affect sleep features (quality, duration and heart rate during sleep) and sleep onset/offset time.
1.1 daily features -> sleep features?
#Bring needed information columns
sleep_data1_1 <- sleep_data[ , c(3,4,7:11)]
###############################sleep equality#################################
# Create new_sleep for sleep quality
set.seed(112)
new_sleep<- matrix(0,4,4)
colnames(new_sleep) <- c("90~100","80~90","70~80","<70")
rownames(new_sleep) <- c("drinking coffee","drinking tea","stress","working out")
for (y in 4:7) {
y1<-y-3
for (x in 1:162) {
if(sleep_data1_1[x,1]> 90 & sleep_data1_1[x,1] <= 100 ) {
if(sleep_data1_1[x,y]=="Yes") {
new_sleep[y1,1]=new_sleep[y1,1]+1; #adding 1 if sleep quality is between 90 and 100
}
}
else if(sleep_data1_1[x,1]> 80 & sleep_data1_1[x,1] <= 90 ) {
if(sleep_data1_1[x,y]=="Yes") {
new_sleep[y1,2]=new_sleep[y1,2]+1; #adding 1 if sleep quality is between 80 and 90
}
}
else if(sleep_data1_1[x,1]> 70 & sleep_data1_1[x,1] <= 80 ) {
if(sleep_data1_1[x,y]=="Yes") {
new_sleep[y1,3]=new_sleep[y1,3]+1; #adding 1 if sleep quality is between 70 and 80
}
}
else if(sleep_data1_1[x,1] <= 70 ) {
if(sleep_data1_1[x,y]=="Yes") {
new_sleep[y1,4]=new_sleep[y1,4]+1; #adding 1 if sleep quality is less than or equal to 70
}
}
}
}
###############################Total sleep amount###############################
# Create new_sleep1 for total sleep amount
set.seed(112)
new_sleep1<- matrix(0,4,4)
colnames(new_sleep1) <- c(">8","7~8(typical)","5~7","<=5")
rownames(new_sleep1) <- c("drinking coffee","drinking tea","stress","working out")
for (y in 4:7) {
y1<-y-3
for (x in 1:162) {
if(sleep_data1_1[x,2]> 480) {
if(sleep_data1_1[x,y]=="Yes") {
new_sleep1[y1,1]=new_sleep1[y1,1]+1; #adding 1 if sleep amount (hours) is more than 8 hours
}
}
else if(sleep_data1_1[x,2]>=420 & sleep_data1_1[x,2] <= 480 ) {
if(sleep_data1_1[x,y]=="Yes") {
new_sleep1[y1,2]=new_sleep1[y1,2]+1; #adding 1 if sleep amount (hours) is between 7 and 8 hours
}
}
else if(sleep_data1_1[x,2]>= 300 & sleep_data1_1[x,2] < 420 ) {
if(sleep_data1_1[x,y]=="Yes") {
new_sleep1[y1,3]=new_sleep1[y1,3]+1; #adding 1 if sleep amount (hours) is between 5 and 7 hours
}
}
else if(sleep_data1_1[x,2] < 300 ) {
if(sleep_data1_1[x,y]=="Yes") {
new_sleep1[y1,4]=new_sleep1[y1,4]+1; #adding 1 if sleep amount (hours) is less than or equal to 5 hours
}
}
}
}
###############################Heart rate######################################
# Create new_sleep2 for heart rate
set.seed(112)
new_sleep2<- matrix(0,4,4)
colnames(new_sleep2) <- c(">80","50~80","40~50(typical)","<40")
rownames(new_sleep2) <- c("drinking coffee","drinking tea","stress","working out")
for (y in 4:7) {
y1<-y-3
for (x in 1:162) {
if(sleep_data1_1[x,3]> 80 ) {
if(sleep_data1_1[x,y]=="Yes") {
new_sleep2[y1,1]=new_sleep2[y1,1]+1; #adding 1 if heart rate (bpm) is more than 80
}
}
else if(sleep_data1_1[x,3]> 50 & sleep_data1_1[x,3] <= 80 ) {
if(sleep_data1_1[x,y]=="Yes") {
new_sleep2[y1,2]=new_sleep2[y1,2]+1; #adding 1 if heart rate (bpm) is between 50 and 80
}
}
else if(sleep_data1_1[x,3]> 45 & sleep_data1_1[x,3] <= 50 ) {
if(sleep_data1_1[x,y]=="Yes") {
new_sleep2[y1,3]=new_sleep2[y1,3]+1; #adding 1 if heart rate (bpm) is between 40 and 50
}
}
else if(sleep_data1_1[x,3] <=40 ) {
if(sleep_data1_1[x,y]=="Yes") {
new_sleep2[y1,4]=new_sleep2[y1,4]+1; #adding 1 if heart rate (bpm) is less than or equal to 40
}
}
}
}
#######################################plot#####################################
# Grouped barplot
barplot(new_sleep,
border="white",
font.axis=2,
beside=T,
col = 1:nrow(new_sleep),
legend.text = TRUE,
args.legend = list(x = "topleft",
inset = c(- 0.001, 0)),
xlab="Sleep quality(%)",
ylab="Numbers",
font.lab=2)
# Grouped barplot
barplot(new_sleep1,
border="white",
font.axis=2,
beside=T,
col = 1:nrow(new_sleep1),
legend.text = TRUE,
args.legend = list(x = "topright",
inset = c(- 0.001, 0)),
xlab="Total sleep time(Hrs)",
ylab="Numbers",
font.lab=2)
# Grouped barplot
barplot(new_sleep2,
border="white",
font.axis=2,
beside=T,
col = 1:nrow(new_sleep2),
legend.text = TRUE,
args.legend = list(x = "topright",
inset = c(- 0.001, 0)),
xlab="Heart rate(BPM)",
ylab="Numbers",
font.lab=2)
1.1
-Method: I used histogram method to show the effects of each daily feature on each sleep feature. First, I divided each sleep feature to different categories such as good, normal and etc. In this step, I generated new_data,new_data1 and new_data2 for each of sleep features, which has the counted number of participants who were satisfied with each condition. For example, in sleep quality, if a participant had sleep quality which was above 90 percent and had coffee, 1 is added to the corresponding element. This was because the ultimate goal of this analysis is to show approximately good ranges of each category. For instance, sleep quality is considered to be good if it is allocated into 90~100%. Even though sleep quality is 90% not 100%, it is not usually regarded as bad sleep. Likewise, I made different categories indicating normal(i.e.,typical or good) range and outside of the range in order to show how many people are belonged to good range in each category and showed what daily features are related to each category.
-Result:
1) Sleep quality: There seems that less stress leads to better sleep quality. Drinking more coffee and tea show the effects on poor sleep by having inversely proportional to sleep quality.
2) Total time sleep: It looks trivial, but working out slightly plays a role in longer sleeping time. Other variables don’t show meaningful effects on this.
3) Heart rate: Significantly, having coffee, tea, stress, and working out on the day of sleep shows more increased heart rate during sleep based on the known information of normal heart rate range during sleep (40~50 bpm).
4) Overall, the number of people who worked out are low compared to other daily features in the three categories, indicating that a few people had working out on the day of the sleep in this dataset and did not affect meaningful results in the three sleep features, respectively.
-Summary:
1) less stress, coffee and tea -> better-quality sleep
2) working out -> more time to sleep
3) more coffee, tea, stress and working out -> higher heart rate
1.2 daily features -> sleep onset/offset time?
sleep_data1_2<-sleep_data[ , c(1,2,8:11)]
#change the string to time format for the columns of sleep start time and sleep end time
sleep_data1_2$Start<-strptime(sleep_data1_2$Start, format ="%H:%M:%S")
sleep_data1_2$Start<-as.POSIXct(sleep_data1_2$Start, format ="%%H:%M:%S")
sleep_data1_2$End<-strptime(sleep_data1_2$End, format ="%H:%M:%S")
sleep_data1_2$End<-as.POSIXct(sleep_data1_2$End, format ="%H:%M:%S")
# relationship between sleep onset and daily features
p1 <- ggplot(sleep_data1_2, aes(x=coffee_state, y=Start)) +
geom_boxplot(fill="slateblue", alpha=0.2) +
theme(legend.position="none")+
xlab("coffee_state") +
ylab("sleep onset")
p2 <- ggplot(sleep_data1_2, aes(x=tea_state, y=Start)) +
geom_boxplot(fill="slateblue", alpha=0.2) +
xlab("tea_state") +
ylab("sleep onset")
p3 <- ggplot(sleep_data1_2, aes(x=working_out_state, y=Start)) +
geom_boxplot(fill="slateblue", alpha=0.2) +
xlab("workingout_state") +
ylab("sleep onset")
p4 <- ggplot(sleep_data1_2, aes(x=stress_state, y=Start)) +
geom_boxplot(fill="slateblue", alpha=0.2) +
xlab("stress_state") +
ylab("sleep onset")
# Display both charts side by side with the patchwork package
p1 + p2 + p3 +p4
# relationship between sleep onset and daily features
p1 <- ggplot(sleep_data1_2, aes(x=coffee_state, y=End)) +
geom_boxplot(fill="slateblue", alpha=0.2) +
xlab("coffee_state") +
ylab("sleep offset")
p2 <- ggplot(sleep_data1_2, aes(x=tea_state, y=End)) +
geom_boxplot(fill="slateblue", alpha=0.2) +
xlab("tea_state") +
ylab("sleep offset")
p3 <- ggplot(sleep_data1_2, aes(x=working_out_state, y=End)) +
geom_boxplot(fill="slateblue", alpha=0.2) +
xlab("workingout_state") +
ylab("sleep offset")
p4 <- ggplot(sleep_data1_2, aes(x=stress_state, y=End)) +
geom_boxplot(fill="slateblue", alpha=0.2) +
xlab("stress_state") +
ylab("sleep offset")
# Display both charts side by side with the patchwork package
p1 + p2 + p3 +p4
1.2
-Method: First, I converted the string of start and end (sleep onset and offset time) to time format using functions of strptime() and as.POSIXct(). I used boxplot() to show the effects of daily features on sleep onset and offset time.
-Result:
1) Sleep onset time from daily features: No significant differences of any daily features on sleep onset time was observed.
2) Sleep offset time from daily features: Drinking coffee showed slightly later time to wake up compared to those who did not have coffee on the sleeping day. Having stress on the day of sleep showed big difference in time to wake up compared to those who did not have stress on the day, indicating that stress may play a role in delaying time to wake up.
-Summary:
1) Coffee -> slight late time to wake up
2) Stress -> late time to wake up
1.3 The relationship between sleep start time and sleep end time
sleep_data1_3 <-sleep_data[ , c(1,2,5,8:11)]
#using different function, chron function, to show time in the relationship between sleep
#onset and offset; The scale of 24 hours are converted to 1 as maximal value.
sleep_data1_3$chrons<-chron(times=sleep_data1_3$Start)
sleep_data1_3$chrone<-chron(times=sleep_data1_3$End)
#taking hour of time only for the relationship between sleep onset and offset
sleep_data1_3$start_h<-as_hms(sleep_data1_3$Start) %>% hour
sleep_data1_3$end_h<-as_hms(sleep_data1_3$End) %>% hour
# scatter plot from time (including minutes)
ggplot(data = sleep_data1_3, mapping = aes(x = chrons, y = chrone)) +
geom_point() +
xlab("sleep start time") +
ylab("wake-up time")+
ggtitle("sleep onset vs. sleep offset (0~24=>0~1)")+
geom_smooth()
# scatter plot from hour only of the time
ggplot(data = sleep_data1_3, mapping = aes(x = start_h, y = end_h)) +
geom_point() +
xlab("sleep start time") +
ylab("wake-up time")+
ggtitle("sleep onset vs. sleep offset (0~24)")+
geom_smooth()
1.3
-Method: At first, I converted the time to hour format using as_hms() since detailed information (i.e., minutes) did not show big meaningful difference from the data that only have hour information only.
-Result: From the hour-format result using the scatter plot, it was observed that people woke up before 10 am regardless of time to fall asleep (except for one person who woke up at night), indicating that time to fall asleep did not affect total sleep time.
-Summary:
1) No effects of sleep onset time on sleep offset time
2) Most of people woke up before 10 am regardless of sleep onset time (except for 1 person).
1.4 coffee and tea -> sleep features & stress -> sleep offset time?
############Sleep features with coffee and/or tea#############################
ggplot(sleep_data, aes(fill=coffee_state, y=Sleep.quality, x=coffee_state)) +
geom_bar(position="dodge", stat="identity") +
scale_fill_viridis(discrete = T, option = "E") +
ggtitle(" Sleep quality depending on coffee and tea intake") +
facet_wrap(~tea_state) +
#facet_wrap(~coffee_state)+
theme_ipsum() +
theme(legend.position="none") +
xlab("Coffee state") +
ylab("Sleep quality")
ggplot(sleep_data, aes(fill=coffee_state, y=Total.sleep.time, x=coffee_state)) +
geom_bar(position="dodge", stat="identity") +
scale_fill_viridis(discrete = T, option = "E") +
ggtitle(" Total sleep time depending on coffee and tea intake") +
facet_wrap(~tea_state) +
#facet_wrap(~coffee_state)+
theme_ipsum() +
theme(legend.position="none") +
xlab("Coffee state") +
ylab("Total sleep time")
ggplot(sleep_data, aes(fill=coffee_state, y=Heart.rate, x=coffee_state)) +
geom_bar(position="dodge", stat="identity") +
scale_fill_viridis(discrete = T, option = "E") +
ggtitle(" Heart rate depending on coffee and tea intake") +
facet_wrap(~tea_state) +
#facet_wrap(~coffee_state)+
theme_ipsum() +
theme(legend.position="none") +
xlab("Coffee state") +
ylab("Heart rate")
#####Sleep offset time in terms of stress level in different daily features#####
ggplot(sleep_data1_3, aes(fill=stress_state, y=end_h, x=stress_state)) +
geom_boxplot(fill="slateblue", alpha=0.2) +
scale_fill_viridis(discrete = T, option = "E") +
ggtitle("Sleep offset time depending on stress and coffee intake") +
facet_wrap(~coffee_state) +
theme_ipsum() +
theme(legend.position="none") +
xlab("Stress state") +
ylab("Sleep offset time")
ggplot(sleep_data1_3, aes(fill=stress_state, y=end_h, x=stress_state)) +
geom_boxplot(fill="slateblue", alpha=0.2) +
scale_fill_viridis(discrete = T, option = "E") +
ggtitle("Sleep offset time depending on stress and tea intake") +
facet_wrap(~tea_state) +
theme_ipsum() +
theme(legend.position="none") +
xlab("Stress state") +
ylab("Sleep offset time")
ggplot(sleep_data1_3, aes(fill=stress_state, y=end_h, x=stress_state)) +
geom_boxplot(fill="slateblue", alpha=0.2) +
scale_fill_viridis(discrete = T, option = "E") +
ggtitle("Sleep offset time depending on stress and working-out state") +
facet_wrap(~working_out_state) +
theme_ipsum() +
theme(legend.position="none") +
xlab("Stress state") +
ylab("Sleep offset time")
1.4
-Method: Further analyses using coffee and tea were conducted based on previous results (1.1~1.3) using facet_wrap() function. Since the number of those who had coffee and tea were almost same in each category of each sleep characteristic, further analysis was made by using geom_bar() function. In addition, stress seems to play a role in delaying the time to wake up compared to those who had the absence of stress on the day of sleep, which made me try to see the effects of stress on sleep offset time depending on yes or no in other daily features(1.2) using boxplot() function.
-Result:
1) Sleep quality: This doesn’t show any meaningful result. Sleep quality seems to be not affected by tea or/and coffee.
2) Total sleep time: This shows that participants slept the most when they took no coffee but tea. In contrast, total sleep time was the least when they didn’t take any coffee or tea.
3) Heart rate: Heart rate during sleep appears to be increased and far outside the normal range on the day the participants drank coffee regardless of tea intake.
4) Sleep offset time with stress and coffee intake: Having stress showed slightly more delayed sleep offset time regardless of taking coffee.
5) Sleep offset time with stress and tea intake: Having stress showed a way more delayed time to wake up compared to those who were with tea and having stress.
6) Sleep offset time with stress and working-out state: Having stress showed slightly more delayed time to wake up with stress regardless of working out. Stress without working out slightly more affected delayed sleep offset time.
-Summary:
1) Increased total sleep without coffee and tea
2) Decreased heart rate without coffee regardless of tea
3) Delayed sleep offset time with stress in all cases of daily features, by showing the most effects of stress with no tea on the delayed time to fall asleep
2.1 Any relationship among sleep features?
sleep_data2 <- sleep_data[ , c(3,4,7)]
ggpairs(sleep_data2, title="2.Correlationship among sleep features")
#relationship between time in bed(TST) and sleep quality
ggplot(data = sleep_data, mapping = aes(x = Total.sleep.time, y = Sleep.quality)) +
geom_point() +
geom_smooth()
2.1
-Method: ggpairs() was used to see relationship among sleep features for each pair. Based on this, the strong relationship (corr: 0.722) between total sleep time and sleep quality was shown.
-Result & Summary: There is a strong correlation between total sleep time and sleep quality (corr: 0.722).
3. sleep features and sleep onset/offset time affect mood at awake
3.1 sleep features affect mood at awake?
sleep_data3_1 <-sleep_data[ , c(1,2,3,4,5,7)]
# relationship between sleep onset and features
ggplot(sleep_data3_1, aes(x=Mood.at.awake, y=Sleep.quality, fill=Mood.at.awake)) +
# fill=name allow to automatically dedicate a color for each group
geom_violin() +
ggtitle("Quality") +
theme_ipsum()
ggplot(sleep_data3_1, aes(x=Mood.at.awake, y=Total.sleep.time, fill=Mood.at.awake)) +
# fill=name allow to automatically dedicate a color for each group
geom_violin() +
ggtitle("Total sleep time") +
theme_ipsum()
ggplot(sleep_data3_1, aes(x=Mood.at.awake, y=Heart.rate, fill=Mood.at.awake)) +
# fill=name allow to automatically dedicate a color for each group
geom_violin() +
ggtitle("Heart rate") +
theme_ipsum()
# Display both charts side by side thanks to the patchwork package
#p1 + p2 +p3
3.1
-Method: The mood at awake in each category was shown with geom_violin() function.
-Result & Summary: No significant outcomes/differences between different emotions when awake were seen with any of sleep features.
3.2 Sleep onset/offset affects mood at awake?
##############################Mood at awake######################################
###############################Sleep Start######################################
#Plot
sleep_data1_3 %>%
mutate(text = fct_reorder(Mood.at.awake, start_h)) %>%
ggplot(aes(y=Mood.at.awake, x=start_h, fill=text)) +
geom_density_ridges(alpha=0.6, stat="binline", bins=20) +
theme_ridges() +
theme(
legend.position="none",
panel.spacing = unit(0.1, "lines"),
strip.text.x = element_text(size = 8)
) +
xlab("Sleep Start Time (0~24)") +
ylab("Mood at awake")
###############################Sleep End######################################
#Plot
sleep_data1_3 %>%
mutate(text = fct_reorder(Mood.at.awake, end_h)) %>%
ggplot(aes(y=Mood.at.awake, x=end_h, fill=Mood.at.awake)) +
geom_density_ridges(alpha=0.6, stat="binline", bins=20) +
theme_ridges() +
theme(
legend.position="none",
panel.spacing = unit(0.1, "lines"),
strip.text.x = element_text(size = 8)
) +
xlab("Sleep End Time(0~24)") +
ylab("Mood at awake")
3.2
-Method:I used ggplot() function to plot the effects of sleep onset/offset time on mood at awake and x axis indicates hours from 0 to 24.
-Result & Summary:
1) It is shown that later time to fall asleep is likely to make people feel bad or good, which indicates that later time to fall asleep affects mood at people at awake in a different way.
2) On the other hand, it is shown that early time to wake up affects most of participants to feel good or bad at awake.
3.3 total sleep time and sleep quality -> mood at awake?
#Mood at awake with time in bed(TST) and sleep quality
ggplot(data = sleep_data, mapping = aes(x = Total.sleep.time, y = Sleep.quality)) +
geom_point(mapping = aes(color = Mood.at.awake)) +
geom_smooth()
mm<-table(sleep_data$Mood.at.awake)
as.data.frame(table(sleep_data$Mood.at.awake))
Var1 Freq
1 :) 147
2 :| 15
-Method: I used ggplot() function with scatter plot because I sometimes feel either sleep amount or sleep quality affects mood at awake. So, I wanted to see if both or either one affects mood at awake. The number of people who felt good at awake was shown with table() as the next step.
-Result & Summary: Overall, more people (n=147) felt good mood at awake compared to those who felt bad at awake (n=15) and no significant relationship between sleep features (such as sleep quality or total sleep time) and mood at awake was observed.
Using R is necessary in the lab that I am belonged to for data analysis. But, as I haven’t used this before, I wanted to learn it. At first, everything was unfamiliar with me. However, as I took the lectures and saw other classmates’ works, I got motivated and learn a lot from them. I am sure that knowing various functions and applying them to the homeworks will be super helpful for me. Especially, through this final project, I got to know that citation of packages of library is required, which is really important for me to memorize.
I had hard time to figure out how to plot time (hours) for axes in some graphs. As times were written as char type, it did not work when I tried to plot it in some graphs. Luckily, after spending a few weeks and trying various functions, I was able to figure it out. I wish I would have known “as_hms” and “chron” functions a little bit earlier, which were what I was exactly looking for. I just checked that, a few days ago, a new file of sleep information in the website that I imported the dataset I used for this project was updated with additional information. I saw that the new dataset had information that I wanted to get at first, which was the actual sleep time. Because it was updated a few weeks ago and I just realized it a few days ago, I was not able to use the new data for the final project. I already made many process with my current dataset. Therefore, I would like to use the updated sleep information for extended analyses in addition to my current analyses. Also, I saw a lot of great projects from other classmates, which had different topics and used different functions in R. Therefore, I want to see theirs and try to use some functions that I have not used in my current project, which will make me know more functions and algorithms.
a. Due to the similar number of those who had coffee and tea in each category of sleep
features, further analysis with facet_wrap() was conducted by controlling one effect
(e.g., coffee) to see the other effect (e.g., tea). Total sleep time was maximized
without coffee but with tea. However, the averaged time of sleep was the minimal in
the participants who had no tea and coffee. Heart rate was increased with coffee
regardless of having tea or not.
b. Stress showed an effect on sleep offset time in 4). Thus, the effect of stress
on sleep offset time was further analyzed when other daily habits were controlled.
Stress slightly delayed sleep offset time regardless of any daily habits except for
having tea. Those who had stress without tea on the day of sleep showed a way more
delayed time to wake up compared to those who had tea on the stressful day.
DANA DIOTTE. sleepdata,csv. Retrieved on 03/05/2022 from https://www.kaggle.com/datasets/danagerous/sleep-data.
Ameen, Mohamed S et al. “About the Accuracy and Problems of Consumer Devices in the Assessment of Sleep.” Sensors (Basel, Switzerland) vol. 19,19 4160. 25 Sep. 2019, doi:10.3390/s19194160
RStudio Team (2022). RStudio: Integrated Development Environment for R. RStudio, PBC, Boston, MA, http://www.rstudio.com/.
Hadley Wickham, Jim Hester and Jennifer Bryan (2022). readr: Read Rectangular Text Data. R package version 2.1.2. https://CRAN.R-project.org/package=readr
Wickham et al. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686
Hadley Wickham, Romain François, Lionel Henry and Kirill Müller (2022). dplyr: A Grammar of Data Manipulation. R package version 1.0.8. https://CRAN.R-project.org/package=dplyr
Garrett Grolemund, Hadley Wickham (2011). Dates and Times Made Easy with lubridate. Journal of Statistical Software, 40(3), 1-25. URL https://www.jstatsoft.org/v40/i03/.
H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.
Bob Rudis (2020). hrbrthemes: Additional Themes, Theme Components and Utilities for ‘ggplot2’. R package version 0.8.0. https://CRAN.R-project.org/package=hrbrthemes
Simon Garnier, Noam Ross, Robert Rudis, Antônio P. Camargo, Marco Sciaini, and Cédric Scherer (2021). Rvision - Colorblind-Friendly Color Maps for R. R package version 0.6.2.
Claus O. Wilke (2021). ggridges: Ridgeline Plots in ‘ggplot2’. R package version 0.5.3. https://CRAN.R-project.org/package=ggridges
Wickham, H. (2021). factors: Tools for Working with Categorical Variables (Factors). R package version 0.5.1. https://CRAN.R-project.org/package=forcats
Thomas Lin Pedersen (2020). patchwork: The Composer of Plots. R package version 1.1.1. https://CRAN.R-project.org/package=patchwork
Dean Attali and Christopher Baker (2022). ggExtra: Add Marginal Histograms to ‘ggplot2’, and More ‘ggplot2’ Enhancements. R package version 0.10.0. https://CRAN.R-project.org/package=ggExtra
David James and Kurt Hornik (2020). chron: Chronological Objects which Can Handle Dates and Times. R package version 2.3-56.
Erich Neuwirth (2022). RColorBrewer: ColorBrewer Palettes. R package version 1.1-3. https://CRAN.R-project.org/package=RColorBrewer
Kirill Müller (2021). hms: Pretty Time of Day. R package version 1.1.1. https://CRAN.R-project.org/package=hms
Wickham, H. & Grolemund, G. (n.d.). R for data science [eBook edition]. O’Reilly. https://r4ds.had.co.nz/index.html
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Noh (2022, May 19). Data Analytics and Computational Social Science: Final_Project (2nd), UPDATED VERSION. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomenoh901740/
BibTeX citation
@misc{noh2022final_project, author = {Noh, Eunsol}, title = {Data Analytics and Computational Social Science: Final_Project (2nd), UPDATED VERSION}, url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomenoh901740/}, year = {2022} }