HW3 using sleep data
A. Data information
I picked sleep-related data to see different features of sleep and how they are related with different conditions. As I am a first-year phd student in the sleep research lab, my interest is fairly biased to sleep:). As I wasn’t sure if I could use the data from the lab I am belonged to, I imported data from Kaggle (https://www.kaggle.com/danagerous/sleep-data).
Data of sleep cycle and other features was recorded between 2014 and 2018 from approximately 180 subjects through Northcube app. This Swedish app which is available on iOS makes us track our life cycle including sleep cycle(patterns).
B. Variable information
From the app, the datasets includes sleep features regarding
The information above is from Ameen MS, Cheung LM, Hauser T, Hahn MA, Schabus M. About the Accuracy and Problems of Consumer Devices in the Assessment of Sleep. Sensors (Basel). 2019;19(19):4160. Published 2019 Sep 25. doi:10.3390/s19194160
coffee_state = The variable is created from “Mood.at.awake” where it has subjects’s side notes. Only coffee state is extracted for this variable.
tea_state = The variable is created from “Mood.at.awake” where it has subjects’s side notes. Only tea state is extracted for this variable.
working_out_state = The variable is created from “Mood.at.awake” where it has subjects’s side notes. Only working-out state is extracted for this variable.
stress_state = The variable is created from “Mood.at.awake” where it has subjects’s side notes. Only stress state is extracted for this variable.
**_TST_mins** (which is I made up and added to the columns; TST) = time between Start and End (mins)
-Names in italic are the variables I made up for the further analyses.
C. Goals
First, as it wasn’t clear enough that “time_in_bed” is differnt from the Total sleep time (TST) (which I made a name for the time from the “start” to “end”), I will check if this is right. In sleep study, we usually think “time_in_bed” means the the time we stay in the bed including the time of being awake and TST. The TST is the actual time to fall asleep before the time to wake up. So, I will check the sleep efficiency through the equation following: TST/TIB*100 if they are different.
I will see if there are relationships between features and conditions.
2.1 There is a paper that “worse mood” was reported from the subjects who woke up early compared to those who fell asleep late assuming they had same amount of sleep. So, I will see the realtionship between time to fall asleep or time to wake up vs. sleep quality.
2.2 I will see the effects of coffee consumed during the day on sleep features -> Drank coffee vs. time to fall asleep, time to wake up, sleep quality or TST
2.3 I will see the effects of tea consumed during the day on sleep features -> Drank tea vs. time to fall asleep, time to wake up, sleep quality or TST
2.4 I will see the effects of stress during the day on sleep features -> Stressful day vs. time to fall asleep, time to wake up, sleep quality or TST
2.5 I will see the effects of excercise during the day on sleep features -> Worked out vs. time to fall asleep, time to wake up, sleep quality or TST
sleep<-read.csv(file="sleepdatacsv.csv",sep=";")
sleep_data<-rename(sleep,Mood.at.awake=Wake.up) #changed the col name of Wake.up to Mood.at.awake
sleep_data = select(sleep_data, 1:7) #excluded the variable of "activity steps", which was the last variable (8th) as many subjects didn't include this information
sleep_data<-sleep_data %>%
drop_na(Sleep.quality,Sleep.Notes,Mood.at.awake,Heart.rate) #excluded rows that have blank for some variables
#changing hours to minutes
times<-as.POSIXlt(sleep_data$Time.in.bed, format="%H:%M")
sleep_data$Time.in.bed<-times$hour*60+times$min
#changing the percentage character to numeric
sleep_data$Sleep.quality<-as.numeric(sub("%","",sleep_data$Sleep.quality))
#mutate: Sleep.Notes are divided to columns corresponding to "Drank coffee", "Drank tea", "Worked out" and "Stressful day"in the variable named states for each.
sleep_data<-sleep_data %>%
mutate(coffee_state = case_when(
grepl("Drank coffee",Sleep.Notes) == TRUE ~ 'Yes', #yes
grepl("Drank coffee",Sleep.Notes) == FALSE ~'No' #no
)) %>%
mutate(tea_state = case_when(
grepl("Drank tea",Sleep.Notes) == TRUE ~ 'Yes', #yes
grepl("Drank tea",Sleep.Notes) == FALSE ~ 'No' #no
)) %>%
mutate(working_out_state = case_when(
grepl("Worked out",Sleep.Notes) == TRUE ~ 'Yes', #yes
grepl("Worked out",Sleep.Notes) == FALSE ~ 'No' #no
)) %>%
mutate(stress_state = case_when(
grepl("Stressful day",Sleep.Notes) == TRUE ~ 'Yes', #yes
grepl("Stressful day",Sleep.Notes) == FALSE ~ 'No' #no
)) %>%
mutate(TST_mins = as.integer(difftime(End,Start)))
head(arrange(sleep_data, Sleep.quality))
Start End Sleep.quality Time.in.bed
1 2014-12-30 21:17:50 2014-12-30 21:33:54 3 16
2 2015-01-19 05:06:38 2015-01-19 06:20:29 16 73
3 2015-06-05 03:45:52 2015-06-05 05:41:01 23 115
4 2015-05-06 21:47:25 2015-05-07 05:21:38 50 454
5 2015-04-28 21:41:45 2015-04-29 05:00:17 53 438
6 2015-03-04 20:53:47 2015-03-05 06:13:31 54 559
Mood.at.awake Sleep.Notes
1 :| Stressful day
2 :)
3 :)
4 :) Ate late:Drank coffee:Drank tea:Worked out
5 :) Drank coffee:Worked out
6 :) Drank coffee:Drank tea:Stressful day:Worked out
Heart.rate coffee_state tea_state working_out_state stress_state
1 72 No No No Yes
2 58 No No No No
3 57 No No No No
4 59 Yes Yes Yes No
5 59 Yes No Yes No
6 68 Yes Yes Yes Yes
TST_mins
1 16
2 73
3 115
4 454
5 438
6 559
#save(sleep_data, file = "sleep_data.csv")
write.table(sleep_data, file = "sleep_data.csv",
sep = "\t", row.names = F)
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Noh (2022, March 23). Data Analytics and Computational Social Science: Homework3. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomenoh876604/
BibTeX citation
@misc{noh2022homework3, author = {Noh, Eunsol}, title = {Data Analytics and Computational Social Science: Homework3}, url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomenoh876604/}, year = {2022} }