Final Project Checkin-1

finalpart1
Template of course blog qmd file
Author

Xiaoyan

Published

March 17, 2023

Code
library(tidyr)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
Code
library(readxl)
library(ggplot2)

Introduction and background

The Chinese government implemented the one-child policy in 1979, which resulted in the increasing proportion of one-child families and the “four-two-one” family structure consisting of four grandparents, two parents, and one child. Despite being blessed with relatively more family and social resources, only children may face physical and socio-psychological problems during development, including an elevated risk for overweight and obesity and negative psychosocial consequences. Previous studies have shown that only children had a higher likelihood of overweight or obesity, compared with children who had one or more siblings. Over obesity, mental healthy is also interesting to explore that how it is related to overweight/obesity, as well as sib-size, in young adolescents affects mental health.。

research questions

  1. Does obesity positively related to mental health?
  2. what are factors that affects mental healthy?
  3. does sibling or obeisty directily related to mental health?

key predictors

  1. mental health
  2. sibling number
  3. obisity rate
  4. gender

hypothesis

  1. Higher obesity rate increase the risk of depression
  2. higher family income increase the rate of obesity
  3. More sibling reduce the risk of both depression and anxiety.

data description

Code
data<-read_excel("/Users/cassie199/Desktop/23spring/603_Spring_2023-1/posts/_data/mentalhealth_data.xlsx")
head(data)
# A tibble: 6 × 29
  T0depres…¹ T0anx…² T1dep…³ T1anx…⁴ Height Weight    WC    HC   SBP   DBP   FBG
       <dbl>   <dbl>   <dbl>   <dbl>  <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1         31      35      41      35   153.   34.6    58  67      98    60   4.4
2         35      24      35      25   172.   46.1    63  78     110    70   3.9
3         31      34      37      26   146.   38.9    72  77.7   102    62   4.6
4         27      31      42      35   162.   46.8    62  80     116    80   4.5
5         31      26      49      33   154.   36.4    56  72      90    60   4.2
6         30      28      47      32   164.   40.6    55  73     102    70   3.7
# … with 18 more variables: TC <dbl>, TG <dbl>, `HDL-C` <dbl>, `LDL-C` <dbl>,
#   BMI <dbl>, WHR <dbl>, WtHR <dbl>, `Family location` <dbl>,
#   `Number of siblings` <dbl>,
#   `How much time do you spend with your father in elementary school?` <dbl>,
#   `How much time do you spend with your mother in elementary school?` <dbl>,
#   `Father’s education level` <dbl>, `Mother’s education level` <dbl>,
#   `Family financial situation` <dbl>, `Sleeping hours` <dbl>, …
Code
glimpse(data)
Rows: 1,348
Columns: 29
$ T0depression                                                        <dbl> 31…
$ T0anxiety                                                           <dbl> 35…
$ T1depression                                                        <dbl> 41…
$ T1anxiety                                                           <dbl> 35…
$ Height                                                              <dbl> 15…
$ Weight                                                              <dbl> 34…
$ WC                                                                  <dbl> 58…
$ HC                                                                  <dbl> 67…
$ SBP                                                                 <dbl> 98…
$ DBP                                                                 <dbl> 60…
$ FBG                                                                 <dbl> 4.…
$ TC                                                                  <dbl> 3.…
$ TG                                                                  <dbl> 0.…
$ `HDL-C`                                                             <dbl> 0.…
$ `LDL-C`                                                             <dbl> 2.…
$ BMI                                                                 <dbl> 14…
$ WHR                                                                 <dbl> 0.…
$ WtHR                                                                <dbl> 0.…
$ `Family location`                                                   <dbl> 2,…
$ `Number of siblings`                                                <dbl> 2,…
$ `How much time do you spend with your father in elementary school?` <dbl> 5,…
$ `How much time do you spend with your mother in elementary school?` <dbl> 5,…
$ `Father’s education level`                                          <dbl> 4,…
$ `Mother’s education level`                                          <dbl> 3,…
$ `Family financial situation`                                        <dbl> 3,…
$ `Sleeping hours`                                                    <dbl> 3,…
$ `Skipping breakfast`                                                <dbl> 1,…
$ Vigorous                                                            <dbl> 1,…
$ Moderate                                                            <dbl> 2,…
Code
sum(is.na(data))
[1] 728
Code
plot(data$T0depression~data$BMI)

This dataset including 1348 variables and 29 columns. there are 728 NA in this data set. all variables was presented as numberic data. descriptive data was also presented as degrees such as education level, family financial situation and depression rate. By pre-plotting depression rate vs BMI, we can see that some ouliers may need to deal with and there is no siginifcant disrtibution on graph. More data processing is needed in future process.