Challenge 9: Creating a Function

challenge_9
debt
Author

Surya Praneeth Reddy Chirasani

Published

January 29, 2023

Code
library(tidyverse)
library(readr)
library("readxl")
library(lubridate)
knitr::opts_chunk$set(echo = TRUE)

Function that reads a dataset and cleans it

Created a function to read in the “Debt in Trillions” dataset and create a new column with readable dates from the “Year and Quarter” Column

Code
debt_data <- function(x) {
  data <-read_excel("_data/debt_in_trillions.xlsx")
  data <- data%>%
        mutate(Date = parse_date_time(`Year and Quarter`, orders="yq"),
        .before=`Year and Quarter`)
  return(data)
} 
Code
debt_data()
# A tibble: 74 × 9
   Date                Year and …¹ Mortg…² HE Re…³ Auto …⁴ Credi…⁵ Stude…⁶ Other
   <dttm>              <chr>         <dbl>   <dbl>   <dbl>   <dbl>   <dbl> <dbl>
 1 2003-01-01 00:00:00 03:Q1          4.94   0.242   0.641   0.688   0.241 0.478
 2 2003-04-01 00:00:00 03:Q2          5.08   0.26    0.622   0.693   0.243 0.486
 3 2003-07-01 00:00:00 03:Q3          5.18   0.269   0.684   0.693   0.249 0.477
 4 2003-10-01 00:00:00 03:Q4          5.66   0.302   0.704   0.698   0.253 0.449
 5 2004-01-01 00:00:00 04:Q1          5.84   0.328   0.72    0.695   0.260 0.446
 6 2004-04-01 00:00:00 04:Q2          5.97   0.367   0.743   0.697   0.263 0.423
 7 2004-07-01 00:00:00 04:Q3          6.21   0.426   0.751   0.706   0.33  0.41 
 8 2004-10-01 00:00:00 04:Q4          6.36   0.468   0.728   0.717   0.346 0.423
 9 2005-01-01 00:00:00 05:Q1          6.51   0.502   0.725   0.71    0.364 0.394
10 2005-04-01 00:00:00 05:Q2          6.70   0.528   0.774   0.717   0.374 0.402
# … with 64 more rows, 1 more variable: Total <dbl>, and abbreviated variable
#   names ¹​`Year and Quarter`, ²​Mortgage, ³​`HE Revolving`, ⁴​`Auto Loan`,
#   ⁵​`Credit Card`, ⁶​`Student Loan`

Function to calculate summary statistics

I have used the debt dataset for this task as it has lot of numeric data

Code
data <-read_excel("_data/debt_in_trillions.xlsx")

I created a function to calculate z score for a variable. It takes a vector as input, calculates z score for each value in the vector and returns the vector as output

Code
z_score <- function(x){
  stat <- (x - mean(x,na.rm=T))/sd(x,na.rm=T)
  return(stat)
}
z_score(data$Mortgage)
 [1] -2.812097806 -2.695629898 -2.608700951 -2.206127094 -2.054212430
 [6] -1.947028196 -1.741943400 -1.615347848 -1.487064354 -1.331773810
[11] -1.154540036 -0.989121847 -0.707235750 -0.433789356 -0.193257805
[16] -0.033747409  0.124919017  0.364606597  0.549436104  0.697974886
[21]  0.810222942  0.843137786  0.860861163  0.829634260  0.726669878
[26]  0.665904012  0.565471540  0.480230535  0.472634802  0.362074686
[31]  0.282741473  0.150238128  0.227883400  0.204252230  0.108039610
[36] -0.005052417 -0.073414016 -0.107172830 -0.207605301 -0.203385450
[41] -0.288626455 -0.365427757 -0.318165417 -0.189881924 -0.091981363
[46] -0.150215318 -0.120676355 -0.087761512 -0.086917541 -0.133335911
[51] -0.011804180 -0.021087854  0.080188588  0.074280796  0.064153152
[56]  0.173869297  0.297932939  0.351947042  0.395833500  0.513145379
[61]  0.561251689  0.611889910  0.730889729  0.717386204  0.818662646
[66]  0.955385843  0.981548924  1.082825366  1.214484741  1.267654873
[71]  1.339392353  1.492994957  1.591739488  1.829739127