library(tidyverse)
library(ggplot2)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)Challenge 9
challenge_9
functions
audrey_bertin
Creating a function
For this week’s challenge, I’ll create a function that calculates z score for a certain value based on a baseline computed from a vector.
It will take the following as input:
baseline= vector of numbers of any lengthvalue= the value to compute the z score for
It will then give the following as output:
- a dataframe with
mean,sd(of baseline),input_valueandz_score
Function Definition
z_score <- function(baseline, value){
mean <- mean(baseline)
sd <- sd(baseline)
z_score <- abs((value - mean)/sd)
results = tibble(mean = mean, sd = sd, input_value = value, z_score = z_score)
return(results)
}Testing
Let’s use the following inputs to test the function:
baseline = c(4, 2, -1, 4, 9, 2, 3, 3, 1 -5)
value = 10We expect the following results if this runs correctly:
- Mean should be: 2.4444444
- SD should be: 3.5746018
- Input value should be 10
- Z score should be the absolute value of (
value-mean) /sd-> abs value of (10 - 2.4444444) / 3.5746018 -> 2.1136776
We confirm this by running the function below and confirming the results match:
results = z_score(baseline, value)
resultstest_zscore <- function(baseline, value){
results = z_score(baseline, value)
expected_mean = mean(baseline)
actual_mean = results$mean
expected_sd = sd(baseline)
actual_sd = results$sd
expected_input = value
actual_input = results$input_value
expected_z = abs((value - mean(baseline)) / sd(baseline))
actual_z = results$z_score
cat("Mean matches: ", expected_mean == actual_mean, "( Actual:", actual_mean, "Expected:", expected_mean, ")\n")
cat("SD matches: ", expected_sd == actual_sd, "( Actual:", actual_sd, "Expected:", expected_sd, ")\n")
cat("Input matches: ", expected_input == actual_input, "( Actual:", actual_input, "Expected:", expected_input, ")\n")
cat("Z Score matches: ", expected_z == actual_z, "( Actual:", actual_z, "Expected:", expected_z, ")")
}test_zscore(baseline, value)Mean matches: TRUE ( Actual: 2.444444 Expected: 2.444444 )
SD matches: TRUE ( Actual: 3.574602 Expected: 3.574602 )
Input matches: TRUE ( Actual: 10 Expected: 10 )
Z Score matches: TRUE ( Actual: 2.113678 Expected: 2.113678 )
We can test again with a second set:
baseline = c(-12, 4, 7, -3, 1, 1, 0, 8, 23, -3, -8, 12, -14, 2, 16)
value = -6test_zscore(baseline, value)Mean matches: TRUE ( Actual: 2.266667 Expected: 2.266667 )
SD matches: TRUE ( Actual: 10.03185 Expected: 10.03185 )
Input matches: TRUE ( Actual: -6 Expected: -6 )
Z Score matches: TRUE ( Actual: 0.8240418 Expected: 0.8240418 )
The function seems to work as expected!