library(tidyverse)
library(ggplot2)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Challenge 9
challenge_9
functions
audrey_bertin
Creating a function
For this week’s challenge, I’ll create a function that calculates z score for a certain value based on a baseline computed from a vector.
It will take the following as input:
baseline
= vector of numbers of any lengthvalue
= the value to compute the z score for
It will then give the following as output:
- a dataframe with
mean
,sd
(of baseline),input_value
andz_score
Function Definition
<- function(baseline, value){
z_score <- mean(baseline)
mean <- sd(baseline)
sd <- abs((value - mean)/sd)
z_score
= tibble(mean = mean, sd = sd, input_value = value, z_score = z_score)
results return(results)
}
Testing
Let’s use the following inputs to test the function:
= c(4, 2, -1, 4, 9, 2, 3, 3, 1 -5)
baseline = 10 value
We expect the following results if this runs correctly:
- Mean should be: 2.4444444
- SD should be: 3.5746018
- Input value should be 10
- Z score should be the absolute value of (
value
-mean
) /sd
-> abs value of (10 - 2.4444444) / 3.5746018 -> 2.1136776
We confirm this by running the function below and confirming the results match:
= z_score(baseline, value)
results results
<- function(baseline, value){
test_zscore
= z_score(baseline, value)
results
= mean(baseline)
expected_mean = results$mean
actual_mean
= sd(baseline)
expected_sd = results$sd
actual_sd
= value
expected_input = results$input_value
actual_input
= abs((value - mean(baseline)) / sd(baseline))
expected_z = results$z_score
actual_z
cat("Mean matches: ", expected_mean == actual_mean, "( Actual:", actual_mean, "Expected:", expected_mean, ")\n")
cat("SD matches: ", expected_sd == actual_sd, "( Actual:", actual_sd, "Expected:", expected_sd, ")\n")
cat("Input matches: ", expected_input == actual_input, "( Actual:", actual_input, "Expected:", expected_input, ")\n")
cat("Z Score matches: ", expected_z == actual_z, "( Actual:", actual_z, "Expected:", expected_z, ")")
}
test_zscore(baseline, value)
Mean matches: TRUE ( Actual: 2.444444 Expected: 2.444444 )
SD matches: TRUE ( Actual: 3.574602 Expected: 3.574602 )
Input matches: TRUE ( Actual: 10 Expected: 10 )
Z Score matches: TRUE ( Actual: 2.113678 Expected: 2.113678 )
We can test again with a second set:
= c(-12, 4, 7, -3, 1, 1, 0, 8, 23, -3, -8, 12, -14, 2, 16)
baseline = -6 value
test_zscore(baseline, value)
Mean matches: TRUE ( Actual: 2.266667 Expected: 2.266667 )
SD matches: TRUE ( Actual: 10.03185 Expected: 10.03185 )
Input matches: TRUE ( Actual: -6 Expected: -6 )
Z Score matches: TRUE ( Actual: 0.8240418 Expected: 0.8240418 )
The function seems to work as expected!