Challenge 4 Instructions

challenge_4

abc_poll

eggs

fed_rates

hotel_bookings

debt

Author

Khadijat Adeleye

Published

March 29, 2023

Code

library(tidyverse)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Read in data

Read in one (or more) of the following datasets, using the correct R package and command.

abc_poll.csv ⭐
poultry_tidy.xlsx or organiceggpoultry.xls⭐⭐
FedFundsRate.csv⭐⭐⭐
hotel_bookings.csv⭐⭐⭐⭐
debt_in_trillions.xlsx ⭐⭐⭐⭐⭐

Code

  df <- read.csv("_data/FedFundsRate.csv")
view(df)

####DATASET DESCRIPTION The federal funds dataset shows the trend of united states economy related to federal funds. The economy was measured by the unemployment rate, inflation rate and GDP. The data set consist of 10 column and 904rows. There are 3196 null values in entire data frame. There are seven columns with more than one null Values.

Code

# num of rows of dataset
nrow(df)

[1] 904

Code

#num of cols of dataset
ncol(df)

[1] 10

Code

#name of all the columns
colnames(df)

 [1] "Year"                         "Month"                       
 [3] "Day"                          "Federal.Funds.Target.Rate"   
 [5] "Federal.Funds.Upper.Target"   "Federal.Funds.Lower.Target"  
 [7] "Effective.Federal.Funds.Rate" "Real.GDP..Percent.Change."   
 [9] "Unemployment.Rate"            "Inflation.Rate"

Code

   tail(df)

    Year Month Day Federal.Funds.Target.Rate Federal.Funds.Upper.Target
899 2016    12   1                        NA                       0.50
900 2016    12  14                        NA                       0.75
901 2017     1   1                        NA                       0.75
902 2017     2   1                        NA                       0.75
903 2017     3   1                        NA                       0.75
904 2017     3  16                        NA                       1.00
    Federal.Funds.Lower.Target Effective.Federal.Funds.Rate
899                       0.25                         0.54
900                       0.50                           NA
901                       0.50                         0.65
902                       0.50                         0.66
903                       0.50                           NA
904                       0.75                           NA
    Real.GDP..Percent.Change. Unemployment.Rate Inflation.Rate
899                        NA               4.7            2.2
900                        NA                NA             NA
901                        NA               4.8            2.3
902                        NA               4.7            2.2
903                        NA                NA             NA
904                        NA                NA             NA

Code

###count for number of null values
sum(is.na(df))

[1] 3196

#####TIDY DATASET when tidying up this data, we first want to check if there are any missing entries in the data set.

Code

#count number of missing entries 
num_missing_cols<-colSums(is.na(df))
print(num_missing_cols)

                        Year                        Month 
                           0                            0 
                         Day    Federal.Funds.Target.Rate 
                           0                          442 
  Federal.Funds.Upper.Target   Federal.Funds.Lower.Target 
                         801                          801 
Effective.Federal.Funds.Rate    Real.GDP..Percent.Change. 
                         152                          654 
           Unemployment.Rate               Inflation.Rate 
                         152                          194

The chosen format was pivoted longer and the unspecified data values(N/A values) in the dataset. The columns and rows were remove from dataframe

Code

#remove missing values
#pivot longer of the federal fund rate columns, removed NA values
clean_FedFundsRate <-pivot_longer(df, col = c("Federal.Funds.Target.Rate" , "Federal.Funds.Lower.Target" , "Effective.Federal.Funds.Rate"),
                 names_to="Federal Fund Type",
                 values_to = "Federal Fund Rate",
                 values_drop_na = TRUE)

Code

#pivot longer of the federal fund rate columns, removed NA values. 

clean_FedFundsRate

# A tibble: 1,317 × 9
    Year Month   Day Federal.Funds.Upp…¹ Real.…² Unemp…³ Infla…⁴ Feder…⁵ Feder…⁶
   <int> <int> <int>               <dbl>   <dbl>   <dbl>   <dbl> <chr>     <dbl>
 1  1954     7     1                  NA     4.6     5.8      NA Effect…    0.8 
 2  1954     8     1                  NA    NA       6        NA Effect…    1.22
 3  1954     9     1                  NA    NA       6.1      NA Effect…    1.06
 4  1954    10     1                  NA     8       5.7      NA Effect…    0.85
 5  1954    11     1                  NA    NA       5.3      NA Effect…    0.83
 6  1954    12     1                  NA    NA       5        NA Effect…    1.28
 7  1955     1     1                  NA    11.9     4.9      NA Effect…    1.39
 8  1955     2     1                  NA    NA       4.7      NA Effect…    1.29
 9  1955     3     1                  NA    NA       4.6      NA Effect…    1.35
10  1955     4     1                  NA     6.7     4.7      NA Effect…    1.43
# … with 1,307 more rows, and abbreviated variable names
#   ¹Federal.Funds.Upper.Target, ²Real.GDP..Percent.Change.,
#   ³Unemployment.Rate, ⁴Inflation.Rate, ⁵`Federal Fund Type`,
#   ⁶`Federal Fund Rate`

Code

clean_FedFundsRate <-p<-pivot_longer(df, col = c("Real.GDP..Percent.Change.", "Unemployment.Rate", "Inflation.Rate"),
                 names_to="GDP Condition",
                 values_to = "GDP Rate",
                 values_drop_na = TRUE)

clean_FedFundsRate

# A tibble: 1,712 × 9
    Year Month   Day Federal.Funds.Tar…¹ Feder…² Feder…³ Effec…⁴ GDP C…⁵ GDP R…⁶
   <int> <int> <int>               <dbl>   <dbl>   <dbl>   <dbl> <chr>     <dbl>
 1  1954     7     1                  NA      NA      NA    0.8  Real.G…     4.6
 2  1954     7     1                  NA      NA      NA    0.8  Unempl…     5.8
 3  1954     8     1                  NA      NA      NA    1.22 Unempl…     6  
 4  1954     9     1                  NA      NA      NA    1.06 Unempl…     6.1
 5  1954    10     1                  NA      NA      NA    0.85 Real.G…     8  
 6  1954    10     1                  NA      NA      NA    0.85 Unempl…     5.7
 7  1954    11     1                  NA      NA      NA    0.83 Unempl…     5.3
 8  1954    12     1                  NA      NA      NA    1.28 Unempl…     5  
 9  1955     1     1                  NA      NA      NA    1.39 Real.G…    11.9
10  1955     1     1                  NA      NA      NA    1.39 Unempl…     4.9
# … with 1,702 more rows, and abbreviated variable names
#   ¹Federal.Funds.Target.Rate, ²Federal.Funds.Upper.Target,
#   ³Federal.Funds.Lower.Target, ⁴Effective.Federal.Funds.Rate,
#   ⁵`GDP Condition`, ⁶`GDP Rate`

####Mutating Data Also merge date with month, day and year into one easily readable “Date” column.

Code

###margin month and date  variable names to make easier to reference in code
clean_FedFundsRate <- mutate(clean_FedFundsRate, Date = make_date(Year, Month, Day))

Code

clean_FedFundsRate<-clean_FedFundsRate[complete.cases(clean_FedFundsRate$"Inflation.Rate"),]

Error in complete.cases(clean_FedFundsRate$Inflation.Rate): no input has determined the number of cases

Code

select(clean_FedFundsRate,Date,"Inflation.Rate","Unemployment.Rate","Federal Fund Rate","Real.GDP..Percent.Change.","Federal Fund Type")

Error in `select()`:
! Can't subset columns that don't exist.
✖ Column `Inflation.Rate` doesn't exist.