library(tidyverse)
library(ggplot2)
library(ggforce)
library(plotly)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Challenge 7
Challenge Overview
Today’s challenge is to:
- read in a data set, and describe the data set using both words and any supporting information (e.g., tables, etc)
- tidy data (as needed, including sanity checks)
- mutate variables as needed (including sanity checks)
- Recreate at least two graphs from previous exercises, but introduce at least one additional dimension that you omitted before using ggplot functionality (color, shape, line, facet, etc) The goal is not to create unneeded chart ink (Tufte), but to concisely capture variation in additional dimensions that were collapsed in your earlier 2 or 3 dimensional graphs.
- Explain why you choose the specific graph type
- If you haven’t tried in previous weeks, work this week to make your graphs “publication” ready with titles, captions, and pretty axis labels and other viewer-friendly features
R Graph Gallery is a good starting point for thinking about what information is conveyed in standard graph types, and includes example R code. And anyone not familiar with Edward Tufte should check out his fantastic books and courses on data visualizaton.
(be sure to only include the category tags for the data you use!)
Read in data
Read in one (or more) of the following datasets, using the correct R package and command.
I have used the same dataset as that of Challenge 6 which is the Debt dataset.
<- read_excel("_data/debt_in_trillions.xlsx") debt
Error in read_excel("_data/debt_in_trillions.xlsx"): could not find function "read_excel"
debt
Error in eval(expr, envir, enclos): object 'debt' not found
Briefly describe the data
The above dataset is about the cumulative debt which is held by some of the nation citizens which are most likely in the US and the dataset consists of 6 rows and 8 columns which explains the different kinds of debts in different quarters Like the auto loan, credit card, student loan.
Tidy Data (as needed)
The dataset has a combined column for year and quarter of the US and corresponds to the associated debt spread to all of its citizens for the period. Therefore, I am splitting the year and quarter field.
<- debt %>%
split_quarter_year separate(`Year and Quarter`,c('Year','Quarter'),sep = ":")
Error in separate(., `Year and Quarter`, c("Year", "Quarter"), sep = ":"): object 'debt' not found
split_quarter_year
Error in eval(expr, envir, enclos): object 'split_quarter_year' not found
Once I am done with splitting the column now I am going to pivot the data in order to filter by the debt type. I want to add a new column called the debt_type and debt_percentage and pivot the data. I expect the rows and columns to be 518 and 4 respectively.
<- split_quarter_year%>%
pivot_debtpivot_longer(!c(Year,Quarter), names_to = "Debt_Type",values_to = "Debt_Percentage" )
Error in pivot_longer(., !c(Year, Quarter), names_to = "Debt_Type", values_to = "Debt_Percentage"): object 'split_quarter_year' not found
pivot_debt
Error in eval(expr, envir, enclos): object 'pivot_debt' not found
Visualization with Multiple Dimensions
From my previous challenge 6 I would like to improve the exploratory analysis graphs to add more dimensionality and add also use the ggplot functionality.
Before I make changes I would like to first display the original graph.
<- pivot_debt%>%
pivot_debt_plot ggplot(mapping=aes(x = Year, y = Debt_Percentage))
Error in ggplot(., mapping = aes(x = Year, y = Debt_Percentage)): object 'pivot_debt' not found
+
pivot_debt_plot geom_point(aes(color = Debt_Type))
Error in eval(expr, envir, enclos): object 'pivot_debt_plot' not found
The main objective of the above graph is to display the different debt types and how they are changed throughout the years. I did add a legend, axis and also color codes it feels like except the mortgages and total all of the other values have been lost.
Therefore, I want to improve my chart by adding a title and also provide more information of the other debt types.
+
pivot_debt_plot geom_point(aes(color = Debt_Type))+
labs(title = "Total National Debt of the Mortgages",subtitle="Student and Auto loan are the secondary category" ,caption = "From the dataset.")+
theme(legend.position = "bottom")+
facet_zoom(y = Debt_Type == !c("Mortgage","Total"),ylim = c(0,2))
Error in eval(expr, envir, enclos): object 'pivot_debt_plot' not found
The second chart I would like to improve from the third chart of the previous challenge 6. This graph doesn’t show anything and without the help of the free scale on each of the axis it makes it very difficult to see the patterns. Therefore, I want to experiment with a lot of them and also give the chart a title as well so as to adjust the axis which will help understanding how the different types of debt flows throughout the year.
+
pivot_debt_plotgeom_point() +
facet_wrap(~Debt_Type) +
scale_x_discrete(breaks = c('03','06','09',12,15,18,21))
Error in eval(expr, envir, enclos): object 'pivot_debt_plot' not found
+
pivot_debt_plotgeom_point(aes(color = Quarter,alpha=0.9,)) +
facet_wrap(~Debt_Type,scales = "free_y") +
scale_x_discrete(breaks = c('03','06','09',12,15,18,21))+
theme_light() +
guides(alpha="none") +
labs(title = "Debt Types per year" ,caption = "From the dataset.")
Error in eval(expr, envir, enclos): object 'pivot_debt_plot' not found