Code
library(tidyverse)
library(summarytools)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Cam Needels
August 17, 2022
month year large_half_dozen large_dozen extra_large_half_dozen
1 January 2004 126.000 230.000 132.000
2 February 2004 128.500 226.250 134.500
3 March 2004 131.000 225.000 137.000
4 April 2004 131.000 225.000 137.000
5 May 2004 131.000 225.000 137.000
6 June 2004 133.500 231.375 137.000
7 July 2004 133.500 233.500 137.000
8 August 2004 133.500 233.500 137.000
9 September 2004 129.750 233.500 135.875
10 October 2004 128.500 233.500 135.500
11 November 2004 128.500 233.500 135.500
12 December 2004 128.500 233.500 135.500
13 January 2005 128.500 233.500 135.500
14 February 2005 128.500 233.500 135.500
15 March 2005 128.500 233.500 135.500
16 April 2005 128.500 233.500 135.500
17 May 2005 128.500 233.500 135.500
18 June 2005 128.500 233.500 135.500
19 July 2005 128.500 233.500 135.500
20 August 2005 128.500 233.500 135.500
21 September 2005 128.500 233.500 135.500
22 October 2005 128.500 233.500 135.500
23 November 2005 128.500 233.500 135.500
24 December 2005 128.500 233.500 135.500
25 January 2006 128.500 233.500 135.500
26 February 2006 128.500 233.500 135.500
27 March 2006 128.500 233.500 135.500
28 April 2006 128.500 233.500 135.500
29 May 2006 128.500 233.500 135.500
30 June 2006 128.500 233.500 135.500
31 July 2006 128.500 233.500 135.500
32 August 2006 128.500 233.500 135.500
33 September 2006 128.500 233.500 135.500
34 October 2006 128.500 233.500 135.500
35 November 2006 128.500 233.500 135.500
36 December 2006 128.500 233.500 135.500
37 January 2007 128.500 233.500 135.500
38 February 2007 131.125 236.125 138.125
39 March 2007 132.000 237.000 139.000
40 April 2007 132.000 237.000 139.000
41 May 2007 132.000 237.000 139.000
42 June 2007 132.000 237.000 139.000
43 July 2007 132.000 237.000 139.000
44 August 2007 132.000 237.000 139.000
45 September 2007 132.000 237.000 139.000
46 October 2007 132.000 237.000 139.000
47 November 2007 132.000 237.000 139.000
48 December 2007 132.000 237.000 139.000
49 January 2008 132.000 237.000 139.000
50 February 2008 132.000 237.000 139.000
51 March 2008 132.000 237.000 139.000
52 April 2008 132.000 237.000 139.000
53 May 2008 132.000 237.000 139.000
54 June 2008 174.500 277.500 185.500
55 July 2008 174.500 277.500 185.500
56 August 2008 174.500 277.500 185.500
57 September 2008 174.500 277.500 185.500
58 October 2008 174.500 277.500 185.500
59 November 2008 174.500 277.500 185.500
60 December 2008 174.500 277.500 185.500
61 January 2009 174.500 277.500 185.500
62 February 2009 174.500 277.500 185.500
63 March 2009 174.500 277.500 185.500
64 April 2009 174.500 277.500 185.500
65 May 2009 174.500 277.500 185.500
66 June 2009 174.500 277.500 185.500
67 July 2009 174.500 277.500 185.500
68 August 2009 174.500 271.500 185.500
69 September 2009 174.500 271.500 185.500
70 October 2009 174.500 271.500 185.500
71 November 2009 174.500 271.500 185.500
72 December 2009 174.500 271.500 185.500
73 January 2010 174.500 271.500 185.500
74 February 2010 174.500 271.500 185.500
75 March 2010 174.500 268.000 185.500
76 April 2010 174.500 268.000 185.500
77 May 2010 174.500 268.000 185.500
78 June 2010 174.500 268.000 185.500
79 July 2010 174.500 268.000 185.500
80 August 2010 174.500 268.000 185.500
81 September 2010 174.500 268.000 185.500
82 October 2010 174.500 267.500 185.500
83 November 2010 174.500 267.500 185.500
84 December 2010 174.500 267.500 185.500
85 January 2011 174.500 267.500 185.500
86 February 2011 174.500 267.500 185.500
87 March 2011 174.500 267.500 185.500
88 April 2011 174.500 267.500 185.500
89 May 2011 174.500 267.500 185.500
90 June 2011 174.500 270.000 185.500
91 July 2011 174.500 270.000 185.500
92 August 2011 174.500 270.000 185.500
93 September 2011 174.500 270.000 185.500
94 October 2011 174.500 270.000 185.500
95 November 2011 174.500 270.000 185.500
96 December 2011 174.500 270.000 185.500
97 January 2012 174.500 267.500 185.500
98 February 2012 174.500 267.500 185.500
99 March 2012 174.500 267.500 185.500
100 April 2012 174.500 267.500 185.500
101 May 2012 173.250 267.500 185.500
102 June 2012 173.250 267.500 185.500
103 July 2012 173.250 267.500 185.500
104 August 2012 173.250 267.500 185.500
105 September 2012 173.250 267.500 185.500
106 October 2012 173.250 267.500 185.500
107 November 2012 178.000 267.500 188.130
108 December 2012 178.000 267.500 188.130
109 January 2013 178.000 267.500 188.130
110 February 2013 178.000 267.500 188.130
111 March 2013 178.000 267.500 188.130
112 April 2013 178.000 267.500 188.130
113 May 2013 178.000 267.500 188.130
114 June 2013 178.000 267.500 188.130
115 July 2013 178.000 267.500 188.130
116 August 2013 178.000 267.500 188.130
117 September 2013 178.000 267.500 188.130
118 October 2013 178.000 267.500 188.130
119 November 2013 178.000 267.500 188.130
120 December 2013 178.000 267.500 188.130
extra_large_dozen
1 230.000
2 230.000
3 230.000
4 234.500
5 236.000
6 241.000
7 241.000
8 241.000
9 241.000
10 241.000
11 241.000
12 241.000
13 241.000
14 241.000
15 241.000
16 241.000
17 241.000
18 241.000
19 241.000
20 241.000
21 241.000
22 241.000
23 241.000
24 241.000
25 241.000
26 241.000
27 241.375
28 241.500
29 241.500
30 241.500
31 241.500
32 241.500
33 241.500
34 241.500
35 241.500
36 241.500
37 241.500
38 244.125
39 245.000
40 245.000
41 245.000
42 245.000
43 245.000
44 245.000
45 245.000
46 245.000
47 245.000
48 245.000
49 245.000
50 245.000
51 245.000
52 245.000
53 245.000
54 285.500
55 285.500
56 285.500
57 285.500
58 285.500
59 285.500
60 285.500
61 285.500
62 285.500
63 285.500
64 285.500
65 285.500
66 285.500
67 285.500
68 285.500
69 285.500
70 285.500
71 285.500
72 285.500
73 285.500
74 285.500
75 285.500
76 285.500
77 285.500
78 285.500
79 285.500
80 285.500
81 285.500
82 285.500
83 285.500
84 285.500
85 285.500
86 285.500
87 285.500
88 285.500
89 285.500
90 285.500
91 285.500
92 285.500
93 285.500
94 285.500
95 285.500
96 285.500
97 285.500
98 288.500
99 288.500
100 288.500
101 288.500
102 288.500
103 288.500
104 288.500
105 288.500
106 288.500
107 290.000
108 290.000
109 290.000
110 290.000
111 290.000
112 290.000
113 290.000
114 290.000
115 290.000
116 290.000
117 290.000
118 290.000
119 290.000
120 290.000
This data is about the number of eggs coming from various dozen sizes. The dates vary from January 2004 until December 2013. This data looks solid however the extra large dozen column is all by itself and it isn’t very tidy. In order to make it tidy I decided to make it long by having each category of dozen be a different data point so that extra large dozen can fit in.
[1] 120
[1] 6
[1] 480
[1] 4
With 120 rows in the current data set and 6 columns. 2 columns describe the case of the data set which include the month and year. The remainder of the 4 describe the carton and size of eggs.
After creating the carton_type column and the price_per_pound column there are 4 columns total. We should now expect to see 480 columns total because the old 4 columns merged into new rows multiplying the 120 by four times.
# A tibble: 480 × 4
month year carton_size price_per_pound
<chr> <int> <chr> <dbl>
1 January 2004 large_half_dozen 126
2 January 2004 large_dozen 230
3 January 2004 extra_large_half_dozen 132
4 January 2004 extra_large_dozen 230
5 February 2004 large_half_dozen 128.
6 February 2004 large_dozen 226.
7 February 2004 extra_large_half_dozen 134.
8 February 2004 extra_large_dozen 230
9 March 2004 large_half_dozen 131
10 March 2004 large_dozen 225
# … with 470 more rows
The pivoted data looks great and has the expected dimensions of 480 rows x 4 columns!
---
title: "Challenge 3 Submission"
author: "Cam Needels"
description: "Tidy Data: Pivoting Featuring Eggs"
date: "08/17/2022"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_3
- eggs
- Cam Needels
---
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
library(summarytools)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
```{r}
library(readr)
eggs_data <- read.csv("B:/Needels/Documents/DACCS 601/601_Spring_2023/posts/_data/eggs_tidy.csv")
eggs_data
```
### Describing the Data
This data is about the number of eggs coming from various dozen sizes. The dates vary from January 2004 until December 2013. This data looks solid however the extra large dozen column is all by itself and it isn't very tidy. In order to make it tidy I decided to make it long by having each category of dozen be a different data point so that extra large dozen can fit in.
### Data prediction
```{r}
#existing rows/cases
nrow(eggs_data)
#existing columns/cases
ncol(eggs_data)
#expected rows/cases
nrow(eggs_data) * (ncol(eggs_data)-2)
# expected columns
2+2
```
With 120 rows in the current data set and 6 columns. 2 columns describe the case of the data set which include the month and year. The remainder of the 4 describe the carton and size of eggs.
After creating the carton_type column and the price_per_pound column there are 4 columns total. We should now expect to see 480 columns total because the old 4 columns merged into new rows multiplying the 120 by four times.
## Pivot the Data
```{r}
eggs_data <-pivot_longer(eggs_data, col = c(large_half_dozen, large_dozen, extra_large_half_dozen, extra_large_dozen),
names_to= "carton_size",
values_to= "price_per_pound")
eggs_data
```
The pivoted data looks great and has the expected dimensions of 480 rows x 4 columns!
```{r}
```