Code
library(tidyverse)
library(lubridate)
library(ggplot2)
library(readr)
library(readxl)
library(summarytools)
::opts_chunk$set(echo = TRUE) knitr
Michaela Bowen
November 30, 2022
In Homework two, I read in the inventory report for all data including items with 0 quantity, that had not yet been cleared from inventory. My goal was to manipulate the data into a workable form in which I could explore and analyze a research question for the data I had available to me. I did this by using several mutates in order to give the need columns, variables and instances.
From this data I was able to interpret that there are 6741 instances of 12 variables, meaning these current products are on the Sales Floor Room, Quarantine room, or vault.
Current_Inventory <- read_csv("../posts/2022-11-16 Export inventory.csv",
skip = 1,
col_names = c("delete","product","package ID","batch","available","room","price","cost","category","expiration_date","thc","isCannabis"))%>%
select(!contains("delete"))%>%
separate(col = "thc", into = c("thc","unit_thc (%/mg)"), sep = " ")%>%
mutate(thc = as.numeric(thc),
isCannabis = case_when(
isCannabis == "Yes" ~ TRUE,
isCannabis == "No" ~ FALSE
))%>%
mutate(expiration_date = na_if(expiration_date, "Invalid date"),
expiration_date = mdy(expiration_date))%>%
mutate(
date = as.Date(expiration_date),
day = day(expiration_date))%>%
mutate(
format_date = format(date, "%m/%d/%Y")
)%>%
arrange(desc(room), category, product)
Here I am reading in the Lab results from the current inventory across 3 License, Worcester Adult Use and Medical, and Northampton Adult Use. The most commonly recognized cannabinoids are THC, CBD, CBG, CBN, and CBC, others usually are present in negligible amounts or are sub variation such as THCa or CBDa. For the purposes of this project I have only included the main cannabinoids and their active sub variation which is noted with an (a) at the end of the abbreviation, as those are the most common in the products we sell.
With testing information separated from units, my goal is to combine like products and extract statistical analysis on the testing information by type of product
CurrentInvAllLicense_orig <- read_xlsx("../Homework 3/Current Inventory Lab Results Report 11_29_2022-12_6_2022.xlsx",
skip = 4,
col_names = c("license","room","product","category","package id","delete","delete1", "batch","cbd","thc","delete3","delete2","cbc","cbca","delete4","cbda","delete5","delete6","delete7","cbg","cbga","delete8","delete9","cbn","cbna","delete10","delete11","thca","thc8","thc9","delete","thcv","thcva"))%>%
select(!contains("delete"))%>%
filter(!license == "Sandbox")
Error: `path` does not exist: '../Homework 3/Current Inventory Lab Results Report 11_29_2022-12_6_2022.xlsx'
#here I will filter out accessories, clothing, employee product and high times cannabis cup items, as our testing analysis does not need them
CurrentInvAllLicense_Cannabis <- CurrentInvAllLicense_orig%>%
filter(!category == "Accessories" & !category == "Clothes")%>%
mutate(
category_abrv = substr(product,1,3)
)%>%
relocate(category_abrv, .before = category)%>%
filter(!category_abrv == "EMP" & !category_abrv == "SAM" & !category_abrv == "HTC")%>%
#here I have separate the units out for all of the testing results in order to have clean analysis
separate(col = "thc", into = c("thc","unit (%/mg)"), sep = " ", convert = TRUE)%>%
separate(col = "cbd", into = c("cbd",NA), sep = " ", convert = TRUE)%>%
separate(col = "cbc", into = c("cbc",NA), sep = " ", convert = TRUE)%>%
separate(col = "cbca", into = c("cbca",NA), sep = " ", convert = TRUE)%>%
separate(col = "cbda", into = c("cbda",NA), sep = " ", convert = TRUE)%>%
separate(col = "cbg", into = c("cbg",NA), sep = " ", convert = TRUE)%>%
separate(col = "cbga", into = c("cbga",NA), sep = " ", convert = TRUE)%>%
separate(col = "cbn", into = c("cbn",NA), sep = " ", convert = TRUE)%>%
separate(col = "cbna", into = c("cbna",NA), sep = " ", convert = TRUE)%>%
separate(col = "thca", into = c("thca",NA), sep = " ", convert = TRUE)%>%
separate(col = "thc8", into = c("thc8",NA), sep = " ", convert = TRUE)%>%
separate(col = "thc9", into = c("thc9",NA), sep = " ", convert = TRUE)%>%
separate(col = "thcv", into = c("thcv",NA), sep = " ", convert = TRUE)%>%
separate(col = "thcva", into = c("thcva",NA), sep = " ", convert = TRUE)%>%
relocate("unit (%/mg)", .after = batch)
Error in filter(., !category == "Accessories" & !category == "Clothes"): object 'CurrentInvAllLicense_orig' not found
The number one most common question asked in dispensaries is:
“What’s your highest testing flower?”
As cannabis educators, budtenders are able to inform consumers about the benefits of differentiating the types of cannabinoids they can consume, while still maintaining a full high. Many consumers believe, that the higher the THC the more effective, however studies have shown that the presence of different cannbinoids such as CBD, CBN, and CBG lead to a more full bodied high called the entourage effect. The cannabis industry deals with consumers ranging from recreational, to serious pain management for medical conditions. In order to maintain a successful business it is important to have products containing many types of cannabinoids as well as varying testing levels.
The graphs below are intended to demonstrate the distribution of cannabinoids by product type that are present in the current inventory.
---
title: "Homework 3"
author: "Michaela Bowen"
desription: "Data Wrangling/Vizualization"
date: "11/30/2022"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- hw3
---
```{r}
#| label: setup
#| warning: false
library(tidyverse)
library(lubridate)
library(ggplot2)
library(readr)
library(readxl)
library(summarytools)
knitr::opts_chunk$set(echo = TRUE)
```
::: panel-tabset
# Homework Two Summary
In Homework two, I read in the inventory report for all data including items with 0 quantity, that had not yet been cleared from inventory. My goal was to manipulate the data into a workable form in which I could explore and analyze a research question for the data I had available to me. I did this by using several mutates in order to give the need columns, variables and instances.
From this data I was able to interpret that there are 6741 instances of 12 variables, meaning these current products are on the Sales Floor Room, Quarantine room, or vault.
```{r, message = FALSE, warning= FALSE}
#| label: Homework 2 Readin
#| warning: FALSE
Current_Inventory <- read_csv("../posts/2022-11-16 Export inventory.csv",
skip = 1,
col_names = c("delete","product","package ID","batch","available","room","price","cost","category","expiration_date","thc","isCannabis"))%>%
select(!contains("delete"))%>%
separate(col = "thc", into = c("thc","unit_thc (%/mg)"), sep = " ")%>%
mutate(thc = as.numeric(thc),
isCannabis = case_when(
isCannabis == "Yes" ~ TRUE,
isCannabis == "No" ~ FALSE
))%>%
mutate(expiration_date = na_if(expiration_date, "Invalid date"),
expiration_date = mdy(expiration_date))%>%
mutate(
date = as.Date(expiration_date),
day = day(expiration_date))%>%
mutate(
format_date = format(date, "%m/%d/%Y")
)%>%
arrange(desc(room), category, product)
```
# Homework Three
## Cleaning Current Inventory
Here I am reading in the Lab results from the current inventory across 3 License, Worcester Adult Use and Medical, and Northampton Adult Use. The most commonly recognized cannabinoids are THC, CBD, CBG, CBN, and CBC, others usually are present in negligible amounts or are sub variation such as THCa or CBDa. For the purposes of this project I have only included the main cannabinoids and their active sub variation which is noted with an (a) at the end of the abbreviation, as those are the most common in the products we sell.
With testing information separated from units, my goal is to combine like products and extract statistical analysis on the testing information by type of product
```{r, message=FALSE}
#| label: cleaning and arranging
#| warning: FALSE
CurrentInvAllLicense_orig <- read_xlsx("../Homework 3/Current Inventory Lab Results Report 11_29_2022-12_6_2022.xlsx",
skip = 4,
col_names = c("license","room","product","category","package id","delete","delete1", "batch","cbd","thc","delete3","delete2","cbc","cbca","delete4","cbda","delete5","delete6","delete7","cbg","cbga","delete8","delete9","cbn","cbna","delete10","delete11","thca","thc8","thc9","delete","thcv","thcva"))%>%
select(!contains("delete"))%>%
filter(!license == "Sandbox")
#here I will filter out accessories, clothing, employee product and high times cannabis cup items, as our testing analysis does not need them
CurrentInvAllLicense_Cannabis <- CurrentInvAllLicense_orig%>%
filter(!category == "Accessories" & !category == "Clothes")%>%
mutate(
category_abrv = substr(product,1,3)
)%>%
relocate(category_abrv, .before = category)%>%
filter(!category_abrv == "EMP" & !category_abrv == "SAM" & !category_abrv == "HTC")%>%
#here I have separate the units out for all of the testing results in order to have clean analysis
separate(col = "thc", into = c("thc","unit (%/mg)"), sep = " ", convert = TRUE)%>%
separate(col = "cbd", into = c("cbd",NA), sep = " ", convert = TRUE)%>%
separate(col = "cbc", into = c("cbc",NA), sep = " ", convert = TRUE)%>%
separate(col = "cbca", into = c("cbca",NA), sep = " ", convert = TRUE)%>%
separate(col = "cbda", into = c("cbda",NA), sep = " ", convert = TRUE)%>%
separate(col = "cbg", into = c("cbg",NA), sep = " ", convert = TRUE)%>%
separate(col = "cbga", into = c("cbga",NA), sep = " ", convert = TRUE)%>%
separate(col = "cbn", into = c("cbn",NA), sep = " ", convert = TRUE)%>%
separate(col = "cbna", into = c("cbna",NA), sep = " ", convert = TRUE)%>%
separate(col = "thca", into = c("thca",NA), sep = " ", convert = TRUE)%>%
separate(col = "thc8", into = c("thc8",NA), sep = " ", convert = TRUE)%>%
separate(col = "thc9", into = c("thc9",NA), sep = " ", convert = TRUE)%>%
separate(col = "thcv", into = c("thcv",NA), sep = " ", convert = TRUE)%>%
separate(col = "thcva", into = c("thcva",NA), sep = " ", convert = TRUE)%>%
relocate("unit (%/mg)", .after = batch)
```
## Statistical Analysis/Plotting
The number one most common question asked in dispensaries is:
***"What's your highest testing flower?"***
As cannabis educators, budtenders are able to inform consumers about the benefits of differentiating the types of cannabinoids they can consume, while still maintaining a full high. Many consumers believe, that the higher the THC the more effective, however studies have shown that the presence of different cannbinoids such as CBD, CBN, and CBG lead to a more full bodied high called the entourage effect. The cannabis industry deals with consumers ranging from recreational, to serious pain management for medical conditions. In order to maintain a successful business it is important to have products containing many types of cannabinoids as well as varying testing levels.
The graphs below are intended to demonstrate the distribution of cannabinoids by product type that are present in the current inventory.
```{r}
#| label: plots
#| warning: FALSE
#chart continaing products that are measured in mg
cannabionids_mg <- CurrentInvAllLicense_Cannabis%>%
group_by(`unit (%/mg)`, category_abrv)%>%
#chart containing products that are measured in %
cannabinoids_perc <-
```
:::