Homework 3

hw3

Author

Michaela Bowen

Published

November 30, 2022

Code

library(tidyverse)
library(lubridate)
library(ggplot2)
library(readr)
library(readxl)
library(summarytools)

knitr::opts_chunk$set(echo = TRUE)

Homework Two Summary
Homework Three

In Homework two, I read in the inventory report for all data including items with 0 quantity, that had not yet been cleared from inventory. My goal was to manipulate the data into a workable form in which I could explore and analyze a research question for the data I had available to me. I did this by using several mutates in order to give the need columns, variables and instances.

From this data I was able to interpret that there are 6741 instances of 12 variables, meaning these current products are on the Sales Floor Room, Quarantine room, or vault.

Code

Current_Inventory <- read_csv("../posts/2022-11-16 Export inventory.csv",
                                         skip = 1,
                                         col_names = c("delete","product","package ID","batch","available","room","price","cost","category","expiration_date","thc","isCannabis"))%>%
  select(!contains("delete"))%>%
  separate(col = "thc", into = c("thc","unit_thc (%/mg)"), sep = " ")%>%
  mutate(thc = as.numeric(thc),
        isCannabis = case_when(
           isCannabis == "Yes" ~ TRUE,
           isCannabis == "No" ~ FALSE
         ))%>%
  mutate(expiration_date = na_if(expiration_date, "Invalid date"),
         expiration_date = mdy(expiration_date))%>%
  mutate(
    date = as.Date(expiration_date),
    day = day(expiration_date))%>%
  mutate(
    format_date = format(date, "%m/%d/%Y")
  )%>%
  arrange(desc(room), category, product)

Cleaning Current Inventory

Here I am reading in the Lab results from the current inventory across 3 License, Worcester Adult Use and Medical, and Northampton Adult Use. The most commonly recognized cannabinoids are THC, CBD, CBG, CBN, and CBC, others usually are present in negligible amounts or are sub variation such as THCa or CBDa. For the purposes of this project I have only included the main cannabinoids and their active sub variation which is noted with an (a) at the end of the abbreviation, as those are the most common in the products we sell.

With testing information separated from units, my goal is to combine like products and extract statistical analysis on the testing information by type of product

Code

CurrentInvAllLicense_orig <- read_xlsx("../Homework 3/Current Inventory Lab Results Report 11_29_2022-12_6_2022.xlsx",
                                         skip = 4,
                                         col_names = c("license","room","product","category","package id","delete","delete1", "batch","cbd","thc","delete3","delete2","cbc","cbca","delete4","cbda","delete5","delete6","delete7","cbg","cbga","delete8","delete9","cbn","cbna","delete10","delete11","thca","thc8","thc9","delete","thcv","thcva"))%>%
  select(!contains("delete"))%>%
  filter(!license == "Sandbox")

Error: `path` does not exist: '../Homework 3/Current Inventory Lab Results Report 11_29_2022-12_6_2022.xlsx'

Code

#here I will filter out accessories, clothing, employee product and high times cannabis cup items, as our testing analysis does not need them
CurrentInvAllLicense_Cannabis <- CurrentInvAllLicense_orig%>%
  filter(!category == "Accessories" & !category == "Clothes")%>%
  mutate(
    category_abrv = substr(product,1,3)
  )%>%
  relocate(category_abrv, .before = category)%>%
  filter(!category_abrv == "EMP" & !category_abrv == "SAM" & !category_abrv == "HTC")%>%
#here I have separate the units out for all of the testing results in order to have clean analysis  
  separate(col = "thc", into = c("thc","unit (%/mg)"), sep = " ", convert = TRUE)%>%
  separate(col = "cbd", into = c("cbd",NA), sep = " ", convert = TRUE)%>%
  separate(col = "cbc", into = c("cbc",NA), sep = " ", convert = TRUE)%>%
  separate(col = "cbca", into = c("cbca",NA), sep = " ", convert = TRUE)%>%
  separate(col = "cbda", into = c("cbda",NA), sep = " ", convert = TRUE)%>%
  separate(col = "cbg", into = c("cbg",NA), sep = " ", convert = TRUE)%>%
  separate(col = "cbga", into = c("cbga",NA), sep = " ", convert = TRUE)%>%
  separate(col = "cbn", into = c("cbn",NA), sep = " ", convert = TRUE)%>%
  separate(col = "cbna", into = c("cbna",NA), sep = " ", convert = TRUE)%>%
  separate(col = "thca", into = c("thca",NA), sep = " ", convert = TRUE)%>%
  separate(col = "thc8", into = c("thc8",NA), sep = " ", convert = TRUE)%>%
  separate(col = "thc9", into = c("thc9",NA), sep = " ", convert = TRUE)%>%
  separate(col = "thcv", into = c("thcv",NA), sep = " ", convert = TRUE)%>%
  separate(col = "thcva", into = c("thcva",NA), sep = " ", convert = TRUE)%>%
  relocate("unit (%/mg)", .after = batch)

Error in filter(., !category == "Accessories" & !category == "Clothes"): object 'CurrentInvAllLicense_orig' not found

Statistical Analysis/Plotting

The number one most common question asked in dispensaries is:

“What’s your highest testing flower?”

As cannabis educators, budtenders are able to inform consumers about the benefits of differentiating the types of cannabinoids they can consume, while still maintaining a full high. Many consumers believe, that the higher the THC the more effective, however studies have shown that the presence of different cannbinoids such as CBD, CBN, and CBG lead to a more full bodied high called the entourage effect. The cannabis industry deals with consumers ranging from recreational, to serious pain management for medical conditions. In order to maintain a successful business it is important to have products containing many types of cannabinoids as well as varying testing levels.

The graphs below are intended to demonstrate the distribution of cannabinoids by product type that are present in the current inventory.

Code


#chart continaing products that are measured in mg
cannabionids_mg <- CurrentInvAllLicense_Cannabis%>%
  group_by(`unit (%/mg)`, category_abrv)%>%
  

#chart containing products that are measured in %  
cannabinoids_perc <-

Error: <text>:11:0: unexpected end of input
9: 
10: 
   ^

--- title: "Homework 3" author: "Michaela Bowen" desription: "Data Wrangling/Vizualization" date: "11/30/2022" format: html: toc: true code-fold: true code-copy: true code-tools: true categories: - hw3 --- ```{r} #| label: setup #| warning: false library(tidyverse) library(lubridate) library(ggplot2) library(readr) library(readxl) library(summarytools) knitr::opts_chunk$set(echo = TRUE) ``` ::: panel-tabset # Homework Two Summary In Homework two, I read in the inventory report for all data including items with 0 quantity, that had not yet been cleared from inventory. My goal was to manipulate the data into a workable form in which I could explore and analyze a research question for the data I had available to me. I did this by using several mutates in order to give the need columns, variables and instances. From this data I was able to interpret that there are 6741 instances of 12 variables, meaning these current products are on the Sales Floor Room, Quarantine room, or vault. ```{r, message = FALSE, warning= FALSE} #| label: Homework 2 Readin #| warning: FALSE Current_Inventory <- read_csv("../posts/2022-11-16 Export inventory.csv", skip = 1, col_names = c("delete","product","package ID","batch","available","room","price","cost","category","expiration_date","thc","isCannabis"))%>% select(!contains("delete"))%>% separate(col = "thc", into = c("thc","unit_thc (%/mg)"), sep = " ")%>% mutate(thc = as.numeric(thc), isCannabis = case_when( isCannabis == "Yes" ~ TRUE, isCannabis == "No" ~ FALSE ))%>% mutate(expiration_date = na_if(expiration_date, "Invalid date"), expiration_date = mdy(expiration_date))%>% mutate( date = as.Date(expiration_date), day = day(expiration_date))%>% mutate( format_date = format(date, "%m/%d/%Y") )%>% arrange(desc(room), category, product) ``` # Homework Three ## Cleaning Current Inventory Here I am reading in the Lab results from the current inventory across 3 License, Worcester Adult Use and Medical, and Northampton Adult Use. The most commonly recognized cannabinoids are THC, CBD, CBG, CBN, and CBC, others usually are present in negligible amounts or are sub variation such as THCa or CBDa. For the purposes of this project I have only included the main cannabinoids and their active sub variation which is noted with an (a) at the end of the abbreviation, as those are the most common in the products we sell. With testing information separated from units, my goal is to combine like products and extract statistical analysis on the testing information by type of product ```{r, message=FALSE} #| label: cleaning and arranging #| warning: FALSE CurrentInvAllLicense_orig <- read_xlsx("../Homework 3/Current Inventory Lab Results Report 11_29_2022-12_6_2022.xlsx", skip = 4, col_names = c("license","room","product","category","package id","delete","delete1", "batch","cbd","thc","delete3","delete2","cbc","cbca","delete4","cbda","delete5","delete6","delete7","cbg","cbga","delete8","delete9","cbn","cbna","delete10","delete11","thca","thc8","thc9","delete","thcv","thcva"))%>% select(!contains("delete"))%>% filter(!license == "Sandbox") #here I will filter out accessories, clothing, employee product and high times cannabis cup items, as our testing analysis does not need them CurrentInvAllLicense_Cannabis <- CurrentInvAllLicense_orig%>% filter(!category == "Accessories" & !category == "Clothes")%>% mutate( category_abrv = substr(product,1,3) )%>% relocate(category_abrv, .before = category)%>% filter(!category_abrv == "EMP" & !category_abrv == "SAM" & !category_abrv == "HTC")%>% #here I have separate the units out for all of the testing results in order to have clean analysis separate(col = "thc", into = c("thc","unit (%/mg)"), sep = " ", convert = TRUE)%>% separate(col = "cbd", into = c("cbd",NA), sep = " ", convert = TRUE)%>% separate(col = "cbc", into = c("cbc",NA), sep = " ", convert = TRUE)%>% separate(col = "cbca", into = c("cbca",NA), sep = " ", convert = TRUE)%>% separate(col = "cbda", into = c("cbda",NA), sep = " ", convert = TRUE)%>% separate(col = "cbg", into = c("cbg",NA), sep = " ", convert = TRUE)%>% separate(col = "cbga", into = c("cbga",NA), sep = " ", convert = TRUE)%>% separate(col = "cbn", into = c("cbn",NA), sep = " ", convert = TRUE)%>% separate(col = "cbna", into = c("cbna",NA), sep = " ", convert = TRUE)%>% separate(col = "thca", into = c("thca",NA), sep = " ", convert = TRUE)%>% separate(col = "thc8", into = c("thc8",NA), sep = " ", convert = TRUE)%>% separate(col = "thc9", into = c("thc9",NA), sep = " ", convert = TRUE)%>% separate(col = "thcv", into = c("thcv",NA), sep = " ", convert = TRUE)%>% separate(col = "thcva", into = c("thcva",NA), sep = " ", convert = TRUE)%>% relocate("unit (%/mg)", .after = batch) ``` ## Statistical Analysis/Plotting The number one most common question asked in dispensaries is: ***"What's your highest testing flower?"*** As cannabis educators, budtenders are able to inform consumers about the benefits of differentiating the types of cannabinoids they can consume, while still maintaining a full high. Many consumers believe, that the higher the THC the more effective, however studies have shown that the presence of different cannbinoids such as CBD, CBN, and CBG lead to a more full bodied high called the entourage effect. The cannabis industry deals with consumers ranging from recreational, to serious pain management for medical conditions. In order to maintain a successful business it is important to have products containing many types of cannabinoids as well as varying testing levels. The graphs below are intended to demonstrate the distribution of cannabinoids by product type that are present in the current inventory. ```{r} #| label: plots #| warning: FALSE #chart continaing products that are measured in mg cannabionids_mg <- CurrentInvAllLicense_Cannabis%>% group_by(`unit (%/mg)`, category_abrv)%>% #chart containing products that are measured in % cannabinoids_perc <- ``` :::