Sasi Tansaraviput HW6

Draft for the final paper

Sasi Tansaraviput
2022-05-10

Introduction

plant-based diet consumption has grown significantly following the rise in health awareness in consumers. With the majority of the consumers desiring for plant-based products that imitated the taste and texture of the conventional ones, the food science community has taken interest in the development of plant-based food. However, the plant-based sausages that were commercially available in the market are still lacking. To develop more meat-like plant-based sausages, the textural attributes of the conventional sausages should be considered. With the texture data that I gather from the commercially available plant-based hotdogs and sausages and the even number of conventional meat products using texture profile analysis (TPA), I analyzed the data with analysis of variance (ANOVA) and plotting the principal component analysis (PCA) plot to answer the research questions.

Research questions

To develop accurate imitation of conventional meat hotdogs and sausages, there are 2 questions that need to be answered in this study. The first one is “How are the plant-based products’ texture attributes different from meat-based one?” while the second research questions is “What is the best available commercially plant protein option between vital wheat gluten, pea protein, and soy protein to imitate the texture of meat-based hotdogs and Italian sausages?”

Data

This data was gathered from research in the plant-based hotdogs and sausages using texture profile analysis (TPA). There are totally 7 textural attributes collected, i.e. Hardness, Adhesiveness, Resilience, Cohesion, Springiness, Gumminess, and Chewiness.

Read in dataset

library(readxl)
TPA <- read_excel("TPA_all.xlsx")
View(TPA)

#delete diameter
TPA <- TPA[,c("Product","Brand","Type","Protein","Component","Sampling","Hardness","Adhesiveness","Resilence","Cohesion","Springiness","Gumminess","Chewiness")]

#change type of sausages
TPA <- TPA %>% dplyr::mutate(Type = ifelse(as.character(Type) != "HD","IT",as.character(Type)))

The data was cleaned, remove unnecessary column, and made to be more uniformed using readxl and dplyr functions.

#Explain the variables
str(TPA)
tibble [145 x 13] (S3: tbl_df/tbl/data.frame)
 $ Product     : chr [1:145] "BallPark Beef Franks" "BallPark Beef Franks" "BallPark Beef Franks" "BallPark Beef Franks" ...
 $ Brand       : chr [1:145] "Ballpark" "Ballpark" "Ballpark" "Ballpark" ...
 $ Type        : chr [1:145] "HD" "HD" "HD" "HD" ...
 $ Protein     : chr [1:145] "M" "M" "M" "M" ...
 $ Component   : chr [1:145] "B" "B" "B" "B" ...
 $ Sampling    : num [1:145] 1 2 3 4 1 2 3 4 5 6 ...
 $ Hardness    : num [1:145] 3820 4522 3653 3594 2886 ...
 $ Adhesiveness: num [1:145] -11.35 -10.63 -11.91 -8.65 -8.82 ...
 $ Resilence   : num [1:145] 32.2 34.2 34.1 25.3 40.6 ...
 $ Cohesion    : num [1:145] 0.6 0.679 0.607 0.468 0.726 0.743 0.724 0.738 0.721 0.746 ...
 $ Springiness : num [1:145] 92 88.2 90.4 86.8 93.3 ...
 $ Gumminess   : num [1:145] 2292 3072 2217 1683 2094 ...
 $ Chewiness   : num [1:145] 2107 2710 2005 1461 1953 ...

TPA data has 13 variables. Product (Variable 1), Brand (Variable 2), Type (Variable 3), Protein (Variable 4), and Component (Variable 5) variables are characters type while Sampling (Variable 6) is integer. The other variables, i.e. Hardness (Variable 7), Adhesiveness (Variable 8), Resilence (Variable 9), Cohesion (Variable 10), Springiness (Variable 11), Gumminess (Variable 12), Chewiness (Variable 13) are numeric.

Statistical Analysis of the data and visualizations

Mean, median, and standard deviations of each hotdog and sausage

TPA_mean <- TPA %>% group_by(Product,Type,Protein,Component) %>% summarise_at(vars(Hardness:Chewiness), list(mean = mean))
#round to 3 digits
TPA_mean <- mutate_if(TPA_mean , is.numeric, round, 3)

TPA_med <- TPA %>% group_by(Product,Type,Protein,Component) %>% summarise_at(vars(Hardness:Chewiness), list(med = median))
#round to 3 digits
TPA_med <- mutate_if(TPA_med , is.numeric, round, 3)
View(TPA_med)

TPA_sd <- TPA %>% group_by(Product,Type,Protein,Component) %>% summarise_at(vars(Hardness:Chewiness), list(sd = sd))
#round to 3 digits
TPA_sd <- mutate_if(TPA_sd , is.numeric, round, 3)
View(TPA_sd)

These 3 tables demonstrated the mean, median, and standard deviations of each hotdog and Italian sausage.

ANOVA and Tukey’s post hoc test

Hotdogs dataset

The hotdogs and Italian sausages should be individually analyzed, thus I wrote this code to seperate them using data-wrangling operations. First, I analyzed the hotdogs dataset.

#For hotdogs
HD <- TPA_mean %>%
  filter(Type == "HD")

HD_M <- HD %>%
  filter(Protein == "M")
  
  
#Hardness
A_M_Hardness_HD <- aov(Hardness_mean ~ Component, data = HD_M)
summary(A_M_Hardness_HD)
            Df  Sum Sq Mean Sq F value Pr(>F)
Component    3 3747142 1249047   0.483  0.728
Residuals    2 5168328 2584164               
#Adhesiveness
A_M_Adhesiveness_HD <- aov(Adhesiveness_mean ~ Component, data = HD_M)
summary(A_M_Adhesiveness_HD)
            Df Sum Sq Mean Sq F value Pr(>F)
Component    3  570.4   190.1   1.375  0.447
Residuals    2  276.5   138.3               
#Resilence
A_M_Resilence_HD <- aov(Resilence_mean ~ Component, data = HD_M)
summary(A_M_Resilence_HD)
            Df Sum Sq Mean Sq F value Pr(>F)
Component    3  35.03   11.68   1.154  0.495
Residuals    2  20.24   10.12               
#Cohesion
A_M_Cohesion_HD <- aov(Cohesion_mean ~ Component, data = HD_M)
summary(A_M_Cohesion_HD)
            Df   Sum Sq  Mean Sq F value Pr(>F)
Component    3 0.016121 0.005374   4.376  0.192
Residuals    2 0.002456 0.001228               
#Springiness
A_M_Springiness_HD <- aov(Springiness_mean ~ Component, data = HD_M)
summary(A_M_Springiness_HD)
            Df Sum Sq Mean Sq F value Pr(>F)  
Component    3  37.68  12.561   23.96 0.0403 *
Residuals    2   1.05   0.524                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Gumminess
A_M_Gumminess_HD <- aov(Gumminess_mean ~ Component, data = HD_M)
summary(A_M_Gumminess_HD)
            Df  Sum Sq Mean Sq F value Pr(>F)
Component    3  705663  235221   0.173  0.906
Residuals    2 2716263 1358132               
#Chewiness
A_M_Chewiness_HD <- aov(Chewiness_mean ~ Component, data = HD_M)
summary(A_M_Chewiness_HD)
            Df  Sum Sq Mean Sq F value Pr(>F)
Component    3  383833  127944   0.126  0.937
Residuals    2 2032661 1016331               

After separate the hotdogs from Italian sausage, the analysis of variance (ANOVA) was used to determine whether the plant-based hotdogs should be compared with individual component of meat-based hotdogs. From the ANOVA result, the only attribute that is significantly different at 95% confidence interval for meat-based hotdogs is springiness.

TukeyHSD(A_M_Springiness_HD)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = Springiness_mean ~ Component, data = HD_M)

$Component
           diff         lwr       upr     p adj
BP-B   2.804667 -2.98819995  8.597533 0.1871227
CP-B   5.373667 -0.41919995 11.166533 0.0577043
TCP-B  5.780667 -0.01219995 11.573533 0.0502022
CP-BP  2.569000 -4.52578368  9.663784 0.2966390
TCP-BP 2.976000 -4.11878368 10.070784 0.2365905
TCP-CP 0.407000 -6.68778368  7.501784 0.9739568
TK_M_Springiness_HD <- TukeyHSD(A_M_Springiness_HD)
plot(TK_M_Springiness_HD, las = 1)

The Tukey’s post hoc test was then conducted to determine how each component is different. Only the Turkey-Chicken-Pork meat blended and Beef is significantly different with each other; however, it is narrowly different. Thus, it was determined that plant-based hotdogs should be compared with overall meat-based hotdogs.

Analysis of variance between individual plant-based hotdogs’ components and overall meat hotdog

#Anova for HD
#Hardness
A_HD_hardness <- aov(Hardness_mean ~ Component, data = HD)
summary(A_HD_hardness)
            Df   Sum Sq Mean Sq F value Pr(>F)
Component    6 11693315 1948886    1.02  0.502
Residuals    5  9553978 1910796               
#Adhesiveness
A_HD_Adhesiveness <- aov(Adhesiveness_mean ~ Component, data = HD)
summary(A_HD_Adhesiveness)
            Df Sum Sq Mean Sq F value Pr(>F)
Component    6   5501   916.8    2.43  0.174
Residuals    5   1887   377.3               
#Resilence
A_HD_Resilence <- aov(Resilence_mean ~ Component, data = HD)
summary(A_HD_Resilence)
            Df Sum Sq Mean Sq F value Pr(>F)
Component    6  209.4   34.90   1.569  0.319
Residuals    5  111.2   22.24               
#Cohesion
A_HD_Cohesion <- aov(Cohesion_mean ~ Component, data = HD)
summary(A_HD_Cohesion)
            Df  Sum Sq  Mean Sq F value Pr(>F)
Component    6 0.02789 0.004648   2.128  0.212
Residuals    5 0.01092 0.002184               
#Springiness
A_HD_Springiness <- aov(Springiness_mean ~ Component, data = HD)
summary(A_HD_Springiness)
            Df Sum Sq Mean Sq F value Pr(>F)
Component    6 175.40   29.23   2.656  0.152
Residuals    5  55.03   11.01               
#Gumminess
A_HD_Gumminess <- aov(Gumminess_mean ~ Component, data = HD)
summary(A_HD_Gumminess)
            Df  Sum Sq Mean Sq F value Pr(>F)
Component    6 3572344  595391   0.562  0.749
Residuals    5 5300381 1060076               
#Chewiness
A_HD_Chewiness <- aov(Chewiness_mean ~ Component, data = HD)
summary(A_HD_Chewiness)
            Df  Sum Sq Mean Sq F value Pr(>F)
Component    6 3385321  564220   0.644  0.698
Residuals    5 4381654  876331               
#Tukey for HD
TK_HD_Adhesiveness <- TukeyHSD(A_HD_Adhesiveness)
plot(TK_HD_Adhesiveness, las = 1)

ANOVA was conducted to compare each attributes between individual plant based hotdogs’ component and overall meat hotdogs. Only adhesiveness of the hotdogs is significantly different at 95% confidence interval. The Tukey’s post hoc test was then conducted to determine the different. From the Tukey’s plot of the adhesiveness, it can be determined that only soy protein component hotdogs are significantly different from meat hotdogs. This means that pea protein and vital wheat gluten are better than soy protein in imitating the adhesiveness of meat hotdogs.

Italian sausages dataset

#For Italian sausages
IT <- TPA_mean %>%
  filter(Type == "IT")

IT_M <- IT %>%
  filter(Protein == "M")
  
  
#Hardness
A_M_Hardness_IT <- aov(Hardness_mean ~ Component, data = IT_M)
summary(A_M_Hardness_IT)
            Df   Sum Sq Mean Sq F value Pr(>F)
Component    1   792104  792104    0.29   0.61
Residuals    6 16409919 2734986               
#Adhesiveness
A_M_Adhesiveness_IT <- aov(Adhesiveness_mean ~ Component, data = IT_M)
summary(A_M_Adhesiveness_IT)
            Df Sum Sq Mean Sq F value Pr(>F)
Component    1     12    12.0   0.026  0.877
Residuals    6   2750   458.4               
#Resilence
A_M_Resilence_IT <- aov(Resilence_mean ~ Component, data = IT_M)
summary(A_M_Resilence_IT)
            Df Sum Sq Mean Sq F value Pr(>F)
Component    1  49.31   49.31   1.493  0.268
Residuals    6 198.14   33.02               
#Cohesion
A_M_Cohesion_IT <- aov(Cohesion_mean ~ Component, data = IT_M)
summary(A_M_Cohesion_IT)
            Df  Sum Sq  Mean Sq F value Pr(>F)
Component    1 0.01594 0.015939   2.045  0.203
Residuals    6 0.04677 0.007795               
#Springiness
A_M_Springiness_IT <- aov(Springiness_mean ~ Component, data = IT_M)
summary(A_M_Springiness_IT)
            Df Sum Sq Mean Sq F value Pr(>F)  
Component    1  8.314   8.314   4.674 0.0739 .
Residuals    6 10.672   1.779                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Gumminess
A_M_Gumminess_IT <- aov(Gumminess_mean ~ Component, data = IT_M)
summary(A_M_Gumminess_IT)
            Df   Sum Sq Mean Sq F value Pr(>F)
Component    1  2870804 2870804   1.396  0.282
Residuals    6 12335681 2055947               
#Chewiness
A_M_Chewiness_IT <- aov(Chewiness_mean ~ Component, data = IT_M)
summary(A_M_Chewiness_IT)
            Df  Sum Sq Mean Sq F value Pr(>F)
Component    1 1685890 1685890   1.186  0.318
Residuals    6 8529505 1421584               

As for the Italian sausages, there is no significant different at 95% confidence interval for any of the attributes for meat-based sausages, which means that plant-based sausages should also be compared with overall meat-based sausages.

Analysis of variance between individual plant-based sausages’ components and overall meat sausage

#Anova for IT
#Hardness
A_IT_hardness <- aov(Hardness_mean ~ Component, data = IT)
summary(A_IT_hardness)
            Df    Sum Sq  Mean Sq F value  Pr(>F)   
Component    4 124547244 31136811   10.29 0.00143 **
Residuals   10  30252951  3025295                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Adhesiveness
A_IT_Adhesiveness <- aov(Adhesiveness_mean ~ Component, data = IT)
summary(A_IT_Adhesiveness)
            Df Sum Sq Mean Sq F value Pr(>F)
Component    4   1899   474.7    1.41    0.3
Residuals   10   3366   336.6               
#Resilence
A_IT_Resilence <- aov(Resilence_mean ~ Component, data = IT)
summary(A_IT_Resilence)
            Df Sum Sq Mean Sq F value Pr(>F)  
Component    4  536.7  134.16   4.601 0.0229 *
Residuals   10  291.6   29.16                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Cohesion
A_IT_Cohesion <- aov(Cohesion_mean ~ Component, data = IT)
summary(A_IT_Cohesion)
            Df  Sum Sq Mean Sq F value Pr(>F)  
Component    4 0.14246 0.03561    5.93 0.0104 *
Residuals   10 0.06006 0.00601                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Springiness
A_IT_Springiness <- aov(Springiness_mean ~ Component, data = IT)
summary(A_IT_Springiness)
            Df Sum Sq Mean Sq F value   Pr(>F)    
Component    4  843.9  210.97   12.52 0.000661 ***
Residuals   10  168.5   16.85                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Gumminess
A_IT_Gumminess <- aov(Gumminess_mean ~ Component, data = IT)
summary(A_IT_Gumminess)
            Df   Sum Sq Mean Sq F value Pr(>F)  
Component    4 32987288 8246822    3.75  0.041 *
Residuals   10 21994209 2199421                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Chewiness
A_IT_Chewiness <- aov(Chewiness_mean ~ Component, data = IT)
summary(A_IT_Chewiness)
            Df   Sum Sq Mean Sq F value Pr(>F)  
Component    4 26188557 6547139    3.75  0.041 *
Residuals   10 17458701 1745870                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Tukey for IT
TK_IT_hardness <- TukeyHSD(A_IT_hardness)
plot(TK_IT_hardness, las = 1)
TK_IT_Resilence <- TukeyHSD(A_IT_Resilence)
plot(TK_IT_Resilence, las = 1)
TK_IT_Cohesion <- TukeyHSD(A_IT_Cohesion)
plot(TK_IT_Cohesion, las = 1)
TK_IT_Springiness <- TukeyHSD(A_IT_Springiness)
plot(TK_IT_Springiness, las = 1)
TK_IT_Gumminess <- TukeyHSD(A_IT_Gumminess)
plot(TK_IT_Gumminess, las = 1)
TK_IT_Chewiness <- TukeyHSD(A_IT_Chewiness)
plot(TK_IT_Chewiness, las = 1)

As for the Italian sausages, the only attribute that is not significantly different is adhesiveness. * Hardness The soy protein-based sausages’ hardness is not significantly different from meat sausages while vital wheat gluten-based sausages are not significantly different from pea protein and soy protein sausages. Additionally, while soy protein and pea protein sausages are significantly different in hardness with each other, the result is almost on the fence. It can be inferred from this post-hoc test that soy protein is appropriate for accurately imitating meat sausages in hardness.

In conclusion based on the Anova test, to make the accurate imitation of conventional meat sausages, the best protein to use as a main component of Italian sausages is soy protein as it can accurately imitate the all texture attributes of meat-based sausages while the vital wheat gluten can only imitate the resilience, springiness, gumminess, and chewiness of the conventional sausages. Additionally, only texture attributes of soy protein that similar to meat sausages are resilience and cohesiveness.

Principal component analysis

These are visualization of every texture attributes, i.e. Hardness, adhesiveness, resilience, cohesiveness, springiness, gumminess, and chewiness, and how every hotdogs and sausages are influenced by those attributes, and how are they similar or different to each other, which can be used to determine the best commercially used plant protein to imitate the meat hotdogs and sausages.

#For hotdogs
HD <- HD %>% dplyr::mutate(Component = ifelse(as.character(Component) != "PP" & as.character(Component) != "SP"& as.character(Component) != "VWG","M",as.character(Component)))
HD <- HD %>% arrange(Protein)
HD$Product = c('M1','M2','M3','M4','M5','M6','PB1','PB2','PB3','PB4','PB5','PB6')

#For Italian sausages
IT <- IT %>% dplyr::mutate(Component = ifelse(as.character(Component) != "PP" & as.character(Component) != "SP"& as.character(Component) != "VWG","M",as.character(Component)))
IT <- IT %>% arrange(Protein)
IT$Product = c('M1','M2','M3','M4','M5','M6','M7','M8','PB1','PB2','PB3','PB4','PB5','PB6','PB7')

Principal component analysis of hotdogs

PCA_HD <- prcomp(HD[,c(5:11)], center = TRUE, scale. = TRUE)
ggbiplot(PCA_HD, choices = c(1,2), obs.scale = 1, var.scale = 1, ellipse = TRUE, labels = HD$Product, groups = HD$Component, varname.adjust = 1, varname.size = 2.5)+ scale_color_discrete(name = 'Protein Component') + theme_classic() + theme(legend.direction = 'horizontal', legend.position = 'bottom') 

Principal component analysis of Italian sausages

PCA_IT <- prcomp(IT[,c(5:11)], center = TRUE, scale. = TRUE)
ggbiplot(PCA_IT, obs.scale = 1, var.scale = 1.8, ellipse = TRUE, labels = IT$Product, groups = IT$Component, varname.adjust = 1, varname.size = 2.5)+ scale_color_discrete(name = 'Protein Component') + theme_classic() + theme(legend.direction = 'horizontal', legend.position = 'bottom')

Conclusion

Answer the questions

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Tansaraviput (2022, May 11). Data Analytics and Computational Social Science: Sasi Tansaraviput HW6. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomsnoutsnake900974/

BibTeX citation

@misc{tansaraviput2022sasi,
  author = {Tansaraviput, Sasi},
  title = {Data Analytics and Computational Social Science: Sasi Tansaraviput HW6},
  url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomsnoutsnake900974/},
  year = {2022}
}