Blog Post 1

Claire Battaglia
text-as-data
blog post 1
Author

Claire Battaglia

Published

September 18, 2022

Background

One application of analyzing text-as-data that I am particularly interested in is analysis of open-text survey responses. Entities working in the food system development space are beginning to utilize what are becoming known as “community food assessments” more frequently. An important feature of many of these surveys is the open-text response, meaning that respondents can answer the question in their own words. In my experience the analysis of these responses consists of reading through them (or some of them, depending on the number of responses) and perhaps identifying a theme that comes up once or twice or even simply highlighting one or two responses that seemed particularly interesting. Not only is this approach potentially quite misleading, it also likely misses rich, nuanced information. My goal is to learn several methods for analyzing those responses quantitatively.

Research Question

There are a myriad research questions that I am interested in within my work in food system development. However, the dataset that I plan to work with in this class is that from the Missoula City-County’s inaugural Community Food Assessment, completed in 2021. In particular, the responses to one (or more) of the open-ended questions:

  1. What changes would you like to see for Missoula’s food system?
  2. In your opinion, what strengths and assets exist in Missoula’s food system (i.e. in what ways is the food system doing well)?
  3. In your opinion, what gaps or unmet needs exist in Missoula’s food system (i.e. in what ways could the food system do better)?

These particular questions are typical of open-ended questions asked on Community Food Assessments.

Articles

I haven’t been able to find any studies that utilize quantitative and/or computational analysis of text in my field of study but I have found analogous work in other fields of study.

“Automated Quantitative Analysis of Open-Ended Survey Responses for Transportation Planning”1 details using a Naive Bayes approach to analyzing open-ended survey questions in transportation planning. In this study, 2,056 people responded to the Metro-North Customer Satisfaction survey, answering 58 closed-response questions and a single open-ended question, “If you are not satisfied with our performance in any of the areas in questions 1 through 58, please explain why below. Please also include any other comments or service suggestions.” The researchers then used a Naive Bayes Classifier to label open-ended responses according to the topics created by the forced-choice questions. This seems useful, although perhaps a bit simplistic.

“Structural Topic Models for Open-Ended Survey Responses”2 examines Structural Topic Models (STM) as a method of analyzing open-ended survey questions. This method allows topics of interest to be discovered in the responses, as opposed to assumed beforehand, and to be analyzed in relation to information about each respondent (such as demographic information). This seems extremely relevant for understanding our food systems, as individuals occupying the same geographic space (e.g. a neighborhood, a zip code, etc.) can perceive and occupy vastly different food systems.

Questions

  1. Are there search terms that would help me narrow my search results to just those studies utilizing text analysis? I find it hard to believe that no one in the food system development space has thought to use any of the quantitative and/or computational analysis methods available to address the common problem of open-ended survey questions. It’s a relatively new space and tends not to be populated by social scientists, however, so it’s possible.
  2. How to word the survey question to best facilitate the analysis method I want to use for the responses. I’ll be working with this survey in Survey Research Methods this semester as well and will examine how I can improve the survey questions.
  3. When using the Naive Bayes approach, as in the first study, what happens if the topics articulated in the open-ended responses don’t fit into any of the categories created by the forced-choice questions.

Methods to learn about:

  • Naive Bayes
  • Structural Topic Model (STM)
  • concept mapping – Is this a quantitative method?

Footnotes

  1. Severin, Karl, Swapna S. Gokhale, Karthik C. Konduri, 2017. “Automated Quantitative Analysis of Open-Ended Survey Responses for Transportation Planning.” Paper presented at IEEE Smart World Congress, San Francisco, August 4-8, 2017. https://doi.org/10.1109/UIC-ATC.2017.8397567↩︎

  2. Roberts, Margaret E.; Stewart, Brandon M.; Tingley, Dustin; Lucas, Christopher; Leder-Luis, Jetson; Gadarian, Shana Kushner; Albertson, Bethany; Rand, David G. (2014). Structural Topic Models for Open-Ended Survey Responses. American Journal of Political Science, 58(4), 1064–1082. doi:10.1111/ajps.12103↩︎