Blog Post 1 - Literature REview

Adithya Parupudi
Author

Adithya Parupudi

Published

September 19, 2022

Articles Referred

Article 1 : Employing structural topic modelling to explore perceived service quality attributes in Airbnb accommodation

This study employs the structural topic model to extract service quality attributes from 242,020 Airbnb reviews in Malaysia. 22 service related topics were extracted from the corpus and four topics have not appeared in previous Airbnb studies. A widely used modified SERVQUAL questionnaire (MSQ) is cross-validated in this study by comparing its service quality attributes with the results of the topic modelling, which indicates that this MSQ can cover general Airbnb service quality attributes. This study also examines the different preferences of Malaysian and international Airbnb users and the changing patterns of the top six service quality attributes during a five-year period. The findings reveal that Malaysian Airbnb users care more about the appearance and location of the property, and international Airbnb users pay more attention to whether the property can accommodate a group of people. In addition, communication with the host is found to play an increasingly important role in Airbnb users’ lodging experiences.

Article 2 : Topic modelling for medical prescription fraud and abuse detection

Medical prescription fraud and abuse have been a pressing issue in the USA, resulting in large financial losses and adverse effects on human health. The size and complexity of the healthcare systems as well as the cost of medical audits make use of statistical methods necessary to generate investigative leads in prescription audits. We analyse prescriber–drug associations by utilizing the real world Medicare part D prescription data from New Hampshire. In particular, we propose the use of topic models to group drugs with respect to the billing patterns and exhibit the potential aberrant behaviours while using medical specialities as a covariate. The prescription patterns of the providers are retrieved with an emphasis on opioids and aggregated into distance-based measures which are visualized by concentration functions. This output can enable healthcare auditors to identify leads for audits of providers prescribing medically unnecessary drugs.

My Project name

Topic Modelling of People’s Biographies

Research Questions

  1. Which areas of work do most of them fall into?
  2. Where are they from?
  3. What is the male vs female percentage in this list?
  4. Which time period did most of them existed?
  5. What are the personality traits of all people belonging to the same/similar profession?
  6. How does their personalities change before and after the major world wars?

Data Sources

I found a website (https://www.biographyonline.net/people/famous-100.html) which has biographies of numerous people, all complied by “Tejvan Pettinger”. Some popular names did not have a biography associated with them. So I pulled data from their respective Wikipedia pages.

Challenges

  1. A difficult and interesting challenge to categorize data demographically.
  2. Have to study my data really well to get a feel of it. Since I wish to draw similarities between people, grouping data is highly important
  3. There are a lot of unwanted data. I never worked on regular expressions, which will prove very difficult. I want to avoid manual work as much as possible.
  4. Since Topic Modeling is an unsupervised learning, what if my results do not match to the results?

References

  1. Employing structural topic modelling to explore perceived service quality attributes in Airbnb accommodation

    URL : https://www.sciencedirect.com/science/article/pii/S0278431920302280?casa_token=XQjDpTgrM9AAAAAA:S_xRt5K-nQOIq6lh_Jm2Y7s2o5fmXgHjEfE1-4Hxk4717DKgSO-NY_gzwI97NKS2WQ0Wcd1l4Q

2. Topic modelling for medical prescription fraud and abuse detection

URL : https://rss.onlinelibrary.wiley.com/doi/epdf/10.1111/rssc.12332?author_access_token=prBEhp2cDXGPLt4-epz2nIta6bR2k8jH0KrdpFOxC67l8xg1ezJ6-z57QpMwRqse7KEvVg3w8lYDPJLZNCMapedziz1EMse3IkqLSJsWsWzA8jonTCuGjMLwUeuZfT0b

  1. Biography Online

    https://www.biographyonline.net/people/famous-100.html