Negative Word Dictionary

The burgeoning amount of reviews makes review credibility questionable; we asked whether we can trust its rating or not, and analyzed the review texts for more objective measures.

Published as Behind Chicken Reviews: Exploratory Analysis Through Text Mining at the Korea Content Association.

My Role:
Natural language processing, data analysis, empirical finding, dictionary

My Team:
Jugyeom Kim: text mining, exploratory analysis
Soohyun Yoon: data processing
Youbeen Lee: data processing
Advisor: Dongwhan Kim

Location
Yonsei University, Seoul, Korea

Time
6 months

Deliverables
Review Analysis, Negative Word Dictionary, Publication

Table of Contents


Problem Definition

01

Exploratory Study


02

Topic Modeling

Dictionary Design


04


03

Problem Definition

Every day 180,000 reviews are written on Tripadvisor...

Reviews are what influence us the most when it comes to decision-making. However, as colossal amount of reviews are produced everyday and they are only increasing, it is getting more difficult to assess their credibility. Especially as the lengths of reviews are shortening and sheer reliance is on the rating as well, exploring the current review status of the food delivery app and opting for a more reliable way of understanding the polarity of the review is in grave need.

  • "I almost never go to a restaurant without any reviews. It's safer and more trustworthy to find a highly-rated place. But often, it seems that the ratings don't always match what the place really is like..."

  • This study explores the texts in reviews and ratings of a delivery application and discovers ways to elevate review credibility and usefulness.

Study Flow

1. Text Mining and Natural Language Processing

Reviews are vital both for the restaurant's improvement and also for objective decision-making for the consumers. However, for both parties, to glimpse through all reviews is demanding, and also due to many factors, the reviews and ratings do not necessarily match as every customer has different standards and values they place weight on. Thus, we attempted to extract reviews from a popular food delivery app with text mining and analyze them with natural language processing (NLP).

2. Category Classification and Topic Modeling

We then tried to look into other methods to increase review usefulness and credibility; a lot has used topic modeling to classify the reviews to several important topics such as β€˜taste’, β€˜time’, β€˜service’, etc. These attempts help the automatic classification and filtration of reviews according to the information needed for the users at a particular time.

3. Sentiment Analysis and Sentiment Dictionary

For restaurants, being able to identify and be reported when there are β€˜negative’ comments and reviews is critical. Often if the negative service of the restaurant is not addressed or corrected, the restaurant cannot improve and the customers not satisfied would not revisit the restaurant. We referenced the previous studies where they tried to do sentiment analysis of the reviews and found a way to more accurately present the negative words through a dictionary.

Arising Research Question:

How Is The Current Review System in a Delivery Food Application and How Can We Improve the Review Usefulness?

Literature Review

Methdology

We sought to first, collect food delivery app review data via web crawling. Then, we analyzed the different ratings and reviews by comparing them with those of other platforms (google, Naver). Then, KoNLpy is used as the natural language processing tool, to explore the different trends in the review data. Then, to analyze if there is a trend in the text morphemes themselves, we used OKT (Open Korean Text) processor to observe such trend.

  1. Python Selenium: Web crawling of 5,000 review data/ Rating trend comparisons

  2. KoNLPy: Korean natural language processing, standardization of data

  3. Open Korean Text processor: Observing part of speech and morpheme proportions

Result 1

Rating System

We have analyzed our observed data "Yogi** food delivery application had a fiver-star oriented rating dispersion compared to two other review platforms when comparing the five franchise chicken brands and 65 stores. Such a result signifies that the review and rating have lower credibility in Yogi**, while others have more neutral rating dispersion, it meant that the delivery app had more lenient ratings compared to their real sentiments.

Factor Analysis

Result 2

We then analyzed the inherent factors to influence its ratings and observe if there is a discrepancy between ratings and the review content. After factor analysis, it was excavated there was a positive correlation between the taste, quantity, and delivery ratings, that one is very important to one other, and that they all together have a very strong relationship. This meant that if the delivery was late, the taste was deemed unsatisfactory.

We conducted general exploration and analysis on the grammatical parts of the sentences, and observed a speciifc trend in the moprhemes and parts of speech that comprised the negative and positive reviews. We discovered certain parts of speech and its frequency indicated negative content, and that numbers could mean severely negative reviews.

Results

We first observed the general rating systems and compared them with other review platforms. We then analyzed the factors involved in giving the ratings. We also then analyzed the NLP results and identified unique grammatical traits in the sentences. Finally, we created a dictionary to easily identify the critical words to the business through topic modeling.

Grammatical Difference

Result 3

Finally, based on the polarity identification, we have created a negative word dictionary of which is chicken-review specialized, under four main topics and 20 sub-topic classifications after extracting a total of 367 negative words.

Negative Word Dictionary

Result 4

Conclusion

Although we had focused primarily on one delivery app and one type of food, fried chicken (because we all love it), we analyzed if there was a discrepancy between the ratings and reviews, and it turns out many customers rate the restaurant higher than they actually think, provided their review content was analyzed. In the flood of reviews, we hoped that provide a negative word dictionary based on sentimental polarity analysis and provide indicators for both the owners and customers about the polarity of the reviews. We also identified certain parts of speech and the presence of numbers could provide negative connotations. We hope that this approach to exploring the delivery review data can assist so many parties involved in the delivery app culture, that is basically bombarded every day due to its popularity.

Publication

Following article is published in the Journal of Korea Contents Association, "BEHIND CHICKEN RATINGS: An Exploratory Analysis of Yogiyo Reviews Through Text Mining" https://doi.org/10.5392/JKCA.2021.21.11.

Abstract: Ratings and reviews, despite their growing influence on restaurants’ sales and reputation, entail a few limitations due to the burgeoning of reviews and inaccuracies in rating systems. This study explores the texts in reviews and ratings of a delivery application and discovers ways to elevate review credibility and usefulness. Through a text mining method, we concluded that the delivery application β€˜Yogiyo’ has (1) a five-star oriented rating dispersion, (2) a strong positive correlation between rating factors (taste, quantity, and delivery) and (3) distinct part of speech and morpheme proportions depending on review polarity. We created a chicken-specialized negative word dictionary under four main topics and 20 sub-topic classifications after extracting a total of 367 negative words. We provide insights on how the research on delivery app reviews should progress, centered on fried chicken reviews.