Automated Summarization of Restaurant Reviews

Manoj Pawar, Deepak Mallya

{mpawar,dmallya}@stanford.edu

June 4th, 2010

Introduction

Websites like Yelp.com or we8there.com and other restaurants reviews websites nowadays have lots of user reviews. Numbers of reviews for each restaurant are increasing day by day because of accessibility of Internet everywhere. For each restaurant normally more than 50 reviews are available. Websites like yelp.com have 5 star rating system along with reviews but it hardly helps for users to understand what is great and what is bad for this restaurant. It is also nearly impossible for someone to read all those reviews. Lot of people wants to make quick decisions about restaurants for dining or lunch. Reading through all the reviews is not only cumbersome; it can confuse lot of people about overall positive or negative sentiments about specific features.

Objective

In this project, we are thinkingof providing for each restaurant a set of features from all available reviews. These features can be anything from general features like “food”, “ambience” to very specific features like “Caesar salad”.We are also going to classifythem under either Good or Bad from positive or negative sentiments from reviews.

Overview

We view solution to this problem as 3 steps.

1.feature extraction 2.Sentiment Analysis 3.Classification.

Our first task is to extract important features about the restaurant. Taking out features like “food”, “ambience”, “staff” etc. is the important part of whole process of summarization. Once we retrieve these features from reviews we need to analyze each sentence in which these features are being talked about. Taking out important sentiment from the sentence and giving it positive or negative class is second step of the process. Once we are done with each feature and their sentiments, we need to summarize on these features, classification will help users to understand what is overall sentiment on a particular feature.

Related Work

Recently many ideas have been proposed on automatic summarization of reviews. Hu and Liu [1] talked about extracting frequent features and bootstrapping techniques to analyze sentiments. Popescu and Etzioni [2] describe relaxation modeling for product features semantics. They have introduce OPINE system for feature extractions and associating opinions to these features. Pang and Lee [3] have discussed various different classification and sentiment analysis method in opining mining. We have taken simple approach of finding frequents nouns and adjectives from reviews which gives us a very good idea about features and sentiments on that. Combination of nouns and adjectives and their pair count helps us to prune unwanted features. Bootstrapping and WordNet expansions of positive and negative sentiments along with part of speech structure of the sentence have helped us in sentiment classification.

Data Collection

We have used reviews provided by Prof Andrew Ng’s lab for our experiments. We have worked on data of 196 restaurants with a total number of 99693 reviews. These reviews are from we8there.com. These reviews are in the following form.

Example:

Restaurant

id595</id

Review

overallRating4.0</overallRating

foodRating5.0</foodRating

ambianceRating3.0</ambianceRating

serviceRating4.0</serviceRating

noiseRating2.0</noiseRating

text Food and service is always top notch at Vinny's - We both had brunch and it was very well prepared and the service is always attentive, but not overbearing - Best value in the Windward area in our book </text

</Review

..

</Restaurant

We wrote initial script to retrieve reviews text for each restaurant and extracted sentences from these reviews. These sentences were then used for feature extraction and analysis. We found that some of the reviews text doesn’t have sentence boundaries so we use delimiter like comma, punctuation mark and hyphen for sentence extractions for further analysis. As you can see every review has overall rating, food rating etc., but we have not used these rating, we have just analyzed text part of each review.

Feature Extraction

We extracted feature from 99693 available reviews. For all available reviews, we ran Stanford POS taggerto tag each word in the reviews with POS tags. Following is an example of the tagged text for a sentence.

Example Sentence

Food and service is always top notch at Vinny's.

POS tagged Sentence:

Food_NNP and_CC service_NN is_VBZ always_RB top_JJ notch_NN at_IN Vinny_NNP ._.

Our intuition is that most of the features for the restaurants will be noun (NN and NNS). We collected counts for each noun on all the available 99693 reviews and sorted them according to their number of occurrences. This worked pretty well for us, as we were able to get most talked features about restaurants.

We wrote a Perl script to parse this file and generate a count of all features for every possible POS Tag. Here are two lines of this file with counts for NN and JJ for a few features.

Per line we have POStag->Word,WordCount

NN-> food,51179 service,32587 restaurant,20863 time,16012 experience,15294 menu,13775 dinner,11099 place,11097 table,10279 wine,10229meal,8866 staff,8565 dining,6943 waiter,6738 night,6154 atmosphere,6053 server,5932 everything,5042 dessert,5000 bit,4879 evening,4860 steak,4763 reservation,4698 bar,4530 lunch,4363 salad,4228 ambiance,4091 list,3887 price,3763

JJ->great,31836 good,30915 excellent,17761 wonderful,9861 nice,9808 special,8322 delicious,7846 outstanding,6443 friendly,5998 attentive,5906 first,5797 little,5678 other,5537 amazing,4391 fantastic,4294 small,4160 perfect,3958 . . .
We extracted top 200 features according to most frequent words from POS tags "NN" and "NNS". So as we can see from above most people talk in general about feature "food", "service", "restaurant", "time", "experience" etc. a lot while commenting in reviews. Also we saw that“restaurant” itself is a feature, which tells that people talk in general about the restaurant a lot. Other features are specific to "food", "service", "experience"and what people commented abouta restaurant.

Bigram Feature Extraction

There is a chance that some features can be multiple word, approach described in previous section won’t be able to handle it. We decided to separately extract bigram features from POS tagged reviews. We look for consecutive NN and NNS as bigram independently and take bigrams, which occur frequently with each other. By doing this we were able to extract features like “table service”, “wine menu”, “caesar salad” and “dining experience”. We combined bigram features and unigram features from previous section with each other to make rich set of features for restaurants.

Automatic Feature Pruning

As we are collecting frequent nouns, we get some high frequency nouns like “wife”, “husband” etc. But in our approach, while doing sentiment analysis on each sentences, we check for adjectives associated with these features, which helps us to prune this features. As these high frequency nouns, which are not actual feature, wont be having any adjectives associated with them most of the time. So ratio of adjectives with these features is pretty low. This indicates that these features are not been opinioned in the reviews. So these features get pruned automatically at later stage of our approach.

Example:

Feature “wife” - my wife gave it her greatest compliment.

nsubj(“wife”, ”gave”);

As you can see “gave” wont be on our top adjective list. Feature “wife” will be dropped at sentiment analysis phase.

Sentence Pruning

Once we have most frequent features of restaurants, we do sentence pruning from reviews text and only extract those sentences, which contain any of the features from our list of frequent noun features. The motivation is that reviews contain a lot of sentences, which don't talk about opinions on any feature of a restaurant. We prune such sentences and only use those sentences, which talk about any features that we have extracted from our feature extraction step. We only store those sentences and the feature labels for these sentences and pass it to the Type Dependency Parser.

Example:

Actual review:

I have forgotten his first name but his last name is Frankel. Our food was delicious. The restaurant became noisier as the evening progressed but because we were in one of the intimate side sections, the noise never became overwhelming.

Data after Sentence Pruning:

food => Our food was delicious.

restaurant => The restaurant became noisier as the evening progressed but because we were in one of the intimate side sections, the noise never became overwhelming.

As you can see we have pruned first sentence, as it doesn’t contain in any extracted feature.

Type Dependencies on Pruned Set of Sentences

We use the pruned set of sentences for each restaurant that contain the features from the Sentence Pruning step and run type dependency parser on these sentences for each Restaurant. We have used Stanford Parser to output type dependency format for each sentence. Stanford parser uses dot, exclamation mark and hyphen as a sentence boundary. We have used same boundaries for extracting sentences from reviews. We store type dependency for each sentence.

Example:

The type dependency generated for a sentence below is as follows
we run it as follows using LexicalParser and output only the dependencies.

java -mx1200m -cp "$scriptdir/stanford-parser.jar:" edu.stanford.nlp.parser.lexparser.LexicalizedParser -outputFormat "typedDependencies" $scriptdir/englishPCFG.ser.gz $*
Feature => Sentence pair:

staff => staff was super attentive.

food => the food was brilliant.

ambience => the ambiance and decor excellent.

Ouput of the parser.
nsubj(attentive-4, staff-1)
cop(attentive-4, was-2)
amod(attentive-4, super-3)
det(food-7, the-6)
nsubj(brilliant-9, food-7)
cop(brilliant-9, was-8)
ccomp(attentive-4, brilliant-9)
det(ambiance-12, the-11)
dep(excellent-15, ambiance-12)
conj_and(ambiance-12, decor-14)
dep(excellent-15, decor-14)
ccomp(attentive-4, excellent-15)
We generate separate files(196 files) for each Restaurant with all sentences, which contain features. They are followed by the Type Dependencies as in the above format. We call this the dependencies file for each Restaurant. Once we have these dependencies file we extract the opinions about each feature by running our SentenceAnalyser script.
Opinion Extraction

We have implemented a Perl script that performs the Opinion Extraction from the dependencies file for each restaurant. We have automated this task by having a single script that calls our SentenceAnalyser on each of the Restaurant dependency files. The script takes three input files and produces one output file. The first input file is the Restaurant review sentences and the second file is the feature for each sentence and third file is the Type dependencies(dependencies file) for each Sentence. The input files have one to one correspondence on line level.

Restaurant1FeaturesFile Line 1 -> Restaurant1ReviewSentencesFile Line 1
Restaurant1FeaturesFile Line 2-> Restaurant1ReviewSentencesFile Line 2
. . .

Restaurant1FeaturesFile Line N -> Restaurant1ReviewSentencesFile Line N
we use the three important type dependencies in doing opinion word extraction for given feature.

1) nsubj : nominal subject

A nominal subject is a noun phrase which is the syntactic subject of a clause. The governor of this relation might not always be a verb: when the verb is a copular verb, the root of the clause is the complement of the copular verb.
Example:-
“Clinton defeated Dole”nsubj(defeated, Clinton)

“The baby is cute”nsubj(cute, baby)

2) amod: adjectival modifier

An adjectival modifier of an NP is any adjectival phrase that serves to modify the meaning of the NP.

Example
“Sam eats red meat”amod(meat, red)

3) neg: negation modifier

The negation modifier is the relation between a negation word and the word it modifies.

Example
“Bill is not a scientist”neg(scientist, not)

“Bill doesn’t drive”neg(drive, n’t)

The SentenceAnalyser takes the dependencies files and extracts "nsubj", "amod" and "neg" tags and interprets them to extract opinions of a feature.
For example consider the sentence below.

"staff was super attentive , the food was brilliant , the ambiance and decor excellent."
We have already extracted the features like "staff", "food" and "ambiance" from our feature extraction step. We then look for these features in the nsubj, amod and neg tags to obtain opinions. For example from the dependencies above we have only "nsubj" and "amod" tags to identify opinions about the sentence.
nsubj(attentive-4, staff-1)

amod(attentive-4, super-3)

nsubj(brilliant-9, food-7)

Wereport opinions as follows for each sentence and store the results in another file called summaryFile per Restaurants.

"staff" -> "super","attentive"

"food" -> "brilliant"

Even though we identified "staff", "food","ambiance" and "decor" as features we could get opinion information only about staff and food looking at the dependencies.

In the above example, we identified "attentive" as well as "super" to be opinion words for "staff". We do that by looking at if any of the features has an nsubj and if so we get the opinion identifier and see if there is any other adjective modifying the opinion adjective, which in the above case is “super”, which is modifying "attentive". Here even though "super" seems to be an intensifying the opinion "attentive" we currently report it as an "opinion".
We also try to identify negation of opinion words, which will invert the polarity of an opinion.

Example
"we both felt that the clam chowder broth was really thin and not as creamy and thick as previous trips"

nsubj(felt-3, we-1)

nsubj(thin-11, broth-8)

nsubj(creamy-15, broth-8)

advmod(creamy-15, as-14)

ccomp(felt-3, creamy-15)

amod(trips-20, previous-19)

prep_as(thin-11, trips-20)

So we extract "broth" and "chowder" as features and identify the opinion words from the type dependencies below

nsubj(thin-11, broth-8)

neg(creamy-15, not-13)

We identify the opinion words for "broth" as "~creamy" and "thin" as shown below. We have put a tilde "~" before the opinion word to signify a negation in polarity.

"broth"->~creamy,thin
We did not get any opinion about "chowder" from the type dependencies.

The Algorithm for the Sentence Analyzer using Type Dependencies is as follows

O= Opinion Word F= Feature Word I= Intensifier S= Sentence


Sentiment Analyzer

Bootstrapping and Expansion using WordNet

We extracted frequent adjectives(JJ), which were generated by the POS tagger on all the 99693 reviews. Our intuition is most of the opinion words are adjectives. We labeled 200 words from this extracted frequent adjective list as good and 200 words as bad. Since these 200 words don’t cover all opinion words we decided to automatically expand list by using WordNet.

We usedJava(JAWS) API of WordNet for expansion problem. We have generated synsets of for these good and bad words and took only synsets of these words, which are of the form AdjectiveSatellite. For each of these synsets we have extracted all the wordForms and produced the expansion set of good and bad words.

Here is an example of a good and bad word expansion

For opinion Word "excellent"

excellent, first-class, fantabulous, splendid

For opinion Word "elegant"

elegant, graceful, refined

For opinion Word "disappointed"

disappointed, defeated, discomfited, foiled, frustrated, thwarted

For opinion Word "worst"

worst, bad, big, tough, spoiled, spoilt, uncollectible, risky, high-risk, speculative, unfit, unsound, forged, defective

Once we have this list of Good and Bad opinion words. Making decision on feature given an opinion word is trivial.

Example:

"staff" -> "super","attentive"

As “super” and “attentive” are both present in our good opinion list, we increase the count of good for staff.

staff(good=2,bad=0)

Sentiment Classifier

We maintain count of good and bad adjectives for each feature and in the end depending on count of good opinion vs. bad opinions we assign a class for feature.

For a given Restaurant

 Feature Fopinionsgood opinionsbad = GOOD (F)

else opinionsgood opinionsbad = BAD (F)

Example:

Restaurant id “595”

food(good:8, bad:0)

service(good:3, bad:0)

breakfast(good:0, bad:1)

So for above example “food” and “service” will get overall sentiment of good from all the reviews while “breakfast” will go under bad sentiments.

GOOD / BAD
Food / Breakfast
Service

Error Analysis

As you can see we found out that not all noun can be taken as features. So nouns like ‘something’ and ‘nothing’ etc. were also produced as output of feature extraction. We took out top features with extremely high frequency; this processwas able to prune some words, which were not features. But still there are some unwanted word nouns, which occur frequently.

Here is a list of few sentences, which have a "neg" type dependency, which helps in inverting polarity. "~" symbol as described earlier inverts the polarity of the opinion Word.

The format below is in the form

"feature"->"opinion Words"->"sentence".

These are some correct examples.

1) waiter->delightful,~attentive,->waiter was delightfulbut notespecially attentive.

2) food->~disappointment,->every time i come here the presentation and food are never a disappointment.

3) wine->~good->the wine was n't good and the `` wine expert '' lead us astray.

4) atmosphere->~stuffy,first,class,->the building is historic and the restaurant's atmosphere is first class but not stuffy.

5) cobbler->~best,->my favorite dessert was the creme brulee but the mississippi mud pie and the apple cobbler were not the best.

6) grill->~disappoint,->as per usualwater grill did not disappoint.