April 19, 2017

# Amazon Food Reviews, Part II  Ok, so now that we explored the contents of the Amazon Fine Reviews data file, it's time to move on to the second task: How many reviews and products are perfectly positive and perfectly negative in sentiment?

TextBlob assigns a polarity value from -1.0 to +1.0, and scores on these extremes are usually outliers. But for this exercise, I'm going to analyze what makes up these outliers, both in terms of quantifying the number of products and comments, as well as in Part III, the key features mentioned in the comments. ``````#polarity by ProductID
data_pid = data2.groupby(['ProductId'])['tb_polarity', 'Score'].mean().reset_index()

#polarity by Profile Name
data_pn = data2.groupby(['ProfileName'])['tb_polarity', 'Score'].mean().reset_index()

#Perfect Review Sets
pf_negr = data2.loc[(data2['tb_polarity'] == -1.0)]
pf_posr = data2.loc[(data2['tb_polarity'] == 1.0)]
`````` But, here we can see how the polarity from TextBlob doesn't correctly analyze text accurately all the time. Notice how for 106 the summary reads, "disappointing", and "text_cln" reads "not what I was expecting in terms of the ..." and yet it assigns it a polarity score of 1.0.

``````#Perfect Review by ProductID
pf_pos_pid = data_pid.loc[(data_pid['tb_polarity'] == 1.0)]
pf_neg_pid = data_pid.loc[(data_pid['tb_polarity'] == -1.0)]
``````

Summary of Perfectly Positive and Perfectly Negative

``````#How many reviews are perfectly positive?
print('The number of perfectly positive sentiment reviews are',len(pf_posr))
print(len(pf_posr)/len(data2), 'of all reviews have 1.0 positive sentiment')

#perfectly negative?
print('The number of perfectly negative sentiment reviews are', len(pf_negr))

print(len(pf_negr)/len(data2), 'of all reviews have -1.0 negative sentiment')

#How many productid's are perfectly positive?
print('Products with perfectly positive average reviews are', len(pf_pos_pid))

#perfectly negative?
print('Products with perfectly negative average reviews are', len(pf_neg_pid))

``````

Summary of Reviews

The number of perfectly positive sentiment reviews are 2888

0.00508044626302 of all reviews have 1.0 positive sentiment

The number of perfectly negative sentiment reviews are 303

0.000533024659867 of all reviews have -1.0 negative sentiment

Summary of Products

Products with perfectly positive average reviews are 190