I obtained a database of millions of short beer reviews, which I have generated graphs and word clouds from and performed a small amount of analysis.
I made use of matplotlib along with the python wordcloud library, to generate graphs and word clouds respectively.
The following graph depicts the number of beers for the top 20 beer styles.
The top style of beer by a far distance is the IPA, followed by the American Pale Ale.
The following graph depicts the most popular words used in reviews (ignoring stop words).
The following image depicts a word cloud of the most popular words.
Popular Words from very positive ratings
For this and the next word cloud, I removed words such as “bottle,flavour,flavor,really”, to try to obtain words which describe the beer in terms of flavour and appearance.
Popular Words from very negative ratings
ABV (Alcohol By Volume)
I found the following graph to be very interesting, there seems to be a noticeable correlation between the overall score of beers and the alcoholic content.
You can see the most common ABV of a beer is around 6%.
Number of beers produced by breweries
The following graph shows how Cigar City leads in the number of beers they have produced.
Distribution of scores
The following graph shows how scores of 0 to 5 are distributed across reviews.