Grouping Whisky Brands

By Wesley Satelis

April 9, 2022

In this post we will be using the unsupervised grouping method Partition Around Medoids (PAM), to create clusters of whisky brands based on ratings given by users of the website https://www.whiskybase.com/whiskies/brands. The PAM method is a variation of the widely known k-means, the main difference is that PAM uses observations in the dataset as cluster centroids, and k-means uses the cluster mean instead.

The original dataset has the following variables.

  • Brand: Whisky brand;
  • Country: Country of origin of the whisky;
  • Whiskies: Number of different whiskies;
  • Votes: Number of votes given to that brand;
  • Rating: (0-100) Rating given by a regular user to that whisky;
  • WB Ranking: (A - G) Ranking based on ratings given by specialists in whisky.

The following table shows how many whisky brand each country has, I chose to discard countries with less than 10 whisky brands since those wouldn’t yield very interesting results.

Country Number of whisky brands
Scotland 3670
United States 1421
Germany 401
Ireland 322
Canada 161
Japan 130
France 118
Switzerland 106
United Kingdom 86
Australia 82
Austria 76
Netherlands 55
Sweden 41
Belgium 30
India 28
Denmark 24
New Zealand 22
Czech Republic 17
Spain 13
Taiwan 10