Disclaimer – the title is a Quality Street pun only and bears no relation to the quality of the data or analysis presented below. This whole blog post is basically to discredit the personal chocolate preferences of a group member who shall remain nameless. Safe to say though, they Vostly overestimated people’s love for the Toffee Finger. Long live the Orange Creme.
In the run-up to Christmas, I was arguing with another member of the group about which are the best and worst Quality Street chocolates. This is clearly an important topic, with YouGov previously dedicating vast resources (I assume) attempting to answer this very question.
However, as the YouGov poll did not perfectly align with my personal and very accurate preferences, I decided to run another, better experiment. For this experiment, I bought a tub of Quality Street, counted all the chocolates, and then left the tub out in the common area for hungry opiglets to consume. I then recounted the chocolates at various points over the next two days to find out which flavours disappeared first, and perhaps more importantly, which sad chocolates were taken only after all other options were exhausted.
As expected, crowd favourites The Purple One and The Green Triangle were quick to go, along with the Fudge and Milk Choc Block. The cremes, controversially my personal favourites, sadly performed only averagely. However, to my great delight, the Toffee Finger comprehensively beat all other competition (including the Coconut Eclair!) to take the wooden spoon and provide me with a moderate degree of smugness in the end.
In an attempt to make this blog post somewhat useful, I’ve included the code I used to make the results plot below. This code should allow you to sort a DataFrame using a custom list, pivot the data when you’re an idiot and type it up the wrong way round, and make a DIY colour palette for your plots.
And remember, even though you may be appalled by others’ chocolate preferences, this actually makes them the perfect person to sit down and share a box with during these festive times.
Happy holidays!
import os import pandas as pd import numpy as np import matplotlib.pyplot as plt
Read in data¶
# data I manually recorded in a csv quality_df = pd.read_csv("Quality_data.csv", names=["flavour", "t1", "t2", "t3", "t4", "t5", "t6", "t7", "t8", "t9"]) # YouGov ranking - https://twitter.com/yougov/status/940868550700527616?lang=en-GB yougov_ranking = ["The Purple One", "The Green Triangle", "Caramel Swirl", "Strawberry Delight", "Orange Creme", "Milk Choc Block", "Fudge", "Toffee Finger", "Orange Chocolate Crunch", "Toffee Penny", "Coconut Eclair"] # sort data (reverse so that plot is in sensible order later) quality_df["flavour"] = quality_df["flavour"].astype("category") quality_df["flavour"] = quality_df["flavour"].cat.set_categories(yougov_ranking) quality_df = quality_df.sort_values(["flavour"], ascending=False).reset_index(drop=True) quality_df
flavour | t1 | t2 | t3 | t4 | t5 | t6 | t7 | t8 | t9 | |
---|---|---|---|---|---|---|---|---|---|---|
0 | Coconut Eclair | 5 | 5 | 5 | 2 | 2 | 2 | 2 | 2 | 0 |
1 | Toffee Penny | 6 | 6 | 6 | 3 | 3 | 1 | 1 | 1 | 0 |
2 | Orange Chocolate Crunch | 5 | 5 | 5 | 5 | 4 | 2 | 1 | 0 | 0 |
3 | Toffee Finger | 7 | 6 | 6 | 5 | 5 | 5 | 5 | 4 | 2 |
4 | Fudge | 8 | 6 | 4 | 0 | 0 | 0 | 0 | 0 | 0 |
5 | Milk Choc Block | 4 | 3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
6 | Orange Creme | 6 | 6 | 6 | 3 | 3 | 2 | 2 | 0 | 0 |
7 | Strawberry Delight | 7 | 5 | 5 | 3 | 2 | 0 | 0 | 0 | 0 |
8 | Caramel Swirl | 7 | 7 | 5 | 2 | 1 | 1 | 1 | 1 | 1 |
9 | The Green Triangle | 4 | 4 | 3 | 1 | 0 | 0 | 0 | 0 | 0 |
10 | The Purple One | 5 | 4 | 4 | 0 | 0 | 0 | 0 | 0 | 0 |
Pivot data for plotting¶
quality_df = quality_df.T new_header = quality_df.iloc[0] # grab the first row for the header quality_df = quality_df[1:] # take the data less the header row quality_df.columns = new_header # set the header row as the df header quality_df = quality_df.reset_index(drop=True) quality_df.index.names = ["time"] quality_df
flavour | Coconut Eclair | Toffee Penny | Orange Chocolate Crunch | Toffee Finger | Fudge | Milk Choc Block | Orange Creme | Strawberry Delight | Caramel Swirl | The Green Triangle | The Purple One |
---|---|---|---|---|---|---|---|---|---|---|---|
time | |||||||||||
0 | 5 | 6 | 5 | 7 | 8 | 4 | 6 | 7 | 7 | 4 | 5 |
1 | 5 | 6 | 5 | 6 | 6 | 3 | 6 | 5 | 7 | 4 | 4 |
2 | 5 | 6 | 5 | 6 | 4 | 1 | 6 | 5 | 5 | 3 | 4 |
3 | 2 | 3 | 5 | 5 | 0 | 0 | 3 | 3 | 2 | 1 | 0 |
4 | 2 | 3 | 4 | 5 | 0 | 0 | 3 | 2 | 1 | 0 | 0 |
5 | 2 | 1 | 2 | 5 | 0 | 0 | 2 | 0 | 1 | 0 | 0 |
6 | 2 | 1 | 1 | 5 | 0 | 0 | 2 | 0 | 1 | 0 | 0 |
7 | 2 | 1 | 0 | 4 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
8 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
Plot data¶
flavours = quality_df.columns.tolist() time = quality_df.index.tolist() data = [quality_df[flavour].tolist() for flavour in flavours] normalised_data = np.zeros_like(data).astype(float) for i in range(len(time)): normalised_data[:,i] = (np.array(data)[:,i]) / np.array(data)[:,i].sum()
flavour_to_colour = {"The Purple One": "purple", "The Green Triangle": "limegreen", "Caramel Swirl": "gold", "Strawberry Delight": "red", "Orange Creme": "darkorange", "Milk Choc Block": "darkgreen", "Fudge": "fuchsia", "Toffee Finger": "chocolate", "Orange Chocolate Crunch": "orangered", "Toffee Penny": "goldenrod", "Coconut Eclair": "mediumblue"} palette = [colour for colour in flavour_to_colour.values()] palette.reverse()
# stacked area plot plt.stackplot(time, normalised_data, labels=flavours, colors=palette) plt.legend(reversed(plt.legend().legendHandles), reversed(flavours), bbox_to_anchor=(1.04, 1), loc="upper left") plt.xlim([time[0], time[-1]]) plt.ylim([0,1]) plt.xlabel("Random times I checked the tub") plt.title("Quality Street tub composition over time\n(chocolates ordered according to YouGov ranking)") plt.show()