“Perfecting the Enemy of the Good” — A New Cupping Score Card by Ian Fretheim

Posted on March 16th, 2017

Cafe Imports Director of Sensory Analysis Ian Fretheim has been working on a new Cupping Score Card for some time now. After careful development and refinement, based on years of on-again/off-again brainstorming and months of application, Ian has arrived upon a new form that we are deeming the “Analytic Cupping Score Card” (Figure 1). Cafe Imports’ Sensory Analysis department has been working with this new form for the past several months, in an effort to increase both the accuracy and the descriptiveness of our cupping program. Please enjoy the following essay, written by Ian himself, where he explores the new design and metaphysical hurdles to its development.
Figure 1 (Click here for a downloadable .pdf)


It is sometimes posited, even admonished, that we should not allow the Perfect to become the enemy of the Good. Less creative words were never said. Of course, early on the road from pragmatic compromise to tired platitude, this may yet be sound advice. Later, it is no more than Status defending Quo.
Or maybe the saying is true, and it is not the Perfect (head) but Perfection (heart) and Perfecting (hands) that we should look for in defense against the Good. In the knowing hands of experts, in the movements of the potter and the poet…we see that the Perfect is a far cry from Perfection. We see the ruse of the Perfect. Not the potter’s ruse but Status’s. The Perfect is nonexistent. And the status quoth neither heart nor skill of hand, but to claim the Perfect enemy that does not exist. Not so, Perfection. Not so, Perfecting.
Of course there are times when the Good needs an enemy. There are times when the Good needs an enemy Better than its friends. Without challenge, cross-breeze cold drought and deluge now and then, the Good loses pace, loses what makes it Robustly Good, defends weaknesses as pillars and afterthoughts as strengths.
Plato may have planted these seeds. He saw the human as dealing in imperfect approximations of the Real thing. The Perfect. But who is the enemy of whom? Until perhaps Plato, had there been no war? And was this, the opening shot, let fly not from the Perfect attacking the Good, but from the Good keeping the Perfect at bay? Is the warning not to let the Perfect become the enemy of the Good a sort of rally cry for the oppressive Good?
With Plato we begin in a cave being taught to not let the Good be the enemy of the Perfect. In a cave, Good dancing distraction before us. Distraction from?…the Perfect. Plato, original Perfectionist. But what did Plato see? By his own theory, and by his own eyes: Approximation. Of the Perfect.
Though he never tasted but 88s, Plato always scaled to 100.
How did he know?
It is not the Good for which we need to worry, but Perfection. Plato lacked his teacher’s knack for it. Lacked the Pliability to be wrong. To fall through the unknown and stick the landing. Or not. In Place of the unknown, Plato put the net of Perfect.
It may be that Plato once fell in love. Grecian air and all. One night they’re walking the coast…and this is where Odysseus… as the sun sets over the Mediterranean expanse, he turns and says to his beloved, “You’re really Good.” Plato. For him, she can not be the Perfect (for the Perfect [non]exists above all existents), nor can he handle perfection, cannot suffer perfection upon perfection. “You’re really good. 88. Maybe 89.” What if the moon catches her eye? “Point five.” Point five? Plato. Tell her/him s/he is perfection. But he cannot see it. So enamoured is he with the Perfect and its construction. Yes. Even for Plato there is no Perfect, only the Project. Perhaps the sweetest he can say is: “Dearest Danae, before you my understanding of the Perfect was less Perfect. But you’ve skewed the distribution and now I’ve added you to the accumulated data and the Perfect is more Perfect than I had imagined!”
Plato regrets his former youthful Proclamations of 87.5, so rash and lacking in Perspective. So absent knowledge and imagination of Danae. So 85 and a quarter. But there they are. Etched in the stone of some demon’s cave, imperfect approximations of the Perfect for all to see. Every 100 must be dragged down by gross material connection with lowly 86s. None withstand judgement and all come out 91 – 92.
But what if we, for a moment, set aside Plato’s approximation of the imperfect Good, which he called the Perfect? What if we take upon ourselves the task of Perfecting an enemy of the Good, rather than positing its ideal Perfect? An enemy of the Good set out not as the straw Perfect, but instead as the Better? Failing Perfection, might this enemy yet revitalize the dogged Good.
We have been working on a new cupping score card here at Cafe Imports for the last few months. The project began in earnest in November of 2016, but followed literally years of ongoing conversation and brainstorming. By early December we had a working model and by the end of the year we had refined it and begun training its use.
What is a score card? We can think of it as a questionnaire. As the administrator of a sensory test, I give my panelists score cards that I intend for them to use to tell me about their experiences. There is certain information that I want to know, in our case, about the coffees on a cupping table. I cannot follow each cupper around the table, asking (for inspiration) at each aspiration, “uh, so, how’s the acidity?” “hmm, ok, how about now? Aftertaste?” “oh, hey, you getting jasmine in that?” “freshly cut? Or more tea-like?” Neither can I just set out the cups and tell everyone to have at it. The information received would scale between pictograms and this essay.
So the score card is a questionnaire. Great. Now we just have to decide what we want to ask. Oh, and how we want to ask it. Oh, and maybe what it is about what we want to ask, and how we can ask that.
Grab a bunch of cupping score cards and compare them. There are overlapping categories, and there are unique categories. In many cases they are arranged differently from one another, even where they are similar. What gives? Ever been to a Cup of Excellence competition? It’s no mistake that their form leads with Clean and Sweet. At every CoE orientation to which I’ve ever been, the head cupper emphasizes Clean and Sweet. If you are unsure about a coffee, ask yourself, is it Clean and Sweet? By leading the scorecard with Clean and Sweet, Cup of Excellence is helping to orient their panelists to find the coffees that best fit their criteria.
So, the order in which the questions get asked matters. What else? Well, how about what we do and do not ask? For example, coffee has bitterness. Even very good coffees have some bitterness. But specialty forms don’t ask about it. Why? Could be that because relative to lower-grade beans and Robustas, specialty Arabica stands out for its lack of bitterness. Could be that we’d rather spend our time assessing other, more positive attributes. Could be that we take it for given and that’s a rap.
There’s a problem, though, which is that bitterness is there. Lurking in every Aftertaste, every Overall, every Final Score. While we can use form structuring and question selection to focus the efforts of our panelists, we cannot very well get them to leave out integral aspects of their experience — in particular when we are asking the very open questions of quality and perception through categories like Flavor and Aftertaste. Life finds a way and bitterness is going to get scored. As is lack of bitterness. Every time. Not providing space for bitterness is fine, but it also means that we have information that 1) is not getting reported and 2) is bleeding into other categories without clear specification.
Once we figure out all of our questions, we’ll figure out the order in which to present them. We know that we want to explicitly include the most basic and unavoidable aspects of the tasting experience, lest they find their own way in. For coffee, this will be the tastes sweet, sour, and bitter. We’ve already got two, and so adding the third will be easy. Just need to make room.
What is this? For everything we add we need to take two away? Not quite. But we do need to limit the number of questions we’re asking our panelists to consider. If you look beyond coffee, you’ll see that in many cases we’re trying to extract much more information from our sensory all at once than are other specialty industries. Ask too much and you’ll dilute the answers. Don’t give enough time to answer them, same result. It works in reverse, too. Ask too few questions or give too much time and you’re likely to get over-cooked (extracted) responses.
What have we been asking about? Aroma, Flavor, Aftertaste, Acidity, Body, Sweetness and Cupper’s Score. What have we been asking about these things? We’ve been asking what the quality is. What is the quality of the Flavor? Of the Acidity? Why? Perhaps for the same reason we don’t ask about bitterness. We are a quality-based industry. What would be the alternative? Intensity. However, raw intensity does little to tell us whether something is any good. Tons of acidity, but it’s all acetic? Enough said — and back to quality.
What is quality? If we look to acidity we can enumerate types of acid — citric, malic, etc.– and then designate which of those are considered good and which are considered bad. We can call this Q1. But that’s not the end of it. Maybe malic is a higher-rated type than citric, but maybe the citric acid in this Yirgacheffe offers a more pleasant experience, described as juicier, than the malic in this Huehuetenango. Let’s call this Q2. And again, what if Yirg number 2 has a similarly juicy citric acid as Yirg number 1, but Yirg number 2’s acid is somehow more concentrated, clear, or representative of citric acid in coffee? Q3. Q4 goes to preference, for while we might deny that preference enters the professional assessment, it’s there. Then, of course, so long as we are talking about positive attributes, we do indeed bring in intensity. We’ll call it I. Let’s call all of these the Indices of Quality. There may be others, and these may not each be weighted equally, though it would be better if they were…
How do we come to a score for the acidity of a coffee? Simplistically, Q1 +/- Q2 +/- Q3 +/- Q4 +/- I. And again for flavor, aftertaste, body, sweetness, etc. And again for the next coffee and the next. Either this, or we just use personalized shortcuts to loosen the bandwidth required to rapidly make the Quality calculation over and over. Intensity looks pretty good again, what with the Indices of Intensity being… intensity.
While quality needs significant simplification and specification, intensity needs precise elaboration and qualification.
Maybe there’s another way to bring in quality? Something less complex and ambiguous? Something that can allow it to elaborate and qualify intensity? Can we build a new score card?
Let’s go back to what we said about what we’re asking about. What do we want to know? What are we looking for? With these questions for lenses, let’s look at our old categories: Aroma, Flavor, Aftertaste, Acidity, Body, Sweetness, and Cupper’s Score. We use seven categories scored 0 – 10 and give a 30-point handicap to all coffees, adding up to 100 (if it’s Perfect). Limiting ourselves to seven categories is extremely functional, and the 30-point math is comfortable so we’ll try to stay with that.
We’ll definitely keep Acidity and Sweetness, to which we’ll add Bitterness. This means we’ve got 5 categories and only 4 spots remaining: Aroma, Flavor, Aftertaste, Body, and Cupper’s Score. Let’s keep Body as it is less vague than the others, is a variable attribute in coffees and lends itself readily to scaling. This essay has gotten long so I’ll cut some chase. Aroma and Cupper’s Score are both out. Cupper’s Score doesn’t tell me anything. It may as well be Stubborn Score when it doesn’t match the attributes and Meh Score when it does. Aroma is important. I always smell the grounds before I make coffee at home. Everyone always smells the grounds when they cup. But what happens when a coffee smells really nice and then cups poorly? It doesn’t get bought. When it doesn’t smell like much but it cups out well? It gets bought. Panelists often say things like: “I marked this an 85, but my aroma score is pulling it up/down.”
Acidity, Sweetness, Bitterness, Body: in. Aroma and Cupper’s Score: out. We’re left with three open spots, and only Flavor and Aftertaste to fill them. Scratch that. Aftertaste is out. People frequently use it to just amplify their Flavor score, it’s qualitatively vague, correlates with other attributes, and now that we’re assessing Bitterness, Aftertaste can probably be dropped.
What about Flavor and our final three spots? Flavor is dubious. Most coffee tasters are highly flavor-centric, and yet flavor is an exceptionally vague category. Are we asking people to draw on the entirety of their food- and beverage-consuming lives? Seriously? And we’re asking coffee people to do this?! It’s no wonder cupping notes at times read like the heavily curated menus of the bourgiest working-class-themed restaurants in your city. If the fennel isn’t roasted on Jim’s cherry wood in a shale-composite outdoor oven, is it even an upper Midwestern farm-style pizza pie? This is the kind of stuff that makes coffee interesting–to talk about. It’s not the stuff that makes coffee quality.
“Flavor?” is a vague question. Try it sometime. “Hey, you! Flavor?” “Uh, what?” “Quality!” Of course, all the complexity discussed above applies. If we’re not interested in fennel, what are we interested in? What are we looking for? With some qualification, we’re looking for coffees that are Fruity, Floral, and Caramely.
Importantly, the most differentiated and specific flavor experiences come from a cupping roast and a cupping preparation. The cupping process certainly highlights defects, but it can also highlight the most exceptionally nuanced, subtle and volatile qualities of a coffee. In other words, much of the ambrosia of these apples is Edenic — whereas the soft, sweet, malic description is not.
Of all the madeleine moments that coffee can conjure, the coffees that we’re consistently interested in buying boil right down to Fruity, Floral, and Caramely. That roasted-fennel dreamscape that I described earlier? Maybe a bit floral, not too fruity, and, by the sounds of it, fairly caramely. We need more info, but what was it, some sort of Pacamara?
Where do we stand now? Fruity, Floral, Caramely, Acidity, Sweetness, Bitterness and Body. Flavors (sought after), Tastes, and Tactility. OK. Now we just need to figure out how to scale them. If you’ve noticed, we’ve already qualified Flavor to some extent. We’ve divided it into positive flavor groups. We could get into some trouble with ferment, but we can deal with that later. What if these categories were scaled graphically from Absent to Intense? What if Intense Fruitiness were a 10 (instead of Perfect coffee flavor)? Granting that the world didn’t end then and there with the panelist’s closing of the 0, I would know that this coffee was very fruity. What if Absent Fruitiness were a 6?
We’ve haven’t gotten too much into score compression, but let me just drop from my pocket that most coffee scorers, like Plato, believe in the Perfect 100, and the crushing insult of <80. We fear with the fear of the ancient mysteries to trespass anywhere near the former or much beyond sight of the latter. It’s called compression, and I am an acolyte-hypocrite in never having scored anything 100 points. How then can I know that an 88 has 12 points to go? If we posit a Perfect 100-point coffee (though ever unknown, for knowledge can only taint the Perfect), by necessity standing behind every Real experience, then our 91s and our 92s will be 8 and 9 points fear, 4 and 5 points hope, and terribly uncomfortable all squished together near the ceiling of our imagination.
Back to it. What if Fructus Absentia were a 6? Simple. I would not expect a fruity coffee. Works the same with the other flavors. How about the tastes? What about that bad acid? Let’s try it. Let’s scale acidity from Lacking at 6 to Intense at 10 (Mild, Moderate, and Strong making up the middle). Aromatic acid? Minus 2. We can use a checkbox. Intense gets a 10, minus 2 gives an 8. Seems a bit much except that the questions are “What is the intensity?” and “Is there acetic acid?” High and low are not the only concerns when we’re talking about scaling. It is also important for the numbers we use to tell us specific information about the category that we’re assessing. Imagine a spider diagram in which acidity is drawn out to ten, but the region beyond eight is shaded with a contrasting color. This conveys more information about the tasting experience than a diagram that only extends to seven or eight, wherein the acidity score has been pre-discounted as lower quality. For those worried over the final maths, we’ve found that acetic coffees tend in the end to be less sweet and more bitter than those without acetic acid.
CQAs? We can use another checkbox. Sweetness is easy. More is more. But bitterness? Graphically we can scale the same: Lacking through Intense. Numerically, we simply invert, such that Lacking is a 10, and Intense is a 6.
This leaves us with Body. Thin – Normal – Thick. 6 – 8 – 10. Rough? Minus 2. Astringent? Minus 4. Are thick coffees objectively better than thin ones? No, but they are thicker. If an EP Excelso gets a 10 on body compared to some spindly Gesha with a 6, then that’s great because now I have some information. The Gesha can score very well elsewhere and the Excelso can be Thick. It’s OK; you don’t have to marry it.
Accounting for defects and further qualitative refinements can be done surprisingly well with set value checkboxes, as you’ll see below.
The example of our score card in figure 1, is a working draft. It has shortcomings. It raises questions. Can we include a slider for panelists to note roast level? What are the specific thresholds and definitions for each category? When should I mark “Variable?” or “Muddled?” What is “Tropical Fruit?” Are CQAs always astringent, and if so should they discount twice? Can coffees “plus-one” more than once in a single category (e.g. Tropical and Stone Fruit)? What happens when we no longer love “Tropical Fruit?” If a cupper scores a coffee higher than 100 points, does their spoon get revoked?
This isn’t the final word in score cards. For me, it’s just the second or third word. Remember: We’re Perfecting an enemy of the Good. Should we find success such that our score card is one day itself Good, another enemy will be needed.
— Ian Fretheim, Cafe Imports Director of Sensory Analysis