CUE: A Usability Testing Bake-Off

by Jared M. Spool

Practicing usability testing has great similarities with baking an apple pie.

If you’ve never baked a pie and you’re like most people, you’ll probably learn from a friend or family member, such as Grandma, or you’ll learn from a cookbook recipe or TV cooking show. The first few pies you bake—while still tasting pretty good—probably won’t come out as good as you’d like, but, with practice, you’ll get better and better.

As you continue to practice, you’ll experiment—sometimes deliberately and sometimes through creative accidents, like leaving out a key ingredient—and from each experiment you’ll learn more about what makes a great pie. As your pie baking skills become more developed, people will beg you to bake more pies and will show great excitement when another one comes out of the oven.

At least, that’s been my experience with baking apple pies. Also, coincidentally, it’s been my experience with usability testing.

The Parallels

Many folks learn usability testing from a colleague or they read about it in a book or attend a course. The first few tests they conducted—while producing interesting results—probably didn’t come out as good as they would have liked. However, with practice, they got better and better.

As they continued to practice, they started to experiment and from each experiment, they learned more about what produces a success usability test. As their testing skills became more developed, their coworkers begged them to conduct more and more tests and showed great excitement with every new test.

Solitary Activity

Both baking apple pies and conducting usability tests are solitary activities, for the most part. The baker and the test organizer typically produce the results by themselves.

Rarely do we get to see how someone else bakes their pies. Rarely do we get to work alongside other usability test organizers to see how they do their work.

Therefore, we develop our craft by ourselves, only really learning from our own experiences. Because of our own trial and error, our testing techniques and skills evolve in a unique direction

The Value of the Bake-off

In the world of baking apple pies, communities of pie-bakers gather at events known as “bake-offs.” These festivals give the bakers a chance to compare their results, often under the guise of choosing the best pie.

While it’s great to win the bake-off, the real benefit comes from learning how others created their entries. Seeing the different techniques and ingredients they used can give great insight into ways to improve your pies in the future. The bake-off is an opportunity to compare and contrast your approach against the approaches of your peers.

CUE: A Usability Testing Bake-off

In 1998, Rolf Molich held what we could call the first usability testing bake-off. Instead, he called it a Comparative Usability Evaluation or CUE. However, the goals are the same.

So far, Rolf has held four CUE sessions, each one comparing top-notch practitioners demonstrating their skills. In each session, each practitioner evaluated the same interface, using his or her own personal techniques. The results have been fascinating.

In an apple pie bake-off, the chefs often compare their recipes to their own. In Rolf’s sessions, participants compared their techniques, looking for interesting differences.

For example, practitioners each wrote a report that Rolf turned over to the client whose design they were evaluating. The participants could compare their reports to the other teams and see what they did differently.

In fact, all the teams produced very different reports. They also planned the projects differently, recruited their participants differently, and created different tasks, even though they were all testing the same interface.

Comparing Leads to Improvement

By comparing our work against others doing the same thing, we can easily see opportunities to improve our own work. For example, reading the various practitioners’ reports, we can get ideas on new sections to add to our own reports and clever ways to describe certain problems.

Even if we didn’t participate in the CUE session ourselves, there are still interesting things to learn. We can look at the instructions Rolf gave each participating practitioner and then ask ourselves, “How would we have executed this project?”

We can think about how we’d recruit users. How many users would we recruit? Would we exclude users already experienced with the interface, focusing only on newbies? Alternatively, would we try to balance the test for both experienced and inexperienced users?

How would we design the tasks? Would we focus purely on structured tasks, trying to collect timing information and success rates? Alternatively, would we look at problem discovery, trying to uncover every problem available?

By comparing what we would do against what the CUE practitioners actually did, we can see how our techniques differ and get ideas on how we could do things differently going forward.

Rating the Problems

Every apple pie tastes a little different. It’s neat to taste a sample from multiple pies, one after the other. You can really see how different approaches produce different results. Some our more tart and some are sweeter. Some have a chunky filling, while others have a crisp topping.

The same is true when you compare the problems that the CUE practitioners found. You can look at a problem, such as a particular function seeming unintuitive, and decide if you would have rated that as a severe problem or as something not worthy of much attention. Reading the CUE problem reports will make you think about how you report problems and rank their severity.

It is fascinating to see how some practitioners rated certain problems as critically important, while other practitioners rated the same problems as low priority issues. Many colleagues, after looking at the CUE results, have revisited their severity ranking process, to make sure they are accurately rating the problems they discover.

Striving for Constant Improvement

Because we’re often working by ourselves, having an external reference point to compare our work to is important. For apple pie bakers, the bake-off is where we can learn from others. In the world of usability testing, the CUE studies are quickly becoming the focal point for these comparisons.

About the Author

Jared M. Spool is a co-founder of Center Centre and the founder of UIE. In 2016, with Dr. Leslie Jensen-Inman, he opened Center Centre, a new design school in Chattanooga, TN to create the next generation of industry-ready UX Designers. They created a revolutionary approach to vocational training, infusing Jared’s decades of UX experience with Leslie’s mastery of experience-based learning methodologies.

Enroll in Our Four-Week Live Course on Outcome-Driven UX Metrics.

Establish your team’s 2025 UX metrics and goals by investing just 4 hours a week in our new Outcome-Driven UX Metrics course, featuring 8 hours of pre-recorded lectures and 8 hours of live coaching sessions with Jared.

You’ll learn to set inspiring UX goals, boost your team’s strategic impact, and receive personalized coaching, all while gaining access to a community of 51,000+ UX leaders.

Join this course and establish your UX Metrics today.