A fellow UX professional recently came to me with a fun story: Her boss came to her desk and told her, “I want us to add benchmarking to our testing strategy.” When she asked him why, he told her that he’d recently heard that a competitor was doing it, and he wanted to keep up with the times.
Benchmarking is a hot topic in the UX industry, and if it’s done properly, it can provide a lot of spectacular information about how your site or app is performing over time!
Unfortunately, it’s often perceived as “just another” way to test the usability of a product. The truth is, benchmarking is a commitment---and it requires all the thought and reflection that any good commitment entails. Stick to the guidelines below, and you’ll gain a rich collection of data that showcases your team’s achievements and pitfalls across the life of your site or app.
Benchmarking is the process of testing a site or app’s progress over time.
This can be progress through different iterations of a prototype.
This can be progress across different versions of an application.
This can even be progress across different sites: yours and your competitors'.
A lot of companies like to run benchmarking studies because they show actual data as it changes over time. It adds a quantitative dimension to the qualitative research you’re already doing, which can drive home your findings and add weight to your suggestions. Plus, it’s always nice to be able to share a pretty chart with the manager!
It’s intensely gratifying to be able to track a product’s successes and failures over time. In the above example, we can see that the app in question has gotten worse at helping users get started. Whatever the first round’s tutorial was doing, it was doing it well; it might be time to scrap the new tutorial flow and return to what worked.
Benchmarking can also clue you in to the ways that later iterations of your product are solving problems, like in the graph above. As you can see, navigation has steadily improved in each iteration---it even reached a perfect score for tablet users in the latest round! Without watching a single video, you can rest easier knowing that users are getting through your app without difficulty.
It starts with a conversation among your team. You need to agree on exactly what you're hoping to learn, because once you have decided on script for a benchmarking study, you shouldn’t make changes to it. (That could compromise the integrity of the research!) So ask yourselves:
Once you know the big questions that your team wants to answer, then you need a script---a very specific kind of script.
1. The tasks have to be goal-oriented and as basic as possible.
2. The tasks also have to avoid using signposts or descriptive instructions because those elements might not be there in the next round of testing.
3. Follow each task with evaluative questions, and keep a consistent pattern of tasks and questions. Using measurable questions like rating scale or multiple choice will give you the data you need to make those pretty charts!
Once you’ve got a script, it’s important to keep it as consistent as possible. Any shifts in the tasks or questions can compromise data and make it more difficult to compare results across different rounds of testing, so try not to tinker with it!
(Pro tip: The UserTesting dashboard makes it unbelievably easy to set up your next round of benchmarking: just click the “Create Similar Test” button!)
The Create Similar Test button copies the details from your previous test, so you can set up a second round of benchmarking with a click.
Finally, it’s important to capture the same number and type of users.
If your budget allows, include more users than you would in a typical study. For ordinary usability tests, 5-7 users will normally give you plenty of qualitative insight. But because we’re looking more at numeric trends in a benchmarking study, you can bump up your total number of users to 10, 20, or even 50.
You don't need to test with the exact same people each time; it’s okay to get fresh eyes on each version of your product.
However, the total number of users and any distinguishing characteristics should remain the same. For example, if you tested with 20 college students the first time around, you'll need to test with 20 college students for the next round too.
The frequency and spacing of new studies depends on a variety of things, including how quickly new designs are being produced and how much of a budget you have.
The important thing about benchmarking is seeing the progress over time, so whatever you do, don’t run an initial benchmarking study and then never run another one. You’ll never know what benefit a benchmarking study can really be!