This article is available in here.
This article investigate the three methods for post-task questionnaire or rating that are widely used in usability testing. They are Usability Magnitude Estimation (UME), Subjective Mental Effort Estimation SMEQ, and Likert scale.
For UME, participants create their own scale of difficult ratings. They can assign a task rating that’s any value greater than zero. The judgement therefore is made based on the ratio of ratings for all the tasks.
For SMEQ, participants are asked to either draw a line through move a scroller on a vertical scale with nine labels from “Not at all hard to do” to “Tremendously hard to do”.
And for Likert scale, participants are asked simply choose from a fixed number of ratings.
This study investigates the correlation of the the three methods by three separate experiments. In experiment one, six users were asked to perform seven tasks on a web-based Supply Chain Management application. And prior to the test, participants received practice on making judgement through UME. They were also asked to complete both two 7 point Likert scale ratings and a UME. Think aloud protocol was used in this test. In experiment two, 26 participants were recruited to test a travel and expenses application. And 5 tasks were assigned to them. Besides UME and Likert scale, experiment two also included SMEQ as part of the test. The researchers use these data collected from these two experiments to find statistical significance.
Through the experiments they designed and the statistical analysis of the data collected from them. They reached a conlusion that with sample sizes of above 10-12, any of the three question types can yield reliable results. But below 10 participants, none of the question types have a high detection rates.
Likert question was easy for participants to use, and easy for administer to setup in a electronic form. SMEQ question showed good overall performance as well. By using scroller in the online version, it was easy to learn. One drawback is the making the widget. Participants had difficulty learning to use UME question type. It was less sensitive than other question types and had lower correlations with other measures such as System Usability Scale (SUS). And based of these findings, they suggest that if you want the additional information and benefits of using SMEQ or UME, SMEQ is a good choice but not UME.
I came across to this article while I was looking for ways to do post-task questionnaires. I found this a blog called measuringusability.com. I think it’s quite interesting to see people testing the testing methods used in usability testing. And it’s interesting to see that they prove their points using statistical analysis as well.
This article convinced me that sometimes a simplest design can do the trick without overcomplicating things. That’s why in UR 4, my group went with just one post-task question – “overall, this task was”. And in this article, I’ve also found references of previous study done on the number of options for the Likerty form question. It turns out anything more than 7 will only confuse the participants without revealing more details.
Overall, I think this is a great article. And I’d love to read more about this.