Saturday, August 29, 2009

Aug 29 - Hollingsed & Novick, Usability Inspection Methods after 15 Years of Research and Practice

Usability Inspection Methods after 15 Years of Research and Practice.
Tasha Hollingsed. Lockheed Martin, 2540 North Telshor Blvd., Suite C, Las Cruces, NM 88011. +1 505-525-5267 tasha.hollingsed@lmco.com
David G. Novick. Department of Computer Science, The University of Texas at El Paso, El Paso, TX 79968-0518. +1 915-747-5725 novick@utep.edu

SIGDOC’07, October 22–24, 2007, El Paso, Texas, USA.

ABSTRACT
Usability inspection methods, such as heuristic evaluation, the cognitive walkthrough, formal usability inspections, and the pluralistic usability walkthrough, were introduced fifteen years ago. Since then, these methods, analyses of their comparative effectiveness, and their use have evolved in different ways. In this paper, we track the fortunes of the methods and analyses, looking at which led to use and to further research, and which led to relative methodological dead ends. Heuristic evaluation and the cognitive walkthrough appear to be the most actively used and researched techniques. The pluralistic walkthrough remains a recognized technique, although not the subject of significant further study. Formal usability inspections appear to have been incorporated into other techniques or largely abandoned in practice. We conclude with lessons for practitioners and suggestions for future research.


By the early 1990s usability became a key issue, and methods for assuring usability burgeoned. A central argument in the field was the relative effectiveness of empirical usability testing versus other, less costly, methods (see, e.g., [10], [19], [20], [24]). Fullblown usability testing was effective but expensive.
Other methods, generally known under the category of usability inspection methods [33], held the promise of usability results that kept costs low by relying on expert review or analysis of interfaces rather than by observing actual users empirically. Several different approaches to usability inspection were proposed, including heuristic evaluation [32], the cognitive walkthrough [51], the pluralistic walkthrough [3], and formal inspections [22].

To provide perspective on usability inspection methods in light of the 15 years of research and practice since the 1992 workshop, we review subsequently reported work for the four principal approaches described in [33]: heuristic evaluation, the cognitive walkthrough, formal usability inspections, and the pluralistic usability walkthrough.


HEURISTIC EVALUATION

In 1990, Nielsen and Molich introduced a new method for evaluating user interfaces called heuristic evaluation [28], [31]. The method involves having a small group of usability experts evaluate a user interface using a set of guidelines and noting the severity of each usability problem and where it exists. They found that the aggregated results of five to ten evaluators of four interfaces identified 55 to 90 percent of the known usability problems for those interfaces.
They concluded that heuristic evaluation was a cheap and intuitive method for evaluating the user interface early in the design process. This method was proposed as a substitute for
empirical user testing [32], [35].

One study [19] compared the four best-known methods of usability assessment: empirical
usability testing, heuristic evaluation, the cognitive walkthrough, and software guidelines
. The study reported that heuristic evaluation found more problems than any other evaluation method, while usability testing revealed more severe problems, more recurring problems and more global problems than heuristic evaluation. Other researchers found that heuristic evaluation found more problems than a cognitive walkthrough, but only for expert evaluators [10]. System designers and non-experts found the same number of problems with both heuristic evaluation and cognitive walkthrough. Another study [24], however, found empirical testing to yield more severe problems than inspection methods and ascribed differences from the results found in [19] to differences in evaluator expertise.

Nielsen examined the role of expertise as a factor in the effectiveness of heuristic evaluation [32]. He compared evaluation results from three distinct groups of evaluators: novice, usability experts, and double experts, who have expertise both in usability and in the particular type of interface being evaluated. As could be expected, the double experts found more problems than the other two groups... Nielsen concluded that if only regular usability experts can be obtained, then a larger number of evaluators (between three and five) are needed than if double experts are available.

Jeffries and Desurvire [20] addressed this misconception by clearly listing all of the disadvantages of heuristic evaluation found throughout these studies.
A first disadvantage is that the evaluators must be experts, as suggested by Nielsen’s findings earlier in the same year.
A second disadvantage is that several evaluation experts are needed. The authors pointed out that it is difficult for some developers to obtain just one expert, much less many, and to use these experts several times throughout the development cycles, as Nielsen suggests, can become costly.
A third disadvantage is the cost. Most of the issues identified by heuristic evaluation in the studies were minor, and few of the severe issues were identified. Another cost is that some of the issues may be “false alarms,” issues that may never bother users in actual use.

...Nielsen responded to these comparison studies with one of his own [35]. He found that while user testing is more expensive than heuristic evaluation, it also provides a better prediction of the possible issues within the interface.

The consensus view appears to be that empirical testing finds more severe issues that will likely impede the user, but at a greater cost to the developer. Heuristic evaluation finds many of the issues within the interface, but cheaper and earlier.

Assessment of the effectiveness of heuristic evaluation continues as an active research thread.

(To be continued.....)

No comments:

Post a Comment