Wednesday, November 4, 2009

Nov 5 - Baker, Heuristic Evaluation of Shared Workspace Groupware (MSc thesis)

Kevin F. Baker

Comments: My interest in Baker's works is due to the fact that I have read about his works indirectly in other people's dissertations and research.

Publications from this Research
An earlier version of the mechanics of collaboration heuristics (similar to Chapter 3) has appeared in the following peer-reviewed publication:
Baker, K., Greenberg, S. and Gutwin, C. (2001) Heuristic Evaluation of Groupware Based on the Mechanics of Collaboration. In M. Little and L. Nigay (Eds) Engineering for Human-Computer Interaction, LNCS Vol 2254, pp. 123-139.
The methodology, analysis, and results from this research study (Chapters 4 and 5) have been summarized in the report listed below. This report has been peer-reviewed and accepted for the upcoming ACM Computer Supported Cooperative Work conference (CSCW 2002).
Baker, K., Greenberg, S. and Gutwin, C. (2002) Empirical Development of a Heuristic Evaluation Methodology for Shared Workspace Groupware. Report 2002-700-03, Department of Computer Science, University of Calgary, Alberta, Canada.

Abstract
Despite the increasing availability of groupware, most systems are not widely used. One main reason is that groupware is difficult to evaluate. In particular, there are no discount usability evaluation methodologies that can discover problems specific to teamwork.
In this research study, I adapt Nielsen’s heuristic evaluation methodology, designed originally for single user applications, to help inspectors rapidly, cheaply, and effectively identify usability problems within groupware systems.
Specifically, I take the Gutwin and Greenberg’s (2000) mechanics of collaboration and restate them as heuristics for the purposes of discovering problems in shared visual work surfaces for distance-separated groups.
As a secondary objective, I revise existing Locales Framework heuristics and assess their compatibility with the mechanics.
I evaluate the practicality of both sets of heuristics by having individuals with varying degrees of HCI and CSCW expertise use them to uncover usability problems in two groupware systems. The results imply that practitioners can effectively inspect and evaluate groupware with the mechanics of collaboration heuristics where they can identify obstacles to real-time interactions over shared workspaces.
The Locales Framework heuristics are not as promising: while inspectors do identify problems inhibiting groupware acceptance, their practicality is limited and they require further improvements.


Chapter 1
Introduction

1.2 A brief survey of single-user evaluation techniques

Research in HCI has developed a multitude of evaluation techniques for analyzing and then improving the usability of conventional single user interfaces. Each methodology highlights different usability issues and identifies different types of problems; therefore, evaluators can choose and mix an appropriate technique to fit the needs and nuances of their situation (McGrath 1996).
There are three primary categories that have been used to distinguish these different types of methods: user observations, field studies, and interface inspections.

1.2.1 User observations

Techniques in this category are conducted in a lab, ideally using a representative sample of the eventual users performing tasks that depict how the product will be used in the “real” world. Evaluators uncover problems, called ‘usability bugs’, by observing the participants completing the tasks with the interface under evaluation.
User observation methodologies include controlled experiments and usability testing.

Controlled experiments.
Controlled experiments are used to establish a cause-and-effect relationship; it must be shown with certainty that the variation of experimental factors (i.e. the independent variables), and only those factors, could have caused the effect observed in the data (i.e., the dependent variable).
Rigorous control is used in these experiments to ensure that all other uncontrolled variables (i.e., confounding variables) do not affect the results and their interpretation.

Usability testing.
The goal of usability testing is to identify and rectify usability deficiencies. This is in conjunction with the intent to create products that are easy to learn, satisfying to use, and that provide high utility and functionality for the users (Rubin 1994).
Designers can manipulate the design of the product to allow them to see how particular features encourage or discourage usability. Participants can do several perhaps unrelated tasks that allow an evaluator to see how the human computer system performs over a broad set of expected uses. The product itself can be changed as the test progresses. For example, if pilot testing reveals certain problems, then the product can be modified midway to correct them.
When the product is tested with one individual, the participant is encouraged to think aloud during the test. This involves users talking out loud while they are performing a particular task in order to reflect cognitive processes. An evaluator observes the participant performing the task in question by focusing on occurrences such as errors made and difficulties experienced.
The information collected can then be applied to remedy the observed usability problems by going through another design iteration of the product, eventually leading to another usability test.

1.2.2 Field studies

A significant problem with performing an evaluation within the laboratory is the failure to account for conditions, context, and tasks that are central to the system’s real world use. Part of this failure stems from the fact that many developers of systems often have only partial or naïve knowledge of the “real world” setting where the end system will be used.
Field studies allow us to study systems in use on real tasks in real work settings, and to observe or discover important factors that are not easily found in a laboratory setting.
Two field study techniques are ethnography and contextual inquiry.

Ethnography
Ethnography is a naturalistic methodology grounded in sociology and anthropology (Bentley et al. 1992, Hughes et al. 1994, Randall 1996). Its premise is that human activities are socially organized; therefore, it looks into patterns of collaboration and interaction.
Randall (1996) stresses four features of ethnography that make it distinct as a method:
1. Naturalistic: involves studying real people and their activities within their natural environment. Only by studying work under these circumstances can one rightfully inform the system’s design.
2. Prolonged: it takes time to form a coherent view of what is going on especially for a complex domain.
3. Seeks to elicit the social world from the point of view of those who inhabit it: the appropriate level of analysis is the significance of the behaviour and not the behaviour itself.
4. Data resists formalization: the methodology stresses the importance of context; therefore, there is no ‘right’ data to be collected.

Data is gathered by the ethnographer observing and recording participants in their environment as they go about their work activities using the technology and tools available to them. This includes focusing on social relationships and how they affect the nature of work. To understand what the culture is doing, the ethnographer must immerse oneself within the cultural framework.
The goal of an ethnographic study for system design is to identify routine practices, problems, and possibilities for development within a given activity or setting. The data gathered usually takes the form of field notes but can be supplemented by audio and video data.

Contextual inquiry
To facilitate designing products, contextual inquiry employs an interview methodology to gain knowledge of what people do within their real world context (Holzblatt and Beyer 1996, 1999).
Specifically, this is accomplished by first conducting interviews through observations and discussions with users as they work. Target users are representatives of those for whom the system is being developed.

1.2.3 Inspection methods

Inspection methods have evaluators ‘inspect’ an interface for usability bugs according to a set of criteria, usually related to how individuals see and perform a task. These methods use judgement as a source of feedback when evaluating specific elements of a user interface (Mack and Nielsen 1994). Inspection techniques include heuristic evaluations, task-centered walkthroughs, pluralistic walkthroughs, and cognitive walkthroughs.

Heuristic evaluation
Heuristic evaluation is a widely accepted discount evaluation method for diagnosing potential usability problems in user interfaces (Mack and Nielsen 1994, Nielsen 1992,1993, 1994a,b).
With this methodology, a small number of usability experts visually inspect an interface and judge its compliance with recognized usability principles (the “heuristics”) (Nielsen 1992, 1993, 1994a).
Heuristics are general rules used to describe common properties of usable interfaces (Nielsen 1994a). During a heuristic evaluation, heuristics help evaluators focus their attention on aspects of an interface that are often trouble spots, making detection of usability problems easier. Noncompliant aspects of the interface are captured as interface bug reports, where evaluators
describe the problem, its severity, and perhaps even suggestions of how to fix it.
Through a process called results synthesis, these raw usability problem reports are then transformed into a cohesive set of design recommendations that are passed on to developers (Cox 1998).

Cognitive walkthroughs
Cognitive walkthroughs (Wharton 1994) are intended to evaluate the design of an interface for ease of learning, particularly by exploration. This is an extension of a model of learning by exploration proposed by Polson and Lewis (1990). The model is related to Norman’s theory of action that forms the theoretical foundation for his work on cognitive engineering (Norman 1988). Cognitive walkthroughs also incorporate the construction-integration model developed by Kintsch (1988).
These ideas help the evaluators examine how the interface guides the user to generate the correct goals and sub goals to perform the required task, and to select the necessary actions to fulfill each goal.

Task-centered walkthroughs
The task-centered walkthrough is a discount usability variation of cognitive walkthroughs.
It was developed as one step in the task-centered design process (Lewis and Rieman 1994). This process looks to involve end users in the design process and provide context to the evaluation of the interface in question.

Pluralistic walkthroughs
Pluralistic walkthroughs (Bias 1994) are meetings where users, developers, and human factors people step through an interface for the purposes of identifying usability problems.
A pre-defined scenario dictates the participants’ interaction with the interface. The scenario ensures that the participants confront the screens just as they would during the successful conduct of the specified task online.
The walkthrough begins when the participants are presented a hardcopy snapshot of the first screen they would encounter in the scenario. Participants are asked to write on the hardcopy of the first panel the actions they would perform while attempting the specified task. After all participants have written their independent responses, the walkthrough administrator announces the “right” answer. The participants verbalize their responses and discuss potential usability problems due to “incorrect” answers.

1.3 Problems applying single-user techniques to groupware evaluation

1.3.3 Inspection methods

As in single-user applications, groupware must effectively support task work. However, groupware must also support teamwork, the ‘work of working together’. Inspection methods are thus limited when we use them ‘as-is’, for they do not address the teamwork components necessary for effective collaboration with groupware.
For example, Nielsen lists many heuristics to guide inspectors, yet none address ‘bugs’ particular to groupware usability.
Similarly, a cognitive walkthrough used to evaluate groupware gave mixed and somewhat inconclusive results (Erback and Hook 1994). Other researchers are providing a framework for typical groupware scenarios that can form a stronger basis for walkthroughs (Cugini et al. 1997).

I speculate in this research study that some of these inspection techniques can be altered to evaluate groupware.
Specifically, I chose to adapt Nielsen’s heuristic evaluation methodology since it is popular with both researchers and industry for several important reasons. It is low cost in terms of time since it can be completed in a relatively short amount of time (i.e., a few hours). End-users are also not required; therefore, resources are inexpensive. Because the heuristics are well documented and worked examples have been made available (e.g., Nielsen 1994a, b), they can be easy to learn and apply. Also, heuristic evaluation is becoming part of the standard HCI curriculum (e.g., Greenberg 1996) and thus known to many HCI practitioners. Non-usability experts can also use this technique fairly successfully (Nielsen 1994a). As well, it is cost-effective: an aggregate of 3-5 usability specialists will typically identify ~75% of all known usability problems for a given interface (Nielsen 1994b). All these factors contribute to the significant uptake of heuristic evaluation in today’s industry since this technique can be easily and cost-effectively integrated into existing development processes while producing instant results.
In expanding heuristic evaluation for the purposes of evaluating groupware, I look to capitalize on all these factors that make this methodology a success.

1.4 Problem statement and research goals

The motivation behind this research is that current real-time distributed groupware systems are awkward and cumbersome to use, a situation partly caused by the lack of practical groupware evaluation methodologies.
My general research goal is to develop and validate a groupware evaluation methodology that is practical in terms of time, cost, logistics, and evaluator experience, while still identifying significant problems in a groupware system.
To narrow the scope, I adopt an existing discount usability technique to real-time, distributed groupware supporting shared workspaces. Real-time distributed groupware encompasses collaborative systems that enable multiple people to work together at the same time but from different locations. A shared workspace is “a bounded space where people can see and manipulate artifacts related to their activities.” (Gutwin 1997). This application genre is very common (e.g., real-time systems for sharing views of conventional applications).
Specifically, I focused on heuristic evaluation as this methodology in its current state satisfies the practicality criteria of time, cost, logistics, evaluator experience while still identifying significant problems in a single user systems. I believe that this technique and its strengths can be extended to assessing collaborative systems.

From this general research goal, my specific research sub-goals follow.
1. I will propose a new set of heuristics that can be used within the heuristic evaluation methodology to detect usability problems in real-time, distributed groupware with a shared workspace.
2. I will demonstrate that the adapted heuristic evaluation for groupware remains a ‘discount’ usability technique by analyzing the ability of inspectors to identify problems in collaborative applications.

1.5 Research direction

At this point, I need to elaborate on the circumstance and my resulting decisions that led to the main thrust of my research study: to derive and validate groupware heuristics based on the mechanics of collaboration. The purpose is to provide insight into my disproportionate focus with the Locales Framework heuristics.

My original objective was to build upon Greenberg et al.’s (1999) preliminary work on the Locales Framework heuristics. While conventional heuristics are easy to learn and apply, an outstanding concern from the original study was that heuristics based on the Locales Framework are complex, which in turn might require a greater level of evaluator training and experience. To that extent, I set out to assess these heuristics by studying how well inspectors unfamiliar with the Locales Framework were able to apply these heuristics to identify usability problems in groupware systems.
Shortly afterwards Gutwin and Greenberg (2000) introduced the mechanics of collaboration framework. This framework was created with low-cost evaluation methods for groupware in mind; therefore, I decided to refocus my research in this direction.
Subsequently, this research study concentrates on creating and validating the mechanics of collaboration heuristics.
While I still explore the locales framework heuristics, they are not my primary area of interest and hence I have devoted less time and effort in this study to their validation.

1.6 Research overview

Chapter 2 chronicles Jakob Nielsen’s design, validation, and subsequent evolution of his original 10 heuristics for the purposes of evaluating single-user interfaces. ...I believe it is necessary to provide a brief history on how the existing heuristic methodology was developed, validated, and updated.

Chapter 3 describes in detail the eight heuristics derived from Gutwin and Greenberg’s (2000) mechanics of collaboration framework. These heuristics form the basis for the rest of the research. In addition, five complementary heuristics evolving from the Locales Framework (Fitzpatrick 1998) are also briefly introduced.

Chapter 4 details the two-step methodology I employed to validate the groupware heuristics as a discount usability method for groupware.
First, a pilot study was conducted to review and subsequently improve the heuristics. Next, two categories of inspectors with varying levels of expertise in HCI and CSCW used the revised heuristics to evaluate two groupware systems. The resulting problem reports form the raw data for the forthcoming analysis.

Chapter 5 describes the results synthesis process employed to transform the inspectors’ raw problem reports into a consolidated list of usability problems for each groupware system.
Next, I systematically analyze these lists to derive conclusions regarding the practicality of both sets of groupware heuristics.
Finally, I discuss some of the factors affecting my results and how I interpret these results.

Chapter 6 summarizes how the goals of my research have been satisfied and the contributions made. In addition, I look to the future and discuss what still needs to be done to help evolve the heuristics for the purposes of developing a robust and effective low-cost technique for evaluating groupware.


Source: Baker, Kevin F. Heuristic Evaluation of Shared Workspace Groupware based on the Mechanics of Collaboration. [Thesis, M.Sc.] University of Calgary, Calgary, Alberta, Canada. May 2002.

No comments:

Post a Comment