“Conducting a Usability Evaluation” – Patrick Jordan

Date : October 19, 2009

After having listed various evaluation methods, Jordan provides a guideline for choosing the most appropriate method(s) depending on the level at which a prototype is developed and shows an example of how such testing might be conducted. Preparing a final report involves a summary of the purpose, method(s) used, results, discussion based on those findings, and recommendations for further implementations. In terms of defining the purpose of the evaluation, this would mean writing up a goal-driven questionnaire, a script for the test itself and specific questions a moderator may have during the test, etc.

Purpose of the Evaluation
At different stages of the design process, there will be different purposes for doing a usability test, and the style of tests and questions employed.

Benchmarking existing products
Sometimes it may help to test available products that are of similar usage to understand how they perform and what your own product on interface has to offer (could it be faster, does it provide more practical options, etc.). Benchmarking existing products provides an insight into where your product stands amongst other products in real-life.

Requirements capture
Finding product specifications should be let early in the process. Testing users at this stage usually means that users will be free to explore a prototype in ways that help designers observe their various physical and cognitive characteristics (i.e. height, reach).

Evaluating initial concepts
With different concepts, testing users is very useful at this stage as the results will help a team choose one option for further implementations. This saves a lot of time! and eliminates “useless debates” as Steve Krug would put it.

Developing a concept
With the team set on a design concept and having knowledge of user requirements, developing a concept would mean creating a working prototype (preferably interactive). Testing at this stage helps gather performance and attitude measures (quantitative and qualitative data).

Testing a design
Testing at the latest stage of a concept with a “fully functional prototype” allows definitive decisions for tweaking minor details.

Evaluating a finished product
After the release of a product, testing is often used for getting insight into how the design could be improve for future products. This stage provides the most valid data for designers and developers as it shows how the product lives in the real world and what sort of problems occur contextually.

Selecting Evaluation Participants

It is important to pick participants that represent your end-user. However, some evaluation methods will require no participants (i.e. non-empirical methods) and other techniques will use colleagues, representative samples or real-end users depending on the availability and the constraints of a project.

Asking colleagues to step in as participants can be helpful in terms of uncovering global problems (visibility, hierarchy, etc.). However, it will be difficult to understand user attitude and belief, as colleagues have somewhat biased views of the product at test and will hence project misleading behaviour.

When selecting a representative sample it is usually advised to look for users in the same age range and gender type as the intended user. Making sure of the physical requirement helps evaluate user performance. However, this user will not be helpful at projecting valid corresponding attitudes toward the product and its broader connotations. Jordan gives the example of a vacuum cleaner and the context of domestic chores.

Real end-users are accessible after a product’s release to provide real life experience of the product.

The entire user population is realistically impossible to test, but a selection of as many variations may close the deal. This type of selectivity usually occurs when a product needs to respond to the population at large, which also includes the disabled, foreigners, and users with other specific needs.

Types of Data Required
Jordan suggests two types of data collection: quantitative data (performance, attitude), and qualitative data (performance, attitude).

Quantitative data (performance, attitude) is generally collected in the form of set responses. For example, How much time would you say you wasted looking for the product you wanted on this site? Responses could then be a set of choices such as 2- minutes, 2-4 minutes, 4-6 minutes, 6-8 minutes, 8+minutes. Performance measures quantify the time and error rate for different features of a product, again, in terms of ‘efficiency’ and ‘effectiveness’. Attitude measures quantify the subjective rate of the product. So, for 10 users, 5 users responded ‘fairly easy’ to the question How would you rate the ease of use of finding your way around this site? with a response-scale from ‘can’t use it’ to ‘very easy’.

Qualitative data (performance, attitude) is generally collected in the form of broad questions with open-ended responses. For example, How does this site make you feel? Responses would be either positive or negative or both. Interpreting those questions can be very complex for a moderator and participants sometimes find it difficult to use technical terms or express how they feel without rationalizing their experience. Qualitative data is important for finding solutions to a problem as participants are less limited by set-responses and can be more descriptive in their responses explaining why and how they felt a certain way for instance.

Constraints and Opportunities

Jordan reminds the reader of contextual constraints and opportunities when preparing and conducting a usability test (Deadlines, Investigator time, Participant time, Money available, Investigator knowledge, Participants available, Facilities and resources). As a particular kind of project may want a particular style of evaluation method, some constraints may obstruct the ideal scenario and will demand alternative testing. This is when investigators might want to send questionnaires out to as many participants as possible, have someone else conduct the test, use colleagues, no participants, etc. as a short-cut.

Reporting the Evaluation
Reports can be delivered in a number of ways: Written Reports, Verbal Presentations, and Video Presentations. Some may want to have a variety of written and visual representations in order to appeal to the engineers, designers, marketers, managers, and so forth, who may need more convincing data.

Written Reports

Writing a usability report starts first and foremost with a summary (test purpose, method(s) used, quantitative and qualitative results from user responses, discussion results from team debriefing, and a set of recommendations for product or interface iterations based on overall results). Assuming that people might not find the time to read the entire report, a summary at the very top, no more than 150 words, is very useful to either get the point across quickly and/or incite the reader to view the rest of the report with more details. The introduction includes the type of product being analyzed, the purpose of the evaluation, the reason for evaluating a concept at its stage, and the major questions that the investigator may want to find responses to during the test. For instance, Were users able to identify each link and interpret its correspondent sub-links? The method section should include the moderator’s script, a list of questions used for users and their subsequent responses, the types of tasks required of users to perform (i.e. book a ticket), protocol used (i.e. talk-aloud), a description and visual of the prototype used, the technology employed during each session (i.e. a camcorder, voice recorder, etc.), and a description of each participant (gender, age, background, belief, etc.). This part must also specify the duration of the test (typically 30 minutes with introduction and interview), and the compensation offered for the participant-time. The conclusion can be presented in the form of bullet points. responding to the investigator’s initial goals (questions and answers). Finally, recommendations concern the further implementations of that concept based on the findings of the evaluation.

Verbal Presentations

Making a verbal presentation has its advantages as you can make sure the important points you want people to read in the report, are made clear and concise in your speech, and in a short span. The speech should essentially give an overview of the product, method used, results, discussion and a proposal for further implementations. This approach must also include some written formats, for instance when presenting quantitative data.

Video Presentation

Showing raw visuals of people interacting with the prototype is appealing to most audiences (designer, engineer, marketers, etc.) because they can select from those visual the data that is most useful to them. People in action with an unfinished concept is a much more valid type of data collection as it truly translates the weaknesses and strength of a product or interface. It is a convincing tool and could replace a lot of the written report.

“Conducting a Usability Evaluation” – Patrick Jordan

Frog’s Research Learning Spiral (How the Why of data wins over the What)

“A Human-Centered Technology.” – Don Norman