Designing for Usability - Nour Diab Yunes

Frog’s Research Learning Spiral (How the Why of data wins over the What)

Date : October 13, 2013

As clients are becoming increasingly aware of ethnographic-ish research being an essential and integral part of the design process for a high quality user experience offer, it is important for us (designers) to learn an appropriate format for approaching and customizing research on a project basis.

Frog’s Research Learning Spiral, as David Sherwin names it in A Five-Step Process for Conducting User Research, allows us to think of research as not only a set of insight-focused methodologies and collaborative practices, but also very much so as a process of articulating and defining the focus area and scope of the research itself through its 5 learning stages: Objectives, Hypotheses, Methods, Conduct, and Synthesis.

I would like to focus on the the 3 early stages of the spiral. These are fundamental in situating the research area and addressing design questions with methodologies geared towards feeding our knowledge lexicon of people and things in their habitual contexts according to pre-defined objectives and hypotheses.

1 • Objectives focus on the framing of questions following the 3 Ws and an H structure: Who What When Where Why and How. These together help us define who the demographic user base is, what activities they might be involved in while using our service or product, when they would be engaged in such activities, where these activities would take place, under which emotional or rational states (why), and using which processes. These questions are in turn reformulated into simple statements of research objectives, which outline the scope of the research effort.

2 • Hypotheses are assumption packed opinions or suppositions we have about a product or service, its users, and the contextual settings in which the product acts, which are meant to be tested and challenged. Sherwin lists 3 types of hypotheses: attitude (what users would like to get out of a service), behavior (what users would like to do with the service), and feature (which feature users would most enjoy using).

3 • Methods — such as contextual enquiry, surveys, interviews, and benchmarking — can help prove or disprove these hypotheses by revealing key data about a demographic user, their contextual environments, and identify leverage points wherein design can affect their everyday and provide positive change or support. Other more participatory activities which involve probing users — such as diary studies, card sorting, and paper prototyping — can serve as experiential idea generating methods with a capacity for drawing design solutions and concepts that meet the user’s needs and mental models. Finally, evaluative methods — such as usability testing, heuristic evaluations, and cognitive walkthroughs — will demonstrate whether these ideas are effective, useful, and desirable.

[F]inding meaning in your data […] can mean reading between the lines and not taking a quote or something observed at face value. The why behind a piece of data is always more important than the what.” — Lauren Serota, Associate Creative Director at Frog Design

According to Sherwin, data tells us what and when users do things, but not why. Context is in fact king. Integrating such framework for user research helps provide us with the contextual understanding — the understanding of given demographics’ everydayness — for making more informed design decisions.

I am particularly interested in the name “learning spiral”: a looping process that doesn’t need to be lengthy, costly, and a unique event. It is spiral and has the potential of being a cyclical and iterative process, which can be applied as needed at different stages of a design process and with a different scope. That spiral from which I can learn allows me to investigate more specific areas of my users’ everyday by defining learning objectives.

While every research endeavour has a plan and objectives, i particularly found this interpretation because it gives importance to the planning and framing of research and integrates objectives definition as part of the research itself. Typically research seems to begin with contextual inquiry and interviewing right away as a recourse to inform the design approach and concept which does not necessarily end up being a solution that is desirable. Involving research participants in the framing of the research seems to be a more inclusive and humane approach that is bound to have a worthwhile and desirable quality.

— source: http://bit.ly/1eCoUyH

“A Human-Centered Technology.” – Don Norman

Date : December 1, 2009

When we speak of technology what we are often referring to are communication platforms, such as computers and cellular phones. For Don Norman (1993), anything that is created by human beings is a technological artifact, whether physical (tangible objects, i.e. paper) or mental (information structures, i.e. language). Norman divides technological innovations into good and bad, those that “make us smart” by extending human capabilities and those that “make us stupid” by frustrating, captivating, enslaving. His focus involves the understanding of both technologies that extend cognition and those that manipulate cognition.

While innovation has a historical contribution in terms of providing appropriate and efficient tools for society that help enrich human knowledge and enhance mental capabilities (memory, thought, reflection), it has however created modes of entertainment that increasingly promote a consuming rather than a productive people and is accountable for the apparent social divider between the haves and havenots of technology.

Moreover, new technologies create new social practices, specializations and knowledge requirements. This in turn adds a new set of social dividers between those who possess the required skills and those who don’t. The societal implications of new technologies have become more and more complex with the advent of computational systems. Norman’s aim is as follows: “If we learn the reasons for and the properties of these various technologies, then maybe we can control the impact.” His aim is to influence a human-centered approach to technology; a quest for technologies that demonstrate humane properties.

“Toward a Human-Centered View of Technology and People”

As technology becomes increasingly immaterial, its function concerns more and more the fulfillment of abstract needs, such as time-management, decision-making, methodologies, etc. Norman’s critique is that rather than aiding human cognition, technologies today add confusion. He notes that “distractibility”, “illogical” behaviour, and other concurrent characterizations are responses to the rigidness of the way in which computers operate. Those characterizations are related to the constant need for technology to impose a certain type of behavioral and mental effort (i.e. to pay attention, to speak grammatically, etc.). In addition, the information overload creates a feeling of inadequacy in society; such as, the inadequacy of the mind the memorize a certain amount of information, the incapacity to cope with innovation, etc. His premise is that “people err,” and computers don’t, instead they are faulty and the blame is made on the human.

‘To say that people often act illogically is to say that their behavior doesn’t fit the artificial mathematics called “logic” and “decision theory.” ‘

Hence, this adds yet another social divide: machine-centered behavior and human-centered behavior. For Norman, the problem lies in the way technology inadequacy to adapt to the human. Technology would benefit society if it understood more the way in which human cognition works. Norman also criticizes technology for creating an information saturation wherein humans are exposed to too much information that results in a so-called mental “burnout”. To illustrate this side-effect, Norman distinguishes between what technology is able to measure (i.e. hours worked per week) and what it can’t (i.e. satisfaction, rupture, experience), which he respectively refers to as “hard sciences” and ‘ “soft” sciences.’ The “hard sciences” make simple and objective measurements, leaving out what it cannot measure; whereas the ‘ “soft” sciences’ concerns itself with the complexity of subjective measurements, thereby making a link between technology and the living system.

“Two Kinds of Cognition”

The author identifies two kinds of cognition: experiential and reflective. Experiential cognition refers to the way we acquire expert skills through the use of our bodily skills; reflective cognition refers to mental thought processes that inform the development of new concepts and appropriate structures that serve as guides for action. Those are complementary and essential in the sense that they include both inner (mental) and outward (physical) practices.

The point here is that technologies that excite the mind (such as, TV) cannot be a substitute for experience, and hence cannot provide adequate modes of cognitive enhancement. “Experiential thought” is only possible through an active participation and engagement of the world.

Source:
Norman, Don. “A Human-Centered Technology.” Things That Make Us Smart: Defending Human Attributes in the Age of the Machine. Perseus Books, U.S.A., 1993.

How to Plan, Design, and Conduct Effective Tests – Jeffrey Rubin & Dana Chisnell

Date : November 12, 2009

“What Is Usability Testing?”

Rubin & Chisnell define ‘usability testing’ as “a process that employs people as testing participants who are representative of the target audience to evaluate the degree to which a product meets specific usability criteria.” (p.21) For them, the overall methodology is the same, but each test will have different results depending on the stage at which a product is tested and the type of goals and concerns a development team has for testing. They propose 4 types of informal tests that can be employed iteratively and that correspond to the different stages of the product at test and respond to issues of: exploration, assessment, validation, and comparison.

Why Test? Goals of Testing

Testing has two main objectives: on the marketing level, it aims at improving sales; on a user-centered level, it aims at minimizing user frustrations and maximizing a product’s usability. As the authors point out, testing goals inform the design of a product; those work in terms of the usefulness or relevance, learnability, efficiency, effectiveness, and satisfaction factors.

Basics of the Methodology

The authors emphasize the need for ‘tight controls’ when screening participants. This means making sure you recruit participants with similar backgrounds, such as: novice computer users. Assuming that you’ve picked the right participants for each test, testing will help you collect quantitative data that determine usability problems, causes and potential solutions.

Each test should involve: a purpose (goals and concerns), a participant (or two), an environment (i.e. a lab or an office room), observation method (a camera and observer who take notes), a methodology (interview or survey questions, and key tasks), data collection (qualitative and quantitative performance measures), results-debriefing (determining solutions for further implementations of the product).

Rubin & Chisnell make it a point that testing is not the ultimate answer, but is always better than not testing at all. In this sense, testing should be thought of as a practical tool for eliminating major problems and finding appropriate solutions for them. However, “Testing does not guarantee success or even prove that a product will be usable.” (p.26) The environment is always somewhat artificial and never the ideal place or circumstance of eventual use. Also, results of a test will depend greatly on the manner in which a test is conducted. This is why choosing an experienced moderator is crucial and making sure you have the right participants will also affect the validity of your data. Testing participants is one way of revealing usability problems, but different products call for different evaluation methods, some empirical and some non-empirical.

“When Should You Test?”

Based on the stage of development of your product, the authors suggest 4 corresponding types of tests, each focused on revealing different types of data in terms of your specific goals and concerns: exploratory (early stage), assessment (early and middle stages), validation (near-final stage), and comparison (all stages). When used as a complement to an iterative design process, those tests help ‘ “shape” the product to fit the end users’ abilities, expectations, and aptitude.’ (p.28)

The Exploratory or Formative Study is used at the early stage of the development process of your product. Here, testing will allow a high-level evaluation (horizontal functions) of the core concepts and ideas by revealing qualitative data. This stage also helps you reconsider your assumptions of your end users and understand them better in terms of their abilities to comprehend the visual language of an interface or product, their expectations of what a product does and how it functions, and their aptitude to learn and perform.

The Assessment or Summative Test is used early on and/or midway in the development process of your product. This test is a low-level evaluation (vertical functions) of the key tasks involved in using the product. How successful are the participants at completing at task? is typically your testing purpose and results in a quantitative performance data collection.

The Validation or Verification Test is used late in the process. The objective of this study is to define a appropriate ‘usability standard’ for future products and verify the efficiency and effectiveness of your product based on your two previous test results. Here, the test looks at various quantitative measurements such as: How well do users perform? How much time does it take them to complete a task? How consistent is the overall architecture of your design?

Finally, the Comparison Test usually comes in two forms: evaluating similar conceptual designs with slight variations or evaluating disparate alternatives of the same content. This method can be employed at any stage of the development process: to compare and explore different design layouts and visual (graphical or textual) languages used, and to determine the stance of your product against the competition.

“Recruiting Participants” – Dumas & Redish

Date : November 11, 2009

Dumas & Redish lay-out the steps and tactics for recruiting participants; those include: Finding Appropriate Participants, Reaching and Screening Potential Participants, Deciding Who Should Recruit Participants, Knowing What to Say When You are Recruiting, Arranging for Payment or Other Incentives, and Having a Back-up — But Not Double Booking

While testing non-representative participants may reveal some high-level problem, they will inevitably result in invalid and misleading feedback. The authors present 2 major problems: user-tolerance and user-understanding of language used throughout an interface may vary greatly from expert to novice user.

It may be that you will find the appropriate participant by browsing through your competitor’s customer-list or your own list where customers have purchased an older version of your product for instance. Those lists can be difficult to access, but those participants are more likely to be your target-audience and will be willing to show-up and try it out as it is relevant to their job or specific goals.

Using company employees can lead to inappropriate results as they may be familiar with the language used and may have been in contact with some parts of the development process of the product. As a resource, the authors suggest working with temporary agencies, advertising, networking, and working with professional associations.

Reaching and Screening Potential Participants concerns selecting participants according to a set of criteria that define your different end-users in terms of their job description, expertise, and so forth. Creating subgroups of participants as you recruit helps you determine whether you would want them to join your test or to attend a later test. Those typically take the form of phone-interview or physical questionnaire to be filled out. Depending on the subgroups you have defined, your questions will have to be specific and short. Additional information about participants may eventually force you to create a new set of subgroups.

Reaching participants is probably the most important part of the process. How do you get people interested and excited about a product? An introductory letter involves introducing the type of product as it relates to the participant’s goals or practice for instance, preparing them for the environment where they will be tested and the protocol employed (i.e. you will have to use a product while verbalizing your thoughts, etc.), indicating the length of the test, and mentioning the type of compensation for participating (money, product samples, etc.)

Who Should Recruit Participants?Clearly, if you are not a good people’s person then by all means get someone else to do the recruiting for you. Having an excited voice at the other end of the phone is what tells you this person is interested and will try his/her best to show up.

How Many Calls It Will Take? According to Dumas & Redish, it may take from 4 up to 15 phone calls to find one participant who is interested in the product and who corresponds to one of your set subgroups. In their experience:

The more specialized skills and experience you need, the more calls you should expect to make.” (p.145)

Depending on how specialized you seek your participants to be, you will want to allow yourself enough time for finding them prior to your ideal testing date. This means starting your search a month in advance. Some participants will be able to attend a test with a one or two days notice. One solution to participants forgetting the show-up or not being able to due to personal or unavoidable reasons would be, as the writers suggest, to prepare a one week or two days reminder before the scheduled date.

Another practical tactic would be to send a confirmation email or message that they can go back and refer to when the test approaches. This letter would include pretty much the content of the introduction letter, the date, time and place, as well as a contact they can use in case they get lost on the way or need to reschedule. This also means that you need to be ready to test on other days in case of no-shows.

The Simplicity Shift – Scott Jenson

Date : October 28, 2009

“Feature Blindness”

Scott Jenson defines ‘feature blindness’ as users being blinded by a feature list. He identifies the bottom-up approach of creating a user persona and a task scenario as more efficient than the top-down approach, in terms of organizing a commonsensical hierarchy of a product’s features. The ideal visualization would be to ‘tame’ the feature list and prioritize features in accordance to a set of usage requirements and subsequent usage frequency.

UnFeatures: There’s More Than Meets The Eye
In considering the context of use, designing a product requires considerable knowledge of the end-user’s lifestyle. In this sense, the product will have to fit a type of skilled-user. UnFeatures, as Jenson explains, “is a broad set of problems that needs to grow over time.” (p.75) For instance, what would be the setup requirements of a product? Does it allow for direct usage, or are there instructions to follow prior to a full-use interface? With broad markets there is a need for easy and simple step-instructions. A product strategy is planned to account for errors and facilitate error-correction. Jenson suggests an example of a ‘search button’ that does the work for users to avoid rejection and enhance user forgiveness and product acceptance. UnFeatures should be included in the product strategy plan at the very beginning of the design process. For instance laying down the products features (functions offered by the product) and the unfeatures (concerns such as recovering from errors and user feedback) can help create a hierarchy of relevance.

The Priority Trick
Having features and unfeatures is great, but when it becomes very confusing for users when too many options are available. Jenson notes that creating a persona/scenario helps shape the hierarchy of your features. Quantitative data is usually handy here: frequency of use, and a preference ranking are some of Jenson’s examples of how to decide on what’s important to users.

Make the Easy, Easy and the Hard, Hard
If we carefully follow Jenson’s method, with a persona/scenario determined and a un/features list prioritized, at this point the skeleton of the product can be build and secondary features can be worked out through sub-menus for example and elaborate layering. Keeping it simple, as the title of his book suggests, seems to be Jenson’s motto. One trick here would be to calibrate the flexibility of a product while considering the complexity and simplicity factor. If more flexibility creates ambiguity, then limited depth insures a higher usability level.

Another trick is to consider both expert and emerging users, and in doing so, expert features will be hidden in the depth of the interface. This way, emerging users will not be overwhelmed by a multitude of features that may not have compatible relevance. So features addressing specific sub-category users (expert users for instance) demand more timely investment in digging out the features, which allows the broader audience to have access to minimal and immediate core functions.

If we look at a “feature list as a source of design confusion” (p.89), we can begin to break them down into categories and organizing their hierarchical importance in terms of the end-user/persona characteristics.

“Innovation Blindness“

Scott Jenson defines ‘innovative blindness’ as members in a team that hold “attitudes that discourage innovative thinking.” (p.113) He critiques corporations that follow previously established guidelines for designing rather than thinking outside the box and looking for new ways of designing meaningful products for meaningful experiences. In his opinion, it is important to question the design in context and its underlying rules.

Where innovation occurs is when you get design dreamers and pragmatic programmers working together to get brilliant design implemented with as little work as possible.” (p.114)

See the Water
The water becomes hard to see when designing new products is made in terms of how previous products work. Here the author gives two of the more commonly employed: the digital watch and critical elements of desktop appliances such as the scroll-bar. He emphasizes conventions and template designs a great hindrance to innovative breakthroughs. Scroll-bar do not need to be included everywhere for the sake that they are useful on the web. New products need new conceptions to create new meaning and new ways of navigating space.

Embrace the impossible
One way to go about encouraging change and innovation in a team is to create a synergistic environment for both pragmatic programmers and dreamer designers to talk, negotiate, collaborate and agree on a compromise that meets both of their concerns and goals. Jenson suggests trying new ideas and failing in the process. This type of approach however is only made possible in a team that promotes a positive attitude to iterative design. Design that does not fail is mostly a design that is done safely and that more often than not will be quite similar its predecessors. Questioning design templates helps designers engage in innovative attempts and “See the Water.” One factor to consider here would be user expectations and how deviating too far from them may obstruct a product’s usability. Zeroing in again on scenarios and personas helps achieve a balance between pragmatism and innovation and hence provides a meaningful ground for designers and programmers to articulate their respective needs and generate collective design solutions. Jenson calls this a “win/win situation.” (p.122)

Fail fast
By Fail fast, the author suggests starting with an idea or concept that may be frail, putting in on paper, working out the ‘works’ and ‘don’t works’ of it and proceeding in a frequent iterative fashion. He offers three typical methods used for innovating and evaluating concepts: Sketching (web sketch, increasing fidelity); Interactive Demos; User Testing.

Innovation doesn’t flow fully formed through your fingers on to the paper. It only comes in fits and starts, and, sometimes, you have to work your way through dozens of designs to find your way to the one that pulls it all together.” (p.132)

“Defining Goals and Concerns” and “Deciding Who Should Be Participants” – Joseph Dumas & Janice Redish

Date : October 26, 2009

Before you begin testing it is important to set your primary evaluation goals and concerns. Once those have been defined, it will be possible to plan a usability test accordingly. Dumas & Redish raise the following processes: making choices among goals and concerns; moving from general concerns to specific ones; and understanding sources of goals and concerns.

Making Choices Among Goals and Concerns
Depending on the stage at which a product has been designed and prepared for testing, goals and concerns will differ. In any case, they should be defined in order to uncover the appropriate problems.

Moving From General Concerns to Specific Ones
The authors suggest both looking at general and specific concerns, those that help frame a type of user and those that may suggest specific tasks for evaluation.

Understanding Sources of Goals and Concerns
– Concerns from task analysis and quantitative usability goals: Quantitative performance data may, for instance, refer to how much time participants will spend on a given task.
– Concerns from a heuristic analysis or expert review: Local problems raised by heuristic analyses or expert reviews help guide some of the questions you may want answered during an evaluation.
– Concerns from previous tests: In the context of iterative testing, questions may concern the problems revealed in previous tests.

Selecting who are the most appropriate participants for testing can be challenging. At this point you should seek representative users. The authors suggest: developing user profiles, selecting subgroups for a test, defining and quantifying characteristics for each subgroup, and finally deciding how many people to include in a test.

Developing user profiles

User profiling is assessed collectively within the team, and involves marketers, engineers, and designers. Profiles will differ depending on what type of product is to be tested, and whether it is a novel or a upgraded version of an existing product.
– Relevant characteristics are found in two categories: those that are found to be common to all users and those that may be specific to some users. – Grouping the target audience into different subgroups helps you decide on which factors matter most. Those subgroups may concern background information such as: users’ specializations, their general and specific computer skills (if any), their knowledge of the product at test, and their experience with similar products. – It is also important when testing the usability of a product to think broadly about users and to seek non-representative users as well. These type of participants will give an insight as to how efficiently this product might correspond to non-expert users.

Selecting subgroups for a test
Dumas & Redish define subgroups as “people who share specific characteristics that are important in the user profile.” (p.123) They propose beginning with a pretest questionnaire or survey for defining those different categories or subgroup users. A detailed understanding of user characteristics can be found through collecting quantitative responses. This helps dividing users into subgroups with limited and measurable requirements which include novice, experienced, and intermediate users.

Deciding how many people to include in a test
Finally, it is important to limit the number of subgroups and the amount of participants within each subgroup before you begin testing. As suggested in Nielsen and Molich 1990’s study, between 4 to 10 participants would suffice to uncover 80-90% of usability problems. Dumas & Redish propose to test 6 to 12 users to make up 2 to 3 subgroups of at least 3 users per subgroup to avoid collecting idiosyncratic data. While testing you will find that similar problems within subgroups as well as across subgroups will emerge, which help determine a number of global problems to consider when redesigning.

“Conducting a Usability Evaluation” – Patrick Jordan

Date : October 19, 2009

After having listed various evaluation methods, Jordan provides a guideline for choosing the most appropriate method(s) depending on the level at which a prototype is developed and shows an example of how such testing might be conducted. Preparing a final report involves a summary of the purpose, method(s) used, results, discussion based on those findings, and recommendations for further implementations. In terms of defining the purpose of the evaluation, this would mean writing up a goal-driven questionnaire, a script for the test itself and specific questions a moderator may have during the test, etc.

Purpose of the Evaluation
At different stages of the design process, there will be different purposes for doing a usability test, and the style of tests and questions employed.

Benchmarking existing products
Sometimes it may help to test available products that are of similar usage to understand how they perform and what your own product on interface has to offer (could it be faster, does it provide more practical options, etc.). Benchmarking existing products provides an insight into where your product stands amongst other products in real-life.

Requirements capture
Finding product specifications should be let early in the process. Testing users at this stage usually means that users will be free to explore a prototype in ways that help designers observe their various physical and cognitive characteristics (i.e. height, reach).

Evaluating initial concepts
With different concepts, testing users is very useful at this stage as the results will help a team choose one option for further implementations. This saves a lot of time! and eliminates “useless debates” as Steve Krug would put it.

Developing a concept
With the team set on a design concept and having knowledge of user requirements, developing a concept would mean creating a working prototype (preferably interactive). Testing at this stage helps gather performance and attitude measures (quantitative and qualitative data).

Testing a design
Testing at the latest stage of a concept with a “fully functional prototype” allows definitive decisions for tweaking minor details.

Evaluating a finished product
After the release of a product, testing is often used for getting insight into how the design could be improve for future products. This stage provides the most valid data for designers and developers as it shows how the product lives in the real world and what sort of problems occur contextually.

Selecting Evaluation Participants

It is important to pick participants that represent your end-user. However, some evaluation methods will require no participants (i.e. non-empirical methods) and other techniques will use colleagues, representative samples or real-end users depending on the availability and the constraints of a project.

Asking colleagues to step in as participants can be helpful in terms of uncovering global problems (visibility, hierarchy, etc.). However, it will be difficult to understand user attitude and belief, as colleagues have somewhat biased views of the product at test and will hence project misleading behaviour.

When selecting a representative sample it is usually advised to look for users in the same age range and gender type as the intended user. Making sure of the physical requirement helps evaluate user performance. However, this user will not be helpful at projecting valid corresponding attitudes toward the product and its broader connotations. Jordan gives the example of a vacuum cleaner and the context of domestic chores.

Real end-users are accessible after a product’s release to provide real life experience of the product.

The entire user population is realistically impossible to test, but a selection of as many variations may close the deal. This type of selectivity usually occurs when a product needs to respond to the population at large, which also includes the disabled, foreigners, and users with other specific needs.

Types of Data Required
Jordan suggests two types of data collection: quantitative data (performance, attitude), and qualitative data (performance, attitude).

Quantitative data (performance, attitude) is generally collected in the form of set responses. For example, How much time would you say you wasted looking for the product you wanted on this site? Responses could then be a set of choices such as 2- minutes, 2-4 minutes, 4-6 minutes, 6-8 minutes, 8+minutes. Performance measures quantify the time and error rate for different features of a product, again, in terms of ‘efficiency’ and ‘effectiveness’. Attitude measures quantify the subjective rate of the product. So, for 10 users, 5 users responded ‘fairly easy’ to the question How would you rate the ease of use of finding your way around this site? with a response-scale from ‘can’t use it’ to ‘very easy’.

Qualitative data (performance, attitude) is generally collected in the form of broad questions with open-ended responses. For example, How does this site make you feel? Responses would be either positive or negative or both. Interpreting those questions can be very complex for a moderator and participants sometimes find it difficult to use technical terms or express how they feel without rationalizing their experience. Qualitative data is important for finding solutions to a problem as participants are less limited by set-responses and can be more descriptive in their responses explaining why and how they felt a certain way for instance.

Constraints and Opportunities

Jordan reminds the reader of contextual constraints and opportunities when preparing and conducting a usability test (Deadlines, Investigator time, Participant time, Money available, Investigator knowledge, Participants available, Facilities and resources). As a particular kind of project may want a particular style of evaluation method, some constraints may obstruct the ideal scenario and will demand alternative testing. This is when investigators might want to send questionnaires out to as many participants as possible, have someone else conduct the test, use colleagues, no participants, etc. as a short-cut.

Reporting the Evaluation
Reports can be delivered in a number of ways: Written Reports, Verbal Presentations, and Video Presentations. Some may want to have a variety of written and visual representations in order to appeal to the engineers, designers, marketers, managers, and so forth, who may need more convincing data.

Written Reports

Writing a usability report starts first and foremost with a summary (test purpose, method(s) used, quantitative and qualitative results from user responses, discussion results from team debriefing, and a set of recommendations for product or interface iterations based on overall results). Assuming that people might not find the time to read the entire report, a summary at the very top, no more than 150 words, is very useful to either get the point across quickly and/or incite the reader to view the rest of the report with more details. The introduction includes the type of product being analyzed, the purpose of the evaluation, the reason for evaluating a concept at its stage, and the major questions that the investigator may want to find responses to during the test. For instance, Were users able to identify each link and interpret its correspondent sub-links? The method section should include the moderator’s script, a list of questions used for users and their subsequent responses, the types of tasks required of users to perform (i.e. book a ticket), protocol used (i.e. talk-aloud), a description and visual of the prototype used, the technology employed during each session (i.e. a camcorder, voice recorder, etc.), and a description of each participant (gender, age, background, belief, etc.). This part must also specify the duration of the test (typically 30 minutes with introduction and interview), and the compensation offered for the participant-time. The conclusion can be presented in the form of bullet points. responding to the investigator’s initial goals (questions and answers). Finally, recommendations concern the further implementations of that concept based on the findings of the evaluation.

Verbal Presentations

Making a verbal presentation has its advantages as you can make sure the important points you want people to read in the report, are made clear and concise in your speech, and in a short span. The speech should essentially give an overview of the product, method used, results, discussion and a proposal for further implementations. This approach must also include some written formats, for instance when presenting quantitative data.

Video Presentation

Showing raw visuals of people interacting with the prototype is appealing to most audiences (designer, engineer, marketers, etc.) because they can select from those visual the data that is most useful to them. People in action with an unfinished concept is a much more valid type of data collection as it truly translates the weaknesses and strength of a product or interface. It is a convincing tool and could replace a lot of the written report.

“Evaluating Usability Throughout Design and Development” – Joseph Dumas & Janice Redish

Date : October 18, 2009

Joseph Dumas & Janice Redish analyze the potential benefits of combining various evaluation methods as complementary tools for usability testing in the context of usability engineering.

Getting experts to review the design

Calling in an Expert allows for a heuristic evaluation of a product’s features. The authors give the example Nielsen and Molich (1990) who, given statistical finding recommend that an interface be evaluated by a small group of 3 to 5 engineers, and provided a set of guidelines for evaluating usability. Those are: Use simple and natural language; Speak the user’s language; Minimize user memory load; Be consistent; Provide feedback; Provide clearly marked exits; Provide shortcuts; Provide good error messages; and Prevent errors. (p.65)

The advantages of heuristic evaluations are that they reveal local problems, whereas usability testing reveals global problems that users may encounter. The authors suggest using both methods to eliminate the most usability problems.

Having peers or experts walk through the design

Walkthroughs are somewhat similar to cognitive walkthroughs, as they involve a group of peers and usability experts analyzing the steps-requirement for accomplishing a task and find solutions for minimizing the time-cost and simplifying the complexity of that task. To distinguish the two variations: walkthroughs help revisit a design concept and determine programming alterations and layout, whereas cognitive walkthroughs are usually conducted before any prototype implementation. Cognitive walkthroughs focus on the human factor in terms of user expectations and goals.

For Redish and Dumas, “walkthroughs are less effective at finding usability problems than other evaluation methods such as heuristic evaluations”. (p.69)

Having users work with prototypes (static or interactive)

Prototypes usually come in two basic forms: static (paper-based) or interactive (software-based). Those have their advantages and disadvantages.

Using static prototypes will mean that one person will be playing the computer or doing a screenplay mimicking the interactivity of a finished product. This interplay between participant, and paper helps moderators find hierarchical issues and language interpretation in terms of features available and the wording assigned to them in the sketches.

Interactive prototypes allows participants to interact with a ‘seemingly’ working interface. The prototypes facilitate the incorporation of user feedback, exploration of several design concepts, and the evaluation of several iterations. (p.72)

Prototypes can also be tested in 3 ways: horizontal (surface interface), vertical (small number of paths available), or scenario prototypes (selected tasks made fully functional).

Getting user edits on early versions of documentation

Conducting a usability test in the early stages of a design helps save time and money in the long run for both designers and engineers, as rapid changes can be made and less work-load will be put on design alterations and programming modifications. However, it is important to test the intended end-users in order to make the right decisions in due time.

As seen in the Atlas (1981), Redish and Dumas define user edit as involving a user, a specific task, a set of product instructions, and an observer. A variation of this method is usability edit (Soderston, 1985) defined as involving a user, written material, a camcorder and/or observer, and a system (ie a computer). Here, users not only look at the material but also interact with the information while thinking aloud. This method of having real text helps moderators evaluate the effectiveness and efficiency of both “commission” and “omission.” (p.77)

Comparing usability testing with other usability evaluation methods

Finally, it is suggested that you combine usability testing with other evaluation methods to augment complement the advantages found in different approaches and diminish the risk of leaving out local problems that can make your product inadequate for your end-user.

After comparing different styles of combining methods, the authors conclude that: (p.82)
– Usability testing uncovers more usability problems, finds more global problems, and more unique problems than other evaluation methods.
– Usability testing uncovers less local problems and takes more hours to conduct than other methods, but is cost effective in the long-run.
– Heuristic evaluations are better at uncovering usability problems than walkthroughs, are more effective when conducted by several experts working independently, and are better at uncovering local problems.

Again, they emphasize the need to conduct both usability tests and heuristic evaluations, as together they help uncover the most usability problems.

Paper Prototyping: The Fast and Easy Way to Design and Refine User Interfaces – Carolyn Snyder

Date : October 15, 2009

Carolyn Snyder proposes a definition of paper prototyping as ‘a variation of usability testing where representative users […] [interact] with a paper version of the interface that is manipulated by a person “playing computer” ‘ (italicized in original, p.4).

The aim of her book is to extend the practice of paper prototyping to a variety of HCI platforms for non-expert users to adopt as a practical tool for creating and testing their products during the development process. Paper prototyping (usability testing, or user-centeredness) help generate useful, intuitive,efficient, and pleasing products for user experiences.

In her book, she offers a brief template for organizing and conducting a usability test: “choose the type of user […], determine some typical tasks […], make screen-shots and/or hand-made [sketches], conduct a usability test [by asking the] user to attempt the tasks by interacting directly with the prototype, [as] one or two of you play the role of “computer” [have a] facilitator [conduct] the session while other members of the product team act as note-taking observers.” (p.5)

To frame more her definition of prototyping, Snyder emphasizes the need for realistic content, which, she observes, often tempts designers in confusing comps, wireframes, and storyboards as paper prototypes. She explains that although they represent different sketching styles, they could only be used for usability tests if they included realistic content (no dummy text or temporary reminders such as ‘image info.’, etc.).

Here are her resumed prototyping benefits: gathering user feedback early on; permitting iterative design; facilitating designer-user dialogue; “does not require any technical skills”; and “encourages creativity.” (p.12)

She then discusses paper prototyping as related to participatory design in its approach since the 1980’s. As she explains, prototyping concepts have become more and more accessible and her aim is to make it more desirable as a design tool for the betterment of our product implementation and user satisfaction.

An Introduction to Usability (continued…) – Patrick Jordan

Date : October 14, 2009

Designing for usability is a user-centered approach to design. Jordan lays out the different considerations to keep in mind during the design process and proposes a set of methods (some empirical, others non-empirical) that can be used as testing tools and discusses the advantages and disadvantages that come with each method. His writing allows designers to understand better that there are different ways of testing users and that some styles are more appropriate depending on the content of tasks involved in using a given product or interface and its relation to a demographic/technographic target audience.

“Designing for Usability”

The author emphasizes the need for design to define its target audience (i.e. general public, expert group, etc.) in terms of its physical (height, reach, strength), and cognitive characteristics (specialist knowledge, attitudes, expectations). Understanding an audience’s characteristics is the first step that allows a better sense of usability requirements involved in the design of user-centered products.

Empirical methods –such as focus groups, interviews, questionnaires, and so forth– provide designers with the adequate information to understand its audience in terms of its attitudes and lifestyles; that is, being aware of the contexts in which the products will be used, for what purpose or specific task, and what other sort of activities a given audience would be likely to be engaged in at the same time. Those processes also help designers calibrate the relevance of the features of their products in accordance to audience needs, beliefs, and expectations. Some products may need to meet certain “legislative measures”. For instance, the author gives the example of a stereo interface in a car system and how positioning the access to volume controls may affect safety requirements.

Moreover, Jordan writes of “Iterative Design”. Iterative design entails a series of evaluations of a sequence of prototypes. This means creating a concept for an initial evaluation, then though empirical methods testing the given product, define specific problems, and returning to second iteration, and a third, and fourth… until the product is deemed usable and appropriate to its audience. He lists different types of prototypes that can be employed for presentation and/or usability testing: written or oral descriptions (specs), visual prototypes, models (physical representations), screen-based interactive prototypes (simulated interactions), and fully working prototypes (pseudo-product).

“Methods for Usability Evaluation”

Furthermore, Jordan offers a detailed description of techniques for observing users and evaluating the ‘ease of use’ in user-centered products. He proposes practical methods for designers for “uncovering unexpected usability problems” (p.51), which he differentiates as empirical and non-empirical.

Empirical Methods:

“Private Camera Conversations” allows access to information concerning user lifestyle and attitudes. Here, participants are recorded on tape as they reveal the positive and negative aspects of their experience in terms of ease of use, and purpose in their everyday. Usually participants are questioned privately with no face-to-face interaction with the moderator. This setting may help them feel that they are in a space where they can speak freely and hence reveal information, which otherwise omitted. Videos also represent raw evidence for designers to work with. The disadvantage might occur when participants deviate from the purpose of the conversation and deliver an irrelevant monologue that designers won’t be able to analyze. Also, with loose responses it becomes very complex for designers to compare different responses due to the inconsistency of the content.

“Co-discovery” involves recruiting two participants (friends or acquaintances) and observing their reactions as they co-discover the controls of a new product and verbalize their thoughts naturally in the process. This method often results in more informal and honest ‘verbalisations’ between the two users, which may explain why they found problems in a product. As with the previous method, the moderator has no control over the issues raised during the conversation; and a moderator’s interference might disrupt the natural occurrence and the spontaneity of the subjects’ reaction.

“Focus Groups” are composed of various participants and a leader who directs and prompts topical discussions. Here there is potential in getting many perspectives on aesthetic, functional, etc. components of a product as well as finding solutions for new iterations. However, the leader’s role is crucial here, as the large number of the group may result in having some voices dominating the discussion and preventing others from expressing their opinions and thoughts. The author suggests 5 or 6 participants to ensure equal opportunity.

In “User Workshops”, a group of users engage ‘hands-on’ in the creation process of a new product by imparting usability requirements and creating sketches and design solutions. This includes the user in the design and makes for more meaningful products that respond to user needs, wants, and attitudes, while nonetheless being a time-consuming and challenging performance.

“Thinking Aloud Protocols” serve a way to enter a user’s mind as she uses an interface and formulates her thought process. Tasks can be specified or open to ‘free exploration’. This method may reveal ‘objective performance data’. In this case, the moderator’s role would be to minimize potential distractions when prompting a task or questioning specific user behaviour (i.e. why did you click here and not there? these type of questions could tempt users to change their attitude and responses).

“Incident Diaries” are a form of probe where users note their experience level with the product. This approach can be both effective and incomplete. It is effective as it helps designers measure the different components of a product on a scale of, ‘very easy’ to ‘very difficult’ (Lickert scale), and documents a ‘long-term usage’. However, it represents incomplete data as users might not be faithful to the diary format and may not keep-up with the agenda.

“Feature Checklists” include list of tools, links, and functions comprised in a product which users checkmark as they successfully use them. Here the designer receives factual data rather than user-experience data.

“Logging Use” uses a computer software to track users’ screen/mouse activity and determine the usefulness of a product’s features. This disadvantage here is in the interpretation of collected data. The author writes ” If parts of a product’s functionality have not been used, or have been used very little […], it could be that this aspect of functionality is not useful and so users do not bother with it, […] it is avoided as it is difficult to use, [or] users did not know it existed.” (p.62) He proposes that an interview would be added at the end of the test in order to understand the data which may be misleading design revisions.

The “Field Observation” approach helps designers understand the place of product in a user’s life and the potential interferences that may be present in their habitual environment at the time of product usage. For more effectiveness, it is advised that the moderator be as invisible as possible. Jordan raises the ethical counter-effectiveness of such an approach. Taping users without them knowing could “compromise the level of ecological validity” of a the data (p.63). Testing a product near the end of its design can also be less useful for designers, as changes will be difficult to make.

“Questionnaires”, in brief, demand either ‘fixed’ or ‘open-ended’ responses from users. Fixed responses (quantitative) will be inaccurate at times as users will feel obliged to tick one answer amongst prefixed terminologies which in turn mislead the data analyst, but will provide comparable findings across participants. Open-ended responses (qualitative), on the other hand, will allow users to freely raise issues in their own terms, making the findings more valid and reliable, but becomes more costly for users to complete and may be left untouched.

“Interviews”, as Jordan proposes, can be “unstructured, semi-structured and structured” (p.68). Similar to the “questionnaires” method, this approach is a list of specific concerns the moderator wants answered or broader investigations that help define a user attitude towards a product. Unstructured interviews are open-ended questions with open-ended answers providing insight into what sort of problems were encountered in user-experience. When moderators have somewhat specific knowledge of potential errors or difficulties in their product, the use of a “semi-structured” interview is advisable as it involves both specific questions and open-ended investigations. “Structured” interviews will have quantitative questions responding to ‘requirements capture’. Interviews allow more valid data collection and minimizes misunderstandings.

“Valuation Method” is a way of approximating the value (or cost) of a product by collecting quantitative data.

“Controlled Experiments” refer to testing users in a laboratory, devoid of excess noise and disruptions. The advantage is that the user will focus on the given task; the disadvantage, however, concerns the lack of environmental familiarity. The ‘experimental conditions’ could influence new user-behaviour, and hence create incoherent data.

Non-Empirical Methods:

“Task Analyses” focuses on cognitive fragmentations of task performance. The analysis enumerates the number of steps requires of users to complete a specific task and works with redesigning and minimizing the steps for easier and simpler usage. Jordan mentions GOMS and the Keystroke Model as tactics for analyzing cognition. This method helps collect objective data on one hand, but presents a risk for veering out of the usability (or qualitative) data which might vary from expert to beginner users.

“Property Checklists” consists on analyzing the success of a product’s response to the human factor. Is this product humane? Does it respond to physical and psychological needs? Does it employ coherent language? Are the controls placed at and appropriate and perceptible height and reach? Etc. This method need not to involve participants, rather it follows a set of required specifications and checks if those have been applied in the design process in terms of “consistency, compatibility, and good feedback” (p.75).

“Expert Appraisals” seeks ‘appraisals’ from experts in the field of product usability. Those diagnose the potential obstacles users may be confronted by and provide solutions for fixing those problems. The disadvantage is that participants need not to be present and hence the diagnosis is somewhat inaccurate compared to “task performance data” (p.78).

Finally, “Cognitive Walkthroughs” are breakdowns of the steps involved in completing a specific task. Here, the expert impersonates a typical user and experiences the product. As we know that every user will have idiosyncratic behaviour, this method is valid in speculating a wide range of problems but invalid in specifying empirical problems as it relies on expert performance.