INTRODUCTION

  • Purpose of paper
  • What is Usability Testing?
  • What are the goals of us. Testing?
  • How does usability testing differ from the traditional assembly line design approach.
  • Talk about how usability testing should be incorporated into the company as a policy but will require a change in attitude and culture.
  • Discuss the benefits and pitfalls
  • Difference between research and usability testing AND Beta testing and usability testing.
  • Different types of evaluation methods (field experiment/study, formal theory- cognitive walkthroughs, judgement studies- heuristic evaluation, sample survey- questionnaire, lab experiment, exp. simulation McGrath, 1994)

CONDUCTING A USABILITY TEST

  • Introduce Observational testing and its benefits and pitfalls
  • Test methods (think aloud, constructive intervention, retrospective testing, coaching method, collaberative/codiscovery)
  • Test team members?
  • Establishing test goals and plans
  • Getting users and ethics
  • Choosing experimenters
  • Preparing test materials
  • Setting up a test room/lab
  • Participants
  • Test tasks and scenarios
  • Performance measurement

CONCLUSIONS

  • Overall benefits of usability testing
  • The need to educate others and will management confidence/support.
  • Continue to be an advocate and sell usability to everyone.

GOOD RESOURCES/ WHERE TO FIND MORE

  • U of C: Saul’s site
  • Link to other web sites (Jakob Nieslen, questionnaire, CHI, HFES

INTRODUCTION

Purpose: The following paper is a description of what usability testing is, the benefits, and the types of usability testing available, in particular, observational testing. The paper then goes on to describe all the necessary factors to take into account when conducting an observational usability test.

What is it?: Usability testing is based on a philosophy which places the user at the centre of the design process. This user centred design process involves the end user in all facets of the product’s development, (initial ideas, models, prototypes, finished product, through to the upgrade).

Usability testing is the method by which the end user provides feedback on the product design at these stages of development. These end users complete a set of real tasks with the product as testers observe their behaviour, expectations, and collect other empirical data. The results of the testing are then used to make changes or improvements so that the product is easier to use, produces higher user satisfaction, and/or is more useful. This process of design, evaluate, redesign/modify, retest, is referred to as iterative design. Since this process will ultimately produce more usable and useful products it should be applied to any product or part of product that will be used by people.

Unfortunately, designers cannot embrace iterative design process unless they have upper management’s agreement. Designers need company support (in the form of financial, facility space, manpower, decision making power) in order to make the iterative design process happen. The following is an excerpt from the Dumas & Redis book, A Practical Guide to Usabilty Testing (1993, chapter 6, p86-p88).

The need for a company to buy into the importance of usability testing and the iterative design cycle. The traditional design process tends to be similar to an assembly line where the developers work to create a product driven by technology. User interface reviews and documentation are an afterthought dealt with late in the project. Why doesn’t this model support the concept of user centred design? The following 6 points are reasons why:

1-The assembly line approach to development. The product is defined by one group and developed by another and usability specialists are brought in (if at all) at the end of design to do acceptance testing, not design involvement.

2-The battle over user ownership. It may be that sales and marketing feel the users to be their customers and may be appreciative if the developers discuss the product with them. Sales/marketing/project managers are afraid that developers will promise features or a product that they don’t plan on delivering. Eventhough this situation may be true, developers must remember that information not received directly from the users is not as helpful since it may have been interpreted differently.

3-User feedback isn’t passed onto the design team. The help-desk/trainers/sales representatives may get to know of problems with the product, but these issues do not get back to the design team. This lack of feedback gives developers a false sense that the product does not have any problems and is usable.

4-Inter-departmental competition hinders respect. The“throw it over the wall” approach encourages people to be isolated from the other development stages. This isolated attitude makes it difficult to understand and respect others with different needs/goals. This may result in a lack of respect for the user who may be one of these seemingly opposing views.

5-Upper management focused on head count Vs. overall budget. The manager may be of the view that anyone can do particular tasks (writing, training, interface designers, usability testers). They are not of the view that a specialist is needed to do these jobs so programmers/marketing have dual roles which could result in a poorer job.

6-Usability and documentation are in the list of priorities. Companies fail to see that even if the product meets schedule and budget, it remains worthless if the user cannot use it effectively and efficiently to accomplish real tasks.

Due to the infusion of technology into the everyday population, the demand for usable products has risen. It is no longer only the well educated and “techies” that are using technology, but the general population uses it every day. This demand has been flown back to the developers and producers of technology. Those companies that recognize producing usable products as a way of being competitive have endorsed the iterative design approach.

What makes Usability Testing Different?

Some people will argue that user testing is already being performed in the traditional process by completing beta testing and academic research.

Usability Vs Beta testing: Beta testing is when a finished product is released early to certain groups of users as a trial period. The company supplying the product will contact the users to see how they like it after this trial period and possibly during the trial. There are no testers running the trial so the product may actually never get used, the regulation of use is up to the users. When users are asked they report what they remember and choose to report. When users do report something they may not be able to report the actions they took that resulted in the problem. Another characteristic of beta testing is the tasks that are tested are whatever users happen to do (i.e. some tasks may never arise, therefore untested). The major difference is that beta testing happens to late in the design process so that it is much more difficult to fix any of the problem(s) identified by the users (Dumas and Redis, 1993).

Usabilitiy Vs Research Studies: These two methods have many similarities (conduct tests in a lab, sample representative participants, controlling variables, recording measures, and analyzing the data). They have different goals; research wants to verify the existence of a theory whereas usability testing attempt to identify current problem areas. Each selects participants differently; usability testing selects users from a sample pool, and is not referred to as “scientific sampling”. They have different views on controlling variables; a usability test does not usually isolate specific variables and tries to reflect real tasks in a real environment. They weigh observations differently; usability test teams’ observations and comments are often given more weight in identifying the problem areas than they are in a research study. The major similarity between the two methods is the physical setting they take place (for the most part, a laboratory) and the major difference is the purpose they serve (Dumas and Redis, 1993).

Real users. Those testing the product are representative of the user population. The benefit of real users is they allows the developers to understand know their specific needs which can help in developing a better product.

Real tasks. The test tasks must be representative of how the product will be used in the “real” world. This will provide information on areas of the product that need to be changed/improved.

Observing and recording. Observations are made on participants’ performance, comments/questions they make, and their behaviour. Observing and recording participants’ behaviors distinguishes a usability test from focus groups, surveys, and beta testing.

Converting data into recommendations. The data collected (quantitative and qualitative data from participants together with observations and users’ comments) can be analyzed and converted into problem areas. This allows problems to be prioritized and given possible solutions.

Changing the product and the process. When applied appropriately the results of a usability test will help improve the product and the development process. It is also a way to stimulate interest in usability within an organization and helps change people’s attitudes about users. The impact of watching a few people struggle with a product will have more of an effect on attitudes than many hours of discussion. The ultimate organizational effect would be to change the design and development process that an organization uses to design and develop products (Dumas and Redis, 1993)..

Types of Testing: As described by McGrath (1994) there are four perspectives from which one can evaluate a product. He talks about a taxonomy of research strategies which includes four quadrants. The first quadrant is Field Strategies which includes field experiment/study (beta testing, real tasks in real setting). Quadrant two, Experimental strategies (involves controlling conditions of the study), includes laboratory experiment and experimental simulation. Quadrant three is Respondent strategies, (sample survey, questionnaire) and judgement study (interview). Quadrant four is Theoretical strategies which includes formal theory (cognitive walkthroughs, heuristic evaluation), and computer simulation.

Usability testing, lies in the Experimental Quadrant. Observational testing is a specific type of usability testing that involves the test team observing users as they go through a set of tasks designed by the test team. Depending on the strategy adopted, testers may or may not interact with the users.

Pitfalls of Usability testing:

  1. Reliability (i.e. users participating in tests are representative of the full spectrum of users) without the representative users the results cannot be generalized to the user population. This risk (of getting unreliable data) can be minimized by having those users classified as non-average users participate in testing. Due to individual variation within any population the test team will not be able to ensure the data collected is 100% reliable.
  2. Validity- is whether the test actually measured something of relevance to the usability of real products Whether the data collected in a usability test is valid or not will depend on the accuracy of the test tasks, scenarios and environment of the testing. The information provided by a user profile or a user representative is necessary in recreating these important factors.

CONDUCTING AN OBSERVATIONAL USABILTY TEST

Deciding upon what type of test to do: The stage of product development will often dictate which type of usability testing will be conducted. [The following descriptions of different stages can be found in.]

Exploratory testing can be done with users at the very preliminary stages of design. At this stage the design team is collecting information on the validity of the user profile (already developed) and what the users’ mental model of the product is. This can be accomplished by having the user walk the team through the tasks with a prototype. The team could probe the user for information on user expectations and how the product should react in different situations (Rubin 1994).

Assessment test is the most common and simplest test which could be conducted early or midway into the product development cycle. This type of test would be done after the exploratory testing had occurred and the basic structure of the product has been established. It would involve users completing realistic tasks. The goals would be to capture information on users behaviour with the product and how well they can perform while using it (Rubin 1994).

Rubin (1994) also describes a validation test and comparison test. A validation test is used to measure a product against a standard or benchmark. A comparison test can be conducted in lieu of the 3 previously mentioned tests and compares two or more products.

Both the exploratory and assessment tests involve gathering information and results by observing users and can be loosely known as observational testing.

Test Team Members: The ideal team would include a designer, usability specialist, evaluating specialist, technical communicators, trainers, marketing, and customer assistance personnel. If the funding or personnel are not available then minimal testing with fewer testers is better than no testing at all.

Test Plan

Conducting a smooth observational usability test can be facilitated be creating a test plan. A test plan will help ensure that nothing is overlooked by the test team. A well thought out test plan can also act as a template for future studies and may help validate the usability test in the eyes of skeptics (Dumas and Redish, 1993 Rubin, 1994; Neilsen, 1996).

What is involved in creating a Test Plan?: Establishing a plan involves (Dumas & Redish, 1993):

defining test goals;

deciding who should be participants;

establishing tasks and task scenarios;

deciding how to measure usability;

preparing materials, the test environment, and test team, and;

conducting a pilot test and revising test procedures.

A test plan can also include other factors such as budget, time schedule, definition of a successful test etc. No matter what the test plan is made of it is important that the entire test team is involved in creating the test plan. This will ensure all members of the team understand the test objectives.

Defining test goals: What is the usability test supposed to achieve? Most usability tests have general goals like, improving the product, creating a product that is easier to learn and use, increasing user satisfaction, creating a benchmark; reducing risk by validation etc. Usability tests also have specific goals that focus on characteristics of the product like, evaluate a menu system, navigation function, labels of certain buttons, etc. The design team needs to identify what areas/characteristics need to studied, thus these needs can be turned into goals.

who should be participants: They should be representative of the intended users. The participants selected will determine how useful the results will be. When recruiting, user characteristics should also be considered (novice or expert user, children or older adults). The type of compensation and whether participants will be given compensation must also be decided.

The Dumas & Redish (1993) describes the process of selecting test participants. They recommend that a user profile be developed by the marketing, usability and design personnel. This profile can be based on market research, analysis of competitors’ products, focus groups, observations, and interviews with prospective users. The user profile should include two types of characteristics: those that all users will share, and individual characteristics among users. Examples of factors that may be included in a user profile are: work experience, general computer experience; experience with this product, knowledge base, age, and experience with similar products. Dumas & Redish go on to describe how to divide the users into subgroups and they make an important point that the intermediate users can be left out. By this they mean that the participants from the extremes of the range of participant population will often give you more useful information than those from the middle of the range. The number of people to include in a subgroup will depend on whether your goal requires a certain number of people and how much budget you have. Neilsen and Molich (1990; in Dumas & Redish) found that almost half of major usability problems were detected with three participants. Other studies found 4 and 5 participants found 90% of the problems. c also state that once there are more than 2 subgroups of 3 to 5 people the same problems arise.

Ethics; It is important to remember that the participants in usability testing are a necessity and that they must be treated with respect. When conducting a usability test the test procedures and conductors must not ask the participant to do anything they do not wish to do or intentionally distress them. Every member of the test team should understand that users will try to perform as best as they can and may feel pressure to do “well”. This is why it is necessary for the test team to make the participant feel as comfortable as possible before, during and after the test. Although the participant has been reassured that the product is being tested and not themselves, they may feel inadequate as they experience difficulties. This may distress some participants and the test team must recognize this and reassure the participant that their performance is not be tested. Additionally, being observed and/or recorded may increase the pressure felt. As an example, the test conductor should not let the user struggle endlessly with a task if it is obvious the user is getting desperate. Users should also be assured that all information and observations made regarding their performance/participation will remain confidential and that their name will not be used to identify them in any resulting documentation. Finally, the user should be informed prior to testing that they have the right to withdraw from testing without consequence. They should then be asked to read and sign an informed consent which summarizes the procedures, purpose of the test, and users’ rights. After the test the user should be debriefed and given the opportunity to discuss the test tasks, ask questions, give further comments, and answer any questions the test conductor/team may have. The user should then be thanked, given their “reward”, given a number/name to contact if he/she is interested in test results or to see “improved” product, and dismissed.(Neilsen, 199? )