Serco Usability Services

Performance
Measurement
Handbook

Version 3

Serco Usability Services

National Physical Laboratory
Teddington, Middlesex, United Kingdom
TW11 0LW

Reference: NPLUS/PMH/v3.0/Dec 95

No extracts from this document may be reproduced without the prior written consent of the Managing Director, National Physical Laboratory; the source must be acknowledged.

Serco Usability Services – Training courses and technical enquiries

This handbook describes the Performance Measurement Method for measuring the usability of a product.

Because the validity and reliability of the results obtained by applying the Method depend on its users interpreting the guidance in this handbook consistently, Serco recommends that all users of this book attend the Usability Evaluation training course run by Serco Usability Services.

Technical enquiries about the contents of this book and information about the associated Serco course should be addressed to:

Serco Usability Services
22 Hand Court
London
WC1V 6JF

Acknowledgements

Thanks are due to all those who have contributed to the development of the Performance Measurement Method, especially the authors of the earlier versions of this book: Ralph Rengger, Miles Macleod, Rosemary Bowden, Annie Drynan, and Miranda Blayney.

Editing and indexing for Version 3 is by David Cooper.

Appendix 2 is an updated version of the NPL Usability Library book The Structure and Contents of an NPL Usability Report by Ralph Rengger.

Appendix 5 is an updated version of the NPL Usability Library book A Guide to Conducting User Sessionsby Ralph Rengger and Rosemary Bowden. Additional material for Appendix 5 is by Cathy Thomas and Rosemary Bowden.

All trademarks acknowledged

Contents

Part 1: Getting started______1

Introduction...... 3

How to use this book...... 3

Introduction to the Method...... 3

Performance Measurement...... 3

Applying the Method...... 4

Performance Measurement Toolkit...... 6

This book – an overview of its parts...... 9

General...... 9

Preliminary pages...... 9

Part 1: Getting Started...... 10

Part 2: Applying the Method...... 11

Part 3: Quick Guide to Analysing Usability Sessions...... 11

Part 4: Guide to Analysing Task Output...... 12

Part 5: Guide to Analysing Video Records...... 12

Part 6: Deriving and interpreting metrics...... 12

Appendices...... 14

Back matter...... 15

Notes for owners of Version 2 of this book...... 15

Summary of how to use this book...... 16

Other parts of the Performance Measurement Toolkit...... 17

Usability Context Analysis Guide...... 17

Identifying Usability Design Issues...... 19

DRUM User Guide...... 20

Other books in the NPL Usability Library...... 20

Suggested readership for the library...... 21

DRUM software...... 22

DRUM hardware...... 22

Part 2: Applying the Method______25

Introduction...... 27

Why read this part of the handbook...... 27

Who should read this part...... 27

Steps in the Performance Measurement Method...... 27

Applying the Method...... 29

Step 1: Define the product to be tested...... 29

Step 2: Define the Context of Use...... 29

Step 3: Specify the evaluation targets and Context of Evaluation...... 31

Step 4: Prepare an evaluation...... 32

Step 5: Carry out user tests...... 35

Step 6: Analyse the test data...... 37

Step 7: Produce the usability reports...... 38

Relevant parts of the library...... 39

Part 3: Quick Guide to Analysing Usability Sessions______31

Introduction...... 43

Why read this part – the Quick Guide...... 43

About the Method...... 43

Measures and metrics...... 47

Measures from analysing task output (Sub-goal 6.1)...... 47

Measures from analysing video records (Sub-goal 6.2)...... 48

The metrics you derive from the measures (Sub-goal 6.3)...... 52

Analysing usability sessions – an overview...... 53

Stages of analysis...... 53

Familiarising yourself with the product...... 53

Familiarising yourself with the task...... 53

Attending the usability sessions...... 53

Analysing task output – Sub-goal 6.1...... 55

Analysing video records...... 55

Analysing task output and video records...... 57

How to analyse task output – Sub-goal 6.1...... 57

How to measure times from video records – Sub-goal 6.2...... 57

Part 4: Guide to Analysing Task Output______47

Introduction...... 67

Why read this part of the handbook...... 67

Who should read this part...... 67

Why analyse task output ?...... 69

Overview...... 69

Measures derived...... 69

Metrics derived...... 71

Associated metrics...... 71

Summary...... 72

How to analyse task output – Sub-goal 6.1...... 75

The requirements for analysing task outputs...... 75

Dependent and independent subtasks...... 75

Quantity...... 78

Quality...... 79

Calculating Task Effectiveness...... 81

Case studies...... 83

Case 1 – Process controller...... 83

Case 2 – Holiday information system...... 86

Case 3 - Drawing package...... 88

Part 5: Guide to Analysing Video Records______67

Introduction...... 95

Why read this part...... 95

Who should read this part...... 95

Understanding session time...... 97

Introduction...... 97

Task Time...... 97

Productive and unproductive actions...... 97

Fuzzy problems...... 100

Categorising and measuring session time – Sub-goal 6.2...... 101

Categorising actions...... 101

Categorising pauses...... 101

Simultaneous actions...... 101

Measuring all types of time...... 104

Detailed descriptions of periods...... 107

Task Time...... 107

Help Time...... 108

Search Time...... 110

Snag Time...... 112

Negating actions...... 112

Cancelled actions...... 116

Rejected actions...... 118

How to analyse video records...... 121

Overview...... 121

Familiarisation with product and task...... 121

Attending the usability sessions...... 122

Analysing the tapes...... 123

Logging times with DRUM software (Sub-goal 6.2)...... 127

Associated tasks...... 127

Further examples of Help, Search, and Snag Time...... 131

Examples of Help Time...... 131

Examples of Search Time...... 132

Examples of Snag Time...... 136

Example usability sessions...... 143

Introduction...... 143

Usability session example 1...... 144

Usability session example 2...... 148

Part 6: Deriving and Interpreting Metrics ______105

Introduction...... 153

Why read this part...... 153

Who should read this part...... 153

Deriving the metrics...... 155

Individual user values (session metrics) – Sub-goal 6.3...... 155

Deriving group values (usability metrics) – Sub-goal 6.4...... 156

Interpreting the results...... 159

Setting target levels...... 159

The effect of zero Task Effectiveness...... 161

Relating the results to usability success factors...... 162

Understanding User Efficiency (UE) and Relative User Efficiency (RUE)...... 169

Appendices______123

Appendix 1: Performance Metrics Directory______125

Introduction to the metrics...... 175

Product-dependent measures of performance...... 175

Product-independent measures of performance...... 175

Usability metrics...... 177

Evaluation procedure...... 178

Product-independent measures of performance...... 179

Group 1: Duration measures...... 179

Group 2: Count measures...... 180

Group 3: Frequency measures...... 180

Group 4: Completeness measures...... 180

Group 5: Correctness measures...... 180

Usability metrics...... 181

Class 1: Goal achievement...... 181

Class 2: Work rate...... 181

Class 3: Operability...... 184

Class 4: Knowledge acquisition...... 185

Summary...... 187

Appendix 2: Example Format of a Usability Report______135

Introduction...... 189

Why read this appendix...... 189

Who should read this appendix...... 190

Basic structure of a report...... 191

Contents of the sections in a report...... 193

Appendix 3: MUSiC______145

The MUSiC project...... 203

Types of metric developed in MUSiC...... 203

MUSiC partners...... 205

Appendix 4: Reader Comment Form______147

Appendix 5: A Guide to Conducting User Sessions______151

Introduction...... 211

Why read this appendix...... 211

Who should read read this appendix...... 211

Relaxation of copyright...... 211

Evaluation checklists...... 213

Planning...... 213

Preparing an evaluation (Step 4)...... 217

Running usability sessions (Step 5)...... 219

Producing results...... 220

Stages in conducting an evaluation...... 221

Back Matter______163

Bibliography______165

Papers...... 229

NPL Usability Library...... 230

Other MUSiC project publications and products...... 231

Miscellaneous...... 233

Glossary______169

Index______173

Some commonly used terms______Inside back cover

Figures

Figure 1: Major parts of the Performance Measurement Toolkit...... 7

Figure 2: Steps and tools in the Performance Measurement Method...... 28

Figure 3: Relationship between measures and metrics...... 45

Figure 4: Hierarchy of task actions...... 48

Figure 5: Decision tree to categorise actions...... 59

Figure 6: Example of a usability session profile...... 62

Figure 7: Hierarchy of action categories...... 103

Figure 8: Decision tree to categorise actions...... 125

Figure 9: Description of usability session example 1...... 144

Figure 10: Log showing the Measures required for calculating the metrics...... 146

Figure 11: Log showing the Measures and other possible information required for diagnostics 147

Figure 12: Description of usability session example 2...... 148

Figure 13: DRUM log of the session with diagnostic information and comments..149

Figure 14: Summary of usability metrics...... 187

Figure 15: Contents list of a model usability report...... 191

Figure 16: Example extract of performance-based results...... 197

Figure 17: Example extract of SUMI results...... 198

Figure 18: Stages in conducting an evaluation...... 221

Tables

Table 1: Parts of the NPL Usability Library relevant for some typical tasks...... 16

Table 2: Suggested readership for parts of the NPL Usability Library...... 21

Table 3: Steps in the Performance Measurement Method and relevant parts of the library39

Table 4: Usability success factors relevant to measures and metrics...... 169

Table 5: Product-independent measures by group...... 177

Table 6: Usability metrics by class...... 177

Table 7: MUSiC partners...... 205

Contents

Part 1: Getting Started

Contents of this part

Introduction...... 3

How to use this book...... 3

Introduction to the Method...... 3

Performance Measurement...... 3

Applying the Method...... 4

Performance Measurement Toolkit...... 6

This book – an overview of its parts...... 9

General...... 9

Preliminary pages...... 9

Part 1: Getting Started...... 10

Part 2: Applying the Method...... 11

Part 3: Quick Guide to Analysing Usability Sessions...... 11

Part 4: Guide to Analysing Task Output...... 12

Part 5: Guide to Analysing Video Records...... 12

Part 6: Deriving and interpreting metrics...... 12

Appendices...... 14

Appendix 1: Performance Metrics Directory...... 14

Appendix 2: Example Format of a Usability Report...... 14

Appendix 3: MUSiC...... 14

Appendix 4: Reader Comment Form...... 14

Appendix 5: A Guide to Conducting User Sessions...... 14

Back matter...... 15

Bibliography...... 15

Glossary...... 15

Index...... 15

Notes for owners of Version 2 of this book...... 15

Summary of how to use this book...... 16

Which part do I need to read?...... 16

Other parts of the Performance Measurement Toolkit...... 17

Usability Context Analysis Guide...... 17

Identifying Usability Design Issues...... 19

Problem Descriptions...... 19

Information on Hierarchical Task Analysis (HTA)...... 19

DRUM User Guide...... 20

Other books in the NPL Usability Library...... 20

Suggested readership for the library...... 21

DRUM software...... 22

DRUM hardware...... 22

Figures

Figure 1: Major parts of the Performance Measurement Toolkit...... 7

Tables

Table 1: Parts of the NPL Usability Library relevant for some typical tasks...... 16

Table 2: Suggested readership for parts of the NPL Usability Library...... 21

Above default page break is necessary to produce a numbered even page.

Contents

Introduction

How to use this book

This book describes the Performance Measurement Method (often abbreviated to the Method).

Read this part first. After a brief overview of the Method, this part describes how this book is organised, and the contents and suggested readership for each part. It also describes other books and tools that support the Method.

From reading this, you will see where to find further information to match your needs.

Introduction to the Method

The Method is designed to be implemented by usability analysts who have undergone basic training. See the front matter of this book for details of whom to contact about training.

Origins of the Method

The Method was developed at the National Physical Laboratory (NPL) as part of the European ESPRIT II project 5429 – MUSiC (Measuring Usability of Systems in Context). MUSiC was concerned with developing the metrics, methods, and standards required to measure the usability of software.

MUSiC involved the development of four types of metric – analytic, performance, cognitive workload, and user satisfaction.

NPL developed the performance-related methods, measures and metrics that are the subject of this book. For more details about the MUSiC project and the other usability measures developed, see “Appendix 3: MUSiC”.

Performance Measurement

The Performance Measurement Method facilitates the measurement of performance metrics. It is designed for use with one or more of the other methods, but can be used independently.

It aims to provide data on the effectiveness and efficiency of users' interaction with a product, thus enabling comparisons with similar products, or with previous versions of the product under development.

It can also highlight areas where a product can be enhanced to improve usability. When used with the other methods, you can build a complete picture of the usability of a system.

It gives you a way of evaluating the usability of a product by observing and analysing how successfully tasks can be performed by users of the product.

Applying the Method

The Performance Measurement Method takes you through all the stages of the evaluation, from deciding what and how to evaluate, to producing the final usability report. The steps involved are as follows:

1Defining the product to be tested. You do this in a structured way using a form supplied as part of the Usability Context Analysis Guide(which is a companion volume to this book).

2Defining the Context of Use. For the measures of usability to be meaningful, you must set up an evaluation test with:

Users who are representative of the population of users who use the product
Tasks that are representative of the ones for which the system is intended
Conditions that are representative of the normal conditions in which the product is used

With the help of the Usability Context Analysis Guide, you produce a specification of key factors concerning the users, the tasks they will perform, and the environments in which they will work.

3Specifying the Context of Evaluationso that the evaluation can be carried out in conditions as close as possible to those in which the product will be used.

The Usability Context Analysis Guide provides a structured questionnaire format to assist you in defining and documenting the Evaluation Plan.

4Preparing an evaluationto meet the specified Context of Evaluation. The evaluation measures the performance of users as they perform set tasks within this context. The Usability Context Analysis Guide describes a procedure for setting up an appropriate evaluation test.

5Performing the user tests. When you are using the full Performance Measurement Method, evaluation sessions are recorded on video. DRUM – the Diagnostic Recorder for Usability Measurement – is a software tool that enables you to make an accurate and comprehensive record of the interaction and to analyse it.

The DRUM User Guide describes how to use the software and specifies the hardware set-up and connections.

6Analysing the data, again with the help of DRUM. When you analyse a usability session, you analyse the task output that a user produces and the video record of the usability session to produce certain measures. This produces measures of Task, Snag, Search, and Help Times.

You then use these measures to calculate metrics, which provide a quantitative measure of usability. Metrics are Effectiveness, Efficiency, Productive Period and Relative User Efficiency.

If you just want to derive measures of Efficiency and Effectiveness, then a video recording is unnecessary.

7Producing a usability report. This should give a description of the performance metrics of the system under test, and could be used to compare the system with similar systems, or with the same system as it is developed over time.

Priorities – for example, of speed or accuracy – can be assessed, and features of the product where the usability can be improved can be highlighted.

The steps just outlined are described in detail in Part 2.

Performance Measurement Toolkit

The Performance Measurement Method is supported by a Performance Measurement Toolkit (shown in Figure 1 on page 7), which consists of software, hardware, and paper tools.

The paper tools are contained in this book and other parts of the NPL Usability Library.

The Toolkit is described in “This book – an overview of its parts” on page 9 and “Other parts of the Performance Measurement Toolkit” on page 17.

Table 1 on page 16 shows which parts of the NPL Usability Library are relevant for certain typical tasks while Table 2 on page 21 shows which parts are relevant to typical users of the Method.

Figure 1: Major parts of the Performance Measurement Toolkit

Above page break is to produce a numbered even page.

Introduction

This book – an overview of its parts

General

Organisation

The manual is divided into six parts plus appendices and back matter. Each part starts on a coloured divider page and covers a broad topic. The first three parts give you an understanding of this book and the Performance Measurement Method; the next three parts help you follow the Method.

Conventions used

Certain key terms, such as measures and metrics, are set in Title Case e.g. Productive Period.

Book titles, new terms, and references within the glossary to other items are all set in italic e.g. DRUM User Guide.

Bold is used for simple emphasis and for steps, goals, and sub-goals in the Performance Measurement Method e.g. Step 6 – Analyse the test data.

Numbered lists (1, 2, 3, …) show a sequence of steps where the order is important. Numbered lists are often used for tasks that you must perform.

A square bullet () denotes a task with just one step.

In round bullet (•) lists, the sequence of items is generally unimportant.

Access methods

Index There is a comprehensive index at the back of the book.

Contents lists There is an overview contents list in the preliminary pages. The first page of each part is printed on coloured card and has a contents list that gives more detail. There are also lists for figures and tables.

Page footers The page numbers are in the page footers. Even page footers give the part of the book; odd page footers are derived from the chapter title.

Preliminary pages

These contain an overview contents list and also include details of how to contact Serco.

Part 1:Getting Started

The part you are now reading. Read all this part because it describes how to use this book.

Part 2:Applying the Method

The Method is described as a sequence of numbered steps with associated goals and sub-goals. These are used throughout the book to help you see where a particular task fits into the overall Method.

Read all this part to get a picture of the whole Method.

However, this part does not contain detailed descriptions of the techniques used; these are described in more detail in Parts 3 – 5 and Appendix 1.

If the analyst uses the techniques in the sequence and manner described, the Method provides an efficient way of measuring performance-based metrics, and of obtaining measures that are repeatableand reproducible.

EfficientThe time taken to measure usability from the test data is of the same order as the time taken to record the data

RepeatableThe same analyst will measure the same level of usability if he or she repeats the test under the same conditions

ReproducibleDifferent analysts will measure the same level of usability if their test conditions are similar