Alla Tolchinsky, Irana Sheykh-Zade, Seema Moorjani, Hyu Kyu Choi

Portable Doctor Advisor: Pilot Usability Test

Introduction

System being evaluated

The system that we are evaluating is the preliminary design for Portable Doctor Advisor (PDA). The device would have diagnostic capabilities for mild symptoms and recommend medicine to relieve discomfort from these symptoms. It will also keep track of the customer’s medicine schedule and give a reminder when it is time to take a medication. In addition, it will store the medical history, and make sure new medicine has no conflict with the person’s allergies and/or current medication.

Purpose and rationale

The purpose of our experiment is to get customer feedback on our interface design at an early stage. By doing so we can measure usability of the design and make changes before investing in the final product. We want to see if customers would have problems completing common tasks, and if they do, where they encounter the problems. Our goal is to find out where our interface is problematic, so we can change it and make improvements for our next design iteration.

Method

Participants

We have a wide range of target customers so for our user testing, we wanted to get users with a variety of backgrounds to test our system. Some of our target customers are people who will use the device only when they are sick to diagnose mild symptoms. Other customers will use the device often, to keep track of the medicine they take on a daily basis. Some customers will be avid computer users, while others computer-illiterate. Of our three participants, two were female, and one was male. One participant was in her twenties, another in his 30s and the third in her 50s. Our users also represented customers with different degrees of computer literacy and familiarity with PDAs. One of our users works in the computer industry and uses a PDA often. Our second user uses her computer on a daily basis to check her e-mail and use office applications, such as Microsoft Word. She has recently bought a PDA, but does not use it often. Our third customer also uses her computer for everyday tasks (e-mail, Word), and does not own or has ever used a PDA.

Age (range) / Sex / Own PDA / Computer Experience / Amount of medication currently taking
20 / F / Yes (novice) / Intermediate / 2 pills a day
34 / M / Yes / Advanced / When sick
50s / F / No / Novel-intermediate / 6 pills a day

Apparatus

We used a handheld device, a Compaq iPAQ. We tested our users in different environments, which the users picked themselves: an environment where they would feel most comfortable. In each case, it was in a quiet place, with no extra noise or distractions. It was important for us to test the device in different places, because it is a portable device, which is intended to be easily used anywhere. One of our participants chose to do the testing in his office at work, another participant asked us to do it at her home, and our third participant performed the user testing in a lunchroom at her workplace (no other people were around). In all the trials, we sat around the participant so we could see which buttons they were pressing as they went along.

Tasks

Task 1: Setting up personal profile and password

This is the customer’s first time using the device. They need to set up their personal profile, which includes entering their personal information such as name and age. Then they need to set up their password.

We chose to perform this task because all the target customers would need to complete it in order to personalize the device. This task also acquaints the customer with the way to input the information into our PDA.

Figure 1: Storyboard for Task 1

Task 2: Input of new medication.
The customer has just received a new prescription and must now enter it into the device, so they can get reminders for when to take the medicine. They need to enter information about the medicine, such as the medicine name, the amount prescribed, and the dosage.

The reason we chose to perform this task was because it will be a frequent task to perform. It involves extensive customer input using the keyboard provided. It also involves manipulating the drop down menus provided on the screen.

Figure 2: Storyboard for Task 2

Task 3: Diagnosis of mild symptoms: headache, sore throat, and fever of 101˚ Fahrenheit.
The customer is not feeling well, so they need to use the PDA to try to diagnose their symptoms and check to see if it is something they need to go to the doctor for, or if they can just take over-the-counter medication. In order to get a diagnosis, the customer needs to input all their symptoms into the device.

The reason we chose this task is because it is a complex task to complete, since it is comprised of many steps, which are carried out on different screens. It is also one of the tasks that most of the potential customers would perform if they had the device.

Figure 3: Storyboard for Task 3


Procedure

In the beginning of each testing session, Irana explained to each participant the purpose of our experiment, asked them to sign the consent form, reassured them that we were testing our design and not testing them, and then asked the participant to think out loud while completing the tasks. We all sat around the participant so we could see the screen and what the participant was doing. We also tape recorded each testing session for later reference. Each task was written out on an index card. Irana gave an example of how to use the PDA's touch screen. Then, she presented the tasks to the participant one at a time, in order of increasing difficulty. Alla timed each task, and noted how many errors the participant made for each task. All of us took notes every time there was a critical event (when the participant was happy to see something, or surprised, or when they had difficulty). During the tasks, we did not provide the participant with any help. For example, if there was something that they were looking for but could not find, we would wait silently until they found it. The only time we interfered with the task was when they accidentally pressed something on the iPAQ to exit out of our application. In those cases, we helped them bring the application back.

Once they were done with all three tasks, we asked the participants to tell us what they had trouble with the most, what they liked the best about our interface, and if there were any suggestions they had to improve our interface so it would be easier to learn and use.

Test Measures

We focused on qualitative data, such as where in the interface our customers had problems. This is important for us because it allows us to pinpoint exactly where the problems with our interface are, so we can change those features. In order to collect the qualitative data, we observed how participants interacted with the presented screens. We specifically looked for the times when the participant looked confused or surprised or spent a long time looking for or at something. We made a note of each error a participant made during a task, which buttons they tended to erroneously press, as well as where their eyes would first glance at in search of a specific button. We also counted how many times each participant used the “Help” button. We measured some quantitative data, such as how long it took each participant to complete each task, and how many errors were made during each task.

Results

Task 1: Setting up profile

Time to complete task / Number of errors / # of times “Help” was used
Participant 1 / 3 min / 0 / 0
Participant 2 / 3 min 30 sec / 1 / 0
Participant 3 / 9 min 25 sec / 2 / 0
Average / 5 min 18 sec / 1 / 0

Since this was the first task that the participants performed, it took them some time to get comfortable with using our application and the specific hardware (none of our participants, even the ones that use a PDA, have ever used an iPAQ before). The high number for participant 3 reflects the fact that she has never used a PDA before. It took her a while to understand how to use a handheld device.

The main problem that the participants experienced during this task was dealing with the keyboard and the fact that when the keyboard is displayed on the screen, it covers our navigation/action buttons. In order to access those buttons, you need to hide the keyboard. Since this was the first task, the participants took a long time to figure out that in order to move on to the next screen they need to hide the keyboard, and therefore, they could not see our buttons right away.

One of the errors that participant 3 made was due to the implementation of the keyboard provided by the hardware. She could not figure out that in order to move between the input fields, she had to touch the screen where the field is located instead of using the “Tab” button on the keyboard. The other two errors made by participant 2 and participant 3 were that they put the first name and last name together in the same field and then when they moved on to the next field, they realized that those two names had separate fields.

Task 2: Adding a new medicine

Time to complete task / Number of errors / # of times “Help” was used
Participant 1 / 2 min / 0 / 0
Participant 2 / 1 min 20 sec / 0 / 0
Participant 3 / 2 min / 1 / 0
Average / 1 min 47 sec / 0.33 / 0

Since the majority of the task involved using the keyboard to input the information and the participants had already learned how to use the keyboard and get to the navigation buttons, the task turned out to be easy. The only error encountered was due to the fact that the cursor is not in the first (or any) field when the screen is first displayed, like it usually is on computer interfaces. This caused one of our participants to type in information for the field before realizing that the curser was not there. Some of our participants were also surprised that there was no field for inputting instructions for how to take the medicine (i.e. with milk). All our participants commented that they liked the dropdown menus.

Task 3: Diagnosing symptoms

Time to complete task / Number of errors / # of times “Help” was used
Participant 1 / 2 min 10 sec / 0 / 0
Participant 2 / 1 min / 0 / 0
Participant 3 / 2 min 20 sec / 0 / 0
Average / 1 min 50 sec / 0 / 0

This was our hardest task, and although our participants did not make any errors, they did not use our interface design as we intended. The problem was that the participants went to the head symptoms to check off “headache” and “sore throat”, and then instead of pressing on “more symptoms”, they tried to type in “fever” using the extra field labeled “add other symptoms one at a time” (Figure 4). Although in the final product this should work, our intent was that they could use a checkbox in the “general symptoms” and use a dropdown menu to select the temperature. In this prototype the diagnosis did not work by typing in symptoms, so the participants that typed in fever were able to go back and figure out that they need to go to a different screen to input the other symptoms. After they completed their tasks, our participants told us that they did not read the directions for the task at first. As seen by our results, the participant that did read the instructions (participant 2) was able to complete the task in 1 minute.

Figure 4: Errors in Head Symptoms screen

General Comments

The major issue when testing our device was the integration of the iPAQ keyboard with our application. The participants did not know how to bring up the keyboard in order to type the information in the fields. Some participants did not see or recognize the icon, and some did not know where to look for the icon. Also, once the keyboard was displayed, all our testers had the problem of figuring out how to hide the keyboard. Since they had a problem of hiding the keyboard, they could not get to our navigation/action buttons and could not move on to complete the task.

Discussion

From our pilot testing we learned that there were two major usability problems that we did not anticipate.

The first problem involved the keyboard:

  1. It was not clear how to bring up the keyboard
  2. It was not clear how to hide the keyboard
  3. The keyboard covered some of our navigation/action buttons, so the participants were confused about how to continue with the task.

The second problem involved our design of the steps to get a diagnosis:

  1. The participants skipped reading the instructions of how to input the symptoms (which screens to use for which symptoms)
  2. Instead of using "More Symptoms" to get to general symptoms, the participants typed in their symptoms manually, which should result in the same diagnosis, but defeats the purpose of our checkboxes, which try to provide ease of use and prevent spelling errors.
  3. Participants found it confusing that pressing the "More Symptoms" button takes them to the human figure screen and not to a list of more symptoms associated with that body part.

For the “real” experiment we would not use an iPAQ as our target platform but instead have a special hardware created just for our application. It will be almost the same, except we would have more screen space, and no navigation on the hardware (all navigation is on the screens). This would allow us more space for the keyboard, which would not cover any of our navigation buttons. We would also want to implement a database, which would support more symptoms. During the experiment, we would also use more people with a wider range of ages and medical problems to get more accurate results on both quantitative and qualitative data. We would videotape the actual experiment so we would be able to analyze facial expressions as part of our process data after each testing session.