Assignment #1

Student ID: 5563 4294

NOTE:

1.  Do NOT put your name on this document anywhere or in the file name.

2.  Please include your student ID in your file name.

Answers should be brief. Most questions can be answered in 1-2 paragraphs. None should require more than one full page. Cite your references. You may use [Wickens], [Ritter], and [Norman] as shorthand for referencing the core class readings. All other references should receive a full citation as a footnote or endnote. Any web pages you visit as part of answering these questions should be referenced if you end up using information from them.

Please enter your responses in this document and turn your document in via the CTools assignments tool. Any papers that do NOT include your student ID on this first page and in the filename will be returned for revision, as will any papers that DO include your name or any identifying information other than ID.

Question 1

Watch this video about the “VoiceOver” feature on the iPhone

http://www.youtube.com/watch?v=tVruB7I2G14

Name 3 senses that someone must use in order to make use of VoiceOver and describe how those senses are used to receive information that would otherwise be picked up through vision.

Answer: The whole point behind VoiceOver is the use of audio cues, so first and foremost the user must use their aural sense to interact with it. Normally, a user must be able to see and read items on the screen; VoiceOver negates that need be reading just about everything the user can interact with.

Audio cues provide effective cues that tell the user: when they navigate to a new icon, when they reach the end of an icon row, when they reach the end of a folder's contents, when they scroll to another page, when the phone is locked, when they return home…these cues, among others, supplant the need to have acute visual access to the interface.

The next most important is touch. Special gestures allow them to augment interaction with the phone's otherwise dominantly visual interface. The user has to employ one, two or three fingers to perform different tasks. The interface doesn't respond to single taps as they do without VoiceOver active — the user can drag freely around the screen without accidental activation of items. Instead, single taps and drags select an item to act on and double taps activate the selected item. There are also special touch gestures for pausing the voice, selecting items in sequence and scroll through pages of information — three fingers switch pages and scroll lists of content.

The last sense is proprioception, the user's overall sense of their body in space. The general sense of touch generally allows the user to interact, but having an overall sense of where their finger lies in relation to the screen is essential. When typing, the user needs to be aware of the position of both thumbs. VoiceOver assists with audio cues. The home button can sometime be lost without the use of vision — while it could be found with touch, VoiceOver announces the orientation of the device which can aid finding the home button: "Landscape. Home button to the left".

Question 2

What is the maximum number of “thumb-friendly” targets can you fit on an iPhone 4 (http://www.apple.com/iphone/)? What size would the targets be in terms of pixels?

Answer:

Assuming a pixel density of 326 pixels/inch on the iPhone retina display, and that the optimal thumb target size is a square 9.6mm x 9.6mm (0.38in x 0.38in) it would be

0.38" * 326px = 124px to a side on a single target.


Question 3

Predict the minimum time it would take a user to accomplish the following task given the interface shown here.

1.  Enter “Jane” into the First Name field.

2.  Enter “Doe” into the Last Name field.

3.  Click the “Female” checkbox.

4.  Click “OK.”

Only include predictions for targeting and typing. Assume the user is an average typist and use Ritter’s estimates of typing speed to estimate text entry time. Do NOT include other actions such as decision time, hand movement (e.g., keyboard to mouse), or button presses. Assume also that the cursor initially starts in the upper left corner and the each subsequent selection operation starts in the middle of each target. Show how you worked through your ultimate answer.

Answer:

[Ritter] " We can extrapolate to keystroke times between 750 ms per keystroke for slow typists to 125 ms per keystroke for fast typists.". Assuming "average" is 40wpm, 5 character = 1 word:

40wpm = 200 char / min = 3 char/sec = 1 char / 0.33s ≈ 330ms / keystroke.

The sequence of operations:

1)  Navigate the mouse from the upper left to the middle of the first name field.

2)  User types "J a n e" at a rate of 300ms per keystroke.

3)  User navigates from last position to the middle of the "Last Name" field.

4)  User types "D o e"

5)  User navigates from last mouse position to the middle of the "Female" checkbox

6)  User navigates to the middle of the "OK" button.

Fitts Law:

MT = (a + b ) * ID

ID = log2(Distance moved / Width of target + 1)

a = 548, b = 420

1)

Width here is calculated via triangulation:

2) four char @ 330ms/stroke = 1,320ms

3)

4) Three chars @ 330ms/stroke = 990ms

5)

6)

The total time here is 10,117ms, or about 10s to complete this task.
Question 4

Discuss how transfer and interference would have impacted a user as they transitioned from the old version of Gmail to the new version, as shown below. Focus only on the part of the UI highlighted.

Answer:

The new Gmail interface "archive" function transfers from the old to the new interface. It behaves the same: checked messages are archived. The same applies to the "report spam" button, and refresh link. These elements exist in both versions and behave the same.

There is a bit of simultaneous interference and transfer that occurs with the operations to select messages. The checkboxes remain the same, but there are no obvious ways to select all, none, read, etc. The latter creates some interference. However, since these options are now available via a drop down, Google probably hoped that knowledge transfer would occur with the small vertical arrow indicating that interaction.

Labels are visually different from the old to new UI. They are much less pronounced in the new UI, losing the green box. This probably caused many users to wonder where these labels went.


Question 4a

In particular, address the transfer and/or interference that impacts the interaction with the “select all widget” introduced in the new version. Aza Raskin has written about this on his Flickr account: http://www.flickr.com/photos/azaraskin/4886664008/

If you have access to a gmail account, try it out. If not, here’s how it works:

1.  If you click the box while closed it selects all messages on the current page and the box is checked.

2.  If you click the arrow next to the box, it opens a menu.

3.  If you then select an item in the menu…

4.  The appropriate messages are selected and the box is checked, but grayed out.

For the discussion of this particular widget, consider the transfer / interference that occurs from previous interactions with “conventional” checkboxes such as the ones shown here, as well as previous interactions with gmail.

Answer:

The Gmail select dropdown has knowledge transfer only in its immediate interaction. Like a typical dropdown, the arrow indicates that there are further choices available related to the function of the visible option. It behaves as expected, with a single click on the arrow revealing these options.

However, this innocently small widget attempts to blend the functionality of two pervasive GUI elements and in the process creates quite a bit of interference.

"Normal" checkboxes behaves a multiple choice selections and allow the user toggle them on and off to indicate a choice is either selected or not. Typically they group related options of a single feature together. When a checkbox is greyed out, it's generally understood that the greyed choice is not available.

In the case of Gmail, the checkbox is dimmed when an incomplete set of the inbox is selected. The creates interference because the Gmail checkbox isn't actually inactive — this to me is the biggest interference. That said, it's not as if this new behavior can't be learned quickly.

I also expected there to be some indication of what was selected once I chose it from this menu, i.e. if I selected "starred" I expected there to be a checkmark next to that selection in the menu to reinforce the fact that I made the selection. Gmail do not give any checkbox-like feedback that any selection was made, making it necessary to quickly scan the field of messages to determine what was selected. In my particular case, I had a single starred message selected several pages down, so there was no visual indication that anything had occurred.

Actually of note is Gmail's use of checkboxes and labels

I expected these to be toggles of the messages that have the labels applied; instead the checkboxes reveal a plethora of options that I didn't quite understand and the words themselves selected and applied a search of the label text.


Question 5

For each of the following lists and the given task, say which would be easier for people to perform. Say why, given multiple explanations if appropriate.

5.1 Paired Association

SUB-CAN TOP-OUT CUP-LOT MEN-CAN

TUB-MAN POT-OUT CUP-SAT LOT-SUB

5.2 Target Search

Q C O G U D (Target = G)

A U I Z O P (Target = Z)

5.3 Free Recall

SUB CAN TOP OUT CUP LOT MEN

RED TAN BLUE CUP FORK PLATE GREEN SPOON

5.4 Serial List Recall

SUB CAN TOP OUT CUP LOT MEN

BRU AUM HIR GIB SPO CAV MIH

5.5 Serial List Recall

SUB CAN TOP OUT CUP LOT MEN

S B U C N T A P O U P C O T L N M E

Answer:

5.1: Because the first set uses multiple stimulus words for the same response ("sub-can" & "men-can") this makes the list harder to memorize, generally, than one that does not repeat any stimulus or response items.

5.2: When scanning a list of items, the first step is perception of the items, by "analyzing the raw features of a stimulus or event" [Wickens, 124]. After perception the " election of channels to attend (and filtering of channels to ignore) is typically driven by four factors: salience, effort, expectancy, and value" [Wickens, 123]. In this case, the letter G has less visual salience over the other letters, being the entire set is comprised of similarly shaped letterforms. By contrast the second set is made up of letters that have less in common. Further, the first set according to [Wickens 131] would be harder "because of the greater confusability of the acoustic features", i.e. "C", "G", "D" sound similar.

5.3: Even though the second list is longer (while still falling in the 7±2 items), a subject is more likely to form meaningful ties between the various words. Because "The more associations there are among the words, the easier it is to recall the words" [Norman, 155], subjects will likely pair the words in ways such as "Red Cup" and "Blue Plate", thus effectively creating fewer chunks " clustered according to meaningful groups" [Norman, 155].

5.4: The second list comes across less as words with associated meaning and more as random groups of letters. In essence, it becomes an arduous task of not seven chunks, but twenty-one, well above the "upper limit … of working memory … around 7 ± 2 chunks of information" [Wickens, 129]. The first list, because of relatively familiarity can have visual or object associations (i.e. sub = "submarine") plus the user might feasibly make paired associations whenever possible (i.e. "sub to the top") in order to reduce the chunks memorized.

5.5: Like 5.4, the first list is far easier to memorize just due to the fewer chunks present to memorize. Even though they are made of the up same letters, the three-letter grouping reduce the number of chunks to memorize and also create associations with words already in long term memory.

Question 6

Research has shown that users follow an F-shaped pattern when scanning search results. See a report on this here: http://www.useit.com/alertbox/reading_pattern.html

Based on what we have been discussing and reading about attention, explain why this would be. Be sure to explain why the “F” peters out after only the first few results.

Answer:

When searching quickly for information on sites, the location of the desired information is often unknown, so the eye will first conduct survey dwells [WickensHollands, Ch3, 77] "to establish those regions more likely to contain a target" [ibid]. These quick saccades around the page find areas of interest either through top-down processing, driven by "Searcher expectancies of where the target might be likely to lie." [Wickens, ch4, 81] — or bottom-up processing driven by unavoidable grabs of attention based on movement or visual salience.

After scanning the page a user then reads as expected in Western cultures where sentences flow from left to right, thus creating the longer "examination dwell" [WickensHollands, ch3, 77] fixations that make up the F-shapes.

With valuable areas of the page detected through surveying, longer fixations are due occur because "the probability of locating a target will increase with more search time." [WickensHollands, ch3, 78] , although this probability "increase[s] at a diminishing rate" [ibid]. Thus as the search wears on, the fixations diminish as the resources available for attention decreases and the amount of information put through working memory increases.