Training Wheel Visualizations

VISUALIZATION USABILITY FOR NOVICE USERS - PROJECT IDEAS

Hundreds of visualizations are developed and papers published that boast of compactness of representation and efficiency of loading time. But it is of utmost importance that the visualizations make sense to users. I am Sujatha Krishnamoorthy, a master’s thesis student. My research is on making visualizations easily learnable to novices. Here are some project ideas related to this. I’ll be happy to discuss with those interested!!!  Much of this research was inspired by usability tests on Datamaps software. Following is some background on the current version of this tool.

BACKGROUND - DATA MAPS TOOL

DataMaps [2] is a front end toolfor visualization and analysis of census data on the United States. Census databases are very large. Data has been collected across states and counties for thousands of attributes.“Attributes” are items such as “total population” “percent white population” “percent black population” “crime rate” and so on. There are approximately 8000 of these attributes in the database.These attributes are grouped together by related categories.

DataMaps uses multiple coordinated views of data. It has a map view , a histogram view , a detailed spreadsheet and a scatter plot view (as seen in the fugure). The histogram shows how many states are present in different value ranges of the given attribute. The map view indicates on the map, values for a chosen attribute (out of the 8000) using color-scale (lighter the color of the state, greater is the value for that state) . The scatter plot aids in comparing two attributes. For example “what is the relationship between total population and crime rate”. The detailed spreadsheet is activated on clicking on a region on the map and presents numerical data values in tabulated format. This is the textual representation of the same data.Datamaps has been designed for a wide variety of users. Designing visualization tools for a very diverse user population has its challenges. Users range from young to old, novices to experts of their domains, novices to experts in using visualizations and so on.

A number of usability tests conducted on Datamaps, reveal a basic problem that is perhaps on of the most important problems faced by all visualization tools: novice users face great difficulty in getting accustomed to the visualization tools. They are still “thinking textually” instead of visually.

There were questions in the usability test such as “out of these 20 states, which one has the highest white population” for which users referred only to the tabulation. They completely ignored the visualizations which would have revealed the answers in a glance. The scatter-plot for the 20 states would have indicated the highest value at a glance. However users laboriously looked up columns and columns of data values in numeric form.The users behavior in the usability tests may be attributed to the “production bias” exhibited by novice users, as described in Rosson and Carrol’s “The paradox of the active user” [1]:

The users’ primary goal is task completion. There is a bias towards productivity (production bias) rather than learning the system features. The system is after all, a means to the end. This automatically reduces the motivation that is needed to learn new procedures offered by the system which are more efficient methods for task completion. Users are likely to stick to what they know. (In this particular case, users borrowed from their general knowledge of tables)

Figure 1: A screenshot of Datamaps

The paradox lies in this: if only they would spend some time understanding the system, they could get their tasks done faster. However, with the aim of doing their tasks, they ignore learning the system and hence end up doing their tasks very inefficiently. There is persistent use of inefficient procedures in interactive tasks by experienced or even expert users when demonstrably more efficient procedures exist. Users would save time, in the long run by taking initial time to learn the system and discover optimal methods. However we see that this is not how users in the real world behave, and we cannot design for the idealized rational user who would start reading a manual before starting his tasks.

It is commonly believed that users of an interface learn most of what they know early on and thereafter only little additions take place to their knowledge. It is therefore of great importance that visualizations address the problem of making novice users comfortable. A visualization is of any use only if it is understood!

PROJECT DIEA 1: TRAINING WHEELVISUALIZATIONS

The name of this project idea is derived from the analogy of using training wheels for people who are learning to ride a bicycle. Visualizations can provide “training wheels” too. The training wheels version would include features such as slow animation and accompanying text that make it clear to the users what the visualization means. Just as training wheels are removed after learning to ride a bicycle, this feature can be turned off, once the users are comfortable with the visualizations.This project would be to develop a “training wheels” version of a visualization, to make users understand the syntax and semantics of the visualization, choosing one or more of the visualizations:

(a)maps

(b)scatter-plots

(c)histograms

Syntax may be defined as the visual encoding used to represent data. Semantics may be defined as the meaning or interpretation of the visualization, composed of all the visual cues put together. For example, “a small square represents a state” is syntax. “If the density of dots is great around a region that means, all those states have values in the same range” is semantics. The training wheels version may use the following design features or other features gleaned from your literature reviews:

-animation techniques: for example, let us consider animations in scatterplot: animated arrows can show how x coordinate value and y coordinate value determine the position of the dot in the plot. Then the dot and its label can slowly appear, with a fade in effect. And then followed by the next dot and the next dot and so on. (refer concept link on the project ideas page of the course website)

-corresponding verbal explanations of the semantics: for example let us consider choosing regions on the scatter plot. The user can drag the mouse over a number of dots and select the dots. Density may be computed for the selected region and the semantics explained verbally, like, “clusters of dots indicate that many states have this value”

Weeks / Phase
1,2 / Literature review / Look up papers in support of training wheels methods. Prepare report which will go into final documentation. Simlutaneously generate design ideas.
3 / Design / Make final conclusions from literature review. Choose the visualization for which the training wheels version is to be developed. Make detailed specifications for how the training wheels should be implemented – choosing from techniques such as animation, enhanced interactive ability, verbal explanations. Prepare design rationale report which will go into final documentation.
4,5,6,7,8 / Implement / Make design changes if needed, consult with instructors/advisors, implement a working model. This would perhaps be the longest phase of all.
Develop brief documentaion
9, 10,11, 12 / Testing / Design test questions to check understanding of novices. Invite voluntary novices and conduct the test. Document testing procedure, results and conclusions.
13,14 / Documentation / Clean up final version of documentation

PROJECT IDEA 2: DATA CENTRIC DESIGN

Multi-dimensional visualization tools such as Data-maps provide several individual visualizations: histograms,scatter-plots, color coded map, tables for looking at the values numerically instead of visually.

However, novice users seldom grasp which visualizations to use for which tasks. For example, locating the state with maximum white population can be done at a glance by looking at the scatterplot and checking which dot is at the highest position.

But novice users, being more familiar with tables, resort to reading off values from columns of tables trying to spot the greatest number even though this procedure may be much more time-consuming. This has happened in several usability tests of Data-maps.

This may be attributed to the fact that users are most familiar with names of attributes

and usage of tables. They carry over their previous knowledge and go about the task, even though the tool may provide faster ways of doing the same. Also, they just want to get the task done, without trying to learn about the tool.

Novice users want to

-bring in their previous knowledge,

-learn by doing (exploratory learning)

-they just want to get the task done (without reading through manuals, help etc)

We can design a visualization tool that leverages these tendencies into teaching novices the best way to do a task. Data-centric approach has been advised as a solution to this.

In a data-centric approach the mechanisms to explore the data-attributes are made more prominent than the individual visualization tools. For example, an explorer bar similar to windows explorer which shows groupings of 8000 attributes which can be visualized is one way of implementing a data-centric visualization tool.

This project will be to implement a data-centric version of DataMaps software, and test it for usability. Currently Datamaps tool just shows all visualizations prominently; it is vis-centric. The list of attributes is not displayed prominently, and interaction with them is limited. Since it is not possible to implement a full working version of datacentric Datamaps with all visualizations, the scope may be narrowed down to measure the effects of constructing a data-centric version.

A data-centric version includes one or more of the follwoing:

- prominent mechanism to explore all the attributes which can be visualized

- prominent mechanisms to choose an interesting subset of attributes to be visualized

- user-system dialogs that deal with what types of tasks a user wants to do (for example, “I want to compare different subsets of counties” )

Weeks / Phase / Details
1,2 / Literature review / Look up papers in support of data-centric methods. These may be for visualizations other than geographic too. Glean main lessons learnt and simultaneously generate design ideas. Prepare report which will go into final documentation.
3 / Design of data-centric approach and a test-bed of tasks / Make final conclusions from literature review and make final design plan. Design should include mechanism to explore attributes, layout and interaction mechanisms which help users choose the best way to visualize given their task. Formulate a task set including tasks like “locate region with highest value of x”, and pre-compute the optimal method of executing it. Document the design – it will go into final version.
4,5,6,7,8 / Implement / implement a working model. This would perhaps be the longest phase of all.
Develop brief documentaion
9, 10,11, 12 / Testing / Actual visualizations need not be used, but users can be tested for choosing the best procedure for executing a task. For example, for comparing population values of three clusters of counties, you may have pre-computed that choosing the map is the best option – here you test if data-centric approach causes novices to follow that. If they choose the correct option, it is noted as a success.
Document the testing.
13,14 / Documentation / Write up final version of documentation including all portions.

PROJECT IDEA 3: INFORMATION DESIGN

Another interesting project would be to develop the system functionality of Datamaps: what must this visualization tool display. Currently, it has map, scatterplot and histograms. Each of these visualizations seeks to convey a kind of insight. For example, scatter-plots convey the relationship between two attributes on x and y axes. Is that all that can be inferred. There could be plenty more insights that can be extracted from the data. Different kinds of visualizations provide different kinds of insights.

For example, in the current version of Datamaps, there are no pie-charts. A survey indicated that almost every user wanted a pie-chart view of percentage variables.

There are attributes such as “percentage white population”, “percentage black population”, “percentage Hispanic population”, “total population”. Taking Texas for instance, most users wanted to be able to see a pie chart of these percentages for Texas – the races making slices of the pie. Pie charts could well be an important addition to the current version of Datamaps.

The goal of this project would be to examine the 8000 attributes collected by census department, and determine what kinds of visualizations should be provided. This would be an information design project. This would involve formulating requirements based on surveys or interviews with people who look like potential users). Based on the requirements analysis, the system functionality, information design and interaction design can be determined.

REFERENCES

[1] Carroll, J.M. & Rosson, M.B. (1987). Paradox of theActive User. In (Ed. J.M. Carroll) Interfacing Thought.MIT Press.

[2] Datamaps for census visualization.